Fast, Accurate Detection of 100,000 Object Classes on a Single Machine: Technical Supplement

Thomas Dean
Mark Ruzon
Mark Segal
Jonathon Shlens
Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, IEEE Computer Society, Washington, DC, USA (2013)
Google Scholar

Abstract

In the companion paper published in CVPR 2013, we presented a method that can directly use deformable part models (DPMs) trained as in [Felzenszwalb et al CVPR 2008]. After training, HOG based part filters are hashed, and, during inference, counts of hashing collisions summed over all hash bands serve as a proxy for part-filter / sliding-window dot products, i.e., filter responses. These counts are an approximation and so we take the original HOG-based filters for the top hash counts and calculate the exact dot products for scoring.

It is possible to train DPM models not on HOG data but on a hashed WTA [Yagnik et al ICCV 2011] version of this data. The resulting part filters are sparse, real-valued vectors the size of WTA vectors computed from sliding windows. Given the WTA hash of a window, we exactly recover dot products of the top responses using an extension of locality-sensitive hashing. In this supplement, we sketch a method for training such WTA-based models.

Research Areas