Classification by retrieval: Binarizing data and classifiers

Publication Type:
Conference Proceeding
SIGIR 2017 - Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017, pp. 595 - 604
Issue Date:
Filename Description Size
p595-shen.pdfPublished version1.13 MB
Adobe PDF
Full metadata record
© 2017 Copyright held by the owner/author(s). This paper proposes a generic formulation that significantly expedites the training and deployment of image classification models, particularly under the scenarios of many image categories and high feature dimensions. As the core idea, our method represents both the images and learned classifiers using binary hash codes, which are simultaneously learned from the training data. Classifying an image thereby reduces to retrieving its nearest class codes in the Hamming space. Specifically, we formulate multiclass image classification as an optimization problem over binary variables. The optimization alternatingly proceeds over the binary classifiers and image hash codes. Profiting from the special property of binary codes, we show that the sub-problems can be efficiently solved through either a binary quadratic program (BQP) or a linear program. In particular, for attacking the BQP problem, we propose a novel bit-flipping procedure which enjoys high efficacy and a local optimality guarantee. Our formulation supports a large family of empirical loss functions and is, in specific, instantiated by exponential and linear losses. Comprehensive evaluations are conducted on several representative image benchmarks. The experiments consistently exhibit reduced computational and memory complexities of model training and deployment, without sacrificing classification accuracy.
Please use this identifier to cite or link to this item: