Exploiting visual word co-occurrence for image retrieval

Publication Type:
Conference Proceeding
MM 2012 - Proceedings of the 20th ACM International Conference on Multimedia, 2012, pp. 69 - 78
Issue Date:
Filename Description Size
Thumbnail2012004361OK.pdf2.81 MB
Adobe PDF
Full metadata record
Bag-of-visual-words (BOVW) based image representation has received intense attention in recent years and has improved content based image retrieval (CBIR) significantly. BOVW does not consider the spatial correlation between visual words in natural images and thus biases the generated visual words towards noise when the corresponding visual features are not stable. In this paper, we construct a visual word co-occurrence table by exploring visual word co-occurrence extracted from small affine-invariant regions in a large collection of natural images. Based on this visual word co-occurrence table, we first present a novel high-order predictor to accelerate the generation of neighboring visual words. A co-occurrence matrix is introduced to refine the similarity measure for image ranking. Like the inverse document frequency (idf), it down-weights the contribution of the words that are less discriminative because of frequent co-occurrence. We conduct experiments on Oxford and Paris Building datasets, in which the ImageNet dataset is used to implement a large scale evaluation. Thorough experimental results suggest that our method outperforms the state-of-the-art, especially when the vocabulary size is comparatively small. In addition, our method is not much more costly than the BOVW model. © 2012 ACM.
Please use this identifier to cite or link to this item: