Enhanced 3-D modeling for landmark image classification

Publication Type:
Journal Article
IEEE Transactions on Multimedia, 2012, 14 (4 PART 2), pp. 1246 - 1258
Issue Date:
Filename Description Size
Thumbnail2012001451OK.pdf2.94 MB
Adobe PDF
Full metadata record
Landmark image classification is a challenging task due to the various circumstances, e.g., illumination, viewpoint, zoom in/out and occlusion under which landmark images are taken. Most existing approaches utilize features extracted from the whole image including both landmark and non-landmark areas. However, non-landmark areas introduce redundant and noisy information. In this paper, we propose a novel approach to improve landmark image classification consisting of three steps. First, an attention-based 3-D reconstruction method is proposed to reconstruct sparse 3-D landmark models. Second, the sparse 3-D models are projected onto iconic images in order to identify images of the hot regions. For a landmark, hot regions are parts of a landmark which attract photographers' attention and are popularly captured in photos. These hot region images are later used to enhance reconstructed sparse 3-D models. Third, the landmark regions are obtained through mapping the enhanced 3-D models to landmark images. A k-dimensional tree (kd-tree) is then constructed for each landmark based on scale invariant feature transform (SIFT) features extracted from the landmark area to classify unlabeled images into pre-defined landmark categories. The proposed method is evaluated using 291661 images of 51 landmarks. Experiments of comparison indicate that our method outperforms bag-of-words (BoW) based approach 18.5% and method of spatial-pyramid-matching using sparse-coding (ScSPM) 8.4%. © 2012 IEEE.
Please use this identifier to cite or link to this item: