Enhanced 3-D modeling for landmark image classification

Xiao, X; Xu, C; Wang, J; Xu, M

Enhanced 3-D modeling for landmark image classification

Xiao, X Xu, C Wang, J Xu, M

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Multimedia, 2012, 14 (4 PART 2), pp. 1246 - 1258
Issue Date:: 2012-07-27

Closed Access

	Filename	Description	Size
	2012001451OK.pdf		2.94 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Xiao, X	en_US
dc.contributor.author	Xu, C	en_US
dc.contributor.author	Wang, J	en_US
dc.contributor.author	Xu, M https://orcid.org/0000-0001-9581-8849	en_US
dc.date.issued	2012-07-27	en_US
dc.identifier.citation	IEEE Transactions on Multimedia, 2012, 14 (4 PART 2), pp. 1246 - 1258	en_US
dc.identifier.issn	1520-9210	en_US
dc.identifier.uri	http://hdl.handle.net/10453/22856
dc.description.abstract	Landmark image classification is a challenging task due to the various circumstances, e.g., illumination, viewpoint, zoom in/out and occlusion under which landmark images are taken. Most existing approaches utilize features extracted from the whole image including both landmark and non-landmark areas. However, non-landmark areas introduce redundant and noisy information. In this paper, we propose a novel approach to improve landmark image classification consisting of three steps. First, an attention-based 3-D reconstruction method is proposed to reconstruct sparse 3-D landmark models. Second, the sparse 3-D models are projected onto iconic images in order to identify images of the hot regions. For a landmark, hot regions are parts of a landmark which attract photographers' attention and are popularly captured in photos. These hot region images are later used to enhance reconstructed sparse 3-D models. Third, the landmark regions are obtained through mapping the enhanced 3-D models to landmark images. A k-dimensional tree (kd-tree) is then constructed for each landmark based on scale invariant feature transform (SIFT) features extracted from the landmark area to classify unlabeled images into pre-defined landmark categories. The proposed method is evaluated using 291661 images of 51 landmarks. Experiments of comparison indicate that our method outperforms bag-of-words (BoW) based approach 18.5% and method of spatial-pyramid-matching using sparse-coding (ScSPM) 8.4%. © 2012 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Multimedia	en_US
dc.relation.isbasedon	10.1109/TMM.2012.2190384	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Enhanced 3-D modeling for landmark image classification	en_US
dc.type	Journal Article
utslib.citation.volume	4 PART 2	en_US
utslib.citation.volume	14	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	08 Information and Computing Sciences	en_US
utslib.for	09 Engineering	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
utslib.copyright.status	closed_access
pubs.issue	4 PART 2	en_US
pubs.publication-status	Published	en_US
pubs.volume	14	en_US

Abstract:

Landmark image classification is a challenging task due to the various circumstances, e.g., illumination, viewpoint, zoom in/out and occlusion under which landmark images are taken. Most existing approaches utilize features extracted from the whole image including both landmark and non-landmark areas. However, non-landmark areas introduce redundant and noisy information. In this paper, we propose a novel approach to improve landmark image classification consisting of three steps. First, an attention-based 3-D reconstruction method is proposed to reconstruct sparse 3-D landmark models. Second, the sparse 3-D models are projected onto iconic images in order to identify images of the hot regions. For a landmark, hot regions are parts of a landmark which attract photographers' attention and are popularly captured in photos. These hot region images are later used to enhance reconstructed sparse 3-D models. Third, the landmark regions are obtained through mapping the enhanced 3-D models to landmark images. A k-dimensional tree (kd-tree) is then constructed for each landmark based on scale invariant feature transform (SIFT) features extracted from the landmark area to classify unlabeled images into pre-defined landmark categories. The proposed method is evaluated using 291661 images of 51 landmarks. Experiments of comparison indicate that our method outperforms bag-of-words (BoW) based approach 18.5% and method of spatial-pyramid-matching using sparse-coding (ScSPM) 8.4%. © 2012 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/22856