Metric learning based structural appearance model for robust visual tracking

Wu, Y; Ma, B; Yang, M; Zhang, J; Jia, Y

Metric learning based structural appearance model for robust visual tracking

Wu, Y Ma, B Yang, M Zhang, J

Jia, Y

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24 (5), pp. 865 - 877
Issue Date:: 2014-01-01

Closed Access

	Filename	Description	Size
	2013006331.pdf	Published Version	1.54 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wu, Y	en_US
dc.contributor.author	Ma, B	en_US
dc.contributor.author	Yang, M	en_US
dc.contributor.author	Zhang, J https://orcid.org/0000-0002-7240-3541	en_US
dc.contributor.author	Jia, Y	en_US
dc.date.issued	2014-01-01	en_US
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2014, 24 (5), pp. 865 - 877	en_US
dc.identifier.issn	1051-8215	en_US
dc.identifier.uri	http://hdl.handle.net/10453/35483
dc.description.abstract	Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods. © 2013 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology	en_US
dc.relation.isbasedon	10.1109/TCSVT.2013.2291283	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Metric learning based structural appearance model for robust visual tracking	en_US
dc.type	Journal Article
utslib.citation.volume	5	en_US
utslib.citation.volume	24	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
utslib.copyright.status	closed_access
pubs.issue	5	en_US
pubs.publication-status	Published	en_US
pubs.volume	24	en_US

Abstract:

Appearance modeling is a key issue for the success of a visual tracker. Sparse representation based appearance modeling has received an increasing amount of interest in recent years. However, most of existing work utilizes reconstruction errors to compute the observation likelihood under the generative framework, which may give poor performance, especially for significant appearance variations. In this paper, we advocate an approach to visual tracking that seeks an appropriate metric in the feature space of sparse codes and propose a metric learning based structural appearance model for more accurate matching of different appearances. This structural representation is acquired by performing multiscale max pooling on the weighted local sparse codes of image patches. An online multiple instance metric learning algorithm is proposed that learns a discriminative and adaptive metric, thereby better distinguishing the visual object of interest from the background. The multiple instance setting is able to alleviate the drift problem potentially caused by misaligned training examples. Tracking is then carried out within a Bayesian inference framework, in which the learned metric and the structure object representation are used to construct the observation model. Comprehensive experiments on challenging image sequences demonstrate qualitatively and quantitatively that the proposed algorithm outperforms the state-of-the-art methods. © 2013 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/35483