They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers

Chang, X; Yu, YL; Yang, Y; Xing, EP

They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers

Chang, X

Yu, YL Yang, Y

Xing, EP

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, 2016-December pp. 1884 - 1893
Issue Date:: 2016-12-09

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted Manuscript versionAdobe PDF (2.74 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807	en_US
dc.contributor.author	Yu, YL	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Xing, EP	en_US
dc.date.issued	2016-12-09	en_US
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, 2016-December pp. 1884 - 1893	en_US
dc.identifier.isbn	9781467388504	en_US
dc.identifier.issn	1063-6919	en_US
dc.identifier.uri	http://hdl.handle.net/10453/58433
dc.description.abstract	© 2016 IEEE. Complex event detection on unconstrained Internet videos has seen much progress in recent years. However, state-of-the-art performance degrades dramatically when the number of positive training exemplars falls short. Since label acquisition is costly, laborious, and time-consuming, there is a real need to consider the much more challenging semantic event search problem, where no example video is given. In this paper, we present a state-of-the-art event search system without any example videos. Relying on the key observation that events (e.g. dog show) are usually compositions of multiple mid-level concepts (e.g. 'dog,' 'theater,' and 'dog jumping'), we first train a skip-gram model to measure the relevance of each concept with the event of interest. The relevant concept classifiers then cast votes on the test videos but their reliability, due to lack of labeled training videos, has been largely unaddressed. We propose to combine the concept classifiers based on a principled estimate of their accuracy on the unlabeled test videos. A novel warping technique is proposed to improve the performance and an efficient highly-scalable algorithm is provided to quickly solve the resulting optimization. We conduct extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV datasets, and achieve state-of-the-art performances.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DE130101311
dc.relation	http://purl.org/au-research/grants/arc/DP150103008
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	en_US
dc.relation.isbasedon	10.1109/CVPR.2016.208	en_US
dc.title	They are Not Equally Reliable: Semantic Event Search Using Differentiated Concept Classifiers	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2016-December	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	2016-December	en_US

Abstract:

© 2016 IEEE. Complex event detection on unconstrained Internet videos has seen much progress in recent years. However, state-of-the-art performance degrades dramatically when the number of positive training exemplars falls short. Since label acquisition is costly, laborious, and time-consuming, there is a real need to consider the much more challenging semantic event search problem, where no example video is given. In this paper, we present a state-of-the-art event search system without any example videos. Relying on the key observation that events (e.g. dog show) are usually compositions of multiple mid-level concepts (e.g. 'dog,' 'theater,' and 'dog jumping'), we first train a skip-gram model to measure the relevance of each concept with the event of interest. The relevant concept classifiers then cast votes on the test videos but their reliability, due to lack of labeled training videos, has been largely unaddressed. We propose to combine the concept classifiers based on a principled estimate of their accuracy on the unlabeled test videos. A novel warping technique is proposed to improve the performance and an efficient highly-scalable algorithm is provided to quickly solve the resulting optimization. We conduct extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV datasets, and achieve state-of-the-art performances.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/58433