Dynamic concept composition for zero-example event detection

Chang, X; Yang, Y; Long, G; Zhang, C; Hauptmann, AG

Dynamic concept composition for zero-example event detection

Chang, X

Yang, Y

Long, G

Zhang, C

Hauptmann, AG

Permalink

Publication Type:: Conference Proceeding
Citation:: 30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 3464 - 3470
Issue Date:: 2016-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted Manuscript VersionAdobe PDF (418.91 kB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Long, G https://orcid.org/0000-0003-3740-9515	en_US
dc.contributor.author	Zhang, C https://orcid.org/0000-0001-5715-7154	en_US
dc.contributor.author	Hauptmann, AG	en_US
dc.date.issued	2016-01-01	en_US
dc.identifier.citation	30th AAAI Conference on Artificial Intelligence, AAAI 2016, 2016, pp. 3464 - 3470	en_US
dc.identifier.isbn	9781577357605	en_US
dc.identifier.uri	http://hdl.handle.net/10453/100649
dc.description.abstract	© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars. In principle, zero-shot learning makes it possible to train an event detection model based on the assumption that events (e.g. birthday party) can be described by multiple mid-level semantic concepts (e.g. "blowing candle", "birthday cake"). Towards this goal, we first pre-Train a bundle of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest and pick up the relevant concept classifiers, which are applied on all test videos to get multiple prediction score vectors. While most existing systems combine the predictions of the concept classifiers with fixed weights, we propose to learn the optimal weights of the concept classifiers for each testing video by exploring a set of online available videos with freeform text descriptions of their content. To validate the effectiveness of the proposed approach, we have conducted extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV dataset. The experimental results confirm the superiority of the proposed approach.	en_US
dc.relation	http://purl.org/au-research/grants/arc/LP160100630
dc.relation.ispartof	30th AAAI Conference on Artificial Intelligence, AAAI 2016	en_US
dc.title	Dynamic concept composition for zero-example event detection	en_US
dc.type	Conference Proceeding
utslib.for	080101 Adaptive Agents and Intelligent Robotics	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/DVC (International)
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - ACRI - Australia China Relations Institute
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US

Abstract:

© Copyright 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. In this paper, we focus on automatically detecting events in unconstrained videos without the use of any visual training exemplars. In principle, zero-shot learning makes it possible to train an event detection model based on the assumption that events (e.g. birthday party) can be described by multiple mid-level semantic concepts (e.g. "blowing candle", "birthday cake"). Towards this goal, we first pre-Train a bundle of concept classifiers using data from other sources. Then we evaluate the semantic correlation of each concept w.r.t. the event of interest and pick up the relevant concept classifiers, which are applied on all test videos to get multiple prediction score vectors. While most existing systems combine the predictions of the concept classifiers with fixed weights, we propose to learn the optimal weights of the concept classifiers for each testing video by exploring a set of online available videos with freeform text descriptions of their content. To validate the effectiveness of the proposed approach, we have conducted extensive experiments on the latest TRECVID MEDTest 2014, MEDTest 2013 and CCV dataset. The experimental results confirm the superiority of the proposed approach.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/100649