Recognizing an Action Using Its Name: A Knowledge-Based Approach

Gan, C; Yang, Y; Zhu, L; Zhao, D; Zhuang, Y

Recognizing an Action Using Its Name: A Knowledge-Based Approach

Gan, C Yang, Y

Zhu, L

Zhao, D Zhuang, Y

Permalink

Publication Type:: Journal Article
Citation:: International Journal of Computer Vision, 2016, 120 (1), pp. 61 - 77
Issue Date:: 2016-10-01

Closed Access

	Filename	Description	Size
	art%3A10.1007%2Fs11263-016-0893-6.pdf	Published Version	5.05 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Gan, C	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Zhu, L https://orcid.org/0000-0002-4093-7557	en_US
dc.contributor.author	Zhao, D	en_US
dc.contributor.author	Zhuang, Y	en_US
dc.date.issued	2016-10-01	en_US
dc.identifier.citation	International Journal of Computer Vision, 2016, 120 (1), pp. 61 - 77	en_US
dc.identifier.issn	0920-5691	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121791
dc.description.abstract	© 2016, Springer Science+Business Media New York. Existing action recognition algorithms require a set of positive exemplars to train a classifier for each action. However, the amount of action classes is very large and the users’ queries vary dramatically. It is impractical to pre-define all possible action classes beforehand. To address this issue, we propose to perform action recognition with no positive exemplars, which is often known as the zero-shot learning. Current zero-shot learning paradigms usually train a series of attribute classifiers and then recognize the target actions based on the attribute representation. To ensure the maximum coverage of ad-hoc action classes, the attribute-based approaches require large numbers of reliable and accurate attribute classifiers, which are often unavailable in the real world. In this paper, we propose an approach that merely takes an action name as the input to recognize the action of interest without any pre-trained attribute classifiers and positive exemplars. Given an action name, we first build an analogy pool according to an external ontology, and each action in the analogy pool is related to the target action at different levels. The correlation information inferred from the external ontology may be noisy. We then propose an algorithm, namely adaptive multi-model rank-preserving mapping (AMRM), to train a classifier for action recognition, which is able to evaluate the relatedness of each video in the analogy pool adaptively. As multiple mapping models are employed, our algorithm has better capability to bridge the gap between visual features and the semantic information inferred from the ontology. Extensive experiments demonstrate that our method achieves the promising performance for action recognition only using action names, while no attributes and positive exemplars are available.	en_US
dc.relation	http://purl.org/au-research/grants/arc/DP150103008
dc.relation	http://purl.org/au-research/grants/arc/DE130101311
dc.relation	http://purl.org/au-research/grants/arc/LP160100630
dc.relation.ispartof	International Journal of Computer Vision	en_US
dc.relation.isbasedon	10.1007/s11263-016-0893-6	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Recognizing an Action Using Its Name: A Knowledge-Based Approach	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	120	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	120	en_US

Abstract:

© 2016, Springer Science+Business Media New York. Existing action recognition algorithms require a set of positive exemplars to train a classifier for each action. However, the amount of action classes is very large and the users’ queries vary dramatically. It is impractical to pre-define all possible action classes beforehand. To address this issue, we propose to perform action recognition with no positive exemplars, which is often known as the zero-shot learning. Current zero-shot learning paradigms usually train a series of attribute classifiers and then recognize the target actions based on the attribute representation. To ensure the maximum coverage of ad-hoc action classes, the attribute-based approaches require large numbers of reliable and accurate attribute classifiers, which are often unavailable in the real world. In this paper, we propose an approach that merely takes an action name as the input to recognize the action of interest without any pre-trained attribute classifiers and positive exemplars. Given an action name, we first build an analogy pool according to an external ontology, and each action in the analogy pool is related to the target action at different levels. The correlation information inferred from the external ontology may be noisy. We then propose an algorithm, namely adaptive multi-model rank-preserving mapping (AMRM), to train a classifier for action recognition, which is able to evaluate the relatedness of each video in the analogy pool adaptively. As multiple mapping models are employed, our algorithm has better capability to bridge the gap between visual features and the semantic information inferred from the ontology. Extensive experiments demonstrate that our method achieves the promising performance for action recognition only using action names, while no attributes and positive exemplars are available.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121791