Action recognition by multiple features and hyper-sphere multi-class SVM

Liu, J; Yang, J; Zhang, Y; He, X

Action recognition by multiple features and hyper-sphere multi-class SVM

Liu, J Yang, J Zhang, Y He, X

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings - International Conference on Pattern Recognition, 2010, pp. 3744 - 3747
Issue Date:: 2010-11-18

Closed Access

	Filename	Description	Size
	2009007669OK.pdf		857.74 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Liu, J	en_US
dc.contributor.author	Yang, J	en_US
dc.contributor.author	Zhang, Y	en_US
dc.contributor.author	He, X https://orcid.org/0000-0001-8962-540X	en_US
dc.date.issued	2010-11-18	en_US
dc.identifier.citation	Proceedings - International Conference on Pattern Recognition, 2010, pp. 3744 - 3747	en_US
dc.identifier.isbn	9780769541099	en_US
dc.identifier.issn	1051-4651	en_US
dc.identifier.uri	http://hdl.handle.net/10453/16277
dc.description.abstract	In this paper we propose a novel framework for action recognition based on multiple features for improve action recognition in videos. The fusion of multiple features is important for recognizing actions as often a single feature based representation is not enough to capture the imaging variations (view-point, illumination etc.) and attributes of individuals (size, age, gender etc.). Hence, we use two kinds of features: i) a quantized vocabulary of local spatio-temporal (ST) volumes (cuboids and 2-D SIFT), and ii) the higher-order statistical models of interest points, which aims to capture the global information of the actor. We construct video representation in terms of local space-time features and global features and integrate such representations with hyper-sphere multi-class SVM. Experiments on publicly available datasets show that our proposed approach is effective. An additional experiment shows that using both local and global features provides a richer representation of human action when compared to the use of a single feature type. © 2010 IEEE.	en_US
dc.relation.ispartof	Proceedings - International Conference on Pattern Recognition	en_US
dc.relation.isbasedon	10.1109/ICPR.2010.912	en_US
dc.title	Action recognition by multiple features and hyper-sphere multi-class SVM	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
dc.location.activity	Istanbul Turkey	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - CRIN - Realtime Information Networks
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

In this paper we propose a novel framework for action recognition based on multiple features for improve action recognition in videos. The fusion of multiple features is important for recognizing actions as often a single feature based representation is not enough to capture the imaging variations (view-point, illumination etc.) and attributes of individuals (size, age, gender etc.). Hence, we use two kinds of features: i) a quantized vocabulary of local spatio-temporal (ST) volumes (cuboids and 2-D SIFT), and ii) the higher-order statistical models of interest points, which aims to capture the global information of the actor. We construct video representation in terms of local space-time features and global features and integrate such representations with hyper-sphere multi-class SVM. Experiments on publicly available datasets show that our proposed approach is effective. An additional experiment shows that using both local and global features provides a richer representation of human action when compared to the use of a single feature type. © 2010 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/16277