Compact representation for large-scale unconstrained video analysis

Wang, S; Pan, P; Long, G; Chen, W; Li, X; Sheng, QZ

Compact representation for large-scale unconstrained video analysis

Wang, S Pan, P Long, G

Chen, W Li, X Sheng, QZ

Permalink

Publication Type:: Journal Article
Citation:: World Wide Web, 2016, 19 (2), pp. 231 - 246
Issue Date:: 2016-03-01

Closed Access

	Filename	Description	Size
	WWWJ-2016.pdf	Published Version	827.79 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, S	en_US
dc.contributor.author	Pan, P	en_US
dc.contributor.author	Long, G https://orcid.org/0000-0003-3740-9515	en_US
dc.contributor.author	Chen, W	en_US
dc.contributor.author	Li, X	en_US
dc.contributor.author	Sheng, QZ	en_US
dc.date.issued	2016-03-01	en_US
dc.identifier.citation	World Wide Web, 2016, 19 (2), pp. 231 - 246	en_US
dc.identifier.issn	1386-145X	en_US
dc.identifier.uri	http://hdl.handle.net/10453/43361
dc.description.abstract	© 2015, Springer Science+Business Media New York. Recently, newly invented features (e.g. Fisher vector, VLAD) have achieved state-of-the-art performance in large-scale video analysis systems that aims to understand the contents in videos, such as concept recognition and event detection. However, these features are in high-dimensional representations, which remarkably increases computation costs and correspondingly deteriorates the performance of subsequent learning tasks. Notably, the situation becomes even worse when dealing with large-scale video data where the number of class labels are limited. To address this problem, we propose a novel algorithm to compactly represent huge amounts of unconstrained video data. Specifically, redundant feature dimensions are removed by using our proposed feature selection algorithm. Considering unlabeled videos that are easy to obtain on the web, we apply this feature selection algorithm in a semi-supervised framework coping with a shortage of class information. Different from most of the existing semi-supervised feature selection algorithms, our proposed algorithm does not rely on manifold approximation, i.e. graph Laplacian, which is quite expensive for a large number of data. Thus, it is possible to apply the proposed algorithm to a real large-scale video analysis system. Besides, due to the difficulty of solving the non-smooth objective function, we develop an efficient iterative approach to seeking the global optimum. Extensive experiments are conducted on several real-world video datasets, including KTH, CCV, and HMDB. The experimental results have demonstrated the effectiveness of the proposed algorithm.	en_US
dc.relation.ispartof	World Wide Web	en_US
dc.relation.isbasedon	10.1007/s11280-015-0354-0	en_US
dc.subject.classification	Information Systems	en_US
dc.title	Compact representation for large-scale unconstrained video analysis	en_US
dc.type	Journal Article
utslib.citation.volume	2	en_US
utslib.citation.volume	19	en_US
utslib.for	0805 Distributed Computing	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	0804 Data Format	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	closed_access
pubs.issue	2	en_US
pubs.publication-status	Published	en_US
pubs.volume	19	en_US

Abstract:

© 2015, Springer Science+Business Media New York. Recently, newly invented features (e.g. Fisher vector, VLAD) have achieved state-of-the-art performance in large-scale video analysis systems that aims to understand the contents in videos, such as concept recognition and event detection. However, these features are in high-dimensional representations, which remarkably increases computation costs and correspondingly deteriorates the performance of subsequent learning tasks. Notably, the situation becomes even worse when dealing with large-scale video data where the number of class labels are limited. To address this problem, we propose a novel algorithm to compactly represent huge amounts of unconstrained video data. Specifically, redundant feature dimensions are removed by using our proposed feature selection algorithm. Considering unlabeled videos that are easy to obtain on the web, we apply this feature selection algorithm in a semi-supervised framework coping with a shortage of class information. Different from most of the existing semi-supervised feature selection algorithms, our proposed algorithm does not rely on manifold approximation, i.e. graph Laplacian, which is quite expensive for a large number of data. Thus, it is possible to apply the proposed algorithm to a real large-scale video analysis system. Besides, due to the difficulty of solving the non-smooth objective function, we develop an efficient iterative approach to seeking the global optimum. Extensive experiments are conducted on several real-world video datasets, including KTH, CCV, and HMDB. The experimental results have demonstrated the effectiveness of the proposed algorithm.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/43361