Sparse coding-based spatiotemporal saliency for action recognition

Zhang, T; Xu, L; Yang, J; Shi, P; Jia, W

Sparse coding-based spatiotemporal saliency for action recognition

Zhang, T Xu, L Yang, J Shi, P Jia, W

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings - International Conference on Image Processing, ICIP, 2015, 2015-December pp. 2045 - 2049
Issue Date:: 2015-12-09

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Download Accepted Manuscript versionAdobe PDF (290.74 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, T	en_US
dc.contributor.author	Xu, L	en_US
dc.contributor.author	Yang, J	en_US
dc.contributor.author	Shi, P	en_US
dc.contributor.author	Jia, W https://orcid.org/0000-0002-0940-3338	en_US
dc.date.issued	2015-12-09	en_US
dc.identifier.citation	Proceedings - International Conference on Image Processing, ICIP, 2015, 2015-December pp. 2045 - 2049	en_US
dc.identifier.isbn	9781479983391	en_US
dc.identifier.issn	1522-4880	en_US
dc.identifier.uri	http://hdl.handle.net/10453/41635
dc.description.abstract	© 2015 IEEE. In this paper, we address the problem of human action recognition by representing image sequences as a sparse collection of patch-level spatiotemporal events that are salient in both space and time domain. Our method uses a multi-scale volumetric representation of video and adaptively selects an optimal space-time scale under which the saliency of a patch is most significant. The input image sequences are first partitioned into non-overlapping patches. Then, each patch is represented by a vector of coefficients that can linearly reconstruct the patch from a learned dictionary of basis patches. We propose to measure the spatiotemporal saliency of patches using Shannon's self-information entropy, where a patch's saliency is determined by information variation in the contents of the patch's spatiotemporal neighborhood. Experimental results on two benchmark datasets demonstrate the effectiveness of our proposed method.	en_US
dc.relation.ispartof	Proceedings - International Conference on Image Processing, ICIP	en_US
dc.relation.isbasedon	10.1109/ICIP.2015.7351160	en_US
dc.title	Sparse coding-based spatiotemporal saliency for action recognition	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2015-December	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
utslib.copyright.status	open_access
pubs.publication-status	Published	en_US
pubs.volume	2015-December	en_US

Abstract:

© 2015 IEEE. In this paper, we address the problem of human action recognition by representing image sequences as a sparse collection of patch-level spatiotemporal events that are salient in both space and time domain. Our method uses a multi-scale volumetric representation of video and adaptively selects an optimal space-time scale under which the saliency of a patch is most significant. The input image sequences are first partitioned into non-overlapping patches. Then, each patch is represented by a vector of coefficients that can linearly reconstruct the patch from a learned dictionary of basis patches. We propose to measure the spatiotemporal saliency of patches using Shannon's self-information entropy, where a patch's saliency is determined by information variation in the contents of the patch's spatiotemporal neighborhood. Experimental results on two benchmark datasets demonstrate the effectiveness of our proposed method.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/41635