Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition

Huang, T; Ben, X; Gong, C; Zhang, B; Yan, R; Wu, Q

Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition

Huang, T Ben, X Gong, C Zhang, B Yan, R Wu, Q

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32, (10), pp. 6967-6980
Issue Date:: 2022-10-01

Closed Access

	Filename	Description	Size
	Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition.pdf	Published version	3.13 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Huang, T
dc.contributor.author	Ben, X
dc.contributor.author	Gong, C
dc.contributor.author	Zhang, B
dc.contributor.author	Yan, R
dc.contributor.author	Wu, Q https://orcid.org/0000-0001-5641-2483
dc.date.accessioned	2023-04-11T02:11:48Z
dc.date.available	2023-04-11T02:11:48Z
dc.date.issued	2022-10-01
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32, (10), pp. 6967-6980
dc.identifier.issn	1051-8215
dc.identifier.issn	1558-2205
dc.identifier.uri	http://hdl.handle.net/10453/169503
dc.description.abstract	Gait recognition can be used in person identification and re-identification by itself or in conjunction with other biometrics. Although gait has both spatial and temporal attributes, and it has been observed that decoupling spatial feature and temporal feature can better exploit the gait feature on the fine-grained level. However, the spatial-temporal correlations of gait video signals are also lost in the decoupling process. Direct 3D convolution approaches can retain such correlations, but they also introduce unnecessary interferences. Instead of common 3D convolution solutions, this paper proposes an integration of decoupling process into a 3D convolution framework for cross-view gait recognition. In particular, a novel block consisting of a Parallel-insight Convolution layer integrated with a Spatial-Temporal Dual-Attention (STDA) unit is proposed as the basic block for global spatial-temporal information extraction. Under the guidance of the STDA unit, this block can well integrate spatial-temporal information extracted by two decoupled models and at the same time retain the spatial-temporal correlations. In addition, a Multi-Scale Salient Feature Extractor is proposed to further exploit the fine-grained features through context awareness extension of part-based features and adaptively aggregating the spatial features. Extensive experiments on three popular gait datasets, namely CASIA-B, OULP and OUMVLP, demonstrate that the proposed method outperforms state-of-the-art methods.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology
dc.relation.isbasedon	10.1109/TCSVT.2022.3175959
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Enhanced Spatial-Temporal Salience for Cross-View Gait Recognition
dc.type	Journal Article
utslib.citation.volume	32
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
pubs.organisational-group	/University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2023-04-11T02:11:47Z
pubs.issue	10
pubs.publication-status	Published
pubs.volume	32
utslib.citation.issue	10

Abstract:

Gait recognition can be used in person identification and re-identification by itself or in conjunction with other biometrics. Although gait has both spatial and temporal attributes, and it has been observed that decoupling spatial feature and temporal feature can better exploit the gait feature on the fine-grained level. However, the spatial-temporal correlations of gait video signals are also lost in the decoupling process. Direct 3D convolution approaches can retain such correlations, but they also introduce unnecessary interferences. Instead of common 3D convolution solutions, this paper proposes an integration of decoupling process into a 3D convolution framework for cross-view gait recognition. In particular, a novel block consisting of a Parallel-insight Convolution layer integrated with a Spatial-Temporal Dual-Attention (STDA) unit is proposed as the basic block for global spatial-temporal information extraction. Under the guidance of the STDA unit, this block can well integrate spatial-temporal information extracted by two decoupled models and at the same time retain the spatial-temporal correlations. In addition, a Multi-Scale Salient Feature Extractor is proposed to further exploit the fine-grained features through context awareness extension of part-based features and adaptively aggregating the spatial features. Extensive experiments on three popular gait datasets, namely CASIA-B, OULP and OUMVLP, demonstrate that the proposed method outperforms state-of-the-art methods.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/169503