3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification

Han, K; Huang, Y; Gong, S; Huang, Y; Wang, L; Tan, T

3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification

Han, K Huang, Y

Gong, S Huang, Y

Wang, L Tan, T

Permalink

Publisher:: Springer
Publication Type:: Conference Proceeding
Citation:: Computer Vision – ACCV 2022, 2023, 13845 LNCS, pp. 71-88
Issue Date:: 2023-01-01

Closed Access

	Filename	Description	Size
	978-3-031-26348-4_5.pdf	Published version	910.02 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Han, K
dc.contributor.author	Huang, Y https://orcid.org/0000-0002-1363-5318
dc.contributor.author	Gong, S
dc.contributor.author	Huang, Y https://orcid.org/0000-0002-1363-5318
dc.contributor.author	Wang, L
dc.contributor.author	Tan, T
dc.date	2022-12-04
dc.date.accessioned	2024-06-20T22:42:17Z
dc.date.available	2024-06-20T22:42:17Z
dc.date.issued	2023-01-01
dc.identifier.citation	Computer Vision – ACCV 2022, 2023, 13845 LNCS, pp. 71-88
dc.identifier.isbn	978-3-031-26347-7
dc.identifier.issn	0302-9743
dc.identifier.issn	1611-3349
dc.identifier.uri	http://hdl.handle.net/10453/179594
dc.description.abstract	3D shape of human body can be both discriminative and clothing-independent information in video-based clothing-change person re-identification (Re-ID). However, existing Re-ID methods usually generate 3D body shapes without considering identity modelling, which severely weakens the discriminability of 3D human shapes. In addition, different video frames provide highly similar 3D shapes, but existing methods cannot capture the differences among 3D shapes over time. They are thus insensitive to the unique and discriminative 3D shape information of each frame and ineffectively aggregate many redundant framewise shapes in a videowise representation for Re-ID. To address these problems, we propose a 3D Shape Temporal Aggregation (3STA) model for video-based clothing-change Re-ID. To generate the discriminative 3D shape for each frame, we first introduce an identity-aware 3D shape generation module. It embeds the identity information into the generation of 3D shapes by the joint learning of shape estimation and identity recognition. Second, a difference-aware shape aggregation module is designed to measure inter-frame 3D human shape differences and automatically select the unique 3D shape information of each frame. This helps minimise redundancy and maximise complementarity in temporal shape aggregation. We further construct a Video-based Clothing-Change Re-ID (VCCR) dataset to address the lack of publicly available datasets for video-based clothing-change Re-ID. Extensive experiments on the VCCR dataset demonstrate the effectiveness of the proposed 3STA model. The dataset is available at https://vhank.github.io/vccr.github.io.
dc.language	en
dc.publisher	Springer
dc.relation.ispartof	Computer Vision – ACCV 2022
dc.relation.ispartof	Asian Conference on Computer Vision
dc.relation.ispartofseries	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
dc.relation.isbasedon	10.1007/978-3-031-26348-4_5
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject.classification	Artificial Intelligence & Image Processing
dc.subject.classification	46 Information and computing sciences
dc.title	3D Shape Temporal Aggregation for Video-Based Clothing-Change Person Re-identification
dc.type	Conference Proceeding
utslib.citation.volume	13845 LNCS
utslib.location.activity	Macao, China
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2024-06-20T22:42:15Z
pubs.finish-date	2022-12-08
pubs.place-of-publication	Switzerland
pubs.publication-status	Published
pubs.start-date	2022-12-04
pubs.volume	13845 LNCS
dc.location	Switzerland

Abstract:

3D shape of human body can be both discriminative and clothing-independent information in video-based clothing-change person re-identification (Re-ID). However, existing Re-ID methods usually generate 3D body shapes without considering identity modelling, which severely weakens the discriminability of 3D human shapes. In addition, different video frames provide highly similar 3D shapes, but existing methods cannot capture the differences among 3D shapes over time. They are thus insensitive to the unique and discriminative 3D shape information of each frame and ineffectively aggregate many redundant framewise shapes in a videowise representation for Re-ID. To address these problems, we propose a 3D Shape Temporal Aggregation (3STA) model for video-based clothing-change Re-ID. To generate the discriminative 3D shape for each frame, we first introduce an identity-aware 3D shape generation module. It embeds the identity information into the generation of 3D shapes by the joint learning of shape estimation and identity recognition. Second, a difference-aware shape aggregation module is designed to measure inter-frame 3D human shape differences and automatically select the unique 3D shape information of each frame. This helps minimise redundancy and maximise complementarity in temporal shape aggregation. We further construct a Video-based Clothing-Change Re-ID (VCCR) dataset to address the lack of publicly available datasets for video-based clothing-change Re-ID. Extensive experiments on the VCCR dataset demonstrate the effectiveness of the proposed 3STA model. The dataset is available at https://vhank.github.io/vccr.github.io.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/179594