Beyond modality alignment: Learning part-level representation for visible-infrared person re-identification

Zhang, P; Wu, Q; Yao, X; Xu, J

Beyond modality alignment: Learning part-level representation for visible-infrared person re-identification

Zhang, P

Wu, Q

Yao, X

Xu, J

Permalink

Publisher:: ELSEVIER
Publication Type:: Journal Article
Citation:: Image and Vision Computing, 2021, 108
Issue Date:: 2021-04-01

Closed Access

	Filename	Description	Size
	1-s2.0-S0262885621000238-main.pdf	Published version	1.65 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, P https://orcid.org/0000-0001-7973-2746
dc.contributor.author	Wu, Q https://orcid.org/0000-0001-5641-2483
dc.contributor.author	Yao, X https://orcid.org/0000-0002-7475-0512
dc.contributor.author	Xu, J https://orcid.org/0000-0002-9102-3616
dc.date.accessioned	2024-03-13T10:20:29Z
dc.date.available	2024-03-13T10:20:29Z
dc.date.issued	2021-04-01
dc.identifier.citation	Image and Vision Computing, 2021, 108
dc.identifier.issn	0262-8856
dc.identifier.issn	1872-8138
dc.identifier.uri	http://hdl.handle.net/10453/176641
dc.description.abstract	Visible-Infrared person re-IDentification (VI-reID) aims to automatically retrieve the pedestrian of interest exposed to sensors in different modalities, such as visible camera v.s. infrared sensor. It struggles to learn both modality-invariant and discriminant representations. Unfortunately, existing VI-reID work mainly focuses on tackling the modality difference, which fine-grained level discriminant information has not been well investigated. This causes inferior identification performance. To address the problem, we propose a Dual-Alignment Part-aware Representation (DAPR) framework to simultaneously alleviate the modality bias and mine different level of discriminant representations. Particularly, our DAPR reduces modality discrepancy of high-level features hierarchically by back-propagating reversal gradients from a modality classifier, in order to learn a modality-invariant feature space. And meanwhile, multiple heads of classifiers with the improved part-aware BNNeck are integrated to supervise the network producing identity-discriminant representations w.r.t. both local details and global structures in the learned modality-invariant space. By training in an end-to-end manner, the proposed DAPR produces camera-modality-invariant yet discriminant features1 for the purpose of person matching across modalities. Extensive experiments are conducted on two benchmarks, i.e., SYSU MM01 and RegDB, and the results demonstrate the effectiveness of our proposed method.
dc.language	English
dc.publisher	ELSEVIER
dc.relation.ispartof	Image and Vision Computing
dc.relation.isbasedon	10.1016/j.imavis.2021.104118
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.subject.classification	4007 Control engineering, mechatronics and robotics
dc.subject.classification	4603 Computer vision and multimedia computation
dc.subject.classification	4611 Machine learning
dc.title	Beyond modality alignment: Learning part-level representation for visible-infrared person re-identification
dc.type	Journal Article
utslib.citation.volume	108
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - INEXT - Innovation in IT Services and Applications
pubs.organisational-group	University of Technology Sydney/Strength - GBDTC - Global Big Data Technologies
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology/School of Electrical and Data Engineering
utslib.copyright.status	closed_access	*
dc.date.updated	2024-03-13T10:20:27Z
pubs.publication-status	Published
pubs.volume	108

Abstract:

Visible-Infrared person re-IDentification (VI-reID) aims to automatically retrieve the pedestrian of interest exposed to sensors in different modalities, such as visible camera v.s. infrared sensor. It struggles to learn both modality-invariant and discriminant representations. Unfortunately, existing VI-reID work mainly focuses on tackling the modality difference, which fine-grained level discriminant information has not been well investigated. This causes inferior identification performance. To address the problem, we propose a Dual-Alignment Part-aware Representation (DAPR) framework to simultaneously alleviate the modality bias and mine different level of discriminant representations. Particularly, our DAPR reduces modality discrepancy of high-level features hierarchically by back-propagating reversal gradients from a modality classifier, in order to learn a modality-invariant feature space. And meanwhile, multiple heads of classifiers with the improved part-aware BNNeck are integrated to supervise the network producing identity-discriminant representations w.r.t. both local details and global structures in the learned modality-invariant space. By training in an end-to-end manner, the proposed DAPR produces camera-modality-invariant yet discriminant features1 for the purpose of person matching across modalities. Extensive experiments are conducted on two benchmarks, i.e., SYSU MM01 and RegDB, and the results demonstrate the effectiveness of our proposed method.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/176641