Partial Alignment for Object Detection in the Wild

He, Z; Zhang, L; Yang, Y; Gao, X

Partial Alignment for Object Detection in the Wild

He, Z Zhang, L Yang, Y

Gao, X

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Journal Article
Citation:: IEEE Transactions on Circuits and Systems for Video Technology, 2021, PP, (99), pp. 1-1
Issue Date:: 2021-01-01

Closed Access

	Filename	Description	Size
	Partial_Alignment_for_Object_Detection_in_the_Wild.pdf	Published version	11.38 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	He, Z
dc.contributor.author	Zhang, L
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.contributor.author	Gao, X
dc.date.accessioned	2022-05-20T05:52:31Z
dc.date.available	2022-05-20T05:52:31Z
dc.date.issued	2021-01-01
dc.identifier.citation	IEEE Transactions on Circuits and Systems for Video Technology, 2021, PP, (99), pp. 1-1
dc.identifier.issn	1051-8215
dc.identifier.issn	1558-2205
dc.identifier.uri	http://hdl.handle.net/10453/157554
dc.description.abstract	Conventional object detectors often encounter remarkable performance drops due to the domain shift caused by environmental changes. However, labeling sufficient training data drawn from various domains is cost-ineffective and labor-intensive. To this end, unsupervised domain adaptive object detection (DAOD) has attracted much attention, in which the detector is transferred from the label-rich source domain to a label-agnostic target domain. Most of the existing cross-domain object detectors are designed with a parameter-shared network architecture, which, however, has an inherent flaw to accumulate the source errors caused by the inaccurate target distribution. As a result, the target data instead of decisive source data may dominate the learning towards model collapse. To overcome the risk, we propose an Asymmetric Tri-way Faster-RCNN (ATF) for DAOD tasks, in which a novel ancillary net is deployed with the chief net and it brings two merits. 1) The ancillary net enables the source distribution to be preserved without accumulating source error/distortion, such that the target distribution is dominated by source data and the model collapse is alleviated. 2) Ancillary target features can be generated by the ancillary net, which further enhances the discrimination of the target domain object detector. Furthermore, in order to remove the useless source-specific knowledge and exploit the informative domain-invariant knowledge, we propose a Partial Alignment based ATF (PA-ATF) model, in which an adversarial shuffling based mutual-information minimization strategy for domain disentanglement is provided. Intuitively, transferring only the source-domain invariant information to the target domain is more conscionable. Extensive experiments on benchmark datasets, including the Cityscapes, Foggy Cityscapes, Pascal VOC, Clipart, Watercolor, SIM10K, and KITTI demonstrate the remarkable performance of our models over other state-of-the-art approaches.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	IEEE Transactions on Circuits and Systems for Video Technology
dc.relation.isbasedon	10.1109/TCSVT.2021.3138851
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Partial Alignment for Object Detection in the Wild
dc.type	Journal Article
utslib.citation.volume	PP
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	closed_access	*
dc.date.updated	2022-05-20T05:52:28Z
pubs.issue	99
pubs.publication-status	Published
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Conventional object detectors often encounter remarkable performance drops due to the domain shift caused by environmental changes. However, labeling sufficient training data drawn from various domains is cost-ineffective and labor-intensive. To this end, unsupervised domain adaptive object detection (DAOD) has attracted much attention, in which the detector is transferred from the label-rich source domain to a label-agnostic target domain. Most of the existing cross-domain object detectors are designed with a parameter-shared network architecture, which, however, has an inherent flaw to accumulate the source errors caused by the inaccurate target distribution. As a result, the target data instead of decisive source data may dominate the learning towards model collapse. To overcome the risk, we propose an Asymmetric Tri-way Faster-RCNN (ATF) for DAOD tasks, in which a novel ancillary net is deployed with the chief net and it brings two merits. 1) The ancillary net enables the source distribution to be preserved without accumulating source error/distortion, such that the target distribution is dominated by source data and the model collapse is alleviated. 2) Ancillary target features can be generated by the ancillary net, which further enhances the discrimination of the target domain object detector. Furthermore, in order to remove the useless source-specific knowledge and exploit the informative domain-invariant knowledge, we propose a Partial Alignment based ATF (PA-ATF) model, in which an adversarial shuffling based mutual-information minimization strategy for domain disentanglement is provided. Intuitively, transferring only the source-domain invariant information to the target domain is more conscionable. Extensive experiments on benchmark datasets, including the Cityscapes, Foggy Cityscapes, Pascal VOC, Clipart, Watercolor, SIM10K, and KITTI demonstrate the remarkable performance of our models over other state-of-the-art approaches.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/157554