Partial Alignment for Object Detection in the Wild

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Circuits and Systems for Video Technology, 2021, PP, (99), pp. 1-1
Issue Date:
2021-01-01
Filename Description Size
Partial_Alignment_for_Object_Detection_in_the_Wild.pdfPublished version11.38 MB
Adobe PDF
Full metadata record
Conventional object detectors often encounter remarkable performance drops due to the domain shift caused by environmental changes. However, labeling sufficient training data drawn from various domains is cost-ineffective and labor-intensive. To this end, unsupervised domain adaptive object detection (DAOD) has attracted much attention, in which the detector is transferred from the label-rich source domain to a label-agnostic target domain. Most of the existing cross-domain object detectors are designed with a parameter-shared network architecture, which, however, has an inherent flaw to accumulate the source errors caused by the inaccurate target distribution. As a result, the target data instead of decisive source data may dominate the learning towards model collapse. To overcome the risk, we propose an Asymmetric Tri-way Faster-RCNN (ATF) for DAOD tasks, in which a novel ancillary net is deployed with the chief net and it brings two merits. 1) The ancillary net enables the source distribution to be preserved without accumulating source error/distortion, such that the target distribution is dominated by source data and the model collapse is alleviated. 2) Ancillary target features can be generated by the ancillary net, which further enhances the discrimination of the target domain object detector. Furthermore, in order to remove the useless source-specific knowledge and exploit the informative domain-invariant knowledge, we propose a Partial Alignment based ATF (PA-ATF) model, in which an adversarial shuffling based mutual-information minimization strategy for domain disentanglement is provided. Intuitively, transferring only the source-domain invariant information to the target domain is more conscionable. Extensive experiments on benchmark datasets, including the Cityscapes, Foggy Cityscapes, Pascal VOC, Clipart, Watercolor, SIM10K, and KITTI demonstrate the remarkable performance of our models over other state-of-the-art approaches.
Please use this identifier to cite or link to this item: