DETR with Additional Global Aggregation for Cross-domain Weakly Supervised Object Detection

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, 2023-June, pp. 11422-11432
Issue Date:
2023-01-01
Filename Description Size
2304.07082v1.pdfPublished version1.4 MB
Adobe PDF
Full metadata record
This paper presents a DETR based method for cross domain weakly supervised object detection CDWSOD aiming at adapting the detector from source to target domain through weak supervision We think DETR has strong potential for CDWSOD due to an insight the encoder and the decoder in DETR are both based on the attention mechanism and are thus capable of aggregating semantics across the entire image The aggregation results i e image level predictions can naturally exploit the weak supervision for domain alignment Such motivated we propose DETR with additional Global Aggregation DETR GA a CDWSOD detector that simultaneously makes instance level image level predictions and utilizes strong weak supervisions The key point of DETR GA is very simple for the encoder decoder we respectively add multiple class queries a foreground query to aggregate the semantics into image level predictions Our query based aggregation has two advantages First in the encoder the weakly supervised class queries are capable of roughly locating the corresponding positions and excluding the distraction from non relevant regions Second through our design the object queries and the foreground query in the decoder share consensus on the class semantics therefore making the strong and weak supervision mutually benefit each other for domain alignment Extensive experiments on four popular cross domain benchmarks show that DETR GA significantly improves cross domain detection accuracy e g 29 0 79 4 mAP on PASCAL VOC Clipartall dataset and advances the states of the art
Please use this identifier to cite or link to this item: