Weakly Supervised Person Search with Region Siamese Networks

Publisher:
IEEE
Publication Type:
Conference Proceeding
Citation:
2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2022, 00, pp. 11986-11995
Issue Date:
2022-02-28
Filename Description Size
Weakly Supervised Person Search with Region Siamese Networks.pdfPublished version5.2 MB
Adobe PDF
Full metadata record
Supervised learning is dominant in person search, but it requires elaborate labeling of bounding boxes and identities. Large-scale labeled training data is often difficult to collect, especially for person identities. A natural question is whether a good person search model can be trained without the need of identity supervision. In this paper, we present a weakly supervised setting where only bounding box annotations are available. Based on this new setting, we provide an effective baseline model termed Region Siamese Networks (R-SiamNets). Towards learning useful representations for recognition in the absence of identity labels, we supervise the R-SiamNet with instance-level consistency loss and cluster-level contrastive loss. For instance-level consistency learning, the R-SiamNet is constrained to extract consistent features from each person region with or without out-of-region context. For cluster-level contrastive learning, we enforce the aggregation of closest instances and the separation of dissimilar ones in feature space. Extensive experiments validate the utility of our weakly supervised method. Our model achieves the rank-1 of 87.1% and mAP of 86.0% on CUHK-SYSU benchmark, which surpasses several fully supervised methods, such as OIM [36] and MGTS [4], by a clear margin. More promising performance can be reached by incorporating extra training data. We hope this work could encourage the future research in this field.
Please use this identifier to cite or link to this item: