PSDiff: Diffusion Model for Person Search with Iterative and Collaborative Refinement

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Circuits and Systems for Video Technology, 2024, PP, (99), pp. 1-1
Issue Date:
2024-01-01
Filename Description Size
1771935.pdfPublished version6.78 MB
Adobe PDF
Full metadata record
Dominant Person Search methods aim to localize and recognize query persons in a unified network, which jointly optimizes the two sub-tasks of pedestrian detection and Re-Identification (ReID). Despite significant progress, current methods face two primary challenges: 1) the pedestrian candidates learned within detectors are suboptimal for the ReID task. 2) the potential for collaboration between two sub-tasks is overlooked. To address these issues, we present a novel Person Search framework based on the Diffusion model, PSDiff. PSDiff formulates the person search as a dual denoising process from noisy boxes and ReID embeddings to ground truths. Distinct from the conventional Detection-to-ReID approach, our denoising paradigm discards prior pedestrian candidates generated by detectors, thereby avoiding the local optimum problem of the ReID task. Following the new paradigm, we further design a new Collaborative Denoising Layer (CDL) to optimize detection and ReID sub-tasks in an iterative and collaborative way, which makes two sub-tasks mutually beneficial. Extensive experiments on the standard benchmarks show that PSDiff achieves state-of-the-art performance with fewer parameters and elastic computing overhead.
Please use this identifier to cite or link to this item: