SFGN: Representing the sequence with one super frame for video person re-identification
- Publisher:
- Elsevier
- Publication Type:
- Journal Article
- Citation:
- Knowledge-Based Systems, 2022, 249, pp. 108884
- Issue Date:
- 2022-08-05
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
1-s2.0-S095070512200421X-main.pdf | Published version | 1.43 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Video-based person re-identification (V-Re-ID) is more robust than image-based person re-identification (I-Re-ID) because of the additional temporal information. However, the high storage overhead of video sequences largely stems the applications of V-Re-ID. To reduce the storage overhead, we propose to represent each video sequence with only one frame. However, directly picking one frame from each sequence will reduce the performance dramatically. Thus, we propose a brand-new framework called super frame generation network (SFGN), which can encode the spatial–temporal information of a video sequence into a generated frame, which is called “super frame” to distinguish from the directly picked “key frame”. To achieve super frames of high visual quality and representation ability, we carefully design the specific-frame-feature fused skip-connection generator (SFSG). SFSG takes the role of a feature encoder and the co-trained image model can be seen as the corresponding feature decoder. To reduce the information loss in the encoding–decoding process, we further propose the feature recovery loss (FRL). To the best of our knowledge, we are the first to propose and relieve this issue. Extensive experiments on Mars, iLIDS-VID, and PRID2011 show that the proposed SFGN can generate super frames of high visual quality and representation ability. For the code, please visit the project website:.
Please use this identifier to cite or link to this item: