Bilaterally Normalized Scale-Consistent Sinkhorn Distance for Few-Shot Image Classification.
- Publisher:
- Institute of Electrical and Electronics Engineers
- Publication Type:
- Journal Article
- Citation:
- IEEE Transactions on Neural Networks and Learning Systems, 2024, PP, (8), pp. 11475-11485
- Issue Date:
- 2024-04-17
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
1641334.pdf | Published version | 1.8 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Few-shot image classification aims at exploring transferable features from base classes to recognize images of the unseen novel classes with only a few labeled images. Existing methods usually compare the support features and query features, which are implemented by either matching the global feature vectors or matching the local feature maps at the same position. However, few labeled images fail to capture all the diverse context and intraclass variations, leading to mismatch issues for existing methods. On one hand, due to the misaligned position and cluttered background, existing methods suffer from the object mismatch issue. On the other hand, due to the scale inconsistency between images, existing methods suffer from the scale mismatch issue. In this article, we propose the bilaterally normalized scale-consistent Sinkhorn distance (BSSD) to solve these issues. First, instead of same-position matching, we use the Sinkhorn distance to find an optimal matching between images, mitigating the object mismatch caused by misaligned position. Meanwhile, we propose the intraimage and interimage attentions as the bilateral normalization on the Sinkhorn distance to suppress the object mismatch caused by background clutter. Second, local feature maps are enhanced with the multiscale pooling strategy, making the Sinkhorn distance possible to find a consistent matching scale between images. Experimental results show the effectiveness of the proposed approach, and we achieve the state-of-the-art on three few-shot benchmarks.
Please use this identifier to cite or link to this item: