Deformable attention-oriented feature pyramid network for semantic segmentation

Publication Type:
Journal Article
Knowledge-Based Systems, 2022, 254
Issue Date:
Full metadata record
In the field of computer vision, the use of pyramid features can significantly improve network performance. However, the misalignment of semantic information and the scale limitation of small-scale features lead to an imbalance of feature contributions, which severely limits the performance of the feature pyramid network. In order to solve the problem of model efficiency decline caused by feature contribution imbalance, in this paper, we propose a deformable attention-oriented feature pyramid network (DAFPN). Unlike previous models, which focus solely on the semantic information between features, DAFPN uses the deformable attention mechanism to model the relationship between multiple features and then merges them in the pyramid feature fusion process. Based on DAFPN, we further propose a fully transformer-based semantic segmentation head, which achieves high performance and good scalability. Comparisons on multiple backbones reveal that our proposed model outperforms the baseline model. Under the same conditions, our method can improve the mIoU by 1∼4%, which is higher than the baseline semantic segmentation model.
Please use this identifier to cite or link to this item: