Deformable attention-oriented feature pyramid network for semantic segmentation

Lu, L; Xiao, Y; Chang, X; Wang, X; Ren, P; Ren, Z

Deformable attention-oriented feature pyramid network for semantic segmentation

Lu, L Xiao, Y Chang, X

Wang, X Ren, P Ren, Z

Permalink

Publisher:: ELSEVIER
Publication Type:: Journal Article
Citation:: Knowledge-Based Systems, 2022, 254
Issue Date:: 2022-10-27

Closed Access

	Filename	Description	Size
	Deformable attention-oriented feature pyramid network for semantic segmentation.pdf	Published version	1.7 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Lu, L
dc.contributor.author	Xiao, Y
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.contributor.author	Wang, X
dc.contributor.author	Ren, P
dc.contributor.author	Ren, Z
dc.date.accessioned	2023-03-21T22:54:44Z
dc.date.available	2023-03-21T22:54:44Z
dc.date.issued	2022-10-27
dc.identifier.citation	Knowledge-Based Systems, 2022, 254
dc.identifier.issn	0950-7051
dc.identifier.issn	1872-7409
dc.identifier.uri	http://hdl.handle.net/10453/167988
dc.description.abstract	In the field of computer vision, the use of pyramid features can significantly improve network performance. However, the misalignment of semantic information and the scale limitation of small-scale features lead to an imbalance of feature contributions, which severely limits the performance of the feature pyramid network. In order to solve the problem of model efficiency decline caused by feature contribution imbalance, in this paper, we propose a deformable attention-oriented feature pyramid network (DAFPN). Unlike previous models, which focus solely on the semantic information between features, DAFPN uses the deformable attention mechanism to model the relationship between multiple features and then merges them in the pyramid feature fusion process. Based on DAFPN, we further propose a fully transformer-based semantic segmentation head, which achieves high performance and good scalability. Comparisons on multiple backbones reveal that our proposed model outperforms the baseline model. Under the same conditions, our method can improve the mIoU by 1∼4%, which is higher than the baseline semantic segmentation model.
dc.language	English
dc.publisher	ELSEVIER
dc.relation.ispartof	Knowledge-Based Systems
dc.relation.isbasedon	10.1016/j.knosys.2022.109623
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 15 Commerce, Management, Tourism and Services, 17 Psychology and Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Deformable attention-oriented feature pyramid network for semantic segmentation
dc.type	Journal Article
utslib.citation.volume	254
utslib.for	08 Information and Computing Sciences
utslib.for	15 Commerce, Management, Tourism and Services
utslib.for	17 Psychology and Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-21T22:54:42Z
pubs.publication-status	Published
pubs.volume	254

Abstract:

In the field of computer vision, the use of pyramid features can significantly improve network performance. However, the misalignment of semantic information and the scale limitation of small-scale features lead to an imbalance of feature contributions, which severely limits the performance of the feature pyramid network. In order to solve the problem of model efficiency decline caused by feature contribution imbalance, in this paper, we propose a deformable attention-oriented feature pyramid network (DAFPN). Unlike previous models, which focus solely on the semantic information between features, DAFPN uses the deformable attention mechanism to model the relationship between multiple features and then merges them in the pyramid feature fusion process. Based on DAFPN, we further propose a fully transformer-based semantic segmentation head, which achieves high performance and good scalability. Comparisons on multiple backbones reveal that our proposed model outperforms the baseline model. Under the same conditions, our method can improve the mIoU by 1∼4%, which is higher than the baseline semantic segmentation model.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167988