Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration

Yang, Z; Wei, Y; Yang, Y

Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration

Yang, Z Wei, Y Yang, Y

Permalink

Publisher:: Institute of Electrical and Electronics Engineers
Publication Type:: Journal Article
Citation:: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44, (9), pp. 4701-4712
Issue Date:: 2022-05-18

Closed Access

	Filename	Description	Size
	Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration.pdf	Published version	3.77 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Yang, Z
dc.contributor.author	Wei, Y
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.date.accessioned	2023-03-21T00:31:36Z
dc.date.available	2023-03-21T00:31:36Z
dc.date.issued	2022-05-18
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44, (9), pp. 4701-4712
dc.identifier.issn	0162-8828
dc.identifier.issn	1939-3539
dc.identifier.uri	http://hdl.handle.net/10453/167849
dc.description.abstract	This paper investigates the principles of embedding learning to tackle the challenging semi-supervised video object segmentation. Unlike previous practices that focus on exploring the embedding learning of foreground object (s), we consider background should be equally treated. Thus, we propose a Collaborative video object segmentation by Foreground-Background Integration (CFBI) approach. CFBI separates the feature embedding into the foreground object region and its corresponding background region, implicitly promoting them to be more contrastive and improving the segmentation results accordingly. Moreover, CFBI performs both pixel-level matching processes and instance-level attention mechanisms between the reference and the predicted sequence, making CFBI robust to various object scales. Based on CFBI, we introduce a multi-scale matching structure and propose an Atrous Matching strategy, resulting in a more robust and efficient framework, CFBI+. We conduct extensive experiments on two popular benchmarks, i.e., DAVIS and YouTube-VOS. Without applying any simulated data for pre-training, our CFBI+ achieves the performance ( J& F) of 82.9 and 82.8 percent, outperforming all the other state-of-the-art methods. Code: https://github.com/z-x-yang/CFBI.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.ispartof	IEEE Transactions on Pattern Analysis and Machine Intelligence
dc.relation.isbasedon	10.1109/tpami.2021.3081597
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0806 Information Systems, 0906 Electrical and Electronic Engineering
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Collaborative Video Object Segmentation by Multi-Scale Foreground-Background Integration
dc.type	Journal Article
utslib.citation.volume	44
utslib.location.activity	United States
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0806 Information Systems
utslib.for	0906 Electrical and Electronic Engineering
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2023-03-21T00:31:35Z
pubs.issue	9
pubs.publication-status	Accepted
pubs.volume	44
utslib.citation.issue	9

Abstract:

This paper investigates the principles of embedding learning to tackle the challenging semi-supervised video object segmentation. Unlike previous practices that focus on exploring the embedding learning of foreground object (s), we consider background should be equally treated. Thus, we propose a Collaborative video object segmentation by Foreground-Background Integration (CFBI) approach. CFBI separates the feature embedding into the foreground object region and its corresponding background region, implicitly promoting them to be more contrastive and improving the segmentation results accordingly. Moreover, CFBI performs both pixel-level matching processes and instance-level attention mechanisms between the reference and the predicted sequence, making CFBI robust to various object scales. Based on CFBI, we introduce a multi-scale matching structure and propose an Atrous Matching strategy, resulting in a more robust and efficient framework, CFBI+. We conduct extensive experiments on two popular benchmarks, i.e., DAVIS and YouTube-VOS. Without applying any simulated data for pre-training, our CFBI+ achieves the performance ( J& F) of 82.9 and 82.8 percent, outperforming all the other state-of-the-art methods. Code: https://github.com/z-x-yang/CFBI.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167849