Disentangled Pre-training for Image Matting

Li, Y; Huang, Z; Yu, G; Chen, L; Wei, Y; Jiao, J

Disentangled Pre-training for Image Matting

Li, Y Huang, Z Yu, G Chen, L

Wei, Y Jiao, J

Permalink

Publisher:: Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:: Conference Proceeding
Citation:: Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, 00, pp. 168-177
Issue Date:: 2024-01-01

Closed Access

	Filename	Description	Size
	1721248.pdf	Published version	3.3 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Li, Y
dc.contributor.author	Huang, Z
dc.contributor.author	Yu, G
dc.contributor.author	Chen, L https://orcid.org/0000-0002-6468-5729
dc.contributor.author	Wei, Y
dc.contributor.author	Jiao, J
dc.date	2024-01-03
dc.date.accessioned	2024-08-07T05:23:31Z
dc.date.available	2024-08-07T05:23:31Z
dc.date.issued	2024-01-01
dc.identifier.citation	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, 00, pp. 168-177
dc.identifier.uri	http://hdl.handle.net/10453/180306
dc.description.abstract	Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pretraining approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. Our project page is available at https://crystraldo.github.io/dpt-mat/.
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation.ispartof	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
dc.relation.ispartof	2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
dc.relation.isbasedon	10.1109/WACV57701.2024.00024
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Disentangled Pre-training for Image Matting
dc.type	Conference Proceeding
utslib.citation.volume	00
pubs.organisational-group	University of Technology Sydney
pubs.organisational-group	University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	University of Technology Sydney/All Manual Groups
pubs.organisational-group	University of Technology Sydney/All Manual Groups/Australian Artificial Intelligence Institute (AAII)
utslib.copyright.status	closed_access	*
dc.date.updated	2024-08-07T05:23:27Z
pubs.finish-date	2024-01-08
pubs.publication-status	Published
pubs.start-date	2024-01-03
pubs.volume	00

Abstract:

Image matting requires high-quality pixel-level human annotations to support the training of a deep model in recent literature. Whereas such annotation is costly and hard to scale, significantly holding back the development of the research. In this work, we make the first attempt towards addressing this problem, by proposing a self-supervised pretraining approach that can leverage infinite numbers of data to boost the matting performance. The pre-training task is designed in a similar manner as image matting, where random trimap and alpha matte are generated to achieve an image disentanglement objective. The pre-trained model is then used as an initialisation of the downstream matting task for fine-tuning. Extensive experimental evaluations show that the proposed approach outperforms both the state-of-the-art matting methods and other alternative self-supervised initialisation approaches by a large margin. We also show the robustness of the proposed approach over different backbone architectures. Our project page is available at https://crystraldo.github.io/dpt-mat/.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/180306