Large-scale Video Panoptic Segmentation in the Wild: A Benchmark

Miao, J; Wang, X; Wu, Y; Li, W; Zhang, X; Wei, Y; Yang, Y

Large-scale Video Panoptic Segmentation in the Wild: A Benchmark

Miao, J Wang, X Wu, Y Li, W Zhang, X Wei, Y Yang, Y

Permalink

Publisher:: IEEE
Publication Type:: Conference Proceeding
Citation:: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June, pp. 21001-21011
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	CVPR22_VIPSeg_OPUS.pdf	Accepted version	8.57 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Miao, J
dc.contributor.author	Wang, X
dc.contributor.author	Wu, Y
dc.contributor.author	Li, W
dc.contributor.author	Zhang, X
dc.contributor.author	Wei, Y
dc.contributor.author	Yang, Y https://orcid.org/0000-0002-0512-880X
dc.date	2022-06-18
dc.date.accessioned	2023-03-14T06:58:46Z
dc.date.available	2023-03-14T06:58:46Z
dc.date.issued	2022-01-01
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, 2022-June, pp. 21001-21011
dc.identifier.isbn	9781665469463
dc.identifier.issn	1063-6919
dc.identifier.uri	http://hdl.handle.net/10453/167318
dc.description.abstract	In this paper, we present a new large-scale dataset for the video panoptic segmentation task, which aims to assign semantic classes and track identities to all pixels in a video. As the ground truth for this task is difficult to annotate, previous datasets for video panoptic segmentation are limited by either small scales or the number of scenes. In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3,536 videos and 84,750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories. To the best of our knowledge, our VIPSeg is the first attempt to tackle the challenging video panoptic segmentation task in the wild by considering diverse scenarios. Based on VIPSeg, we evaluate existing video panoptic segmentation approaches and propose an efficient and effective clip-based baseline method to analyze our VIPSeg dataset. Our dataset is available at https://github.com/VIPSeg-Dataset/VIPSeg-Dataset/.
dc.language	en
dc.publisher	IEEE
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
dc.relation.ispartof	2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
dc.relation.isbasedon	10.1109/CVPR52688.2022.02036
dc.rights	info:eu-repo/semantics/closedAccess
dc.title	Large-scale Video Panoptic Segmentation in the Wild: A Benchmark
dc.type	Conference Proceeding
utslib.citation.volume	2022-June
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-14T06:58:34Z
pubs.finish-date	2022-06-24
pubs.publication-status	Published
pubs.start-date	2022-06-18
pubs.volume	2022-June

Abstract:

In this paper, we present a new large-scale dataset for the video panoptic segmentation task, which aims to assign semantic classes and track identities to all pixels in a video. As the ground truth for this task is difficult to annotate, previous datasets for video panoptic segmentation are limited by either small scales or the number of scenes. In contrast, our large-scale VIdeo Panoptic Segmentation in the Wild (VIPSeg) dataset provides 3,536 videos and 84,750 frames with pixel-level panoptic annotations, covering a wide range of real-world scenarios and categories. To the best of our knowledge, our VIPSeg is the first attempt to tackle the challenging video panoptic segmentation task in the wild by considering diverse scenarios. Based on VIPSeg, we evaluate existing video panoptic segmentation approaches and propose an efficient and effective clip-based baseline method to analyze our VIPSeg dataset. Our dataset is available at https://github.com/VIPSeg-Dataset/VIPSeg-Dataset/.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167318