Clustering based Point Cloud Representation Learning for 3D Analysis
- Publisher:
- Institute of Electrical and Electronics Engineers (IEEE)
- Publication Type:
- Conference Proceeding
- Citation:
- Proceedings of the IEEE International Conference on Computer Vision, 2023, 00, pp. 8249-8260
- Issue Date:
- 2023-01-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
Clustering_based_Point_Cloud_Representation_Learning_for_3D_Analysis.pdf | Published version | 2.72 MB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Point cloud analysis (such as 3D segmentation and detection) is a challenging task, because of not only the irregular geometries of many millions of unordered points, but also the great variations caused by depth, viewpoint, occlusion, etc. Current studies put much focus on the adaption of neural networks to the complex geometries of point clouds, but are blind to a fundamental question: how to learn an appropriate point embedding space that is aware of both discriminative semantics and challenging variations? As a response, we propose a clustering based supervised learning scheme for point cloud analysis. Unlike current de-facto, scene-wise training paradigm, our algorithm conducts within-class clustering on the point embedding space for automatically discovering subclass patterns which are latent yet representative across scenes. The mined patterns are, in turn, used to repaint the embedding space, so as to respect the underlying distribution of the entire training dataset and improve the robustness to the variations. Our algorithm is principled and readily pluggable to modern point cloud segmentation networks during training, without extra overhead during testing. With various 3D network architectures (i.e., voxel-based, point-based, Transformer-based, automatically searched), our algorithm shows notable improvements on famous point cloud segmentation datasets (i.e., 2.0-2.6% on single-scan and 2.0-2.2% multi-scan of SemanticKITTI, 1.8-1.9% on S3DIS, in terms of mIoU). Our algorithm also demonstrates utility in 3D detection, showing 2.0-3.4% mAP gains on KITTI.
Please use this identifier to cite or link to this item: