Research on Point Cloud Segmentation and 3D Scene Understanding

Publication Type:
Thesis
Issue Date:
2022
Full metadata record
Recently, the amount of 3D data has been sharply increasing thanks to widely available 3D sensors like Lidar, Kinect and RealSense. Compared to 2D images, 3D data provides clear topology and geometric information, which is very important for many computer vision applications. Among many types of 3D data, the point cloud is widely used because of its easy availability and storage. This thesis summaries the works that have been conducted on understanding the high-level semantics and basic structure of 3D scenes based on point cloud data. Firstly, a fully-supervised point cloud semantic segmentation framework is designed. Contextual information can help resolve ambiguity and improve the robustness of a recognition system. To capture long-range context, a long-short-term feature bank based framework is introduced to exploit the patch-wise relationship for point cloud semantic segmentation. This approach can capture context in an arbitrary range by costing a little extra computation than a standard segmentation model. Experiments demonstrate that the proposed approach outperforms other point cloud semantic segmentation approaches exploring long-range context. Manually labeling point cloud datasets, especially point-wise annotations, for fully-supervised methods is expensive. To avoid manually annotating 3D keypoints, a weakly-supervised 3D keypoint extraction method for point cloud registration, called KPSNet, is proposed, which only uses the relative transformation matrices between input pairs of point clouds as weak labels to establish point-to-point correspondences on-the-fly for training. Moreover, KPSNet can simultaneously detect 3D keypoints and learn the representations. Experimental results reveal that our proposed method performs better in point cloud registration than other methods without keypoint annotations. To cut the need for manual point-wise annotations, a weakly-supervised point cloud semantic segmentation approach is proposed, which uses coarse labels only. The proposed approach reformulates this problem as a semi-supervised point cloud semantic segmentation problem with noisy pseudo labels. Then a three-branch network is proposed, which can reduce the impact of noises in pseudo labels and leverage the inner structure of point clouds to improve the segmentation performance. By evaluating on benchmark datasets, this method surpasses other weakly-supervised methods and fully-supervised method PointNet++, and narrows the gap with other outstanding fully-supervised approaches. Subsequently, another weakly-supervised point cloud semantic segmentation framework is introduced, which uses incomplete point-level labels. It explores contrastive learning with cross-sample contrast and low-level contrast and a pseudo label refinement module to mine more helpful supervision information from pseudo labels. This method gains state-of-the-art performance on benchmark datasets with several weakly-supervised annotation settings.
Please use this identifier to cite or link to this item: