High-Performance and Interpretable 3D Point Cloud Analysis

Publication Type:
Thesis
Issue Date:
2025
Full metadata record
Three-dimensional computer vision is advancing rapidly, yet processing large, dynamic scenes still faces limits in feature extraction, clustering-based learning, efficiency, and model interpretability. This thesis proposes four complementary methods to address these gaps. Cluster3D is a clustering-driven representation learning approach that discovers fine-grained subclass patterns in point clouds and improves supervised segmentation on both static and dynamic data. LSK3DNet is a 3D backbone with large sparse kernels that uses spatial-wise dynamic sparsity and channel-wise weight selection to reduce computation while boosting accuracy for semantic segmentation and object detection. Interpretable3D is a prototype-based classifier that embeds interpretability directly into the architecture, providing transparent, case-level explanations while maintaining competitive results on shape classification and part segmentation. Shape2Scene is a scalable pretraining strategy that bridges shape-level learning and scene-level tasks, delivering stronger transfer and better scalability than prior approaches. Experiments across multiple benchmarks validate these contributions with state-of-the-art or competitive accuracy and notable efficiency gains. Overall, the thesis advances feature learning, clustering-based learning, efficiency, and interpretability for 3D perception, enabling more robust and accountable systems for real-world applications such as autonomous driving, robotics, and augmented reality.
Please use this identifier to cite or link to this item: