Normal Transformer: Extracting Surface Geometry from LiDAR Points Enhanced by Visual Semantics

Publisher:
Institute of Electrical and Electronics Engineers (IEEE)
Publication Type:
Journal Article
Citation:
IEEE Transactions on Intelligent Vehicles, 2024, PP, (99), pp. 1-11
Issue Date:
2024-01-01
Filename Description Size
1705998.pdfPublished version16.76 MB
Adobe PDF
Full metadata record
High-quality surface normal can help improve geometry estimation in problems faced by autonomous vehicles, such as collision avoidance and occlusion inference. While a considerable volume of literature focuses on densely scanned indoor scenarios, normal estimation during autonomous driving remains an intricate problem due to the sparse, non-uniform, and noisy nature of real-world LiDAR scans. In this paper, we introduce a multi-modal technique that leverages 3D point clouds and 2D colour images obtained from LiDAR and camera sensors for surface normal estimation. We present the Hybrid Geometric Transformer (HGT), a novel transformer-based neural network architecture that proficiently fuses visual semantic and 3D geometric information. Furthermore, we developed an effective learning strategy for the multi-modal data. Experimental results demonstrate the superior effectiveness of our information fusion approach compared to existing methods. It has also been verified that the proposed model can learn from a simulated 3D environment that mimics a traffic scene. The learned geometric knowledge is transferable and can be applied to real-world 3D scenes in the KITTI dataset. Further tasks built upon the estimated normal vectors in the KITTI dataset show that the proposed estimator has an advantage over existing methods.
Please use this identifier to cite or link to this item: