Connecting the dots for real-time LiDAR-based object detection with YOLO

Dai, B; Le Gentil, C; Vidal-Calleja, T

Connecting the dots for real-time LiDAR-based object detection with YOLO

Dai, B Le Gentil, C

Vidal-Calleja, T

Permalink

Publication Type:: Conference Proceeding
Citation:: Australasian Conference on Robotics and Automation, ACRA, 2018, 2018-December
Issue Date:: 2018-01-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Download Published versionAdobe PDF (2.54 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Dai, B	en_US
dc.contributor.author	Le Gentil, C https://orcid.org/0000-0002-9790-5935	en_US
dc.contributor.author	Vidal-Calleja, T https://orcid.org/0000-0002-5763-9644	en_US
dc.date.available	2018-11-12	en_US
dc.date.issued	2018-01-01	en_US
dc.identifier.citation	Australasian Conference on Robotics and Automation, ACRA, 2018, 2018-December	en_US
dc.identifier.issn	1448-2053	en_US
dc.identifier.uri	http://hdl.handle.net/10453/129918
dc.description.abstract	© 2018 Australasian Robotics and Automation Association. All rights reserved. In this paper we introduce a generic method for people and vehicle detection using LiDAR data only, leveraging a pre-trained Convolutional Neural Network (CNN) from the RGB domain. Typically with machine learning algorithms, there is an inherent trade-off between the amount of training data available and the need for engineered features. The current state-of-the-art object detection and classification heavily rely on deep CNNs trained on enormous RGB image datasets. To take advantage of this inbuilt knowledge, we propose to fine-tune You only look once (YOLO) network transferring its understanding about object shapes to upsampled LiDAR images. Our method creates a dense depth/intensity map, which highlights object contours, from the 3D-point cloud of a LiDAR scan. The proposed method is hardware agnostic, hence can be used with any LiDAR data, independently on the number of channels or beams. Overall, the proposed pipeline exploits the notable similarity between upsampled LiDAR images and RGB images preventing the need to train a deep CNN from scratch. This transfer learning makes our method data efficient while avoiding the creation of heavily engineered features. Evaluation results show that our proposed LiDAR-only detection model has equivalent performance to its RGB-only counterpart.	en_US
dc.relation.ispartof	Australasian Conference on Robotics and Automation, ACRA	en_US
dc.rights	info:eu-repo/semantics/openAccess
dc.title	Connecting the dots for real-time LiDAR-based object detection with YOLO	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2018-December	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Mechanical and Mechatronic Engineering
pubs.organisational-group	/University of Technology Sydney/Strength - CAS - Centre for Autonomous Systems
pubs.organisational-group	/University of Technology Sydney/Students
utslib.copyright.status	open_access	*
pubs.publication-status	Published	en_US
pubs.volume	2018-December	en_US

Abstract:

© 2018 Australasian Robotics and Automation Association. All rights reserved. In this paper we introduce a generic method for people and vehicle detection using LiDAR data only, leveraging a pre-trained Convolutional Neural Network (CNN) from the RGB domain. Typically with machine learning algorithms, there is an inherent trade-off between the amount of training data available and the need for engineered features. The current state-of-the-art object detection and classification heavily rely on deep CNNs trained on enormous RGB image datasets. To take advantage of this inbuilt knowledge, we propose to fine-tune You only look once (YOLO) network transferring its understanding about object shapes to upsampled LiDAR images. Our method creates a dense depth/intensity map, which highlights object contours, from the 3D-point cloud of a LiDAR scan. The proposed method is hardware agnostic, hence can be used with any LiDAR data, independently on the number of channels or beams. Overall, the proposed pipeline exploits the notable similarity between upsampled LiDAR images and RGB images preventing the need to train a deep CNN from scratch. This transfer learning makes our method data efficient while avoiding the creation of heavily engineered features. Evaluation results show that our proposed LiDAR-only detection model has equivalent performance to its RGB-only counterpart.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/129918