CAMRL: A Joint Method of Channel Attention and Multidimensional Regression Loss for 3D Object Detection in Automated Vehicles

Gao, H; Fang, D; Xiao, J; Hussain, W; Kim, JY

CAMRL: A Joint Method of Channel Attention and Multidimensional Regression Loss for 3D Object Detection in Automated Vehicles

Gao, H Fang, D Xiao, J Hussain, W

Kim, JY

Permalink

Publisher:: IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Publication Type:: Journal Article
Citation:: IEEE Transactions on Intelligent Transportation Systems, 2022, PP, (99)
Issue Date:: 2022-01-01

Closed Access

	Filename	Description	Size
	CAMRL_A_Joint_Method_of_Channel_Attention_and_Multidimensional_Regression_Loss_for_3D_Object_Detection_in_Automated_Vehicles.pdf	Published version	3.01 MB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Gao, H
dc.contributor.author	Fang, D
dc.contributor.author	Xiao, J
dc.contributor.author	Hussain, W https://orcid.org/0000-0003-0610-4006
dc.contributor.author	Kim, JY
dc.date.accessioned	2023-03-20T22:33:22Z
dc.date.available	2023-03-20T22:33:22Z
dc.date.issued	2022-01-01
dc.identifier.citation	IEEE Transactions on Intelligent Transportation Systems, 2022, PP, (99)
dc.identifier.issn	1524-9050
dc.identifier.issn	1558-0016
dc.identifier.uri	http://hdl.handle.net/10453/167813
dc.description.abstract	Fully automated vehicles collect information about their road environments to adjust their driving actions, such as braking and slowing down. The development of artificial intelligence (AI) and the Internet of Things (IoT) has improved the cognitive abilities of vehicles, allowing them to detect traffic signs, pedestrians, and obstacles for increasing the intelligence of these transportation systems. Three-dimensional (3D) object detection in front-view images taken by vehicle cameras is important for both object detection and depth estimation. In this paper, a joint channel attention and multidimensional regression loss method for 3D object detection in automated vehicles (called CAMRL) is proposed to improve the average precision of 3D object detection by focusing on the model’s ability to infer the locations and sizes of objects. First, channel attention is introduced to effectively learn the yaw angles from the road images captured by vehicle cameras. Second, a multidimensional regression loss algorithm is designed to further optimize the size and position parameters during the training process. Third, the intrinsic parameters of the camera and the depth estimate of the model are combined to reduce the object depth computation error, allowing us to calculate the distance between an object and the camera after the object’s size is confirmed. As a result, objects are detected, and their depth estimations are validated. Then, the vehicle can determine when and how to stop if an object is nearby. Finally, experiments conducted on the KITTI dataset demonstrate that our method is effective and performs better than other baseline methods, especially in terms of 3D object detection and bird’s-eye view (BEV) evaluation.
dc.language	English
dc.publisher	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
dc.relation.ispartof	IEEE Transactions on Intelligent Transportation Systems
dc.relation.isbasedon	10.1109/TITS.2022.3219474
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0905 Civil Engineering, 1507 Transportation and Freight Services
dc.subject.classification	Logistics & Transportation
dc.title	CAMRL: A Joint Method of Channel Attention and Multidimensional Regression Loss for 3D Object Detection in Automated Vehicles
dc.type	Journal Article
utslib.citation.volume	PP
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0905 Civil Engineering
utslib.for	1507 Transportation and Freight Services
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
dc.date.updated	2023-03-20T22:33:21Z
pubs.issue	99
pubs.publication-status	Published
pubs.volume	PP
utslib.citation.issue	99

Abstract:

Fully automated vehicles collect information about their road environments to adjust their driving actions, such as braking and slowing down. The development of artificial intelligence (AI) and the Internet of Things (IoT) has improved the cognitive abilities of vehicles, allowing them to detect traffic signs, pedestrians, and obstacles for increasing the intelligence of these transportation systems. Three-dimensional (3D) object detection in front-view images taken by vehicle cameras is important for both object detection and depth estimation. In this paper, a joint channel attention and multidimensional regression loss method for 3D object detection in automated vehicles (called CAMRL) is proposed to improve the average precision of 3D object detection by focusing on the model’s ability to infer the locations and sizes of objects. First, channel attention is introduced to effectively learn the yaw angles from the road images captured by vehicle cameras. Second, a multidimensional regression loss algorithm is designed to further optimize the size and position parameters during the training process. Third, the intrinsic parameters of the camera and the depth estimate of the model are combined to reduce the object depth computation error, allowing us to calculate the distance between an object and the camera after the object’s size is confirmed. As a result, objects are detected, and their depth estimations are validated. Then, the vehicle can determine when and how to stop if an object is nearby. Finally, experiments conducted on the KITTI dataset demonstrate that our method is effective and performs better than other baseline methods, especially in terms of 3D object detection and bird’s-eye view (BEV) evaluation.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/167813