Spatio-temporal DenseNet for real-time intent prediction of pedestrians in urban traffic environments

Saleh, K; Hossny, M; Nahavandi, S

Spatio-temporal DenseNet for real-time intent prediction of pedestrians in urban traffic environments

Saleh, K Hossny, M Nahavandi, S

Permalink

Publisher:: ELSEVIER
Publication Type:: Journal Article
Citation:: Neurocomputing, 2020, 386, pp. 317-324
Issue Date:: 2020-04-21

Closed Access

	Filename	Description	Size
	1-s2.0-S0925231219318065-main.pdf		1.81 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Saleh, K
dc.contributor.author	Hossny, M
dc.contributor.author	Nahavandi, S
dc.date.accessioned	2021-03-11T03:50:20Z
dc.date.available	2021-03-11T03:50:20Z
dc.date.issued	2020-04-21
dc.identifier.citation	Neurocomputing, 2020, 386, pp. 317-324
dc.identifier.issn	0925-2312
dc.identifier.issn	1872-8286
dc.identifier.uri	http://hdl.handle.net/10453/147030
dc.description.abstract	© 2019 Autonomous ground vehicles are increasingly finding their way into real-life applications, ranging from food/parcel delivery to self-driving vehicles. Given that, understanding the behaviours and intentions of humans are still one of the main challenges autonomous ground vehicles faced with. More specifically, when it comes to complex environments such as urban traffic scenes, inferring the intentions and actions of vulnerable road users such as pedestrians become even harder. In this paper, we address the problem of intent action prediction of pedestrians in urban traffic environments using only image sequences from a monocular RGB camera. We propose a real-time framework that can accurately detect, track and predict the intended actions of pedestrians based on a tracking-by-detection technique in conjunction with a novel spatio-temporal DenseNet model. We trained and evaluated our framework based on real data collected from urban traffic environments. Our framework has shown resilient and competitive results in comparison to other baseline approaches. Overall, we achieved an average precision score of 84.76% with a real-time performance at 20 FPS.
dc.language	English
dc.publisher	ELSEVIER
dc.relation.ispartof	Neurocomputing
dc.relation.isbasedon	10.1016/j.neucom.2019.12.091
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	08 Information and Computing Sciences, 09 Engineering, 17 Psychology and Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Spatio-temporal DenseNet for real-time intent prediction of pedestrians in urban traffic environments
dc.type	Journal Article
utslib.citation.volume	386
utslib.for	08 Information and Computing Sciences
utslib.for	09 Engineering
utslib.for	17 Psychology and Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/A/DRsch The Data Science Institute
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2021-03-11T03:50:19Z
pubs.publication-status	Published
pubs.volume	386

Abstract:

© 2019 Autonomous ground vehicles are increasingly finding their way into real-life applications, ranging from food/parcel delivery to self-driving vehicles. Given that, understanding the behaviours and intentions of humans are still one of the main challenges autonomous ground vehicles faced with. More specifically, when it comes to complex environments such as urban traffic scenes, inferring the intentions and actions of vulnerable road users such as pedestrians become even harder. In this paper, we address the problem of intent action prediction of pedestrians in urban traffic environments using only image sequences from a monocular RGB camera. We propose a real-time framework that can accurately detect, track and predict the intended actions of pedestrians based on a tracking-by-detection technique in conjunction with a novel spatio-temporal DenseNet model. We trained and evaluated our framework based on real data collected from urban traffic environments. Our framework has shown resilient and competitive results in comparison to other baseline approaches. Overall, we achieved an average precision score of 84.76% with a real-time performance at 20 FPS.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/147030