Deep Learning Aided Visual Localisation in Urban Pedestrian Environments

Jayasuriya, Maleen

Deep Learning Aided Visual Localisation in Urban Pedestrian Environments

Jayasuriya, Maleen

Permalink

Publication Type:: Thesis
Issue Date:: 2021

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download contents and abstractAdobe PDF (546.52 kB)

Adobe PDF

Download thesisAdobe PDF (16.61 MB)

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Jayasuriya, Maleen
dc.date.accessioned	2021-11-10T07:08:38Z
dc.date.available	2021-11-10T07:08:38Z
dc.date.issued	2021
dc.identifier.uri	http://hdl.handle.net/10453/151478
dc.description	University of Technology Sydney. Faculty of Engineering and Information Technology.	en_US.UTF-8
dc.description.abstract	Localisation of a mobile robot is a fundamental problem in robotics research. In a known environment, localisation can be performed using a prebuilt map, whereas much more complex simultaneous localisation and mapping (SLAM), which estimates both the robot location and the map, is required when operating in an unknown environment. This thesis focuses on the localisation of low-speed vehicles ranging from personal mobility devices to delivery robots, operating in a known outdoor urban environment using low-cost cameras with the objective of improving their functionality and safety. Existing techniques for vision only localisation, even while operating in known environments, requires SLAM due to the difficulty in building reliable maps that are persistent across long time frames. This thesis proposes an approach that circumvents this problem by utilising convolutional neural network (CNN) based perception of (a) persistent pole-like landmarks such as lamp posts, trees, street signs and parking meters, and (b) important ground surface boundaries related to persistent infrastructure such as curbs, pavement edges and manhole covers, found in urban environments. Localisation is carried out on a prebuilt map consisting of the 2D locations of these landmarks and a vector distance transform (VDT) representation of the ground surface boundaries. An extended Kalman filter (EKF) fuses these observations to carry out pose estimation while robustly dealing with missed detections and wrong classifications. This approach is further extended by utilising an omnidirectional camera to improve the effective field of view (FoV) of the landmark detection system. The framework utilises an information-theoretic strategy to decide the best viewpoint to serve as an input to the CNN in a given iteration, instead of the full 360-degree coverage offered by an omnidirectional camera, in order to leverage the advantage of having a higher field of view without compromising on performance. Finally, a strategy to decide when and how to incorporate traversable path boundaries such as pavements or footpath edges is also proposed. Real-world experiments carried out in dynamic urban environments across large time gaps in the year and at different distance scales, utilising an instrumented mobility scooter, are presented to highlight the effectiveness of the contributions of this thesis in contrast to state of the art visual SLAM based approaches.	en_US.UTF-8
dc.format	Thesis (PhD)
dc.language.iso	en_US	en_US.UTF-8
dc.relation	https://opus.lib.uts.edu.au/bitstream/10453/151478/2/02whole.pdf
dc.rights	info:eu-repo/semantics/openAccess
dc.rights	The author owns the copyright in this thesis including all reproduction and reuse rights for the work. The work may not be altered without the permission of the copyright owner. Attribution is essential when quoting or paraphrasing from this thesis.
dc.rights	au.edu.uts.lib/ppc
dc.title	Deep Learning Aided Visual Localisation in Urban Pedestrian Environments	en_US.UTF-8
dc.type	Thesis
utslib.copyright.status	open_access	*

Abstract:

Localisation of a mobile robot is a fundamental problem in robotics research. In a known environment, localisation can be performed using a prebuilt map, whereas much more complex simultaneous localisation and mapping (SLAM), which estimates both the robot location and the map, is required when operating in an unknown environment. This thesis focuses on the localisation of low-speed vehicles ranging from personal mobility devices to delivery robots, operating in a known outdoor urban environment using low-cost cameras with the objective of improving their functionality and safety. Existing techniques for vision only localisation, even while operating in known environments, requires SLAM due to the difficulty in building reliable maps that are persistent across long time frames. This thesis proposes an approach that circumvents this problem by utilising convolutional neural network (CNN) based perception of (a) persistent pole-like landmarks such as lamp posts, trees, street signs and parking meters, and (b) important ground surface boundaries related to persistent infrastructure such as curbs, pavement edges and manhole covers, found in urban environments. Localisation is carried out on a prebuilt map consisting of the 2D locations of these landmarks and a vector distance transform (VDT) representation of the ground surface boundaries. An extended Kalman filter (EKF) fuses these observations to carry out pose estimation while robustly dealing with missed detections and wrong classifications. This approach is further extended by utilising an omnidirectional camera to improve the effective field of view (FoV) of the landmark detection system. The framework utilises an information-theoretic strategy to decide the best viewpoint to serve as an input to the CNN in a given iteration, instead of the full 360-degree coverage offered by an omnidirectional camera, in order to leverage the advantage of having a higher field of view without compromising on performance. Finally, a strategy to decide when and how to incorporate traversable path boundaries such as pavements or footpath edges is also proposed. Real-world experiments carried out in dynamic urban environments across large time gaps in the year and at different distance scales, utilising an instrumented mobility scooter, are presented to highlight the effectiveness of the contributions of this thesis in contrast to state of the art visual SLAM based approaches.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/151478