Deep Learning Aided Visual Localisation in Urban Pedestrian Environments

Publication Type:
Thesis
Issue Date:
2021
Full metadata record
Localisation of a mobile robot is a fundamental problem in robotics research. In a known environment, localisation can be performed using a prebuilt map, whereas much more complex simultaneous localisation and mapping (SLAM), which estimates both the robot location and the map, is required when operating in an unknown environment. This thesis focuses on the localisation of low-speed vehicles ranging from personal mobility devices to delivery robots, operating in a known outdoor urban environment using low-cost cameras with the objective of improving their functionality and safety. Existing techniques for vision only localisation, even while operating in known environments, requires SLAM due to the difficulty in building reliable maps that are persistent across long time frames. This thesis proposes an approach that circumvents this problem by utilising convolutional neural network (CNN) based perception of (a) persistent pole-like landmarks such as lamp posts, trees, street signs and parking meters, and (b) important ground surface boundaries related to persistent infrastructure such as curbs, pavement edges and manhole covers, found in urban environments. Localisation is carried out on a prebuilt map consisting of the 2D locations of these landmarks and a vector distance transform (VDT) representation of the ground surface boundaries. An extended Kalman filter (EKF) fuses these observations to carry out pose estimation while robustly dealing with missed detections and wrong classifications. This approach is further extended by utilising an omnidirectional camera to improve the effective field of view (FoV) of the landmark detection system. The framework utilises an information-theoretic strategy to decide the best viewpoint to serve as an input to the CNN in a given iteration, instead of the full 360-degree coverage offered by an omnidirectional camera, in order to leverage the advantage of having a higher field of view without compromising on performance. Finally, a strategy to decide when and how to incorporate traversable path boundaries such as pavements or footpath edges is also proposed. Real-world experiments carried out in dynamic urban environments across large time gaps in the year and at different distance scales, utilising an instrumented mobility scooter, are presented to highlight the effectiveness of the contributions of this thesis in contrast to state of the art visual SLAM based approaches.
Please use this identifier to cite or link to this item: