An audio-visual solution to sound source localization and tracking with applications to HRI

Publication Type:
Conference Proceeding
Australasian Conference on Robotics and Automation, ACRA, 2016, 2016-December pp. 268 - 277
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
ACRA16.pdfAccepted Manuscript version4.05 MB
Adobe PDF
© 2018 Australasian Robotics and Automation Association. All rights reserved. Robot audition is an emerging and growing branch in the robotic community and is necessary for a natural Human-Robot Interaction (HRI). In this paper, we propose a framework that integrates advances from Simultaneous Localization And Mapping (SLAM), bearing-only target tracking, and robot audition techniques into a unified system for sound source identification, localization, and tracking. In indoors, acoustic observations are often highly noisy and corrupted due to reverberations, the robot egomotion and background noise, and the possible discontinuous nature of them. Therefore, in everyday interaction scenarios, the system requires accommodating for outliers, robust data association, and appropriate management of the landmarks, i.e. sound sources. We solve the robot self-localization and environment representation problems using an RGB-D SLAM algorithm, and sound source localization and tracking using recursive Bayesian estimation in the form of the extended Kalman filter with unknown data associations and an unknown number of landmarks. The experimental results show that the proposed system performs well in the medium-sized cluttered indoor environment.
Please use this identifier to cite or link to this item: