Listening for people: Exploiting the spectral structure of speech to robustly perceive the presence of people

Publication Type:
Conference Proceeding
2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011, pp. 2903 - 2909
Issue Date:
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2011004546OK.pdf1.68 MB
Adobe PDF
As the desire to see robots ubiquitous in society grows, so does the need for providing the robots with the means of building awareness of any humans with which it may be sharing the environment. This paper presents a real-world suitable system which enables robots to robustly perceive the presence of people acoustically. The proposed binaural system first identifies voiced signal by means of a novel approach to Voice Activity Detection that exploits the spectral signature and characteristics of speech without reliance on a priori knowledge. Bearing estimates for each speaker are then made using a multi-track particle filter with a belief update function comprised of a Cross-correlation bearing estimate and an estimate of the speaker's fundamental frequency. Results, from an evaluation of each of the major system components and a system evaluation in which the robot successfully built human-centric situational awareness of the three humans with which it shared an office lunch-room containing typical background noises, are presented and discussed.
Please use this identifier to cite or link to this item: