A general compression approach to multi-channel three-dimensional audio

Publication Type:
Journal Article
Citation:
IEEE Transactions on Audio, Speech and Language Processing, 2013, 21 (8), pp. 1676 - 1688
Issue Date:
2013-05-22
Full metadata record
Files in This Item:
Filename Description Size
06508842.pdfPublished Version2.13 MB
Adobe PDF
This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. © 2006-2012 IEEE.
Please use this identifier to cite or link to this item: