Recognition of Emotions in User-Generated Videos with Kernelized Features

Publication Type:
Journal Article
IEEE Transactions on Multimedia, 2018, 20 (10), pp. 2824 - 2835
Issue Date:
Filename Description Size
08299568.pdfPublished Version2.89 MB
Adobe PDF
Full metadata record
© 1999-2012 IEEE. Recognition of emotions in user-generated videos has attracted increasing research attention. Most existing approaches are based on spatial features extracted from video frames. However, due to the broad affective gap between spatial features of images and high-level emotions, the performance of existing approaches is restricted. To bridge the affective gap, we propose recognizing emotions in user-generated videos with kernelized features. We reformulate the equation of the discrete Fourier transform as a linear kernel function and construct a polynomial kernel function based on the linear kernel. The polynomial kernel is applied to spatial features of video frames to generate kernelized features. Compared with spatial features, kernelized features show superior discriminative capability. Moreover, we are the first to apply the sparse representation method to reduce the impact of noise contained in videos; this method helps contribute to performance improvement. Extensive experiments are conducted on two challenging benchmark datasets, that is, VideoEmotion-8 and Ekman-6. The experimental results demonstrate that the proposed method achieves state-of-the-art performance.
Please use this identifier to cite or link to this item: