Joint structured sparsity regularized multiview dimension reduction for video-based facial expression recognition

Publication Type:
Journal Article
ACM Transactions on Intelligent Systems and Technology, 2016, 8 (2)
Issue Date:
Filename Description Size
a28-xie.pdfPublished Version1.44 MB
Adobe PDF
Full metadata record
© 2016 ACM. Video-based facial expression recognition (FER) has recently received increased attention as a result of its widespread application. Using only one type of feature to describe facial expression in video sequences is often inadequate, because the information available is very complex. With the emergence of different features to represent different properties of facial expressions in videos, an appropriate combination of these features becomes an important, yet challenging, problem. Considering that the dimensionality of these features is usually high, we thus introduce multiview dimension reduction (MVDR) into video-based FER. In MVDR, it is critical to explore the relationships between and within different feature views. To achieve this goal, we propose a novel framework of MVDR by enforcing joint structured sparsity at both inter- and intraview levels. In this way, correlations on and between the feature spaces of different views tend to be well-exploited. In addition, a transformation matrix is learned for each view to discover the patterns contained in the original features, so that the different views are comparable in finding a common representation. The model can be not only performed in an unsupervised manner, but also easily extended to a semisupervised setting by incorporating some domain knowledge. An alternating algorithm is developed for problem optimization, and each subproblem can be efficiently solved. Experiments on two challenging video-based FER datasets demonstrate the effectiveness of the proposed framework.
Please use this identifier to cite or link to this item: