Dependency exploitation: A unified CNN-RNN approach for visual emotion recognition

Publication Type:
Conference Proceeding
Citation:
IJCAI International Joint Conference on Artificial Intelligence, 2017, pp. 3595 - 3601
Issue Date:
2017-01-01
Metrics:
Full metadata record
Files in This Item:
Filename Description Size
0503.pdfPublished version8.69 MB
Adobe PDF
Visual emotion recognition aims to associate images with appropriate emotions. There are different visual stimuli that can affect human emotion from low-level to high-level, such as color, texture, part, object, etc. However, most existing methods treat different levels of features as independent entity without having effective method for feature fusion. In this paper, we propose a unified CNN-RNN model to predict the emotion based on the fused features from different levels by exploiting the dependency among them. Our proposed architecture leverages convolutional neural network (CNN) with multiple layers to extract different levels of features within a multi-task learning framework, in which two related loss functions are introduced to learn the feature representation. Considering the dependencies within the low-level and high-level features, a bidirectional recurrent neural network (RNN) is proposed to integrate the learned features from different layers in the CNN model. Extensive experiments on both Internet images and art photo datasets demonstrate that our method outperforms the state-of-the-art methods with at least 7% performance improvement.
Please use this identifier to cite or link to this item: