Classifier-specific intermediate representation for multimedia tasks

Publication Type:
Conference Proceeding
Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 2012, 2012
Issue Date:
Filename Description Size
a50-ma.pdfPublished version407.85 kB
Adobe PDF
Full metadata record
Video annotation and multimedia classification play important roles in many applications such as video indexing and retrieval. To improve video annotation and event detection, researchers have proposed using intermediate concept classifiers with concept lexica to help understand the videos. Yet it is difficult to judge how many and what concepts would be sufficient for the particular video analysis task. Additionally, obtaining robust semantic concept classifiers requires a large number of positive training examples, which in turn has high human annotation cost. In this paper, we propose an approach that is able to automatically learn an intermediate representation from video features together with a classifier. The joint optimization of the two components makes them mutually beneficial and reciprocal. Effectively, the intermediate representation and the classifier are tightly correlated. The classifier dependent intermediate representation not only accurately reflects the task semantics but is also more suitable for the specific classifier. Thus we have created a discriminative semantic analysis framework based on a tightly-coupled intermediate representation. Several experiments on video annotation and multimedia event detection using real-world videos demonstrate the effectiveness of the proposed approach. Copyright © 2012 ACM.
Please use this identifier to cite or link to this item: