Multiple features but few labels? A symbiotic solution exemplified for video analysis
- Publication Type:
- Conference Proceeding
- Citation:
- MM 2014 - Proceedings of the 2014 ACM Conference on Multimedia, 2014, pp. 77 - 86
- Issue Date:
- 2014-01-01
Closed Access
Filename | Description | Size | |||
---|---|---|---|---|---|
1bf8ce40212336c41db66ebbd1ec2d9d6deb.pdf | Published version | 311.41 kB |
Copyright Clearance Process
- Recently Added
- In Progress
- Closed Access
This item is closed access and not available.
Video analysis has been attracting increasing research due to the proliferation of internet videos. In this paper, we investigate how to improve the performance on internet quality video analysis. Particularly, we work on the scenario of few labeled training videos being provided, which is less focused in multimedia. To being with, we consider how to more effectively harness the evidences from the low-level features. Researchers have developed several promising features to represent videos to capture the semantic information. However, as videos usually characterize rich semantic contents, the analysis performance by using one single feature is potentially limited. Simply combining multiple features through early fusion or late fusion to incorporate more informative cues is doable but not optimal due to the heterogeneity and different predicting capability of these features. For better exploitation of multiple features, we propose to mine the importance of different features and cast it into the learning of the classification model. Our method is based on multiple graphs from different features and uses the Riemannian metric to evaluate the feature importance. On the other hand, to be able to use limited labeled training videos for a respectable accuracy we formulate our method in a semi-supervised way. The main contribution of this paper is a novel scheme of evaluating the feature importance that is further casted into a unified framework of harnessing multiple weighted features with limited labeled training videos. We perform extensive experiments on video action recognition and multimedia event recognition and the comparison to other state-of-the-art multi-feature learning algorithms has validated the efficacy of our framework.
Please use this identifier to cite or link to this item: