Transfer tagging from image to video

Publication Type:
Conference Proceeding
Citation:
MM'11 - Proceedings of the 2011 ACM Multimedia Conference and Co-Located Workshops, 2011, pp. 1137 - 1140
Issue Date:
2011-12-29
Filename Description Size
p1137-yang.pdfPublished version777.48 kB
Adobe PDF
Full metadata record
Nowadays massive amount of web video datum has been emerging on the Internet. To achieve an effective and efficient video retrieval, it is critical to automatically assign semantic keywords to the videos via content analysis. However, most of the existing video tagging methods suffer from the problem of lacking sufficient tagged training videos due to high labor cost of manual tagging. Inspired by the observation that there are much more well-labeled data in other yet relevant types of media (e.g. images), in this paper we study how to build a "cross-media tunnel" to transfer external tag knowledge from image to video. Meanwhile, the intrinsic data structures of both image and video spaces are well explored for inferring tags. We propose a Cross-Media Tag Transfer (CMTT) paradigm which is able to: 1) transfer tag knowledge between image and video by minimizing their distribution difference; 2) infer tags by revealing the underlying manifold structures embedded within both image and video spaces. We also learn an explicit mapping function to handle unseen videos. Experimental results have been reported and analyzed to illustrate the superiority of our proposal. Copyright 2011 ACM.
Please use this identifier to cite or link to this item: