Effective transfer tagging from image to video

Publication Type:
Journal Article
ACM Transactions on Multimedia Computing, Communications and Applications, 2013, 9 (2)
Issue Date:
Filename Description Size
a14-yang.pdfPublished Version730.77 kB
Adobe PDF
Full metadata record
Recent years have witnessed a great explosion of user-generated videos on the Web. In order to achieve an effective and efficient video search, it is critical for modern video search engines to associate videos with semantic keywords automatically. Most of the existing video tagging methods can hardly achieve reliable performance due to deficiency of training data. It is noticed that abundant well-tagged data are available in other relevant types of media (e.g., images). In this article, we propose a novel video tagging framework, termed as Cross-Media Tag Transfer (CMTT), which utilizes the abundance of well-tagged images to facilitate video tagging. Specifically, we build a cross-media tunnel to transfer knowledge from images to videos. To this end, an optimal kernel space, in which distribution distance between images and video is minimized, is found to tackle the domainshift problem. A novel cross-media video tagging model is proposed to infer tags by exploring the intrinsic local structures of both labeled and unlabeled data, and learn reliable video classifiers. An efficient algorithm is designed to optimize the proposed model in an iterative and alternative way. Extensive experiments illustrate the superiority of our proposal compared to the state-of-the-art algorithms. © 2013 ACM.
Please use this identifier to cite or link to this item: