V-JAUNE: A framework for joint action recognition and video summarization

Publication Type:
Journal Article
Citation:
ACM Transactions on Multimedia Computing, Communications and Applications, 2017, 13 (2)
Issue Date:
2017-04-01
Filename Description Size
revision_v1.pdfAccepted Manuscript Version13.33 MB
Adobe PDF
Full metadata record
© 2017 ACM. Video summarization and action recognition are two important areas of multimedia video analysis. While these two areas have been tackled separately to date, in this article, we present a latent structural SVM framework to recognize the action and derive the summary of a video in a joint, simultaneous fashion. Efficient inference is provided by a submodular score function that accounts for the action and summary jointly. In this article, we also define a novel measure to evaluate the quality of a predicted video summary against the annotations of multiple annotators. Quantitative and qualitative results over two challenging action datasets-the ACE and MSR DailyActivity3D datasets-show that the proposed joint approach leads to higher action recognition accuracy and equivalent or better summary quality than comparable approaches that perform these tasks separately.
Please use this identifier to cite or link to this item: