Hierarchical Latent Concept Discovery for Video Event Detection

Publication Type:
Journal Article
IEEE Transactions on Image Processing, 2017, 26 (5), pp. 2149 - 2162
Issue Date:
Filename Description Size
07858791.pdfPublished Version4.03 MB
Adobe PDF
Full metadata record
© 1992-2012 IEEE. Semantic information is important for video event detection. How to automatically discover, model, and utilize semantic information to facilitate video event detection has been a challenging problem. In this paper, we propose a novel hierarchical video event detection model, which deliberately unifies the processes of underlying semantics discovery and event modeling from video data. Specially, different from most of the approaches based on manually pre-defined concepts, we devise an effective model to automatically uncover video semantics by hierarchically capturing latent static-visual concepts in frame-level and latent activity concepts (i.e., temporal sequence relationships of static-visual concepts) in segment-level. The unified model not only enables a discriminative and descriptive representation for videos, but also alleviates error propagation problem from video representation to event modeling existing in previous methods. A max-margin framework is employed to learn the model. Extensive experiments on four challenging video event datasets, i.e., MED11, CCV, UQE50, and FCVID, have been conducted to demonstrate the effectiveness of the proposed method.
Please use this identifier to cite or link to this item: