Multi-layer multi-view topic model for classifying advertising video

Publication Type:
Journal Article
Pattern Recognition, 2017, 68 pp. 66 - 81
Issue Date:
Filename Description Size
1-s2.0-S0031320317301036-main.pdfPublished Version2.9 MB
Adobe PDF
Full metadata record
© 2017 Elsevier Ltd The recent proliferation of advertising (ad) videos has driven the research in multiple applications, ranging from video analysis to video indexing and retrieval. Among them, classifying ad video is a key task because it allows automatic organization of videos according to categories or genres, and this further enables ad video indexing and retrieval. However, classifying ad video is challenging compared to other types of video classification because of its unconstrained content. While many studies focus on embedding ads relevant to videos, to our knowledge, few focus on ad video classification. In order to classify ad video, this paper proposes a novel ad video representation that aims to sufficiently capture the latent semantics of video content from multiple views in an unsupervised manner. In particular, we represent ad videos from four views, including bag-of-feature (BOF), vector of locally aggregated descriptors (VLAD), fisher vector (FV) and object bank (OB). We then devise a multi-layer multi-view topic model, mlmv_LDA, which models the topics of videos from different views. A topical representation for video, supporting category-related task, is finally achieved by the proposed method. Our empirical classification results on 10,111 real-world ad videos demonstrate that the proposed approach effectively differentiate ad videos.
Please use this identifier to cite or link to this item: