How unlabeled web videos help complex event detection?

Publisher:
IJCAI-INT JOINT CONF ARTIF INTELL
Publication Type:
Conference Proceeding
Citation:
IJCAI International Joint Conference on Artificial Intelligence, 2017, 0, pp. 4040-4046
Issue Date:
2017-01-01
Filename Description Size
0564.pdfPublished version691.8 kB
Adobe PDF
Full metadata record
The lack of labeled exemplars is an important factor that makes the task of multimedia event detection (MED) complicated and challenging. Utilizing artificially picked and labeled external sources is an effective way to enhance the performance of MED. However, building these data usually requires professional human annotators, and the procedure is too time-consuming and costly to scale. In this paper, we propose a new robust dictionary learning framework for complex event detection, which is able to handle both labeled and easy-to-get unlabeled web videos by sharing the same dictionary. By employing the lq-norm based loss jointly with the structured sparsity based regularization, our model shows strong robustness against the substantial noisy and outlier videos from open source. We exploit an effective optimization algorithm to solve the proposed highly non-smooth and non-convex problem. Extensive experiment results over standard datasets of TRECVID MEDTest 2013 and TRECVID MEDTest 2014 demonstrate the effectiveness and superiority of the proposed framework on complex event detection.
Please use this identifier to cite or link to this item: