Classifier-specific intermediate representation for multimedia tasks

Ma, Z; Yang, Y; Hauptmann, AG; Sebe, N

Classifier-specific intermediate representation for multimedia tasks

Ma, Z Yang, Y

Hauptmann, AG Sebe, N

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 2012, 2012
Issue Date:: 2012-07-27

Closed Access

	Filename	Description	Size
	a50-ma.pdf	Published version	407.85 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Ma, Z	en_US
dc.contributor.author	Yang, Y https://orcid.org/0000-0001-5528-0546	en_US
dc.contributor.author	Hauptmann, AG	en_US
dc.contributor.author	Sebe, N	en_US
dc.date.issued	2012-07-27	en_US
dc.identifier.citation	Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 2012, 2012	en_US
dc.identifier.isbn	9781450313292	en_US
dc.identifier.uri	http://hdl.handle.net/10453/121230
dc.description.abstract	Video annotation and multimedia classification play important roles in many applications such as video indexing and retrieval. To improve video annotation and event detection, researchers have proposed using intermediate concept classifiers with concept lexica to help understand the videos. Yet it is difficult to judge how many and what concepts would be sufficient for the particular video analysis task. Additionally, obtaining robust semantic concept classifiers requires a large number of positive training examples, which in turn has high human annotation cost. In this paper, we propose an approach that is able to automatically learn an intermediate representation from video features together with a classifier. The joint optimization of the two components makes them mutually beneficial and reciprocal. Effectively, the intermediate representation and the classifier are tightly correlated. The classifier dependent intermediate representation not only accurately reflects the task semantics but is also more suitable for the specific classifier. Thus we have created a discriminative semantic analysis framework based on a tightly-coupled intermediate representation. Several experiments on video annotation and multimedia event detection using real-world videos demonstrate the effectiveness of the proposed approach. Copyright © 2012 ACM.	en_US
dc.relation.ispartof	Proceedings of the 2nd ACM International Conference on Multimedia Retrieval, ICMR 2012	en_US
dc.relation.isbasedon	10.1145/2324796.2324854	en_US
dc.title	Classifier-specific intermediate representation for multimedia tasks	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Video annotation and multimedia classification play important roles in many applications such as video indexing and retrieval. To improve video annotation and event detection, researchers have proposed using intermediate concept classifiers with concept lexica to help understand the videos. Yet it is difficult to judge how many and what concepts would be sufficient for the particular video analysis task. Additionally, obtaining robust semantic concept classifiers requires a large number of positive training examples, which in turn has high human annotation cost. In this paper, we propose an approach that is able to automatically learn an intermediate representation from video features together with a classifier. The joint optimization of the two components makes them mutually beneficial and reciprocal. Effectively, the intermediate representation and the classifier are tightly correlated. The classifier dependent intermediate representation not only accurately reflects the task semantics but is also more suitable for the specific classifier. Thus we have created a discriminative semantic analysis framework based on a tightly-coupled intermediate representation. Several experiments on video annotation and multimedia event detection using real-world videos demonstrate the effectiveness of the proposed approach. Copyright © 2012 ACM.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/121230