Revealing Event Saliency in Unconstrained Video Collection.

Zhang, D; Han, J; Jiang, L; Ye, S; Chang, X

Revealing Event Saliency in Unconstrained Video Collection.

Zhang, D Han, J Jiang, L Ye, S Chang, X

Permalink

Publisher:: Institute of Electrical and Electronics Engineers
Publication Type:: Journal Article
Citation:: IEEE Transactions on Image Processing, 2017, 26, (4), pp. 1746-1758
Issue Date:: 2017-04

Closed Access

	Filename	Description	Size
	Revealing_Event_Saliency_in_Unconstrained_Video_Collection.pdf		3.81 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Zhang, D
dc.contributor.author	Han, J
dc.contributor.author	Jiang, L
dc.contributor.author	Ye, S
dc.contributor.author	Chang, X https://orcid.org/0000-0002-7778-8807
dc.date.accessioned	2022-08-11T05:34:06Z
dc.date.available	2022-08-11T05:34:06Z
dc.date.issued	2017-04
dc.identifier.citation	IEEE Transactions on Image Processing, 2017, 26, (4), pp. 1746-1758
dc.identifier.issn	1057-7149
dc.identifier.issn	1941-0042
dc.identifier.uri	http://hdl.handle.net/10453/159937
dc.description.abstract	Recent progresses in multimedia event detection have enabled us to find videos about a predefined event from a large-scale video collection. Research towards more intrinsic unsupervised video understanding is an interesting but understudied field. Specifically, given a collection of videos sharing a common event of interest, the goal is to discover the salient fragments, i.e., the curt video fragments that can concisely portray the underlying event of interest, from each video. To explore this novel direction, this paper proposes an unsupervised event saliency revealing framework. It first extracts features from multiple modalities to represent each shot in the given video collection. Then, these shots are clustered to build the cluster-level event saliency revealing framework, which explores useful information cues (i.e., the intra-cluster prior, inter-cluster discriminability, and inter-cluster smoothness) by a concise optimization model. Compared with the existing methods, our approach could highlight the intrinsic stimulus of the unseen event within a video in an unsupervised fashion. Thus, it could potentially benefit to a wide range of multimedia tasks like video browsing, understanding, and search. To quantitatively verify the proposed method, we systematically compare the method to a number of baseline methods on the TRECVID benchmarks. Experimental results have demonstrated its effectiveness and efficiency.
dc.format	Print-Electronic
dc.language	eng
dc.publisher	Institute of Electrical and Electronics Engineers
dc.relation.ispartof	IEEE Transactions on Image Processing
dc.relation.isbasedon	10.1109/TIP.2017.2658957
dc.rights	info:eu-repo/semantics/closedAccess
dc.subject	0801 Artificial Intelligence and Image Processing, 0906 Electrical and Electronic Engineering, 1702 Cognitive Sciences
dc.subject.classification	Artificial Intelligence & Image Processing
dc.title	Revealing Event Saliency in Unconstrained Video Collection.
dc.type	Journal Article
utslib.citation.volume	26
utslib.location.activity	United States
utslib.for	0801 Artificial Intelligence and Image Processing
utslib.for	0906 Electrical and Electronic Engineering
utslib.for	1702 Cognitive Sciences
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - AAII - Australian Artificial Intelligence Institute
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
utslib.copyright.status	closed_access	*
pubs.consider-herdc	false
dc.date.updated	2022-08-11T05:34:04Z
pubs.issue	4
pubs.publication-status	Published
pubs.volume	26
utslib.citation.issue	4

Abstract:

Recent progresses in multimedia event detection have enabled us to find videos about a predefined event from a large-scale video collection. Research towards more intrinsic unsupervised video understanding is an interesting but understudied field. Specifically, given a collection of videos sharing a common event of interest, the goal is to discover the salient fragments, i.e., the curt video fragments that can concisely portray the underlying event of interest, from each video. To explore this novel direction, this paper proposes an unsupervised event saliency revealing framework. It first extracts features from multiple modalities to represent each shot in the given video collection. Then, these shots are clustered to build the cluster-level event saliency revealing framework, which explores useful information cues (i.e., the intra-cluster prior, inter-cluster discriminability, and inter-cluster smoothness) by a concise optimization model. Compared with the existing methods, our approach could highlight the intrinsic stimulus of the unseen event within a video in an unsupervised fashion. Thus, it could potentially benefit to a wide range of multimedia tasks like video browsing, understanding, and search. To quantitatively verify the proposed method, we systematically compare the method to a number of baseline methods on the TRECVID benchmarks. Experimental results have demonstrated its effectiveness and efficiency.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/159937