Web-based semantic fragment discovery for online lingual-visual similarity

Sun, X; Cao, J; Li, C; Zhu, L; Shen, HT

Web-based semantic fragment discovery for online lingual-visual similarity

Sun, X Cao, J Li, C Zhu, L Shen, HT

Permalink

Publication Type:: Conference Proceeding
Citation:: 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 182 - 188
Issue Date:: 2017-01-01

Closed Access

	Filename	Description	Size
	Web-based semantic fragment discovery for online lingual-visual similarity.pdf	Published version	3.72 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Sun, X	en_US
dc.contributor.author	Cao, J	en_US
dc.contributor.author	Li, C	en_US
dc.contributor.author	Zhu, L	en_US
dc.contributor.author	Shen, HT	en_US
dc.date.issued	2017-01-01	en_US
dc.identifier.citation	31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 182 - 188	en_US
dc.identifier.uri	http://hdl.handle.net/10453/125901
dc.description.abstract	Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. In this paper, we present an automatic approach for on-line discovery of visual-lingual semantic fragments from weakly labeled Internet images. Instead of learning region-entity correspondences from well-labeled image-sentence pairs, our approach directly collects and enhances the weakly labeled visual contents from the Web and constructs an adaptive visual representation which automatically links generic lingual phrases to their related visual contents. To ensure reliable and efficient semantic discovery, we adopt non-parametric density estimation to re-rank the related visual instances and proposed a fast self-similarity-based quality assessment method to identify the high-quality semantic fragments. The discovered semantic fragments provide an adaptive joint representation for texts and images, based on which lingual-visual similarity can be defined for further co-analysis of heterogeneous multimedia data. Experimental results on semantic fragment quality assessment, sentence-based image retrieval, automatic multimedia insertion and ordering demonstrated the effectiveness of the proposed framework. The experiments show that the proposed methods can make effective use of the Web knowledge, and are able to generate competitive results compared to state-of-the-art approaches in various tasks.	en_US
dc.relation.ispartof	31st AAAI Conference on Artificial Intelligence, AAAI 2017	en_US
dc.title	Web-based semantic fragment discovery for online lingual-visual similarity	en_US
dc.type	Conference Proceeding
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Software
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Copyright © 2017, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. In this paper, we present an automatic approach for on-line discovery of visual-lingual semantic fragments from weakly labeled Internet images. Instead of learning region-entity correspondences from well-labeled image-sentence pairs, our approach directly collects and enhances the weakly labeled visual contents from the Web and constructs an adaptive visual representation which automatically links generic lingual phrases to their related visual contents. To ensure reliable and efficient semantic discovery, we adopt non-parametric density estimation to re-rank the related visual instances and proposed a fast self-similarity-based quality assessment method to identify the high-quality semantic fragments. The discovered semantic fragments provide an adaptive joint representation for texts and images, based on which lingual-visual similarity can be defined for further co-analysis of heterogeneous multimedia data. Experimental results on semantic fragment quality assessment, sentence-based image retrieval, automatic multimedia insertion and ordering demonstrated the effectiveness of the proposed framework. The experiments show that the proposed methods can make effective use of the Web knowledge, and are able to generate competitive results compared to state-of-the-art approaches in various tasks.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/125901