Triplet-Based Deep Hashing Network for Cross-Modal Retrieval

Deng, C; Chen, Z; Liu, X; Gao, X; Tao, D

Triplet-Based Deep Hashing Network for Cross-Modal Retrieval

Deng, C Chen, Z Liu, X Gao, X Tao, D

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Image Processing, 2018, 27 (8), pp. 3893 - 3903
Issue Date:: 2018-08-01

Closed Access

	Filename	Description	Size
	08331146.pdf	Published Version	2.87 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Deng, C	en_US
dc.contributor.author	Chen, Z	en_US
dc.contributor.author	Liu, X	en_US
dc.contributor.author	Gao, X	en_US
dc.contributor.author	Tao, D https://orcid.org/0000-0001-7225-5449	en_US
dc.date.issued	2018-08-01	en_US
dc.identifier.citation	IEEE Transactions on Image Processing, 2018, 27 (8), pp. 3893 - 3903	en_US
dc.identifier.issn	1057-7149	en_US
dc.identifier.uri	http://hdl.handle.net/10453/131652
dc.description.abstract	© 1992-2012 IEEE. Given the benefits of its low storage requirements and high retrieval efficiency, hashing has recently received increasing attention. In particular, cross-modal hashing has been widely and successfully used in multimedia similarity search applications. However, almost all existing methods employing cross-modal hashing cannot obtain powerful hash codes due to their ignoring the relative similarity between heterogeneous data that contains richer semantic information, leading to unsatisfactory retrieval performance. In this paper, we propose a triplet-based deep hashing (TDH) network for cross-modal retrieval. First, we utilize the triplet labels, which describe the relative relationships among three instances as supervision in order to capture more general semantic correlations between cross-modal instances. We then establish a loss function from the inter-modal view and the intra-modal view to boost the discriminative abilities of the hash codes. Finally, graph regularization is introduced into our proposed TDH method to preserve the original semantic similarity between hash codes in Hamming space. Experimental results show that our proposed method outperforms several state-of-the-art approaches on two popular cross-modal data sets.	en_US
dc.relation.ispartof	IEEE Transactions on Image Processing	en_US
dc.relation.isbasedon	10.1109/TIP.2018.2821921	en_US
dc.subject.classification	Artificial Intelligence & Image Processing	en_US
dc.title	Triplet-Based Deep Hashing Network for Cross-Modal Retrieval	en_US
dc.type	Journal Article
utslib.citation.volume	8	en_US
utslib.citation.volume	27	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
utslib.for	1702 Cognitive Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.issue	8	en_US
pubs.publication-status	Published	en_US
pubs.volume	27	en_US

Abstract:

© 1992-2012 IEEE. Given the benefits of its low storage requirements and high retrieval efficiency, hashing has recently received increasing attention. In particular, cross-modal hashing has been widely and successfully used in multimedia similarity search applications. However, almost all existing methods employing cross-modal hashing cannot obtain powerful hash codes due to their ignoring the relative similarity between heterogeneous data that contains richer semantic information, leading to unsatisfactory retrieval performance. In this paper, we propose a triplet-based deep hashing (TDH) network for cross-modal retrieval. First, we utilize the triplet labels, which describe the relative relationships among three instances as supervision in order to capture more general semantic correlations between cross-modal instances. We then establish a loss function from the inter-modal view and the intra-modal view to boost the discriminative abilities of the hash codes. Finally, graph regularization is introduced into our proposed TDH method to preserve the original semantic similarity between hash codes in Hamming space. Experimental results show that our proposed method outperforms several state-of-the-art approaches on two popular cross-modal data sets.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/131652