Top-K nearest keyword search on large graphs

Qiao, M; Qin, L; Cheng, H; Yu, JX; Tian, W

Top-K nearest keyword search on large graphs

Qiao, M Qin, L

Cheng, H Yu, JX

Tian, W

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the VLDB Endowment, 2013, 6 (10), pp. 901 - 912
Issue Date:: 2013-01-01

Closed Access

	Filename	Description	Size
	2013005189OK.pdf		557.54 kB		View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Qiao, M	en_US
dc.contributor.author	Qin, L https://orcid.org/0000-0001-6068-5062	en_US
dc.contributor.author	Cheng, H	en_US
dc.contributor.author	Yu, JX https://orcid.org/0000-0002-9738-827X	en_US
dc.contributor.author	Tian, W	en_US
dc.date.issued	2013-01-01	en_US
dc.identifier.citation	Proceedings of the VLDB Endowment, 2013, 6 (10), pp. 901 - 912	en_US
dc.identifier.uri	http://hdl.handle.net/10453/28936
dc.description.abstract	It is quite common for networks emerging nowadays to have labels or textual contents on the nodes. On such networks, we study the problem of top-k nearest keyword (k-NK) search. In a network G modeled as an undirected graph, each node is attached with zero or more keywords, and each edge is assigned with a weight measuring its length. Given a query node q in G and a keyword λ, a k-NK query seeks k nodes which contain λ and are nearest to q. k-NK is not only useful as a stand-alone query but also as a building block for tackling complex graph pattern matching problems. The key to an accurate k-NK result is a precise shortest distance estimation in a graph. Based on the latest distance oracle technique, we build a shortest path tree for a distance oracle and use the tree distance as a more accurate estimation. With such representation, the original k-NK query on a graph can be reduced to answering the query on a set of trees and then assembling the results obtained from the trees. We propose two efficient algorithms to report the exact k-NK result on a tree. One is query time optimized for a scenario when a small number of result nodes are of interest to users. The other handles k-NK queries for an arbitrarily large k efficiently. In obtaining a k-NK result on a graph from that on trees, a global storage technique is proposed to further reduce the index size and the query time. Extensive experimental results conform with our theoretical findings, and demonstrate the effectiveness and efficiency of our k-NK algorithms on large real graphs. © 2013 VLDB Endowment.	en_US
dc.relation.ispartof	Proceedings of the VLDB Endowment	en_US
dc.relation.isbasedon	10.14778/2536206.2536217	en_US
dc.title	Top-K nearest keyword search on large graphs	en_US
dc.type	Conference Proceeding
utslib.citation.volume	10	en_US
utslib.citation.volume	6	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	0802 Computation Theory and Mathematics	en_US
utslib.for	0807 Library and Information Studies	en_US
dc.location.activity	Trento, Italy	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.issue	10	en_US
pubs.publication-status	Published	en_US
pubs.volume	6	en_US

Abstract:

It is quite common for networks emerging nowadays to have labels or textual contents on the nodes. On such networks, we study the problem of top-k nearest keyword (k-NK) search. In a network G modeled as an undirected graph, each node is attached with zero or more keywords, and each edge is assigned with a weight measuring its length. Given a query node q in G and a keyword λ, a k-NK query seeks k nodes which contain λ and are nearest to q. k-NK is not only useful as a stand-alone query but also as a building block for tackling complex graph pattern matching problems. The key to an accurate k-NK result is a precise shortest distance estimation in a graph. Based on the latest distance oracle technique, we build a shortest path tree for a distance oracle and use the tree distance as a more accurate estimation. With such representation, the original k-NK query on a graph can be reduced to answering the query on a set of trees and then assembling the results obtained from the trees. We propose two efficient algorithms to report the exact k-NK result on a tree. One is query time optimized for a scenario when a small number of result nodes are of interest to users. The other handles k-NK queries for an arbitrarily large k efficiently. In obtaining a k-NK result on a graph from that on trees, a global storage technique is proposed to further reduce the index size and the query time. Extensive experimental results conform with our theoretical findings, and demonstrate the effectiveness and efficiency of our k-NK algorithms on large real graphs. © 2013 VLDB Endowment.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/28936