Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search

Publication Type:
Journal Article
Citation:
IEEE Transactions on Knowledge and Data Engineering, 2016, 28 (7), pp. 1706 - 1721
Issue Date:
2016-07-01
Full metadata record
Files in This Item:
Filename Description Size
07407420.pdfPublished Version2.26 MB
Adobe PDF
© 1989-2012 IEEE. With advances in geo-positioning technologies and geo-location services, there are a rapidly growing amount of spatio-Textual objects collected in many applications such as location based services and social networks, in which an object is described by its spatial location and a set of keywords (terms). Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In the paper, we study two fundamental problems in the spatial keyword queries: Top k spatial keyword search (TOPK-SK), and batch top k spatial keyword search (BTOPK-SK). Given a set of spatio-Textual objects, a query location and a set of query keywords, the TOPK-SK retrieves the closest k objects each of which contains all keywords in the query. BTOPK-SK is the batch processing of sets of TOPK-SK queries. Based on the inverted index and the linear quadtree, we propose a novel index structure, called inverted linear quadtree (IL-Quadtree), which is carefully designed to exploit both spatial and keyword based pruning techniques to effectively reduce the search space. An efficient algorithm is then developed to tackle top k spatial keyword search. To further enhance the filtering capability of the signature of linear quadtree, we propose a partition based method. In addition, to deal with BTOPK-SK, we design a new computing paradigm which partition the queries into groups based on both spatial proximity and the textual relevance between queries. We show that the IL-Quadtree technique can also efficiently support BTOPK-SK. Comprehensive experiments on real and synthetic data clearly demonstrate the efficiency of our methods.
Please use this identifier to cite or link to this item: