Handling query skew in large indexes: a view based approach

Huang, W; Yu, JX; Shang, Z

Handling query skew in large indexes: a view based approach

Huang, W Yu, JX

Shang, Z

Permalink

Publication Type:: Journal Article
Citation:: Frontiers of Computer Science, 2018, 12 (1), pp. 146 - 162
Issue Date:: 2018-02-01

Closed Access

	Filename	Description	Size
	Huang2018_Article_HandlingQuerySkewInLargeIndexe.pdf	Published Version	1.32 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Huang, W	en_US
dc.contributor.author	Yu, JX https://orcid.org/0000-0002-9738-827X	en_US
dc.contributor.author	Shang, Z	en_US
dc.date.issued	2018-02-01	en_US
dc.identifier.citation	Frontiers of Computer Science, 2018, 12 (1), pp. 146 - 162	en_US
dc.identifier.issn	2095-2228	en_US
dc.identifier.uri	http://hdl.handle.net/10453/132004
dc.description.abstract	© 2018, Higher Education Press and Springer-Verlag GmbH Germany. Indexing is one of the most important techniques to facilitate query processing over a multi-dimensional dataset. A commonly used strategy for such indexing is to keep the tree-structured index balanced. This strategy reduces query processing cost in the worst case, and can handle all different queries equally well. In other words, this strategy implies that all queries are uniformly issued, which is partially because the query distribution is not possibly known and will change over time in practice. A key issue we study in this work is whether it is the best to fully rely on a balanced tree-structured index in particular when datasets become larger and larger in the big data era. This means that, when a dataset becomes very large, it becomes unreasonable to assume that all data in any subspace are equally important and are uniformly accessed by all queries at the index level. Given the existence of query skew and the possible changes of query skew, in this paper, we study how to handle such query skew and such query skew changes at the index level without sacrifice of supporting any possible queries in a wellbalanced tree index and without a high overhead. To tackle the issue, we propose index-view at the index level, where an index-view is a short-cut in a balanced tree-structured index to access objects in the subspaces that are more frequently accessed, and propose a new index-view-centric framework for query processing using index-views in a bottom-up manner. We study index-views selection problem in both static and dynamic setting, and we confirm the effectiveness of our approach using large real and synthetic datasets.	en_US
dc.relation.ispartof	Frontiers of Computer Science	en_US
dc.relation.isbasedon	10.1007/s11704-016-5525-3	en_US
dc.title	Handling query skew in large indexes: a view based approach	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	12	en_US
utslib.for	0806 Information Systems	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	12	en_US

Abstract:

© 2018, Higher Education Press and Springer-Verlag GmbH Germany. Indexing is one of the most important techniques to facilitate query processing over a multi-dimensional dataset. A commonly used strategy for such indexing is to keep the tree-structured index balanced. This strategy reduces query processing cost in the worst case, and can handle all different queries equally well. In other words, this strategy implies that all queries are uniformly issued, which is partially because the query distribution is not possibly known and will change over time in practice. A key issue we study in this work is whether it is the best to fully rely on a balanced tree-structured index in particular when datasets become larger and larger in the big data era. This means that, when a dataset becomes very large, it becomes unreasonable to assume that all data in any subspace are equally important and are uniformly accessed by all queries at the index level. Given the existence of query skew and the possible changes of query skew, in this paper, we study how to handle such query skew and such query skew changes at the index level without sacrifice of supporting any possible queries in a wellbalanced tree index and without a high overhead. To tackle the issue, we propose index-view at the index level, where an index-view is a short-cut in a balanced tree-structured index to access objects in the subspaces that are more frequently accessed, and propose a new index-view-centric framework for query processing using index-views in a bottom-up manner. We study index-views selection problem in both static and dynamic setting, and we confirm the effectiveness of our approach using large real and synthetic datasets.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/132004