Efficient Sensitivity Analysis for Inequality Queries in Probabilistic Databases

Qin, B; Yu, JX

Efficient Sensitivity Analysis for Inequality Queries in Probabilistic Databases

Qin, B Yu, JX

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (1), pp. 86 - 99
Issue Date:: 2017-01-01

Closed Access

	Filename	Description	Size
	07576647.pdf	Published Version	654.15 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Qin, B	en_US
dc.contributor.author	Yu, JX https://orcid.org/0000-0002-9738-827X	en_US
dc.date.issued	2017-01-01	en_US
dc.identifier.citation	IEEE Transactions on Knowledge and Data Engineering, 2017, 29 (1), pp. 86 - 99	en_US
dc.identifier.issn	1041-4347	en_US
dc.identifier.uri	http://hdl.handle.net/10453/126995
dc.description.abstract	© 1989-2012 IEEE. In this paper, we study inequality query (IQ query) processing in tuple independent probabilistic databases, where IQ queries can be categorized into IQ-path, IQ-tree, and IQ-graph queries. We focus on two related issues for IQ queries. One issue is to efficiently compute their probabilities, with the observation that the time complexity of the state-of-the-art algorithm to process IQ-graph queries is high. The other issue is to efficiently perform their sensitivity analysis, which has not been studied before. Here, sensitivity analysis is to identify input tuples that have high influence on the probability of an answer tuple, and the influence of an input tuple is defined as the difference between the output probabilities obtained in two cases, where we assume that the tuple exists in one case and does not exist in the other one. In this paper, we compile the inequality conditions of an IQ query q into a compilation tree T, which encodes the Shannon expansion order. Moreover, we split q into a set of subqueries and each contains only one inequality condition. Using compilation tree and decomposition, we introduce a dynamic programming algorithm called Dec to process an IQ query q in time O(IΦI), where Φ is the lineage of q. An IQ query can be processed by our Dec if and only if its inequality conditions can be compiled into a compilation tree T and the inequality conditions from any node to all of its child nodes must be the same in T. We conduct extensive experiments using real and synthetic datasets to demonstrate the efficiency of our algorithm for computing the probabilities and influences of IQ queries.	en_US
dc.relation.ispartof	IEEE Transactions on Knowledge and Data Engineering	en_US
dc.relation.isbasedon	10.1109/TKDE.2016.2613538	en_US
dc.subject.classification	Information Systems	en_US
dc.title	Efficient Sensitivity Analysis for Inequality Queries in Probabilistic Databases	en_US
dc.type	Journal Article
utslib.citation.volume	1	en_US
utslib.citation.volume	29	en_US
utslib.for	0804 Data Format	en_US
utslib.for	08 Information and Computing Sciences	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.issue	1	en_US
pubs.publication-status	Published	en_US
pubs.volume	29	en_US

Abstract:

© 1989-2012 IEEE. In this paper, we study inequality query (IQ query) processing in tuple independent probabilistic databases, where IQ queries can be categorized into IQ-path, IQ-tree, and IQ-graph queries. We focus on two related issues for IQ queries. One issue is to efficiently compute their probabilities, with the observation that the time complexity of the state-of-the-art algorithm to process IQ-graph queries is high. The other issue is to efficiently perform their sensitivity analysis, which has not been studied before. Here, sensitivity analysis is to identify input tuples that have high influence on the probability of an answer tuple, and the influence of an input tuple is defined as the difference between the output probabilities obtained in two cases, where we assume that the tuple exists in one case and does not exist in the other one. In this paper, we compile the inequality conditions of an IQ query q into a compilation tree T, which encodes the Shannon expansion order. Moreover, we split q into a set of subqueries and each contains only one inequality condition. Using compilation tree and decomposition, we introduce a dynamic programming algorithm called Dec to process an IQ query q in time O(IΦI), where Φ is the lineage of q. An IQ query can be processed by our Dec if and only if its inequality conditions can be compiled into a compilation tree T and the inequality conditions from any node to all of its child nodes must be the same in T. We conduct extensive experiments using real and synthetic datasets to demonstrate the efficiency of our algorithm for computing the probabilities and influences of IQ queries.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/126995