Probabilistic ranking over relations

Chang, L; Yu, JX; Qin, L; Lin, X

Probabilistic ranking over relations

Chang, L Yu, JX

Qin, L

Lin, X

Permalink

Publication Type:: Conference Proceeding
Citation:: Advances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings, 2010, pp. 477 - 488
Issue Date:: 2010-05-19

Closed Access

	Filename	Description	Size
	2013002381OK.pdf		662.42 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Chang, L	en_US
dc.contributor.author	Yu, JX https://orcid.org/0000-0002-9738-827X	en_US
dc.contributor.author	Qin, L https://orcid.org/0000-0001-6068-5062	en_US
dc.contributor.author	Lin, X	en_US
dc.date.issued	2010-05-19	en_US
dc.identifier.citation	Advances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings, 2010, pp. 477 - 488	en_US
dc.identifier.isbn	9781605589459	en_US
dc.identifier.uri	http://hdl.handle.net/10453/28970
dc.description.abstract	Probabilistic top-k ranking queries have been extensively studied due to the fact that data obtained can be uncertain in many real applications. A probabilistic top-k ranking query ranks objects by the interplay of score and probability, with an implicit assumption that both scores based on which objects are ranked and probabilities of the existence of the objects are stored in the same relation. We observe that in general scores and probabilities are highly possible to be stored in different relations, for example, in column-oriented DBMSs and in data warehouses. In this paper we study probabilistic top-k ranking queries when scores and probabilities are stored in different relations. We focus on reducing the join cost in probabilistic top-k ranking. We investigate two probabilistic score functions, discuss the upper/lower bounds in random access and sequential access, and provide insights on the advantages and disadvantages of random/sequential access in terms of upper/lower bounds. We also propose random, sequential, and hybrid algorithms to conduct probabilistic top-k ranking. We conducted extensive performance studies using real and synthetic datasets, and report our findings in this paper. Copyright 2010 ACM.	en_US
dc.relation.ispartof	Advances in Database Technology - EDBT 2010 - 13th International Conference on Extending Database Technology, Proceedings	en_US
dc.relation.isbasedon	10.1145/1739041.1739099	en_US
dc.title	Probabilistic ranking over relations	en_US
dc.type	Conference Proceeding
utslib.for	0806 Information Systems	en_US
dc.location.activity	Lausanne, Switzerland	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US

Abstract:

Probabilistic top-k ranking queries have been extensively studied due to the fact that data obtained can be uncertain in many real applications. A probabilistic top-k ranking query ranks objects by the interplay of score and probability, with an implicit assumption that both scores based on which objects are ranked and probabilities of the existence of the objects are stored in the same relation. We observe that in general scores and probabilities are highly possible to be stored in different relations, for example, in column-oriented DBMSs and in data warehouses. In this paper we study probabilistic top-k ranking queries when scores and probabilities are stored in different relations. We focus on reducing the join cost in probabilistic top-k ranking. We investigate two probabilistic score functions, discuss the upper/lower bounds in random access and sequential access, and provide insights on the advantages and disadvantages of random/sequential access in terms of upper/lower bounds. We also propose random, sequential, and hybrid algorithms to conduct probabilistic top-k ranking. We conducted extensive performance studies using real and synthetic datasets, and report our findings in this paper. Copyright 2010 ACM.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/28970