Optimal spatial dominance: An effective search of nearest neighbor candidates

Wang, X; Zhang, Y; Zhang, W; Lin, X; Cheema, MA

Optimal spatial dominance: An effective search of nearest neighbor candidates

Wang, X Zhang, Y

Zhang, W Lin, X Cheema, MA

Permalink

Publication Type:: Conference Proceeding
Citation:: Proceedings of the ACM SIGMOD International Conference on Management of Data, 2015, 2015-May pp. 923 - 938
Issue Date:: 2015-05-27

Closed Access

	Filename	Description	Size
	2015_sigmod_SpatialDominance.pdf	Published version	457.25 kB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Wang, X	en_US
dc.contributor.author	Zhang, Y https://orcid.org/0000-0002-2674-1638	en_US
dc.contributor.author	Zhang, W	en_US
dc.contributor.author	Lin, X	en_US
dc.contributor.author	Cheema, MA	en_US
dc.date.issued	2015-05-27	en_US
dc.identifier.citation	Proceedings of the ACM SIGMOD International Conference on Management of Data, 2015, 2015-May pp. 923 - 938	en_US
dc.identifier.isbn	9781450327589	en_US
dc.identifier.issn	0730-8078	en_US
dc.identifier.uri	http://hdl.handle.net/10453/41649
dc.description.abstract	Copyright © 2015 ACM. In many domains such as computational geometry and database management, an object may be described by multiple instances (points). Then the distance (or similarity) between two objects is captured by the pair-wise distances among their instances. In the past, numerous nearest neighbor (NN) functions have been proposed to define the distance between objects with multiple instances and to identify the NN object. Nevertheless, considering that a user may not have a specific NN function in mind, it is desirable to provide her with a set of NN candidates. Ideally, the set of NN candidates must include every object that is NN for at least one of the NN functions and must exclude every nonpromising object. However, no one has studied the problem of NN candidates computation from this perspective. Although some of the existing works aim at returning a set of candidate objects, they do not focus on the NN functions while computing the candidate objects. As a result, they either fail to include an NN object w.r.t. some NN functions or include a large number of unnecessary objects that have no potential to be the NN regardless of the NN functions. Motivated by this, we classify the existing NN functions for objects with multiple instances into three families by characterizing their key features. Then, we advocate three spatial dominance operators to compute NN candidates where each operator is optimal w.r.t. different coverage of NN functions. Efficient algorithms are proposed for the dominance check and corresponding NN candidates computation. Extensive empirical study on real and synthetic datasets shows that our proposed operators can significantly reduce the number of NN candidates. The comprehensive performance evaluation demonstrates the efficiency of our computation techniques.	en_US
dc.relation.ispartof	Proceedings of the ACM SIGMOD International Conference on Management of Data	en_US
dc.relation.isbasedon	10.1145/2723372.2749442	en_US
dc.title	Optimal spatial dominance: An effective search of nearest neighbor candidates	en_US
dc.type	Conference Proceeding
utslib.citation.volume	2015-May	en_US
utslib.for	0806 Information Systems	en_US
utslib.for	080109 Pattern Recognition and Data Mining	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology/School of Computer Science
pubs.organisational-group	/University of Technology Sydney/Strength - CAI - Centre for Artificial Intelligence
utslib.copyright.status	closed_access
pubs.publication-status	Published	en_US
pubs.volume	2015-May	en_US

Abstract:

Copyright © 2015 ACM. In many domains such as computational geometry and database management, an object may be described by multiple instances (points). Then the distance (or similarity) between two objects is captured by the pair-wise distances among their instances. In the past, numerous nearest neighbor (NN) functions have been proposed to define the distance between objects with multiple instances and to identify the NN object. Nevertheless, considering that a user may not have a specific NN function in mind, it is desirable to provide her with a set of NN candidates. Ideally, the set of NN candidates must include every object that is NN for at least one of the NN functions and must exclude every nonpromising object. However, no one has studied the problem of NN candidates computation from this perspective. Although some of the existing works aim at returning a set of candidate objects, they do not focus on the NN functions while computing the candidate objects. As a result, they either fail to include an NN object w.r.t. some NN functions or include a large number of unnecessary objects that have no potential to be the NN regardless of the NN functions. Motivated by this, we classify the existing NN functions for objects with multiple instances into three families by characterizing their key features. Then, we advocate three spatial dominance operators to compute NN candidates where each operator is optimal w.r.t. different coverage of NN functions. Efficient algorithms are proposed for the dominance check and corresponding NN candidates computation. Extensive empirical study on real and synthetic datasets shows that our proposed operators can significantly reduce the number of NN candidates. The comprehensive performance evaluation demonstrates the efficiency of our computation techniques.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/41649