Quantile-based KNN over multi-valued objects
- Publication Type:
- Conference Proceeding
- Proceedings - International Conference on Data Engineering, 2010, pp. 16 - 27
- Issue Date:
K Nearest Neighbor search has many applications including data mining, multi-media, image processing, and monitoring moving objects. In this paper, we study the problem of KNN over multi-valued objects. We aim to provide effective and efficient techniques to identify KNN sensitive to relative distributions of objects. We propose to use quantiles to summarize relative-distribution- sensitive K nearest neighbors. Given a query Q and a quantile φ ε (0, 1], we firstly study the problem of efficiently computing K nearest objects based on a φ-quantile distance (e.g. median distance) from each object to Q. The second problem is to retrieve the K nearest objects to Q based on overall distances in the "best population" (with a given size specified by φ-quantile) for each object. While the first problem can be solved in polynomial time, we show that the 2nd problem is NP-hard. A set of efficient, novel algorithms have been proposed to give an exact solution for the first problem and an approximate solution for the second problem with the approximation ratio 2. Extensive experiment demonstrates that our techniques are very efficient and effective. © 2010 IEEE.
Please use this identifier to cite or link to this item: