I-LSH: I/O efficient c-approximate nearest neighbor search in high-dimensional space

Publication Type:
Conference Proceeding
Proceedings - International Conference on Data Engineering, 2019, 2019-April pp. 1670 - 1673
Issue Date:
Filename Description Size
450.pdfAccepted Manuscript331.85 kB
Adobe PDF
Full metadata record
© 2019 IEEE. Nearest Neighbor search has been well solved in low-dimensional space, but is challenging in high-dimensional space due to the curse of dimensionality. As a trade-off between efficiency and result accuracy, a variety of c-approximate nearest neighbor (c-ANN) algorithms have been proposed to return a c-approximate NN with confident at least δ. We observe that existing c-ANN search algorithms have some limitations on I/O efficiency when their indexes are resided on the external memory, which is critical for handling large scale high-dimensional data. In this paper, we introduce an incremental search based c-ANN search algorithm, named I-LSH. Unlike the previous LSH methods, which expand the bucket width in an exponential way, I-LSH adopts a more natural search strategy to incrementally access the hash values of the objects. We provide rigorous theoretical analysis to underpin our incremental search strategy. Our comprehensive experiment results show that, compared with state-of-the-art I/O efficient c-ANN techniques, our algorithm can achieve much better I/O efficiency under the same theoretical guarantee.
Please use this identifier to cite or link to this item: