Locally private Jaccard similarity estimation

Publication Type:
Journal Article
Concurrency Computation, 2019, 31 (24)
Issue Date:
Filename Description Size
cpe.4889.pdfPublished Version1.08 MB
Adobe PDF
Full metadata record
© 2018 John Wiley & Sons, Ltd. Jaccard Similarity has been widely used to measure the distance between two sets (or preference profiles) owned by two different users. Yet, in the private data collection scenario, it requires the untrusted curator could only estimate an approximately accurate Jaccard similarity of the involved users but without being allowed to access their preference profiles. This paper aims to address the above requirements by considering the local differential privacy model. To achieve this, we initially focused on a particular hash technique, MinHash, which was originally invented to estimate the Jaccard similarity efficiently. We designed the PrivMin algorithm to achieve the perturbation of MinHash signature by adopting Exponential mechanism and build the Locally Differentially Private Jaccard Similarity Estimation (LDP-JSE) protocol for allowing the untrusted curator to approximately estimate Jaccard similarity. Theoretical and empirical results demonstrate that the proposed protocol can retain a highly acceptable utility of the estimated similarity as well as preserving privacy.
Please use this identifier to cite or link to this item: