LSH: The Distortion of Locality Sensitive Hashing

Alessandro Panconesi, Sapienza University of Rome

Friday, February 24, 2017 at 11:00 a.m.

Room 316 (CIT 3rd Floor)

Given a pairwise similarity notion between objects, locality sensitive hashing (LSH) aims to construct a hash function family over the universe of objects such that the probability two objects hash to the same value is their similarity. LSH is a powerful algorithmic tool for large-scale applications and much work has been done to understand LSHable similarities, i.e., similarities that admit an LSH.

In this paper we focus on similarities that are provably non-LSHable and propose a notion of distortion to capture the approximation of such a similarity by a similarity that is LSHable. We consider several well-known non-LSHable similarities and show tight upper and lower bounds on their distortion. We also experimentally show that our upper bounds translate to effective algorithms in practice.

Joint work with Flavio Chierichetti, Ravi Kumar and Erisa Terolli

Host: Eli Upfal