Lower bounds on locality sensitive hashing

Rajeev Motwani, Assaf Naor, Rina Panigrahy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given a metric space (X, d X), c ≥ 1, r > 0, and p,q ∈ [0,1], a distribution over mappings ℋ: X → ℕ is called a (r,cr,p,g)-sensitive hash family if any two points in X at distance at most r are mapped by ℋ to the same value with probability at least p, and any two points at distance greater than cr are mapped by ℋ to the same value with probability at most q. This notion was introduced by Indyk and Motwani in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been used extensively for this purpose. The performance of these algorithms is governed by the parameter ρ = log(1/p)/log(1/q), and constructing hash families with small ρ automatically yields improved nearest neighbor algorithms. Here we show that for X = ℓ 1 it is impossible to achieve ρ ≤ 1/2c. This almost matches the construction of Indyk and Motwani which achieves ρ ≤ 1/c.

Original languageEnglish (US)
Title of host publicationProceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06
Pages154-157
Number of pages4
Volume2006
StatePublished - 2006
Event22nd Annual Symposium on Computational Geometry 2006, SCG'06 - Sedona, AZ, United States
Duration: Jun 5 2006Jun 7 2006

Other

Other22nd Annual Symposium on Computational Geometry 2006, SCG'06
CountryUnited States
CitySedona, AZ
Period6/5/066/7/06

Fingerprint

Hashing
Locality
Lower bound
Nearest Neighbor Search
Search Algorithm
Metric space
Nearest Neighbor
Family
Nearest neighbor search

Keywords

  • Locality Sensitive Hashing
  • Lower Bounds
  • Nearest Neighbor Search

ASJC Scopus subject areas

  • Software
  • Geometry and Topology
  • Safety, Risk, Reliability and Quality
  • Chemical Health and Safety

Cite this

Motwani, R., Naor, A., & Panigrahy, R. (2006). Lower bounds on locality sensitive hashing. In Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06 (Vol. 2006, pp. 154-157)

Lower bounds on locality sensitive hashing. / Motwani, Rajeev; Naor, Assaf; Panigrahy, Rina.

Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06. Vol. 2006 2006. p. 154-157.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Motwani, R, Naor, A & Panigrahy, R 2006, Lower bounds on locality sensitive hashing. in Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06. vol. 2006, pp. 154-157, 22nd Annual Symposium on Computational Geometry 2006, SCG'06, Sedona, AZ, United States, 6/5/06.
Motwani R, Naor A, Panigrahy R. Lower bounds on locality sensitive hashing. In Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06. Vol. 2006. 2006. p. 154-157
Motwani, Rajeev ; Naor, Assaf ; Panigrahy, Rina. / Lower bounds on locality sensitive hashing. Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06. Vol. 2006 2006. pp. 154-157
@inproceedings{e102d2bc79ac40ab9c81b678c7358db6,
title = "Lower bounds on locality sensitive hashing",
abstract = "Given a metric space (X, d X), c ≥ 1, r > 0, and p,q ∈ [0,1], a distribution over mappings ℋ: X → ℕ is called a (r,cr,p,g)-sensitive hash family if any two points in X at distance at most r are mapped by ℋ to the same value with probability at least p, and any two points at distance greater than cr are mapped by ℋ to the same value with probability at most q. This notion was introduced by Indyk and Motwani in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been used extensively for this purpose. The performance of these algorithms is governed by the parameter ρ = log(1/p)/log(1/q), and constructing hash families with small ρ automatically yields improved nearest neighbor algorithms. Here we show that for X = ℓ 1 it is impossible to achieve ρ ≤ 1/2c. This almost matches the construction of Indyk and Motwani which achieves ρ ≤ 1/c.",
keywords = "Locality Sensitive Hashing, Lower Bounds, Nearest Neighbor Search",
author = "Rajeev Motwani and Assaf Naor and Rina Panigrahy",
year = "2006",
language = "English (US)",
isbn = "1595933409",
volume = "2006",
pages = "154--157",
booktitle = "Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06",

}

TY - GEN

T1 - Lower bounds on locality sensitive hashing

AU - Motwani, Rajeev

AU - Naor, Assaf

AU - Panigrahy, Rina

PY - 2006

Y1 - 2006

N2 - Given a metric space (X, d X), c ≥ 1, r > 0, and p,q ∈ [0,1], a distribution over mappings ℋ: X → ℕ is called a (r,cr,p,g)-sensitive hash family if any two points in X at distance at most r are mapped by ℋ to the same value with probability at least p, and any two points at distance greater than cr are mapped by ℋ to the same value with probability at most q. This notion was introduced by Indyk and Motwani in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been used extensively for this purpose. The performance of these algorithms is governed by the parameter ρ = log(1/p)/log(1/q), and constructing hash families with small ρ automatically yields improved nearest neighbor algorithms. Here we show that for X = ℓ 1 it is impossible to achieve ρ ≤ 1/2c. This almost matches the construction of Indyk and Motwani which achieves ρ ≤ 1/c.

AB - Given a metric space (X, d X), c ≥ 1, r > 0, and p,q ∈ [0,1], a distribution over mappings ℋ: X → ℕ is called a (r,cr,p,g)-sensitive hash family if any two points in X at distance at most r are mapped by ℋ to the same value with probability at least p, and any two points at distance greater than cr are mapped by ℋ to the same value with probability at most q. This notion was introduced by Indyk and Motwani in 1998 as the basis for an efficient approximate nearest neighbor search algorithm, and has since been used extensively for this purpose. The performance of these algorithms is governed by the parameter ρ = log(1/p)/log(1/q), and constructing hash families with small ρ automatically yields improved nearest neighbor algorithms. Here we show that for X = ℓ 1 it is impossible to achieve ρ ≤ 1/2c. This almost matches the construction of Indyk and Motwani which achieves ρ ≤ 1/c.

KW - Locality Sensitive Hashing

KW - Lower Bounds

KW - Nearest Neighbor Search

UR - http://www.scopus.com/inward/record.url?scp=33748088520&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33748088520&partnerID=8YFLogxK

M3 - Conference contribution

SN - 1595933409

SN - 9781595933409

VL - 2006

SP - 154

EP - 157

BT - Proceedings of the Twenty-Second Annual Symposium on Computational Geometry 2006, SCG'06

ER -