On universal classes of extremely random constant-time hash functions

Research output: Contribution to journalArticle

Abstract

A family of functions F that map [0,m - 1] into [0,n - 1] is said to be k-wise independent if any tuple of k distinct points in [0, m - 1] have a corresponding image, for a randomly selected f ∈ F, that is uniformly distributed in [0, n - 1] k. This paper shows that for suitably fixed ε < 1 and any k < m ε, there are families of k-wise independent functions that can be evaluated in constant time for the standard random access model of computation. It is also proven that any such family requires a storage array of m δ random seeds for a suitable δ < 1. These seeds can be pseudorandom values precomputed from an initial O(k) random seeds. A simple adaptation yields n ε-wise independent functions that require n δ storage in many cases where m ≫ n. Lower bounds are presented to show that neither storage requirement can be materially reduced. Previous constructions of random functions having constant evaluation time and sublinear storage exhibited only a constant degree of independence. Unfortunately, the explicit randomized constructions, while requiring a constant number of operations, are far too slow for any practical application. However, nonconstructive existence arguments are given, which suggest that this factor might be eliminated. The problem of eliminating this factor is shown to be equivalent to a fundamental open question in graph theory. As a consequence of these constructions, many probabilistic algorithms-from traditional hashing to Ranade's emulation of common PRAM algorithms - can for the first time be shown to achieve, up to constant factors, their expected asymptotic performance for a programmable, albeit formal and currently impractical, model of computation, and a research direction is now available that may eventually lead to implementations that are fast and provably sound.

Original languageEnglish (US)
Pages (from-to)505-543
Number of pages39
JournalSIAM Journal on Computing
Volume33
Issue number3
DOIs
StatePublished - Jul 28 2004

    Fingerprint

Keywords

  • Hash functions
  • Hashing
  • Limited independence
  • Optimal speedup
  • PRAM emulation
  • Storage-time tradeoff
  • Universal hash functions

ASJC Scopus subject areas

  • Computer Science(all)
  • Mathematics(all)

Cite this