Spectral hashing

Yair Weiss, Antonio Torralba, Robert Fergus

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigen- vectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data- point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outper- form the state-of-the art.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference
Pages1753-1760
Number of pages8
StatePublished - 2009
Event22nd Annual Conference on Neural Information Processing Systems, NIPS 2008 - Vancouver, BC, Canada
Duration: Dec 8 2008Dec 11 2008

Other

Other22nd Annual Conference on Neural Information Processing Systems, NIPS 2008
CountryCanada
CityVancouver, BC
Period12/8/0812/11/08

Fingerprint

Eigenvalues and eigenfunctions
Semantics
Hamming distance
Binary codes
Experiments

ASJC Scopus subject areas

  • Information Systems

Cite this

Weiss, Y., Torralba, A., & Fergus, R. (2009). Spectral hashing. In Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference (pp. 1753-1760)

Spectral hashing. / Weiss, Yair; Torralba, Antonio; Fergus, Robert.

Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 2009. p. 1753-1760.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Weiss, Y, Torralba, A & Fergus, R 2009, Spectral hashing. in Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. pp. 1753-1760, 22nd Annual Conference on Neural Information Processing Systems, NIPS 2008, Vancouver, BC, Canada, 12/8/08.
Weiss Y, Torralba A, Fergus R. Spectral hashing. In Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 2009. p. 1753-1760
Weiss, Yair ; Torralba, Antonio ; Fergus, Robert. / Spectral hashing. Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference. 2009. pp. 1753-1760
@inproceedings{890d94598fa54dac85eae9aa824a0a7e,
title = "Spectral hashing",
abstract = "Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigen- vectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data- point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outper- form the state-of-the art.",
author = "Yair Weiss and Antonio Torralba and Robert Fergus",
year = "2009",
language = "English (US)",
isbn = "9781605609492",
pages = "1753--1760",
booktitle = "Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference",

}

TY - GEN

T1 - Spectral hashing

AU - Weiss, Yair

AU - Torralba, Antonio

AU - Fergus, Robert

PY - 2009

Y1 - 2009

N2 - Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigen- vectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data- point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outper- form the state-of-the art.

AB - Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigen- vectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data- point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outper- form the state-of-the art.

UR - http://www.scopus.com/inward/record.url?scp=84858779327&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858779327&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84858779327

SN - 9781605609492

SP - 1753

EP - 1760

BT - Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference

ER -