### Abstract

Semantic hashing[1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be shown to be NP hard. By relaxing the original problem, we obtain a spectral method whose solutions are simply a subset of thresholded eigen- vectors of the graph Laplacian. By utilizing recent results on convergence of graph Laplacian eigenvectors to the Laplace-Beltrami eigenfunctions of manifolds, we show how to efficiently calculate the code of a novel data- point. Taken together, both learning the code and applying it to a novel point are extremely simple. Our experiments show that our codes outper- form the state-of-the art.

Original language | English (US) |
---|---|

Title of host publication | Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference |

Pages | 1753-1760 |

Number of pages | 8 |

State | Published - Dec 1 2009 |

Event | 22nd Annual Conference on Neural Information Processing Systems, NIPS 2008 - Vancouver, BC, Canada Duration: Dec 8 2008 → Dec 11 2008 |

### Publication series

Name | Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference |
---|

### Other

Other | 22nd Annual Conference on Neural Information Processing Systems, NIPS 2008 |
---|---|

Country | Canada |

City | Vancouver, BC |

Period | 12/8/08 → 12/11/08 |

### Fingerprint

### ASJC Scopus subject areas

- Information Systems

### Cite this

*Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference*(pp. 1753-1760). (Advances in Neural Information Processing Systems 21 - Proceedings of the 2008 Conference).