Sparse feature learning for deep belief networks

Marc'aurelio Ranzato, Y. Lan Boureau, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference
StatePublished - 2009
Event21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada
Duration: Dec 3 2007Dec 6 2007

Other

Other21st Annual Conference on Neural Information Processing Systems, NIPS 2007
CountryCanada
CityVancouver, BC
Period12/3/0712/6/07

Fingerprint

Unsupervised learning
Bayesian networks
Learning algorithms

ASJC Scopus subject areas

  • Information Systems

Cite this

Ranzato, M., Boureau, Y. L., & LeCun, Y. (2009). Sparse feature learning for deep belief networks. In Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

Sparse feature learning for deep belief networks. / Ranzato, Marc'aurelio; Boureau, Y. Lan; LeCun, Yann.

Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ranzato, M, Boureau, YL & LeCun, Y 2009, Sparse feature learning for deep belief networks. in Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 21st Annual Conference on Neural Information Processing Systems, NIPS 2007, Vancouver, BC, Canada, 12/3/07.
Ranzato M, Boureau YL, LeCun Y. Sparse feature learning for deep belief networks. In Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009
Ranzato, Marc'aurelio ; Boureau, Y. Lan ; LeCun, Yann. / Sparse feature learning for deep belief networks. Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference. 2009.
@inproceedings{d30ae18aa47746fab3a4c440d483aa84,
title = "Sparse feature learning for deep belief networks",
abstract = "Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.",
author = "Marc'aurelio Ranzato and Boureau, {Y. Lan} and Yann LeCun",
year = "2009",
language = "English (US)",
isbn = "160560352X",
booktitle = "Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference",

}

TY - GEN

T1 - Sparse feature learning for deep belief networks

AU - Ranzato, Marc'aurelio

AU - Boureau, Y. Lan

AU - LeCun, Yann

PY - 2009

Y1 - 2009

N2 - Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

AB - Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation. We describe a novel and efficient algorithm to learn sparse representations, and compare it theoretically and experimentally with a similar machine trained probabilistically, namely a Restricted Boltzmann Machine. We propose a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation. We demonstrate this method by extracting features from a dataset of handwritten numerals, and from a dataset of natural image patches. We show that by stacking multiple levels of such machines and by training sequentially, high-order dependencies between the input observed variables can be captured.

UR - http://www.scopus.com/inward/record.url?scp=84858784801&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84858784801&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84858784801

SN - 160560352X

SN - 9781605603520

BT - Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference

ER -