Unsupervised learning of invariant feature hierarchies with applications to object recognition

Marc'Aurelio Ranzato, Fu Jie Huang, Y. Lan Boureau, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

Original languageEnglish (US)
Title of host publication2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
DOIs
StatePublished - 2007
Event2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07 - Minneapolis, MN, United States
Duration: Jun 17 2007Jun 22 2007

Other

Other2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07
CountryUnited States
CityMinneapolis, MN
Period6/17/076/22/07

Fingerprint

Unsupervised learning
Supervised learning
Object recognition
Parameterization
Convolution
Classifiers
Detectors

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Vision and Pattern Recognition
  • Software
  • Control and Systems Engineering

Cite this

Ranzato, MA., Huang, F. J., Boureau, Y. L., & LeCun, Y. (2007). Unsupervised learning of invariant feature hierarchies with applications to object recognition. In 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07 [4270182] https://doi.org/10.1109/CVPR.2007.383157

Unsupervised learning of invariant feature hierarchies with applications to object recognition. / Ranzato, Marc'Aurelio; Huang, Fu Jie; Boureau, Y. Lan; LeCun, Yann.

2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07. 2007. 4270182.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ranzato, MA, Huang, FJ, Boureau, YL & LeCun, Y 2007, Unsupervised learning of invariant feature hierarchies with applications to object recognition. in 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07., 4270182, 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07, Minneapolis, MN, United States, 6/17/07. https://doi.org/10.1109/CVPR.2007.383157
Ranzato MA, Huang FJ, Boureau YL, LeCun Y. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07. 2007. 4270182 https://doi.org/10.1109/CVPR.2007.383157
Ranzato, Marc'Aurelio ; Huang, Fu Jie ; Boureau, Y. Lan ; LeCun, Yann. / Unsupervised learning of invariant feature hierarchies with applications to object recognition. 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07. 2007.
@inproceedings{a0e6bfb584ff44d2a6b2756c099dacff,
title = "Unsupervised learning of invariant feature hierarchies with applications to object recognition",
abstract = "We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64{\%} error on MNIST, and 54{\%} average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.",
author = "Marc'Aurelio Ranzato and Huang, {Fu Jie} and Boureau, {Y. Lan} and Yann LeCun",
year = "2007",
doi = "10.1109/CVPR.2007.383157",
language = "English (US)",
isbn = "1424411807",
booktitle = "2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07",

}

TY - GEN

T1 - Unsupervised learning of invariant feature hierarchies with applications to object recognition

AU - Ranzato, Marc'Aurelio

AU - Huang, Fu Jie

AU - Boureau, Y. Lan

AU - LeCun, Yann

PY - 2007

Y1 - 2007

N2 - We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

AB - We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level. Training a supervised classifier on these features yields 0.64% error on MNIST, and 54% average recognition rate on Caltech 101 with 30 training samples per category. While the resulting architecture is similar to convolutional networks, the layer-wise unsupervised training procedure alleviates the over-parameterization problems that plague purely supervised learning procedures, and yields good performance with very few labeled training samples.

UR - http://www.scopus.com/inward/record.url?scp=34948870900&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34948870900&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2007.383157

DO - 10.1109/CVPR.2007.383157

M3 - Conference contribution

SN - 1424411807

SN - 9781424411801

BT - 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR'07

ER -