Learning convolutional feature hierarchies for visual recognition

Koray Kavukcuoglu, Pierre Sermanet, Y. Lan Boureau, Karol Gregor, Michaël Mathieu, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
StatePublished - 2010
Event24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010 - Vancouver, BC, Canada
Duration: Dec 6 2010Dec 9 2010

Other

Other24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010
CountryCanada
CityVancouver, BC
Period12/6/1012/9/10

Fingerprint

Detectors
Network architecture

ASJC Scopus subject areas

  • Information Systems

Cite this

Kavukcuoglu, K., Sermanet, P., Boureau, Y. L., Gregor, K., Mathieu, M., & LeCun, Y. (2010). Learning convolutional feature hierarchies for visual recognition. In Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

Learning convolutional feature hierarchies for visual recognition. / Kavukcuoglu, Koray; Sermanet, Pierre; Boureau, Y. Lan; Gregor, Karol; Mathieu, Michaël; LeCun, Yann.

Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010. 2010.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kavukcuoglu, K, Sermanet, P, Boureau, YL, Gregor, K, Mathieu, M & LeCun, Y 2010, Learning convolutional feature hierarchies for visual recognition. in Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010. 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010, Vancouver, BC, Canada, 12/6/10.
Kavukcuoglu K, Sermanet P, Boureau YL, Gregor K, Mathieu M, LeCun Y. Learning convolutional feature hierarchies for visual recognition. In Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010. 2010
Kavukcuoglu, Koray ; Sermanet, Pierre ; Boureau, Y. Lan ; Gregor, Karol ; Mathieu, Michaël ; LeCun, Yann. / Learning convolutional feature hierarchies for visual recognition. Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010. 2010.
@inproceedings{caab05845c9648148bb4346ef6f29016,
title = "Learning convolutional feature hierarchies for visual recognition",
abstract = "We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.",
author = "Koray Kavukcuoglu and Pierre Sermanet and Boureau, {Y. Lan} and Karol Gregor and Micha{\"e}l Mathieu and Yann LeCun",
year = "2010",
language = "English (US)",
isbn = "9781617823800",
booktitle = "Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010",

}

TY - GEN

T1 - Learning convolutional feature hierarchies for visual recognition

AU - Kavukcuoglu, Koray

AU - Sermanet, Pierre

AU - Boureau, Y. Lan

AU - Gregor, Karol

AU - Mathieu, Michaël

AU - LeCun, Yann

PY - 2010

Y1 - 2010

N2 - We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.

AB - We propose an unsupervised method for learning multi-stage hierarchies of sparse convolutional features. While sparse coding has become an increasingly popular method for learning visual features, it is most often trained at the patch level. Applying the resulting filters convolutionally results in highly redundant codes because overlapping patches are encoded in isolation. By training convolutionally over large image windows, our method reduces the redudancy between feature vectors at neighboring locations and improves the efficiency of the overall representation. In addition to a linear decoder that reconstructs the image from sparse features, our method trains an efficient feed-forward encoder that predicts quasisparse features from the input. While patch-based training rarely produces anything but oriented edge detectors, we show that convolutional training produces highly diverse filters, including center-surround filters, corner detectors, cross detectors, and oriented grating detectors. We show that using these filters in multistage convolutional network architecture improves performance on a number of visual recognition and detection tasks.

UR - http://www.scopus.com/inward/record.url?scp=84860604923&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84860604923&partnerID=8YFLogxK

M3 - Conference contribution

SN - 9781617823800

BT - Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010, NIPS 2010

ER -