Learning invariant features through topographic filter maps

Koray Kavukcuoglu, Marc'Aurelio Ranzato, Robert Fergus, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited.

Original languageEnglish (US)
Title of host publication2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009
Pages1605-1612
Number of pages8
DOIs
StatePublished - 2009
Event2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009 - Miami, FL, United States
Duration: Jun 20 2009Jun 25 2009

Other

Other2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009
CountryUnited States
CityMiami, FL
Period6/20/096/25/09

Fingerprint

Image recognition
Object recognition
Feature extraction
Classifiers
Detectors

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Biomedical Engineering

Cite this

Kavukcuoglu, K., Ranzato, MA., Fergus, R., & LeCun, Y. (2009). Learning invariant features through topographic filter maps. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009 (pp. 1605-1612). [5206545] https://doi.org/10.1109/CVPRW.2009.5206545

Learning invariant features through topographic filter maps. / Kavukcuoglu, Koray; Ranzato, Marc'Aurelio; Fergus, Robert; LeCun, Yann.

2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009. 2009. p. 1605-1612 5206545.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kavukcuoglu, K, Ranzato, MA, Fergus, R & LeCun, Y 2009, Learning invariant features through topographic filter maps. in 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009., 5206545, pp. 1605-1612, 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, Miami, FL, United States, 6/20/09. https://doi.org/10.1109/CVPRW.2009.5206545
Kavukcuoglu K, Ranzato MA, Fergus R, LeCun Y. Learning invariant features through topographic filter maps. In 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009. 2009. p. 1605-1612. 5206545 https://doi.org/10.1109/CVPRW.2009.5206545
Kavukcuoglu, Koray ; Ranzato, Marc'Aurelio ; Fergus, Robert ; LeCun, Yann. / Learning invariant features through topographic filter maps. 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009. 2009. pp. 1605-1612
@inproceedings{ec6282726eaa482a978c0682848e062d,
title = "Learning invariant features through topographic filter maps",
abstract = "Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited.",
author = "Koray Kavukcuoglu and Marc'Aurelio Ranzato and Robert Fergus and Yann LeCun",
year = "2009",
doi = "10.1109/CVPRW.2009.5206545",
language = "English (US)",
isbn = "9781424439935",
pages = "1605--1612",
booktitle = "2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009",

}

TY - GEN

T1 - Learning invariant features through topographic filter maps

AU - Kavukcuoglu, Koray

AU - Ranzato, Marc'Aurelio

AU - Fergus, Robert

AU - LeCun, Yann

PY - 2009

Y1 - 2009

N2 - Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited.

AB - Several recently-proposed architectures for highperformance object recognition are composed of two main stages: a feature extraction stage that extracts locallyinvariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier. The first stage is often composed of three main modules: (1) a bank of filters (often oriented edge detectors); (2) a non-linear transform, such as a point-wise squashing functions, quantization, or normalization; (3) a spatial pooling operation which combines the outputs of similar filters over neighboring regions. We propose a method that automatically learns such feature extractors in an unsupervised fashion by simultaneously learning the filters and the pooling units that combine multiple filter outputs together. The method automatically generates topographic maps of similar filters that extract features of orientations, scales, and positions. These similar filters are pooled together, producing locally-invariant outputs. The learned feature descriptors give comparable results as SIFT on image recognition tasks for which SIFT is well suited, and better results than SIFT on tasks for which SIFT is less well suited.

UR - http://www.scopus.com/inward/record.url?scp=70450177775&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70450177775&partnerID=8YFLogxK

U2 - 10.1109/CVPRW.2009.5206545

DO - 10.1109/CVPRW.2009.5206545

M3 - Conference contribution

SN - 9781424439935

SP - 1605

EP - 1612

BT - 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009

ER -