Adaptive deconvolutional networks for mid and high level feature learning

Matthew D. Zeiler, Graham W. Taylor, Rob Fergus

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.

Original languageEnglish (US)
Title of host publication2011 International Conference on Computer Vision, ICCV 2011
Pages2018-2025
Number of pages8
DOIs
StatePublished - 2011
Event2011 IEEE International Conference on Computer Vision, ICCV 2011 - Barcelona, Spain
Duration: Nov 6 2011Nov 13 2011

Other

Other2011 IEEE International Conference on Computer Vision, ICCV 2011
CountrySpain
CityBarcelona
Period11/6/1111/13/11

Fingerprint

Classifiers
Decomposition

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Zeiler, M. D., Taylor, G. W., & Fergus, R. (2011). Adaptive deconvolutional networks for mid and high level feature learning. In 2011 International Conference on Computer Vision, ICCV 2011 (pp. 2018-2025). [6126474] https://doi.org/10.1109/ICCV.2011.6126474

Adaptive deconvolutional networks for mid and high level feature learning. / Zeiler, Matthew D.; Taylor, Graham W.; Fergus, Rob.

2011 International Conference on Computer Vision, ICCV 2011. 2011. p. 2018-2025 6126474.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zeiler, MD, Taylor, GW & Fergus, R 2011, Adaptive deconvolutional networks for mid and high level feature learning. in 2011 International Conference on Computer Vision, ICCV 2011., 6126474, pp. 2018-2025, 2011 IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, 11/6/11. https://doi.org/10.1109/ICCV.2011.6126474
Zeiler MD, Taylor GW, Fergus R. Adaptive deconvolutional networks for mid and high level feature learning. In 2011 International Conference on Computer Vision, ICCV 2011. 2011. p. 2018-2025. 6126474 https://doi.org/10.1109/ICCV.2011.6126474
Zeiler, Matthew D. ; Taylor, Graham W. ; Fergus, Rob. / Adaptive deconvolutional networks for mid and high level feature learning. 2011 International Conference on Computer Vision, ICCV 2011. 2011. pp. 2018-2025
@inproceedings{12c788ee148043d892b74b4d15d8c192,
title = "Adaptive deconvolutional networks for mid and high level feature learning",
abstract = "We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.",
author = "Zeiler, {Matthew D.} and Taylor, {Graham W.} and Rob Fergus",
year = "2011",
doi = "10.1109/ICCV.2011.6126474",
language = "English (US)",
isbn = "9781457711015",
pages = "2018--2025",
booktitle = "2011 International Conference on Computer Vision, ICCV 2011",

}

TY - GEN

T1 - Adaptive deconvolutional networks for mid and high level feature learning

AU - Zeiler, Matthew D.

AU - Taylor, Graham W.

AU - Fergus, Rob

PY - 2011

Y1 - 2011

N2 - We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.

AB - We present a hierarchical model that learns image decompositions via alternating layers of convolutional sparse coding and max pooling. When trained on natural images, the layers of our model capture image information in a variety of forms: low-level edges, mid-level edge junctions, high-level object parts and complete objects. To build our model we rely on a novel inference scheme that ensures each layer reconstructs the input, rather than just the output of the layer directly beneath, as is common with existing hierarchical approaches. This makes it possible to learn multiple layers of representation and we show models with 4 layers, trained on images from the Caltech-101 and 256 datasets. When combined with a standard classifier, features extracted from these models outperform SIFT, as well as representations from other feature learning methods.

UR - http://www.scopus.com/inward/record.url?scp=84856686379&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84856686379&partnerID=8YFLogxK

U2 - 10.1109/ICCV.2011.6126474

DO - 10.1109/ICCV.2011.6126474

M3 - Conference contribution

AN - SCOPUS:84856686379

SN - 9781457711015

SP - 2018

EP - 2025

BT - 2011 International Conference on Computer Vision, ICCV 2011

ER -