Fast approximations to structured sparse coding and applications to object classification

Arthur Szlam, Karol Gregor, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 x 481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.

Original languageEnglish (US)
Title of host publicationComputer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings
Pages200-213
Number of pages14
Volume7576 LNCS
EditionPART 5
DOIs
StatePublished - 2012
Event12th European Conference on Computer Vision, ECCV 2012 - Florence, Italy
Duration: Oct 7 2012Oct 13 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 5
Volume7576 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other12th European Conference on Computer Vision, ECCV 2012
CountryItaly
CityFlorence
Period10/7/1210/13/12

Fingerprint

Sparse Coding
Object Classification
Glossaries
Approximation
Leaves
Pseudo-inverse
Binary trees
Subset
Sparse Representation
Scale Invariant Feature Transform
Object recognition
Object Recognition
Binary Tree
Descriptors
Multiplication
Benchmark
Dictionary
Coefficient
Modeling

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Szlam, A., Gregor, K., & LeCun, Y. (2012). Fast approximations to structured sparse coding and applications to object classification. In Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings (PART 5 ed., Vol. 7576 LNCS, pp. 200-213). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7576 LNCS, No. PART 5). https://doi.org/10.1007/978-3-642-33715-4_15

Fast approximations to structured sparse coding and applications to object classification. / Szlam, Arthur; Gregor, Karol; LeCun, Yann.

Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. Vol. 7576 LNCS PART 5. ed. 2012. p. 200-213 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7576 LNCS, No. PART 5).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Szlam, A, Gregor, K & LeCun, Y 2012, Fast approximations to structured sparse coding and applications to object classification. in Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. PART 5 edn, vol. 7576 LNCS, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), no. PART 5, vol. 7576 LNCS, pp. 200-213, 12th European Conference on Computer Vision, ECCV 2012, Florence, Italy, 10/7/12. https://doi.org/10.1007/978-3-642-33715-4_15
Szlam A, Gregor K, LeCun Y. Fast approximations to structured sparse coding and applications to object classification. In Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. PART 5 ed. Vol. 7576 LNCS. 2012. p. 200-213. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 5). https://doi.org/10.1007/978-3-642-33715-4_15
Szlam, Arthur ; Gregor, Karol ; LeCun, Yann. / Fast approximations to structured sparse coding and applications to object classification. Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings. Vol. 7576 LNCS PART 5. ed. 2012. pp. 200-213 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); PART 5).
@inproceedings{7daba8c242614e3c85c7d5cbad42a33a,
title = "Fast approximations to structured sparse coding and applications to object classification",
abstract = "We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 x 481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.",
author = "Arthur Szlam and Karol Gregor and Yann LeCun",
year = "2012",
doi = "10.1007/978-3-642-33715-4_15",
language = "English (US)",
isbn = "9783642337147",
volume = "7576 LNCS",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
number = "PART 5",
pages = "200--213",
booktitle = "Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings",
edition = "PART 5",

}

TY - GEN

T1 - Fast approximations to structured sparse coding and applications to object classification

AU - Szlam, Arthur

AU - Gregor, Karol

AU - LeCun, Yann

PY - 2012

Y1 - 2012

N2 - We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 x 481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.

AB - We describe a method for fast approximation of sparse coding. A given input vector is passed through a binary tree. Each leaf of the tree contains a subset of dictionary elements. The coefficients corresponding to these dictionary elements are allowed to be nonzero and their values are calculated quickly by multiplication with a precomputed pseudoinverse. The tree parameters, the dictionary, and the subsets of the dictionary corresponding to each leaf are learned. In the process of describing this algorithm, we discuss the more general problem of learning the groups in group structured sparse modeling. We show that our method creates good sparse representations by using it in the object recognition framework of [1,2]. Implementing our own fast version of the SIFT descriptor the whole system runs at 20 frames per second on 321 x 481 sized images on a laptop with a quad-core cpu, while sacrificing very little accuracy on the Caltech 101, Caltech 256, and 15 scenes benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=84867882953&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84867882953&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-33715-4_15

DO - 10.1007/978-3-642-33715-4_15

M3 - Conference contribution

SN - 9783642337147

VL - 7576 LNCS

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 200

EP - 213

BT - Computer Vision, ECCV 2012 - 12th European Conference on Computer Vision, Proceedings

ER -