Exploiting linear structure within convolutional networks for efficient evaluation

Emily Denton, Wojciech Zaremba, Joan Bruna, Yann LeCun, Rob Fergus

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2 ×, while keeping the accuracy within 1% of the original model.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems
PublisherNeural information processing systems foundation
Pages1269-1277
Number of pages9
Volume2
EditionJanuary
StatePublished - 2014
Event28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada
Duration: Dec 8 2014Dec 13 2014

Other

Other28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014
CountryCanada
CityMontreal
Period12/8/1412/13/14

Fingerprint

Smartphones
Object recognition
Convolution
Program processors
Redundancy
Internet
Graphics processing unit

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Denton, E., Zaremba, W., Bruna, J., LeCun, Y., & Fergus, R. (2014). Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems (January ed., Vol. 2, pp. 1269-1277). Neural information processing systems foundation.

Exploiting linear structure within convolutional networks for efficient evaluation. / Denton, Emily; Zaremba, Wojciech; Bruna, Joan; LeCun, Yann; Fergus, Rob.

Advances in Neural Information Processing Systems. Vol. 2 January. ed. Neural information processing systems foundation, 2014. p. 1269-1277.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Denton, E, Zaremba, W, Bruna, J, LeCun, Y & Fergus, R 2014, Exploiting linear structure within convolutional networks for efficient evaluation. in Advances in Neural Information Processing Systems. January edn, vol. 2, Neural information processing systems foundation, pp. 1269-1277, 28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014, Montreal, Canada, 12/8/14.
Denton E, Zaremba W, Bruna J, LeCun Y, Fergus R. Exploiting linear structure within convolutional networks for efficient evaluation. In Advances in Neural Information Processing Systems. January ed. Vol. 2. Neural information processing systems foundation. 2014. p. 1269-1277
Denton, Emily ; Zaremba, Wojciech ; Bruna, Joan ; LeCun, Yann ; Fergus, Rob. / Exploiting linear structure within convolutional networks for efficient evaluation. Advances in Neural Information Processing Systems. Vol. 2 January. ed. Neural information processing systems foundation, 2014. pp. 1269-1277
@inproceedings{e572ecd9977c40c9a00f5c340522f490,
title = "Exploiting linear structure within convolutional networks for efficient evaluation",
abstract = "We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2 ×, while keeping the accuracy within 1{\%} of the original model.",
author = "Emily Denton and Wojciech Zaremba and Joan Bruna and Yann LeCun and Rob Fergus",
year = "2014",
language = "English (US)",
volume = "2",
pages = "1269--1277",
booktitle = "Advances in Neural Information Processing Systems",
publisher = "Neural information processing systems foundation",
edition = "January",

}

TY - GEN

T1 - Exploiting linear structure within convolutional networks for efficient evaluation

AU - Denton, Emily

AU - Zaremba, Wojciech

AU - Bruna, Joan

AU - LeCun, Yann

AU - Fergus, Rob

PY - 2014

Y1 - 2014

N2 - We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2 ×, while keeping the accuracy within 1% of the original model.

AB - We present techniques for speeding up the test-time evaluation of large convolutional networks, designed for object recognition tasks. These models deliver impressive accuracy, but each image evaluation requires millions of floating point operations, making their deployment on smartphones and Internet-scale clusters problematic. The computation is dominated by the convolution operations in the lower layers of the model. We exploit the redundancy present within the convolutional filters to derive approximations that significantly reduce the required computation. Using large state-of-the-art models, we demonstrate speedups of convolutional layers on both CPU and GPU by a factor of 2 ×, while keeping the accuracy within 1% of the original model.

UR - http://www.scopus.com/inward/record.url?scp=84937896655&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937896655&partnerID=8YFLogxK

M3 - Conference contribution

VL - 2

SP - 1269

EP - 1277

BT - Advances in Neural Information Processing Systems

PB - Neural information processing systems foundation

ER -