Improving generalization performance using double backpropagation

Harris Drucker, Yann LeCun

Research output: Contribution to journalArticle

Abstract

In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian. Significant improvement is shown with different architectures and different test sets, especially with architectures that had previously been shown to have very good performance when trained using backpropagation. It is shown that double backpropagation, as compared to backpropagation, creates weights that are smaller, thereby causing the output of the neurons to spend more time in the linear region.

Original languageEnglish (US)
Pages (from-to)991-997
Number of pages7
JournalIEEE Transactions on Neural Networks
Volume3
Issue number6
DOIs
StatePublished - Nov 1992

Fingerprint

Back Propagation
Backpropagation
Test Set
Training Algorithm
Output
Term
Energy Function
Forcing
Neurons
Neuron
Generalization
Generalise
Energy
Architecture

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Hardware and Architecture
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Theoretical Computer Science

Cite this

Improving generalization performance using double backpropagation. / Drucker, Harris; LeCun, Yann.

In: IEEE Transactions on Neural Networks, Vol. 3, No. 6, 11.1992, p. 991-997.

Research output: Contribution to journalArticle

@article{e9ce76219c05429fa1b14d28816919af,
title = "Improving generalization performance using double backpropagation",
abstract = "In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian. Significant improvement is shown with different architectures and different test sets, especially with architectures that had previously been shown to have very good performance when trained using backpropagation. It is shown that double backpropagation, as compared to backpropagation, creates weights that are smaller, thereby causing the output of the neurons to spend more time in the linear region.",
author = "Harris Drucker and Yann LeCun",
year = "1992",
month = "11",
doi = "10.1109/72.165600",
language = "English (US)",
volume = "3",
pages = "991--997",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",
number = "6",

}

TY - JOUR

T1 - Improving generalization performance using double backpropagation

AU - Drucker, Harris

AU - LeCun, Yann

PY - 1992/11

Y1 - 1992/11

N2 - In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian. Significant improvement is shown with different architectures and different test sets, especially with architectures that had previously been shown to have very good performance when trained using backpropagation. It is shown that double backpropagation, as compared to backpropagation, creates weights that are smaller, thereby causing the output of the neurons to spend more time in the linear region.

AB - In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian. Significant improvement is shown with different architectures and different test sets, especially with architectures that had previously been shown to have very good performance when trained using backpropagation. It is shown that double backpropagation, as compared to backpropagation, creates weights that are smaller, thereby causing the output of the neurons to spend more time in the linear region.

UR - http://www.scopus.com/inward/record.url?scp=0026953305&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0026953305&partnerID=8YFLogxK

U2 - 10.1109/72.165600

DO - 10.1109/72.165600

M3 - Article

AN - SCOPUS:0026953305

VL - 3

SP - 991

EP - 997

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

IS - 6

ER -