Recurrent orthogonal networks and long-memory tasks

Mikael Henaff, Arthur Szlam, Yann LeCun

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur-thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

Original languageEnglish (US)
Title of host publication33rd International Conference on Machine Learning, ICML 2016
PublisherInternational Machine Learning Society (IMLS)
Pages2978-2986
Number of pages9
Volume5
ISBN (Electronic)9781510829008
StatePublished - 2016
Event33rd International Conference on Machine Learning, ICML 2016 - New York City, United States
Duration: Jun 19 2016Jun 24 2016

Other

Other33rd International Conference on Machine Learning, ICML 2016
CountryUnited States
CityNew York City
Period6/19/166/24/16

Fingerprint

Data storage equipment
Processing

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Computer Networks and Communications

Cite this

Henaff, M., Szlam, A., & LeCun, Y. (2016). Recurrent orthogonal networks and long-memory tasks. In 33rd International Conference on Machine Learning, ICML 2016 (Vol. 5, pp. 2978-2986). International Machine Learning Society (IMLS).

Recurrent orthogonal networks and long-memory tasks. / Henaff, Mikael; Szlam, Arthur; LeCun, Yann.

33rd International Conference on Machine Learning, ICML 2016. Vol. 5 International Machine Learning Society (IMLS), 2016. p. 2978-2986.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Henaff, M, Szlam, A & LeCun, Y 2016, Recurrent orthogonal networks and long-memory tasks. in 33rd International Conference on Machine Learning, ICML 2016. vol. 5, International Machine Learning Society (IMLS), pp. 2978-2986, 33rd International Conference on Machine Learning, ICML 2016, New York City, United States, 6/19/16.
Henaff M, Szlam A, LeCun Y. Recurrent orthogonal networks and long-memory tasks. In 33rd International Conference on Machine Learning, ICML 2016. Vol. 5. International Machine Learning Society (IMLS). 2016. p. 2978-2986
Henaff, Mikael ; Szlam, Arthur ; LeCun, Yann. / Recurrent orthogonal networks and long-memory tasks. 33rd International Conference on Machine Learning, ICML 2016. Vol. 5 International Machine Learning Society (IMLS), 2016. pp. 2978-2986
@inproceedings{7d25bba3e1f240e99ba45f1de592332b,
title = "Recurrent orthogonal networks and long-memory tasks",
abstract = "Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur-thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.",
author = "Mikael Henaff and Arthur Szlam and Yann LeCun",
year = "2016",
language = "English (US)",
volume = "5",
pages = "2978--2986",
booktitle = "33rd International Conference on Machine Learning, ICML 2016",
publisher = "International Machine Learning Society (IMLS)",

}

TY - GEN

T1 - Recurrent orthogonal networks and long-memory tasks

AU - Henaff, Mikael

AU - Szlam, Arthur

AU - LeCun, Yann

PY - 2016

Y1 - 2016

N2 - Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur-thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

AB - Although RNNs have been shown to be powerful tools for processing sequential data, finding architectures or optimization strategies that allow them to model very long term dependencies is still an active area of research In this work, we carefully analyze two synthetic datasets originally outlined in (Hochreiter & Schmidhuber, 1997) which are used to evaluate the ability of RNNs to store information over many time steps. We explicitly construct RNN solutions to these problems, and using these constructions, illuminate both the problems themselves and the way in which RNNs store different types of information in their hidden states. These constructions fur-thermore explain the success of recent methods that specify unitary initializations or constraints on the transition matrices.

UR - http://www.scopus.com/inward/record.url?scp=84999048318&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84999048318&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84999048318

VL - 5

SP - 2978

EP - 2986

BT - 33rd International Conference on Machine Learning, ICML 2016

PB - International Machine Learning Society (IMLS)

ER -