Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks

Brenden Lake, Marco Baroni

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generaliza-tions when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

Original languageEnglish (US)
Title of host publication35th International Conference on Machine Learning, ICML 2018
EditorsJennifer Dy, Andreas Krause
PublisherInternational Machine Learning Society (IMLS)
Pages4487-4499
Number of pages13
Volume7
ISBN (Electronic)9781510867963
StatePublished - Jan 1 2018
Event35th International Conference on Machine Learning, ICML 2018 - Stockholm, Sweden
Duration: Jul 10 2018Jul 15 2018

Other

Other35th International Conference on Machine Learning, ICML 2018
CountrySweden
CityStockholm
Period7/10/187/15/18

Fingerprint

Recurrent neural networks
Navigation
Neural networks
Experiments

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Human-Computer Interaction
  • Software

Cite this

Lake, B., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In J. Dy, & A. Krause (Eds.), 35th International Conference on Machine Learning, ICML 2018 (Vol. 7, pp. 4487-4499). International Machine Learning Society (IMLS).

Generalization without systematicity : On the compositional skills of sequence-to-sequence recurrent networks. / Lake, Brenden; Baroni, Marco.

35th International Conference on Machine Learning, ICML 2018. ed. / Jennifer Dy; Andreas Krause. Vol. 7 International Machine Learning Society (IMLS), 2018. p. 4487-4499.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lake, B & Baroni, M 2018, Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. in J Dy & A Krause (eds), 35th International Conference on Machine Learning, ICML 2018. vol. 7, International Machine Learning Society (IMLS), pp. 4487-4499, 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, 7/10/18.
Lake B, Baroni M. Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Dy J, Krause A, editors, 35th International Conference on Machine Learning, ICML 2018. Vol. 7. International Machine Learning Society (IMLS). 2018. p. 4487-4499
Lake, Brenden ; Baroni, Marco. / Generalization without systematicity : On the compositional skills of sequence-to-sequence recurrent networks. 35th International Conference on Machine Learning, ICML 2018. editor / Jennifer Dy ; Andreas Krause. Vol. 7 International Machine Learning Society (IMLS), 2018. pp. 4487-4499
@inproceedings{886a37b5fc2f43449e4bca3b5557e3ae,
title = "Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks",
abstract = "Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb {"}dax,{"} he or she can immediately understand the meaning of {"}dax twice{"} or {"}sing and dax.{"} In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generaliza-tions when the differences between training and test commands are small, so that they can apply {"}mix-and-match{"} strategies to solve the task. However, when generalization requires systematic compositional skills (as in the {"}dax{"} example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.",
author = "Brenden Lake and Marco Baroni",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
volume = "7",
pages = "4487--4499",
editor = "Jennifer Dy and Andreas Krause",
booktitle = "35th International Conference on Machine Learning, ICML 2018",
publisher = "International Machine Learning Society (IMLS)",

}

TY - GEN

T1 - Generalization without systematicity

T2 - On the compositional skills of sequence-to-sequence recurrent networks

AU - Lake, Brenden

AU - Baroni, Marco

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generaliza-tions when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

AB - Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generaliza-tions when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks' notorious training data thirst.

UR - http://www.scopus.com/inward/record.url?scp=85057241154&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057241154&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:85057241154

VL - 7

SP - 4487

EP - 4499

BT - 35th International Conference on Machine Learning, ICML 2018

A2 - Dy, Jennifer

A2 - Krause, Andreas

PB - International Machine Learning Society (IMLS)

ER -