Dynamic neural turing machine with continuous and discrete addressing schemes

Caglar Gulcehre, Sarath Chandar, Kyunghyun Cho, Yoshua Bengio

Research output: Contribution to journalArticle

Abstract

We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell twoseparate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones.We implement theD-NTMwith both continuous and discrete read and write mechanisms.We investigate the mechanisms and effects of learning to read andwrite into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of ourmodel and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential pMNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

Original languageEnglish (US)
Pages (from-to)857-884
Number of pages28
JournalNeural computation
Volume30
Issue number4
DOIs
StatePublished - Apr 1 2018

Fingerprint

Long-Term Memory
Short-Term Memory
Language
Learning
Turing Machine

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Cognitive Neuroscience

Cite this

Dynamic neural turing machine with continuous and discrete addressing schemes. / Gulcehre, Caglar; Chandar, Sarath; Cho, Kyunghyun; Bengio, Yoshua.

In: Neural computation, Vol. 30, No. 4, 01.04.2018, p. 857-884.

Research output: Contribution to journalArticle

Gulcehre, Caglar ; Chandar, Sarath ; Cho, Kyunghyun ; Bengio, Yoshua. / Dynamic neural turing machine with continuous and discrete addressing schemes. In: Neural computation. 2018 ; Vol. 30, No. 4. pp. 857-884.
@article{a4794583c917445996419ec1d830b6fa,
title = "Dynamic neural turing machine with continuous and discrete addressing schemes",
abstract = "We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell twoseparate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones.We implement theD-NTMwith both continuous and discrete read and write mechanisms.We investigate the mechanisms and effects of learning to read andwrite into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of ourmodel and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential pMNIST, Stanford Natural Language Inference, associative recall, and copy tasks.",
author = "Caglar Gulcehre and Sarath Chandar and Kyunghyun Cho and Yoshua Bengio",
year = "2018",
month = "4",
day = "1",
doi = "10.1162/NECO_a_01060",
language = "English (US)",
volume = "30",
pages = "857--884",
journal = "Neural computation",
issn = "0899-7667",
number = "4",

}

TY - JOUR

T1 - Dynamic neural turing machine with continuous and discrete addressing schemes

AU - Gulcehre, Caglar

AU - Chandar, Sarath

AU - Cho, Kyunghyun

AU - Bengio, Yoshua

PY - 2018/4/1

Y1 - 2018/4/1

N2 - We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell twoseparate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones.We implement theD-NTMwith both continuous and discrete read and write mechanisms.We investigate the mechanisms and effects of learning to read andwrite into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of ourmodel and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential pMNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

AB - We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This addressing scheme maintains for each memory cell twoseparate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones.We implement theD-NTMwith both continuous and discrete read and write mechanisms.We investigate the mechanisms and effects of learning to read andwrite into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of ourmodel and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential pMNIST, Stanford Natural Language Inference, associative recall, and copy tasks.

UR - http://www.scopus.com/inward/record.url?scp=85044322950&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85044322950&partnerID=8YFLogxK

U2 - 10.1162/NECO_a_01060

DO - 10.1162/NECO_a_01060

M3 - Article

VL - 30

SP - 857

EP - 884

JO - Neural computation

JF - Neural computation

SN - 0899-7667

IS - 4

ER -