Reward-based training of recurrent neural networks for cognitive and value-based tasks

H. Francis Song, Guangyu R. Yang, Xiao-Jing Wang

Research output: Contribution to journalArticle

Abstract

Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal’s internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.

Original languageEnglish (US)
Article numbere21492
JournaleLife
Volume6
DOIs
StatePublished - Jan 13 2017

Fingerprint

Recurrent neural networks
Reward
Animals
Learning
Supervised learning
Reinforcement learning
Neural Networks (Computer)
Cognition
Neural networks
Feedback
Networks (circuits)

ASJC Scopus subject areas

  • Neuroscience(all)
  • Medicine(all)
  • Immunology and Microbiology(all)
  • Biochemistry, Genetics and Molecular Biology(all)

Cite this

Reward-based training of recurrent neural networks for cognitive and value-based tasks. / Song, H. Francis; Yang, Guangyu R.; Wang, Xiao-Jing.

In: eLife, Vol. 6, e21492, 13.01.2017.

Research output: Contribution to journalArticle

@article{8011804696604572bf4fb8eace5cb90f,
title = "Reward-based training of recurrent neural networks for cognitive and value-based tasks",
abstract = "Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal’s internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.",
author = "Song, {H. Francis} and Yang, {Guangyu R.} and Xiao-Jing Wang",
year = "2017",
month = "1",
day = "13",
doi = "10.7554/eLife.21492",
language = "English (US)",
volume = "6",
journal = "eLife",
issn = "2050-084X",
publisher = "eLife Sciences Publications",

}

TY - JOUR

T1 - Reward-based training of recurrent neural networks for cognitive and value-based tasks

AU - Song, H. Francis

AU - Yang, Guangyu R.

AU - Wang, Xiao-Jing

PY - 2017/1/13

Y1 - 2017/1/13

N2 - Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal’s internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.

AB - Trained neural network models, which exhibit features of neural activity recorded from behaving animals, may provide insights into the circuit mechanisms of cognitive functions through systematic analysis of network activity and connectivity. However, in contrast to the graded error signals commonly used to train networks through supervised learning, animals learn from reward feedback on definite actions through reinforcement learning. Reward maximization is particularly relevant when optimal behavior depends on an animal’s internal judgment of confidence or subjective preferences. Here, we implement reward-based training of recurrent neural networks in which a value network guides learning by using the activity of the decision network to predict future reward. We show that such models capture behavioral and electrophysiological findings from well-known experimental paradigms. Our work provides a unified framework for investigating diverse cognitive and value-based computations, and predicts a role for value representation that is essential for learning, but not executing, a task.

UR - http://www.scopus.com/inward/record.url?scp=85012005486&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85012005486&partnerID=8YFLogxK

U2 - 10.7554/eLife.21492

DO - 10.7554/eLife.21492

M3 - Article

C2 - 28084991

AN - SCOPUS:85012005486

VL - 6

JO - eLife

JF - eLife

SN - 2050-084X

M1 - e21492

ER -