Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making

Tom Schönberg, Nathaniel D. Daw, Daphna Joel, John P. O'Doherty

Research output: Contribution to journalArticle

Abstract

The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.

Original languageEnglish (US)
Pages (from-to)12860-12867
Number of pages8
JournalJournal of Neuroscience
Volume27
Issue number47
DOIs
StatePublished - Nov 21 2007

Fingerprint

Reward
Decision Making
Learning
Corpus Striatum
Dopaminergic Neurons
Task Performance and Analysis
Mesencephalon
Reinforcement (Psychology)
Individuality
Magnetic Resonance Imaging

Keywords

  • Associative learning
  • Basal ganglia
  • Computational models
  • fMRI
  • Instrumental conditioning
  • Prediction errors

ASJC Scopus subject areas

  • Neuroscience(all)

Cite this

Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. / Schönberg, Tom; Daw, Nathaniel D.; Joel, Daphna; O'Doherty, John P.

In: Journal of Neuroscience, Vol. 27, No. 47, 21.11.2007, p. 12860-12867.

Research output: Contribution to journalArticle

Schönberg, Tom ; Daw, Nathaniel D. ; Joel, Daphna ; O'Doherty, John P. / Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. In: Journal of Neuroscience. 2007 ; Vol. 27, No. 47. pp. 12860-12867.
@article{244db2681ef5468689f8415942f0fa81,
title = "Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making",
abstract = "The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.",
keywords = "Associative learning, Basal ganglia, Computational models, fMRI, Instrumental conditioning, Prediction errors",
author = "Tom Sch{\"o}nberg and Daw, {Nathaniel D.} and Daphna Joel and O'Doherty, {John P.}",
year = "2007",
month = "11",
day = "21",
doi = "10.1523/JNEUROSCI.2496-07.2007",
language = "English (US)",
volume = "27",
pages = "12860--12867",
journal = "Journal of Neuroscience",
issn = "0270-6474",
publisher = "Society for Neuroscience",
number = "47",

}

TY - JOUR

T1 - Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making

AU - Schönberg, Tom

AU - Daw, Nathaniel D.

AU - Joel, Daphna

AU - O'Doherty, John P.

PY - 2007/11/21

Y1 - 2007/11/21

N2 - The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.

AB - The computational framework of reinforcement learning has been used to forward our understanding of the neural mechanisms underlying reward learning and decision-making behavior. It is known that humans vary widely in their performance in decision-making tasks. Here, we used a simple four-armed bandit task in which subjects are almost evenly split into two groups on the basis of their performance: those who do learn to favor choice of the optimal action and those who do not. Using models of reinforcement learning we sought to determine the neural basis of these intrinsic differences in performance by scanning both groups with functional magnetic resonance imaging. We scanned 29 subjects while they performed the reward-based decision-making task. Our results suggest that these two groups differ markedly in the degree to which reinforcement learning signals in the striatum are engaged during task performance. While the learners showed robust prediction error signals in both the ventral and dorsal striatum during learning, the nonlearner group showed a marked absence of such signals. Moreover, the magnitude of prediction error signals in a region of dorsal striatum correlated significantly with a measure of behavioral performance across all subjects. These findings support a crucial role of prediction error signals, likely originating from dopaminergic midbrain neurons, in enabling learning of action selection preferences on the basis of obtained rewards. Thus, spontaneously observed individual differences in decision making performance demonstrate the suggested dependence of this type of learning on the functional integrity of the dopaminergic striatal system in humans.

KW - Associative learning

KW - Basal ganglia

KW - Computational models

KW - fMRI

KW - Instrumental conditioning

KW - Prediction errors

UR - http://www.scopus.com/inward/record.url?scp=36348966690&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=36348966690&partnerID=8YFLogxK

U2 - 10.1523/JNEUROSCI.2496-07.2007

DO - 10.1523/JNEUROSCI.2496-07.2007

M3 - Article

C2 - 18032658

AN - SCOPUS:36348966690

VL - 27

SP - 12860

EP - 12867

JO - Journal of Neuroscience

JF - Journal of Neuroscience

SN - 0270-6474

IS - 47

ER -