The ubiquity of model-based reinforcement learning

Bradley B. Doll, Dylan A. Simon, Nathaniel D. Daw

Research output: Contribution to journalReview article

Abstract

The reward prediction error (RPE) theory of dopamine (DA) function has enjoyed great success in the neuroscience of learning and decision-making. This theory is derived from model-free reinforcement learning (RL), in which choices are made simply on the basis of previously realized rewards. Recently, attention has turned to correlates of more flexible, albeit computationally complex, model-based methods in the brain. These methods are distinguished from model-free learning by their evaluation of candidate actions using expected future outcomes according to a world model. Puzzlingly, signatures from these computations seem to be pervasive in the very same regions previously thought to support model-free learning. Here, we review recent behavioral and neural evidence about these two systems, in attempt to reconcile their enigmatic cohabitation in the brain.

Original languageEnglish (US)
Pages (from-to)1075-1081
Number of pages7
JournalCurrent Opinion in Neurobiology
Volume22
Issue number6
DOIs
StatePublished - Dec 1 2012

ASJC Scopus subject areas

  • Neuroscience(all)

Fingerprint Dive into the research topics of 'The ubiquity of model-based reinforcement learning'. Together they form a unique fingerprint.

  • Cite this