Mathematical Psychology
About

Model-Based vs. Model-Free

The distinction between model-based (goal-directed) and model-free (habitual) learning captures two fundamental modes of behavioral control that compete and cooperate in guiding decisions.

Daw, Niv, and Dayan (2005) formalized the psychological distinction between habitual and goal-directed behavior as a computational distinction between model-free and model-based reinforcement learning. Model-free learning caches action values through direct experience (like Q-learning), while model-based learning maintains an internal model of the environment and plans by simulating outcomes.

Two Systems

Model-Free vs. Model-Based Model-free: Q(s,a) ← Q(s,a) + α·δ (cached values, fast, inflexible)
Model-based: Q(s,a) = Σ T(s'|s,a)·[R(s') + γ·max_a' Q(s',a')] (planned, slow, flexible)

Model-free ≈ habitual (dorsolateral striatum)
Model-based ≈ goal-directed (prefrontal cortex, caudate)

The Two-Step Task

Daw et al. (2011) designed the two-step task to dissociate the two systems behaviorally. After a first-stage choice leads probabilistically to one of two second stages, model-free agents repeat actions that led to reward regardless of the transition probability, while model-based agents account for the transition structure. Most people show a mixture of both strategies, with the relative contribution of model-based planning varying with cognitive load, stress, and individual differences.

The model-based/model-free framework has been applied to understanding compulsive behavior in OCD (excessive habitual control), addiction (shift from goal-directed to habitual drug seeking), and the development of cognitive control across childhood and adolescence.

Related Topics

References

  1. Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8(12), 1704–1711. https://doi.org/10.1038/nn1560
  2. Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–1215. https://doi.org/10.1016/j.neuron.2011.02.027
  3. Gillan, C. M., Kosinski, M., Whelan, R., Phelps, E. A., & Daw, N. D. (2016). Characterizing a psychiatric symptom dimension related to deficits in goal-directed control. eLife, 5, e11305. https://doi.org/10.7554/eLife.11305
  4. Gläscher, J., Daw, N., Dayan, P., & O’Doherty, J. P. (2010). States versus rewards: Dissociable neural prediction error signals underlying model-based and model-free reinforcement learning. Neuron, 66(4), 585–595. https://doi.org/10.1016/j.neuron.2010.04.016

External Links