An anatomical substrate of credit assignment in reinforcement learning

Jorgen Kornfeld
Michale S. Fee
Philipp Schubert
Winfried Denk
bioRxiv (2020)

Abstract

How is experience used to improve performance? In both biological and artificial systems,
the optimization of parameters that affect behavior requires a process that determines
whether a parameter affects the outcome and then modifies the parameter accordingly.
Central to the recent bloom of artificial intelligence has been the error-backpropagation
algorithm(Rumelhart, Hinton, and Williams 1986) , which computationally retraces the signal
from the output to each synapse (weight) and allows a large number of parameters to be
optimized in parallel at high learning rates. Biological systems, however, lack an obvious
mechanism to retrace the signal path. Here we show, by combining high-throughput volume
electron microscopy (Denk and Horstmann 2004) and automated connectomic
analysis(Januszewski et al. 2018; Dorkenwald et al. 2017; Schubert et al. 2019) , that the
synaptic architecture of songbird basal ganglia supports a form of local credit assessment
proposed in a model of songbird reinforcement learning (M. S. Fee and Goldberg 2011). We
show that three of this model’s major predictions hold true: first, cortical axons that encode
exploratory motor variability terminate predominantly on dendritic shafts of spiny neurons.
Second, cortical axons that encode timing seek out spines, which enable calcium-based
coincidence detection (R. Yuste and Denk 1995) and appear to be capable of creating and
storing eligibility traces (Yagishita et al. 2014). Third, synapse pairs that presynaptically share
a cortical timing axon and post-synaptically a medium spiny dendrite are substantially more
similar in size than expected, indicating a history of Hebbian plasticity (Bartol et al. 2015;
Kasthuri et al. 2015) . Combined with numerical simulations these data provide strong
evidence for a model of basal ganglia learning with a biologically plausible credit assignment
mechanism.

Research Areas