Connectionists: New article on Dopamine and STDP
Eugene M. Izhikevich
Eugene.Izhikevich at nsi.edu
Mon Jan 15 14:34:24 EST 2007
Modulation of spike-timing-dependent plasticity (STDP) by dopamine (DA)
can resolve many outstanding questions in cognitive neuroscience, in
particular, the neurobiological (spiking) mechanism of credit assignment
problem.
Details can be found in the article
E.M. Izhikevich (2007) Solving the Distal Reward Problem through Linkage
of STDP and Dopamine Signaling. Cerebral Cortex, 10.1093/cercor/bhl152,
available at
http://vesicle.nsi.edu/users/izhikevich/publications/dastdp.pdf
ABSTRACT:
Learning the associations between cues and rewards (classical or
Pavlovian conditioning) or between cues, actions, and rewards
(instrumental or operant conditioning) involves reinforcement of
neuronal activity by rewards or punishments. Typically, the reward
comes seconds after reward-predicting cues or reward-triggering actions,
creating an explanatory conundrum known in the behavioral literature as
the "distal reward problem" and in the reinforcement learning literature
as the "credit assignment problem". Indeed, how does the animal know
which of the many cues and actions preceding the reward should be
credited for the reward? In neural terms, in which sensory cues and
motor actions correspond to neuronal firings, how does the brain know
what firing patterns, out of an unlimited repertoire of all possible
patterns, are responsible for the reward if the patterns are no longer
there when the reward arrives? How does it know which spikes of which
neurons result in the reward if *many* neurons fire during the waiting
period to the reward? Finally, how does the common reinforcement signal
in the form of the neuromodulator dopamine (DA) influence the right
synapses at the right time, if DA is released globally to many synapses?
Here we show how the conundrum is resolved by a model network of
cortical spiking neurons with spike-timing-dependent plasticity (STDP)
modulated by dopamine (DA). Although STDP is triggered by
nearly-coincident firing patterns on a millisecond time scale, slow
kinetics of subsequent synaptic plasticity is sensitive to changes in
the extracellular DA concentration during the critical period of a few
seconds. Random firings during the waiting period to the reward do not
affect STDP, and hence make the network insensitive to the ongoing
activity --- the key feature that distinguishes our approach from
previous theoretical studies, which implicitly assume that the network
be quiet during the waiting period or that the patterns be preserved
until the reward arrives. This study emphasizes the importance of
precise firing patterns in brain dynamics.
--
Eugene M. Izhikevich, Ph.D., http://www.izhikevich.com
The Neurosciences Institute, Eugene.Izhikevich at nsi.edu
10640 John J. Hopkins Drive tel:(858) 626-2063
San Diego, CA, 92121, USA fax:(858) 626-2099
More information about the Connectionists
mailing list