paper: hebbian synaptic update rule for reinforcement learning

Fri Nov 26 21:17:58 EST 1999

The following paper is available at
  http://csl.anu.edu.au/~bartlett/papers/BartlettBaxter-Nov99.ps.gz

Hebbian Synaptic Modifications in Spiking Neurons that Learn
Peter L. Bartlett and Jonathan Baxter
Australian National University

In this paper, we derive a new model of synaptic plasticity, based on
recent algorithms for reinforcement learning (in which an agent attempts
to learn appropriate actions to maximize its long-term average reward). 
We show that these direct reinforcement learning algorithms also give
locally optimal performance for the problem of reinforcement learning
with multiple agents, without any explicit communication between
agents.  By considering a network of spiking neurons as a collection of
agents attempting to maximize the long-term average of a reward signal,
we derive a synaptic update rule that is qualitatively similar to Hebb's
postulate.  This rule  requires only simple computations, such as
addition and leaky  integration, and involves only quantities that are
available in the  vicinity of the synapse.  Furthermore, it leads to
synaptic  connection strengths that give locally optimal values of the
long  term average reward.  The reinforcement learning paradigm is 
sufficiently broad to encompass many learning problems that are  solved
by the brain. We illustrate, with simulations, that the  approach is
effective for simple pattern classification and motor learning tasks.

-- 
Peter.

Peter Bartlett                       email: Peter.Bartlett at anu.edu.au
Machine Learning Group
Computer Sciences Laboratory                   Phone: +61 2 6279 8681
Research School of Information Sciences and Engineering
Australian National University                 Fax:   +61 2 6279 8645
Canberra, 0200 AUSTRALIA              http://csl.anu.edu.au/~bartlett