A tech report on transfer of solutions across multiple RL tasks
Dan Bernstein
bern at cs.umass.edu
Fri Apr 9 17:33:18 EDT 1999
Anouncing a technical report related to solving multiple RL tasks:
http://www-anw.cs.umass.edu/~bern/publications/reuse_tech.ps
--------------------------------------------------------------------------
Daniel S. Bernstein
Adaptive Networks Lab
Department of Computer Science
University of Massachusetts, Amherst
TR-1999-26
April, 1999
We consider the reuse of policies for previous MDPs in learning on a
new MDP, under the assumption that the vector of parameters of each
MDP is drawn from a fixed probability distribution. We use the
options framework, in which an option consists of a set of initiation
states, a policy, and a termination condition. We use an option
called a \emph{reuse option}, for which the set of initiation states
is the set of all states, the policy is a combination of policies from
the old MDPs, and the termination condition is based on the number of
time steps since the option was initiated. Given policies for $m$ of
the MDPs from the distribution, we construct reuse options from the
policies and compare performance on an $m+1$st MDP both with and
without various reuse options. We find that reuse options can speed
initial learning of the $m+1$st task. We also present a distribution
of MDPs for which reuse options can slow initial learning. We discuss
reasons for this and suggest other ways to design reuse options.
Keywords: reinforcement learning, Markov decision processes,
options, learning to learn
----------------------------------------------------------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Daniel S. Bernstein URL: http://www-anw.cs.umass.edu/~bern
Department of Computer Science EMAIL: bern at cs.umass.edu
University of Massachusetts PHONE: (413)545-1596 [office]
Amherst, MA 01003
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
More information about the Connectionists
mailing list