Technical Report Available
Pierguido V.C. CAIRONI
caironi at elet.polimi.it
Tue Nov 18 08:00:25 EST 1997
Please accept my apologies if you receive multiple copies of this
message.
The following technical report is available on the web at the page:
http://www.elet.polimi.it/~caironi/listpub.html
or directly at:
ftp://www.elet.polimi.it/pub/data/Pierguido.Caironi/tr97_50.ps.gz
-----------------------------------------------------------------------
Gradient-Based Reinforcement Learning:
Learning Combinations of Control Policies
Pierguido V.C. Caironi
email: caironi at elet.polimi.it
Technical Report 97.50
Dipartimento di Elettronica e Informazione
Politecnico di Milano
Abstract
This report presents two innovative reinforcement learning
algorithms for continuous state-action environments:
Gradient REinforceMent LearnINg for Multiple control
policies (GREMLIN-M) and Gradient REinforceMent LearnINg
for Multiple and Single control policies (GREMLIN-MS).
The two algorithms learn optimal combinations of control
policies for autonomous agents. GREMLIN-M learns an optimal
combination of fixed base control policies. GREMLIN-MS
extends GREMLIN-M enabling the agent to learn simultaneously
the base control policies as well.
GREMLIN-M and GREMLIN-MS optimize a performance function
equal to the sum of the expected reinforcements in a sliding
temporal window of finite length. The optimization is carried
out through gradient ascent with respect to the parameter
values of the control functions. While being natural
extensions of previously existing supervised learning
algorithms, GREMLIN-M and GREMLIN-MS improve the current
state of art of reinforcement learning taking into account
the temporal credit assignment problem for the on-line and
real-time combination of control policies.
Furthermore, GREMLIN-M and GREMLIN-MS lend themselves to a
motivational interpretation. That is, the combination
function resulting from learning may be seen as a
representation of the motivations to apply any single base
control policy in different environmental conditions.
--
Name: Pierguido V. C. CAIRONI
Job: Ph.D. Student at the Politecnico di Milano - ITALY
e-mail: caironi at elet.polimi.it
Address: Politecnico di Milano - Dip. di Elettronica e Informazione
Piazza Leonardo da Vinci 32
20133 - Milano - ITALY
Tel: +39-2-23993622
Fax: +39-2-23993411
WWW: http://www.elet.polimi.it/~caironi
More information about the Connectionists
mailing list