No subject
Sebastian Thrun
thrun at gmdzi.uucp
Mon Mar 18 23:39:54 EST 1991
Well, there is a new TR available on the neuroprose archieve which is more or
less an extended version of the NIPS paper I announced some weeks ago:
ON PLANNING AND EXPLORATION IN NON-DISCRETE WORLDS
Sebastian Thrun Knut Moeller
German National Research Center Bonn University
for Computer Science
St. Augustin, FRG Bonn, FRG
The application of reinforcement learning to control problems has received
considerable attention in the last few years [Anderson86,Barto89,Sutton84].
In general there are two principles to solve reinforcement learning problems:
direct and indirect techniques, both having their advantages and
disadvantages.
We present a system that combines both methods. By interaction with an
unknown environment a world model is progressively constructed using the
backpropagation algorithm. For optimizing actions with respect to future
reinforcement planning is applied in two steps: An experience network
proposes a plan, which is subsequently optimized by gradient descent with a
chain of model networks. While operating in a goal-oriented manner due to
the planning process the experience network is trained. Its accumulating
experience is fed back into the planning process in form of initial plans,
such that planning can be gradually reduced. In order to ensure complete
system identification, a competence network is trained to predict the
accuracy of the model. This network enables purposeful exploration of the
world.
The appropriateness of this approach to reinforcement learning is demonstrated
by three different control experiments, namely a target tracking, a robotics
and a pole balancing task.
Keywords: backpropagation, connectionist networks, control, exploration,
planning, pole balancing, reinforcement learning, robotics, neural networks,
and, and, and...
- -------------------------------------------------------------------------
The TR can be retrieved by ftp:
unix> ftp cheops.cis.ohio-state.edu
Name: anonymous
Guest Login ok, send ident as password
Password: neuron
ftp> binary
ftp> cd pub
ftp> cd neuroprose
ftp> get thrun.plan-explor.ps.Z
ftp> bye
unix> uncompress thrun.plan-explor.ps
unix> lpr thrun.plan-explor.ps
- -------------------------------------------------------------------------
If you have trouble in ftping the files, do not hesitate to contact me.
--- Sebastian Thrun
(st at gmdzi.uucp, st at gmdzi.gmd.de)
------- End of Forwarded Message
More information about the Connectionists
mailing list