preprint of NIPS paper
bradtke@envy.cs.umass.edu
bradtke at envy.cs.umass.edu
Thu Jan 28 11:00:57 EST 1993
The following paper has been placed in the Neuroprose Archives.
FTP instructions are given below.
Reinforcement Learning Applied to Linear Quadratic Regulation
Steven J. Bradtke
Computer Science Department
University of Massachusetts
Amherst, MA 01003
bradtke at cs.umass.edu
Recent research on reinforcement learning has focused on algorithms
based on the principles of Dynamic Programming (DP). One of the most
promising areas of application for these algorithms is the control of
dynamical systems, and some impressive results have been achieved.
However, there are significant gaps between practice and theory. In
particular, there are no convergence proofs for problems with continuous
state and action spaces, or for systems involving non-linear function
approximators (such as multilayer perceptrons). This paper presents
research applying DP-based reinforcement learning theory to Linear
Quadratic Regulation (LQR), an important class of control problems
involving continuous state and action spaces and requiring a simple type of
non-linear function approximator. We describe an algorithm based on
Q-learning that is proven to converge to the optimal controller for a
large class of LQR problems. We also describe a slightly different
algorithm that is only locally convergent to the optimal
Q-function, demonstrating one of the possible pitfalls of using a
non-linear function approximator with DP-based learning.
-----------------------------------------------------
FTP INSTRUCTIONS
Either use "Getps bradtke.nips5.ps.Z", or do the following:
unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
Name: anonymous
Password: neuron
ftp> cd pub/neuroprose
ftp> binary
ftp> get bradtke.nips5.ps.Z
ftp> quit
unix> uncompress bradtke.nips5.ps
unix> lpr -s bradtke.nips5.ps (or however you print postscript)
Steve Bradtke
More information about the Connectionists
mailing list