Paper on neuroprose archives
Steve Bradtke
bradtke at picard.gteds.gte.com
Tue Sep 13 08:43:00 EDT 1994
ftp://archive.cis.ohio-state.edu/pub/neuroprose/bradtke.RLforLQ.ps.Z
FTP-host: archive.cis.ohio-state.edu
FTP-file: pub/neuroprose/bradtke.rlforlq.ps.Z
Adaptive Linear Quadratic Control Using Policy Iteration (19 pages)
CMPSCI Technical Report 94-49
Steven J. Bradtke (1), B. Erik Ydstie (2), and Andrew G. Barto (1)
(1) Computer Science Department
University of Massachusetts
Amherst, MA 01003
(2) Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
bradtke at cs.umass.edu
ydstie at andrew.cmu.edu
barto at cs.umass.edu
Abstract
In this paper we present stability and convergence results for
Dynamic Programming-based reinforcement learning applied to Linear
Quadratic Regulation (LQR). The specific algorithm we analyze is based on
Q-learning and it is proven to converge to the optimal controller provided
that the underlying system is controllable and a particular signal vector
is persistently excited. The performance of the algorithm is illustrated
by applying it to a model of a flexible beam.
Instructions for ftp retrieval of this paper are given below.
Please do not reply directly to this message.
FTP INSTRUCTIONS:
unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
Name: anonymous
Password: <your e-mail address>
ftp> cd pub/neuroprose
ftp> binary
ftp> get bradtke.rlforlq.ps.Z
ftp> quit
unix> uncompress bradtke.rlforlq.ps.Z
Thanks to Jordan Pollack for maintaining this archive.
Steve Bradtke
=======================================================================
Steve Bradtke (813) 978-6285 GTE Data Services
DC F4M
Internet: One E. Telecom Parkway
bradtke@[138.83.42.66]@gte.com Temple Terrace, FL 33637
bradtke at cs.umass.edu
=======================================================================
More information about the Connectionists
mailing list