Paper on neuroprose archives

Tue Sep 13 08:43:00 EDT 1994

ftp://archive.cis.ohio-state.edu/pub/neuroprose/bradtke.RLforLQ.ps.Z

FTP-host: archive.cis.ohio-state.edu
FTP-file: pub/neuroprose/bradtke.rlforlq.ps.Z

Adaptive Linear Quadratic Control Using Policy Iteration (19 pages)
		CMPSCI Technical Report 94-49

Steven J. Bradtke (1), B. Erik Ydstie (2), and Andrew G. Barto (1)

(1)	Computer Science Department
	University of Massachusetts
	Amherst, MA 01003

(2)	Department of Chemical Engineering
	Carnegie Mellon University
	Pittsburgh, PA 15213

bradtke at cs.umass.edu
ydstie at andrew.cmu.edu
barto at cs.umass.edu

			Abstract

	In this paper we present stability and convergence results for
Dynamic Programming-based reinforcement learning applied to Linear
Quadratic Regulation (LQR).  The specific algorithm we analyze is based on
Q-learning and it is proven to converge to the optimal controller provided
that the underlying system is controllable and a particular signal vector
is persistently excited.  The performance of the algorithm is illustrated
by applying it to a model of a flexible beam.

Instructions for ftp retrieval of this paper are given below.  
Please do not reply directly to this message.

FTP INSTRUCTIONS:

unix> ftp archive.cis.ohio-state.edu (or 128.146.8.52)
    Name: anonymous
    Password: <your e-mail address>
    ftp> cd pub/neuroprose
    ftp> binary
    ftp> get bradtke.rlforlq.ps.Z
    ftp> quit
unix> uncompress bradtke.rlforlq.ps.Z

Thanks to Jordan Pollack for maintaining this archive. 

Steve Bradtke

=======================================================================
Steve Bradtke	(813) 978-6285		GTE Data Services
					DC F4M
Internet:				One E. Telecom Parkway
bradtke@[138.83.42.66]@gte.com		Temple Terrace, FL 33637
bradtke at cs.umass.edu
=======================================================================