New RL paper and WWW interface to archive

Fri May 26 18:26:12 EDT 1995

                GENERALIZATION IN REINFORCEMENT LEARNING:
             SUCCESSFUL EXAMPLES USING SPARSE COARSE CODING
                          Richard S. Sutton
                         submitted to NIPS'95

On large problems, reinforcement learning systems must use
parameterized function approximators such as neural networks in order to
generalize between similar situations and actions. In these cases there
are no strong theoretical results on the accuracy of convergence, and
computational results have been mixed. In particular, Boyan and Moore
reported at last year's meeting a series of negative results in
attempting to apply dynamic programming together with function approximation
to simple control problems with continuous state spaces. In this paper,
we present positive results for all the control tasks they attempted,
and for one that is significantly larger. The most important differences
are that we used sparse-coarse-coded function approximators (CMACs)
whereas they used mostly global function approximators, and that we
learned online whereas they learned offline. Boyan and Moore and
others have suggested that the problems they encountered could be solved
by using actual outcomes ("rollouts"), as in classical Monte Carlo
methods, and as in the TD(lambda) algorithm when lambda=1.
However, in our experiments this always resulted in substantially poorer
performance. We conclude that reinforcement learning can work robustly
in conjunction with function approximators, and that there is little
justification at present for avoiding the case of general lambda.

________________
The paper is available by ftp as
    ftp://ftp.cs.umass.edu/pub/anw/pub/sutton/sutton-inprer.ps.gz
or via a new WWW interface to my small archive at
    http://envy.cs.umass.edu/People/sutton/archive.html.
Please change any WWW pointers to the old ftp archive.