Real Pole-Balancing

Wed Jan 20 07:13:24 EST 1993

Recently I posted an abstract for a technical report on real
pole-balancing [1]. Andy Barto his since pointed out that the abstract
gives the impression of condemning approximate dynamic programming
methods as tools for learning control.  This was not our intention.

The offending line is "This limits the usefulness of this kind of
learning controller to small problems which are likely to be better
controlled by other means. Before a learning controller can tackle
more difficult problems, a more powerful learning scheme has to be
found."

Firstly, by "this kind of learning controller" was meant the kind of
learning controller which required a carefully designed state space
decoder. Setting the parameters of the controller was not
straightforward, and required some trial and error, helped by prior
knowledge of the plant. By "more difficult problems" was meant
problems with even more parameters. It seems reasonable to suggest
that a better learning scheme would be needed in such instances. But
that is not to say that an improved scheme that made use of
approximate dynamic programming techniques would not be up to the job.

Andy Barto points out that better learning schemes have already been
produced.  The early ACE/ASE learning algorithm [2] was chosen for our
implementation for speed of execution in a real-time environment. It
might also be considered interesting as a base-line comparison, since
the ACE/ASE controller is relatively well-known.

Barto, Sutton and Anderson used Michie and Chambers' [3] state
representation, since this was the work on which they were improving.
They mentioned this was a critical part of the algorithm, which should
be adaptive.

A copy of the report is available by ftp from svr-ftp.eng.cam.ac.uk,
as reports/jervis_tr115.ps.Z.

references:

[1]

@techreport{Jervis92,
author	=	"T.T.Jervis and F.Fallside",
title	=	"Pole Balancing on a Real Rig using a Reinforcement Learning Controller",
year	=	"1992",
month	=	"December",
number	=	"CUED/F-INFENG/TR 115",
institution =	"Cambridge University Engineering Department",
address =	"Trumpington Street, Cambridge, England"}

[2]

@article{Barto83,
author	=	"A.G. Barto and R.S. Sutton and C.W. Anderson",
title	=	"Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems",
year	=	"1983",
month	=	"September/October",
journal	=	"IEEE Transactions on Systems, Man and Cybernetics",
volume	=	"SMC-13",
pages	=	"834-846"}

[3]

@incollection{Michie68,
author	=	"D. Michie and R.A. Chambers",
title	=	"Boxes: An Experiment in Adaptive Control",
booktitle	=	"Machine Intelligence",
publisher	=	"Oliver and Boyd",
year	=	"1968",
volume	=	"2",
pages	=	"137-152",
editor	=	"E. Dale and D. Michie"}