Neuroevolution paper, software, and demo announcement
Faustino J. Gomez
inaki at cs.utexas.edu
Tue Nov 5 16:29:18 EST 2002
Dear Connectionists,
Enforced SubPopulations (ESP) version 3.0 is now available. ESP
is a method that uses cooperative coevolution to evolve recurrent
neural network for difficult reinforcement learning tasks that require
memory. A paper describing the method (abstract below), source code,
and animated demo in the double pole balancing task are all available at:
http://www.cs.utexas.edu/users/nn/pages/research/ne-methods.html#esp
--Faustino J. Gomez and Risto Miikkulainen
Paper:
-----------------------------------------------------------------------
ROBUST NON-LINEAR CONTROL THROUGH NEUROEVOLUTION.
Faustino J. Gomez and Risto Miikkulainen
Department of Computer Sciences,
The University of Texas at Austin Technical Report TR-AI-02-292, Oct 2002.
http://www.cs.utexas.edu/users/nn/pages/publications/abstracts.html#gomez.tr02-292.ps.gz
Abstract:
Many complex control problems require sophisticated
solutions that are not amenable to traditional controller
design. Not only is it difficult to model real world systems, but
often it is unclear what kind of behavior is required to solve
the task. Reinforcement learning (RL) approaches have
made progress by utilizing direct interaction with the task
environment, but have so far not scaled well to large state
spaces and environments that are not fully observable. In
recent years, neuroevolution, the artificial evolution of neural
networks, has had remarkable success in tasks that exhibit
these two properties, but, like RL methods, requires solutions
to be discovered in simulation and then transferred to the real
world. To ensure that transfer is possible, evolved controllers
need to be robust enough to cope with discrepancies
between these two settings. In this paper, we demonstrate
how a method called Enforced SubPopulations (ESP), for
evolving recurrent neural network controllers, can facilitate
this transfer. The method is first compared to a broad range
of reinforcement learning algorithms on very difficult versions
of the pole balancing problem that involve large (continuous,
high-dimensional) state spaces and hidden state. ESP is
shown to be significantly more efficient and powerful than the
other methods on these tasks. We then present a
model-based method that allows controllers evolved in a
learned model of the environment to successfully transfer to
the real world. We test the method on the most difficult
version of the pole balancing task, and show that the
appropriate use of noise during evolution can improve
transfer significantly by compensating for inaccuracy in the
model.
Software:
-----------------------------------------------------------------------
ESP 3.0 C++ SOURCE CODE
http://www.cs.utexas.edu/users/nn/pages/software/abstracts.html#esp-cpp
Faustino J. Gomez
The ESP package contains source code implementing the Enforced SubPopulations
algorithm and the pole balancing domain. The source code is written in C++,
and is designed for easy extensibility to new tasks.
Documentation for the code in html is available at:
http://www.cs.utexas.edu/users/inaki/espdoc/
Demo:
-----------------------------------------------------------------------
NON-MARKOV DOUBLE POLE BALANCING
http://www.cs.utexas.edu/users/nn/pages/research/espdemo
Faustino Gomez
The page contains links to movies (in avi and Quicktime) showing the
evolution of controllers for the non-Markov double pole balancing
problem. The best controller from each generation is shown trying to
balance the system using only three of the six state variables (no
velocities).
More information about the Connectionists
mailing list