Neuroevolution paper, software, and demo announcement

Tue Nov 5 16:29:18 EST 2002

Dear Connectionists,

Enforced SubPopulations (ESP) version 3.0 is now available.  ESP
is a method that uses cooperative coevolution to evolve recurrent 
neural network for difficult reinforcement learning tasks that require 
memory.  A paper describing the method (abstract below), source code, 
and animated demo in the double pole balancing task are all available at:

http://www.cs.utexas.edu/users/nn/pages/research/ne-methods.html#esp

--Faustino J. Gomez and Risto Miikkulainen

Paper:
-----------------------------------------------------------------------
ROBUST NON-LINEAR CONTROL THROUGH NEUROEVOLUTION.
Faustino J. Gomez and Risto Miikkulainen 
Department of Computer Sciences,
The University of Texas at Austin Technical Report TR-AI-02-292, Oct 2002.

http://www.cs.utexas.edu/users/nn/pages/publications/abstracts.html#gomez.tr02-292.ps.gz

Abstract:

      Many complex control problems require sophisticated
      solutions that are not amenable to traditional controller
      design. Not only is it difficult to model real world systems, but
      often it is unclear what kind of behavior is required to solve
      the task. Reinforcement learning (RL) approaches have
      made progress by utilizing direct interaction with the task
      environment, but have so far not scaled well to large state
      spaces and environments that are not fully observable. In
      recent years, neuroevolution, the artificial evolution of neural
      networks, has had remarkable success in tasks that exhibit
      these two properties, but, like RL methods, requires solutions
      to be discovered in simulation and then transferred to the real
      world. To ensure that transfer is possible, evolved controllers
      need to be robust enough to cope with discrepancies
      between these two settings. In this paper, we demonstrate
      how a method called Enforced SubPopulations (ESP), for
      evolving recurrent neural network controllers, can facilitate
      this transfer. The method is first compared to a broad range
      of reinforcement learning algorithms on very difficult versions
      of the pole balancing problem that involve large (continuous,
      high-dimensional) state spaces and hidden state. ESP is
      shown to be significantly more efficient and powerful than the
      other methods on these tasks. We then present a
      model-based method that allows controllers evolved in a
      learned model of the environment to successfully transfer to
      the real world. We test the method on the most difficult
      version of the pole balancing task, and show that the
      appropriate use of noise during evolution can improve
      transfer significantly by compensating for inaccuracy in the
      model. 

Software:
-----------------------------------------------------------------------
ESP 3.0 C++ SOURCE CODE
http://www.cs.utexas.edu/users/nn/pages/software/abstracts.html#esp-cpp
Faustino J. Gomez

The ESP package contains source code implementing the Enforced SubPopulations
algorithm and the pole balancing domain.  The source code is written in C++,
and is designed for easy extensibility to new tasks.

Documentation for the code in html is available at:

http://www.cs.utexas.edu/users/inaki/espdoc/

Demo:
-----------------------------------------------------------------------
NON-MARKOV DOUBLE POLE BALANCING 
http://www.cs.utexas.edu/users/nn/pages/research/espdemo
Faustino Gomez

The page contains links to movies (in avi and Quicktime) showing the
evolution of controllers for the non-Markov double pole balancing 
problem.  The best controller from each generation is shown trying to 
balance the system using only three of the six state variables (no
velocities).