TD-Gammon paper available in neuroprose

Tue Jun 1 20:07:38 EDT 1993

The following paper, which has been accepted for publication
in Neural Computation, has been placed in the neuroprose
archive at Ohio State. Instructions for retrieving the paper
by anonymous ftp are appended below.

---------------------------------------------------------------
   TD-Gammon, A Self-Teaching Backgammon Program,
          Achieves Master-Level Play

              Gerald Tesauro
     IBM Thomas J. Watson Research Center
               P. O. Box 704
         Yorktown Heights, NY 10598
          (tesauro at watson.ibm.com)

Abstract:
TD-Gammon is a neural network that is able to teach
itself to play backgammon solely by playing against
itself and learning from the results, based on the
TD(lambda) reinforcement learning algorithm (Sutton, 1988).
Despite starting from random initial weights (and hence
random initial strategy), TD-Gammon achieves a surprisingly
strong level of play.  With zero knowledge built in at the
start of learning (i.e. given only a ``raw'' description
of the board state), the network learns to play at a strong
intermediate level.  Furthermore, when a set of hand-crafted
features is added to the network's input representation, the
result is a truly staggering level of performance:
the latest version of TD-Gammon is now estimated to
play at a strong master level that is extremely close to the
world's best human players.
---------------------------------------------------------------
FTP INSTRUCTIONS

     unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52)
     Name: anonymous
     Password: (use your e-mail address)
     ftp> cd pub/neuroprose
     ftp> binary
     ftp> get tesauro.tdgammon.ps.Z
     ftp> bye
     unix% uncompress tesauro.tdgammon.ps
     unix% lpr tesauro.tdgammon.ps