TD-Gammon paper available in neuroprose
Gerald Tesauro
tesauro at watson.ibm.com
Tue Jun 1 20:07:38 EDT 1993
The following paper, which has been accepted for publication
in Neural Computation, has been placed in the neuroprose
archive at Ohio State. Instructions for retrieving the paper
by anonymous ftp are appended below.
---------------------------------------------------------------
TD-Gammon, A Self-Teaching Backgammon Program,
Achieves Master-Level Play
Gerald Tesauro
IBM Thomas J. Watson Research Center
P. O. Box 704
Yorktown Heights, NY 10598
(tesauro at watson.ibm.com)
Abstract:
TD-Gammon is a neural network that is able to teach
itself to play backgammon solely by playing against
itself and learning from the results, based on the
TD(lambda) reinforcement learning algorithm (Sutton, 1988).
Despite starting from random initial weights (and hence
random initial strategy), TD-Gammon achieves a surprisingly
strong level of play. With zero knowledge built in at the
start of learning (i.e. given only a ``raw'' description
of the board state), the network learns to play at a strong
intermediate level. Furthermore, when a set of hand-crafted
features is added to the network's input representation, the
result is a truly staggering level of performance:
the latest version of TD-Gammon is now estimated to
play at a strong master level that is extremely close to the
world's best human players.
---------------------------------------------------------------
FTP INSTRUCTIONS
unix% ftp archive.cis.ohio-state.edu (or 128.146.8.52)
Name: anonymous
Password: (use your e-mail address)
ftp> cd pub/neuroprose
ftp> binary
ftp> get tesauro.tdgammon.ps.Z
ftp> bye
unix% uncompress tesauro.tdgammon.ps
unix% lpr tesauro.tdgammon.ps
More information about the Connectionists
mailing list