Technical report available on reinforcement learning and chess

Wed Nov 26 06:34:36 EST 1997

Technical Report Available
--------------------------

Title
-----
KnightCap: A chess program that learns by combining TD($\lambda$) with
minimax search. 

Authors
-------
Jonathan Baxter, Andrew Tridgell and Lex Weaver.

Department of Systems Engineering and Department of Computer Science,
Australian National University.

Abstract 
------- 
In this paper we present TDLeaf($\lambda$), a variation on the
TD($\lambda$) algorithm that enables it to be used in conjunction with
minimax search. We present some experiments in which our chess
program, ``KnightCap,'' used TDLeaf($\lambda$) to learn its evaluation
function while playing on the Free Internet Chess Server (FICS,
fics.onenet.net). It improved from a 1650 rating to a 2100 rating in
just 308 games and 3 days of play (equivalent to improving from
mediocre to expert for a human).  A more recent version of KnightCap
is currently playing on the "Non-Free" Internet Chess Server
(ICC, chessclub.com) with a rating of around 2500. We discuss some of the
reasons for this success and also the relationship between our results
and Tesauro's results in backgammon.

Download Instructions
---------------------
You can ftp the paper directly from
ftp://syseng.anu.edu.au/~jon/publish/papers/knightcap.tar.gz

If you want to learn more about KnightCap, check out
http://syseng.anu.edu.au/lsg and follow the knightcap link. You can
retrieve the paper from there, the latest source code for KnightCap,
and watch a version of KnightCap ("KnightC") playing on ICC with our
chess applet.