Technical report available on reinforcement learning and chess
Jonathan Baxter
Jon.Baxter at
Wed Nov 26 06:34:36 EST 1997
Technical Report Available
KnightCap: A chess program that learns by combining TD($\lambda$) with
minimax search.
Jonathan Baxter, Andrew Tridgell and Lex Weaver.
Department of Systems Engineering and Department of Computer Science,
Australian National University.
In this paper we present TDLeaf($\lambda$), a variation on the
TD($\lambda$) algorithm that enables it to be used in conjunction with
minimax search. We present some experiments in which our chess
program, ``KnightCap,'' used TDLeaf($\lambda$) to learn its evaluation
function while playing on the Free Internet Chess Server (FICS, It improved from a 1650 rating to a 2100 rating in
just 308 games and 3 days of play (equivalent to improving from
mediocre to expert for a human). A more recent version of KnightCap
is currently playing on the "Non-Free" Internet Chess Server
(ICC, with a rating of around 2500. We discuss some of the
reasons for this success and also the relationship between our results
and Tesauro's results in backgammon.
Download Instructions
You can ftp the paper directly from
If you want to learn more about KnightCap, check out and follow the knightcap link. You can
retrieve the paper from there, the latest source code for KnightCap,
and watch a version of KnightCap ("KnightC") playing on ICC with our
chess applet.
More information about the Connectionists
mailing list