TR: Comparing Value-Function Estimation Algorithms in Undiscounted Problems
Csaba Szepesvari
szepes at mindmaker.hu
Thu Nov 18 13:32:57 EST 1999
Dear Colleagues,
The following technical report is available at
http://victoria.mindmaker.hu/~szepes/papers/slowql-tr99-02.ps.gz
All comments are welcome.
Best wishes,
Csaba Szepesvari
----------------------------------------------------------------
Comparing Value-Function Estimation Algorithms in Undiscounted Problems
TR99-02, Mindmaker Ltd., Budapest 1121, Konkoly Th. M. u. 29-33
Ferenc Beleznay, Tamas Grobler and Csaba Szepesvari
We compare scaling properties of several value-function estimation
algorithms. In particular, we prove that
Q-learning can scale exponentially slowly with the number of states. We
identify the reasons of the slow convergence and
show that both TD($\lambda$) and learning with a fixed learning-rate
enjoy rather fast convergence, just like the
model-based method.
More information about the Connectionists
mailing list