TR: Comparing Value-Function Estimation Algorithms in Undiscounted Problems

Thu Nov 18 13:32:57 EST 1999

Dear Colleagues,

The following technical report is available at

 http://victoria.mindmaker.hu/~szepes/papers/slowql-tr99-02.ps.gz

All comments are welcome.

 Best wishes,

  Csaba Szepesvari

----------------------------------------------------------------
Comparing Value-Function Estimation Algorithms in Undiscounted Problems
TR99-02, Mindmaker Ltd., Budapest 1121, Konkoly Th. M. u. 29-33
Ferenc Beleznay, Tamas Grobler and Csaba Szepesvari

We compare scaling properties of several value-function estimation
algorithms. In particular, we prove that
Q-learning can scale exponentially slowly with the number of states. We
identify the reasons of the slow convergence and
show that both TD($\lambda$) and learning with a fixed learning-rate
enjoy rather fast convergence, just like the
model-based method.