TR: Comparing Value-Function Estimation Algorithms in Undiscounted  Problems
    Csaba Szepesvari 
    szepes at mindmaker.hu
       
    Thu Nov 18 13:32:57 EST 1999
    
    
  
Dear Colleagues,
The following technical report is available at
 http://victoria.mindmaker.hu/~szepes/papers/slowql-tr99-02.ps.gz
All comments are welcome.
 Best wishes,
  Csaba Szepesvari
----------------------------------------------------------------
Comparing Value-Function Estimation Algorithms in Undiscounted Problems
TR99-02, Mindmaker Ltd., Budapest 1121, Konkoly Th. M. u. 29-33
Ferenc Beleznay, Tamas Grobler and Csaba Szepesvari
We compare scaling properties of several value-function estimation
algorithms. In particular, we prove that
Q-learning can scale exponentially slowly with the number of states. We
identify the reasons of the slow convergence and
show that both TD($\lambda$) and learning with a fixed learning-rate
enjoy rather fast convergence, just like the
model-based method.
    
    
More information about the Connectionists
mailing list