Hierarchical Reinforcement Learning (tech report)

Sun Aug 24 14:42:36 EDT 1997

The following technical report is available in gzipped postscript
format from 

ftp://ftp.cs.orst.edu/pub/tgd/papers/tr-maxq.ps.gz

             Hierarchical Reinforcement Learning with the
                  MAXQ Value Function Decomposition

                         Thomas G. Dietterich
                    Department of Computer Science
                       Oregon State University
                         Corvallis, OR 97331

                               Abstract

   This paper describes the MAXQ method for hierarchical reinforcement
   learning based on a hierarchical decomposition of the value function
   and derives conditions under which the MAXQ decomposition can
   represent the optimal value function.  We show that for certain
   execution models, the MAXQ decomposition will produce better policies
   than Feudal Q learning.