Hierarchical Reinforcement Learning (tech report)
Tom Dietterich
tgd at CS.ORST.EDU
Sun Aug 24 14:42:36 EDT 1997
The following technical report is available in gzipped postscript
format from
ftp://ftp.cs.orst.edu/pub/tgd/papers/tr-maxq.ps.gz
Hierarchical Reinforcement Learning with the
MAXQ Value Function Decomposition
Thomas G. Dietterich
Department of Computer Science
Oregon State University
Corvallis, OR 97331
Abstract
This paper describes the MAXQ method for hierarchical reinforcement
learning based on a hierarchical decomposition of the value function
and derives conditions under which the MAXQ decomposition can
represent the optimal value function. We show that for certain
execution models, the MAXQ decomposition will produce better policies
than Feudal Q learning.
More information about the Connectionists
mailing list