Technical Report Announcement: Reinforcement Learning with Temporal Abstraction
Rich Sutton
rich at cs.umass.edu
Mon Jul 20 11:54:13 EDT 1998
We are pleased to announce the public availability of the following
technical report:
Between MDPs and semi-MDPs:
Learning, planning, and representing knowledge at multiple temporal scales.
by Richard S. Sutton, Doina Precup, and Satinder Singh
Learning, planning, and representing knowledge at multiple levels
of temporal abstraction are key challenges for Artificial Intelligence.
this paper we develop an approach to these problems based on the
mathematical framework of reinforcement learning and Markov
decision processes (MDPs). We extend the usual notion of action to
include {\it options}---whole courses of behavior that may be temporally
extended, stochastic, and contingent on events. Examples of options
include picking up an object, going to lunch, and traveling to a distant
city, as well as primitive actions such as muscle twitches and joint
torques. Options may be given a priori, learned by experience, or
both. They may be used interchangeably with actions in a variety of
planning and learning methods. The theory of semi-Markov decision
processes (SMDPs) can be applied to model the consequences of
options and as a basis for planning and learning methods using them.
In this paper we develop these connections, building on prior work by
Bradtke and Duff (1995), Parr (in prep.) and others. Our main novel
results concern the interface between the MDP and SMDP levels of
analysis. We show how a set of options can be altered by changing
only their termination conditions to improve over SMDP methods
with no additional cost. We also introduce {\it intra-option}
temporal-difference methods that are able to learn from fragments of
an option's execution. Finally, we propose a notion of subgoal which
can be used to improve the options themselves. Overall, we argue that
options and their models provide hitherto missing aspects of a
powerful, clear, and expressive framework for representing and
organizing knowledge.
ftp://ftp.cs.umass.edu/pub/anw/pub/sutton/SPS-98.ps.gz
39 pages, 1.8 MBytes.
More information about the Connectionists
mailing list