On-Line, Interactive RL Tutorial
Mance E. Harmon
harmonme at aa.wpafb.af.mil
Mon Jan 13 16:06:16 EST 1997
Reinforcement Learning: An On-Line, Interactive Tutorial
by Mance E. Harmon
http://eureka1.aa.wpafb.af.mil/rltutorial
(hardcopy length: 19 Pages)
Scope of Tutorial
The purpose of this tutorial is to provide an introduction to reinforcement
learning (RL) at a level easily understood by students and researchers in a
wide range of disciplines. The intent is not to present a rigorous
mathematical discussion that requires a great deal of effort on the part of
the reader, but rather to present a conceptual framework that might serve as
an introduction to a more rigorous study of RL. The fundamental principles and
techniques used to solve RL problems are presented. The most popular RL
algorithms are presented and interactively demonstrated using WebSim, a
Java-based simulation development environment. Section 1 presents an overview
of RL and provides a simple example to develop intuition of the underlying
dynamic programming mechanism. In Section 2 the parts of a reinforcement
learning problem are discussed. These include the environment, reinforcement
function, and value function. Section 3 gives a description of the most widely
used reinforcement learning algorithms. These include TD(lambda) and both the
residual and direct forms of value iteration, Q-learning, and advantage
learning. In Section 4 some of the ancillary issues in RL are briefly
discussed, such as choosing an exploration strategy and an appropriate
discount factor. The conclusion is given in Section 5. Finally, Section 6 is a
glossary of commonly used terms followed by references in Section 7 and a
bibliography of RL applications in Section 8. It is assumed that the reader
has some knowledge of learning algorithms that rely on gradient descent (such
as the backpropagation of errors algorithm).
Mance Harmon
harmonme at aa.wpafb.af.mil
More information about the Connectionists
mailing list