On-Line, Interactive RL Tutorial

Mon Jan 13 16:06:16 EST 1997

Reinforcement Learning: An On-Line, Interactive Tutorial
by Mance E. Harmon

http://eureka1.aa.wpafb.af.mil/rltutorial
(hardcopy length: 19 Pages)

Scope of Tutorial

The purpose of this tutorial is to provide an introduction to reinforcement 
learning (RL) at a level easily understood by students and researchers in a 
wide range of disciplines. The intent is not to present a rigorous 
mathematical discussion that requires a great deal of effort on the part of 
the reader, but rather to present a conceptual framework that might serve as 
an introduction to a more rigorous study of RL. The fundamental principles and 
techniques used to solve RL problems are presented. The most popular RL 
algorithms are presented and interactively demonstrated using WebSim, a 
Java-based simulation development environment. Section 1 presents an overview 
of RL and provides a simple example to develop intuition of the underlying 
dynamic programming mechanism. In Section 2 the parts of a reinforcement 
learning problem are discussed. These include the environment, reinforcement 
function, and value function. Section 3 gives a description of the most widely 
used reinforcement learning algorithms. These include TD(lambda) and both the 
residual and direct forms of value iteration, Q-learning, and advantage 
learning. In Section 4 some of the ancillary issues in RL are briefly 
discussed, such as choosing an exploration strategy and an appropriate 
discount factor. The conclusion is given in Section 5. Finally, Section 6 is a 
glossary of commonly used terms followed by references in Section 7 and a 
bibliography of RL applications in Section 8. It is assumed that the reader 
has some knowledge of learning algorithms that rely on gradient descent (such 
as the backpropagation of errors algorithm). 

Mance Harmon
harmonme at aa.wpafb.af.mil