Tech Reports from CBCL at MIT

Mon Aug 16 12:37:02 EDT 1993

The following technical reports from the Center for Biological
and Computational Learning at M.I.T. are now available via 
anonymous ftp.

--------------
:CBCL Paper #83/AI Memo #1440
:author Michael I. Jordan and Robert A. Jacobs
:title Hierarchical Mixtures of Experts and the EM Algorithm
:date August 1993
:pages 29

We present a tree-structured architecture for supervised learning.  
The statistical model underlying the architecture is a hierarchical 
mixture model in which both the mixture coefficients and the mixture 
components are generalized linear models (GLIM's).  Learning is treated 
as a maximum likelihood problem; in particular, we present an Expectation-
Maximization (EM) algorithm for adjusting the parameters of the architecture.
We also develop an on-line learning algorithm in which the parameters are 
updated incrementally.  Comparative simulation results are presented in 
the robot dynamics domain.

--------------
:CBCL Paper #84/AI Memo #1441
:author Tommi Jaakkola, Michael I. Jordan and Satinder P. Singh
:title On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
:date August 1993
:pages 13

Recent developments in the area of reinforcement learning have yielded 
a number of new algorithms for the prediction and control of Markovian 
environments.  These algorithms, including the TD(lambda) algorithm of 
Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be 
motivated heuristically as approximations to dynamic programming (DP).  
In this paper we provide a rigorous proof of convergence of these DP-based 
learning algorithms by relating them to the powerful techniques of 
stochastic approximation theory via a new convergence theorem.  The 
theorem establishes a general class of convergent algorithms to which 
both TD (lambda) and Q-learning belong.

============================

How to get a copy of above reports:

The files are in compressed postscript format and are named by their 
AI memo number, e.g., the Jordan and Jacobs paper is named 
AIM-1440.ps.Z.  

Here is the procedure for ftp-ing:

unix> ftp ftp.ai.mit.edu   (log-in as anonymous)
ftp>  cd ai-pubs/publications/1993
ftp>  binary
ftp>  get AIM-number.ps.Z
ftp>  quit
unix> zcat AIM-number.ps.Z | lpr

I will periodically update the above list as new titles become
available.

Best wishes,

Reza Shadmehr
Center for Biological and Computational Learning
M. I. T.
Cambridge, MA 02139