Tech Reports from CBCL at MIT

Reza Shadmehr reza at ai.mit.edu
Mon Aug 16 12:37:02 EDT 1993


The following technical reports from the Center for Biological
and Computational Learning at M.I.T. are now available via 
anonymous ftp.

--------------
:CBCL Paper #83/AI Memo #1440
:author Michael I. Jordan and Robert A. Jacobs
:title Hierarchical Mixtures of Experts and the EM Algorithm
:date August 1993
:pages 29

We present a tree-structured architecture for supervised learning.  
The statistical model underlying the architecture is a hierarchical 
mixture model in which both the mixture coefficients and the mixture 
components are generalized linear models (GLIM's).  Learning is treated 
as a maximum likelihood problem; in particular, we present an Expectation-
Maximization (EM) algorithm for adjusting the parameters of the architecture.
We also develop an on-line learning algorithm in which the parameters are 
updated incrementally.  Comparative simulation results are presented in 
the robot dynamics domain.

--------------
:CBCL Paper #84/AI Memo #1441
:author Tommi Jaakkola, Michael I. Jordan and Satinder P. Singh
:title On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
:date August 1993
:pages 13

Recent developments in the area of reinforcement learning have yielded 
a number of new algorithms for the prediction and control of Markovian 
environments.  These algorithms, including the TD(lambda) algorithm of 
Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be 
motivated heuristically as approximations to dynamic programming (DP).  
In this paper we provide a rigorous proof of convergence of these DP-based 
learning algorithms by relating them to the powerful techniques of 
stochastic approximation theory via a new convergence theorem.  The 
theorem establishes a general class of convergent algorithms to which 
both TD (lambda) and Q-learning belong.

============================

How to get a copy of above reports:

The files are in compressed postscript format and are named by their 
AI memo number, e.g., the Jordan and Jacobs paper is named 
AIM-1440.ps.Z.  

Here is the procedure for ftp-ing:

unix> ftp ftp.ai.mit.edu   (log-in as anonymous)
ftp>  cd ai-pubs/publications/1993
ftp>  binary
ftp>  get AIM-number.ps.Z
ftp>  quit
unix> zcat AIM-number.ps.Z | lpr


I will periodically update the above list as new titles become
available.


Best wishes,

Reza Shadmehr
Center for Biological and Computational Learning
M. I. T.
Cambridge, MA 02139





More information about the Connectionists mailing list