Tech Reports from CBCL at MIT
Reza Shadmehr
reza at ai.mit.edu
Mon Aug 16 12:37:02 EDT 1993
The following technical reports from the Center for Biological
and Computational Learning at M.I.T. are now available via
anonymous ftp.
--------------
:CBCL Paper #83/AI Memo #1440
:author Michael I. Jordan and Robert A. Jacobs
:title Hierarchical Mixtures of Experts and the EM Algorithm
:date August 1993
:pages 29
We present a tree-structured architecture for supervised learning.
The statistical model underlying the architecture is a hierarchical
mixture model in which both the mixture coefficients and the mixture
components are generalized linear models (GLIM's). Learning is treated
as a maximum likelihood problem; in particular, we present an Expectation-
Maximization (EM) algorithm for adjusting the parameters of the architecture.
We also develop an on-line learning algorithm in which the parameters are
updated incrementally. Comparative simulation results are presented in
the robot dynamics domain.
--------------
:CBCL Paper #84/AI Memo #1441
:author Tommi Jaakkola, Michael I. Jordan and Satinder P. Singh
:title On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
:date August 1993
:pages 13
Recent developments in the area of reinforcement learning have yielded
a number of new algorithms for the prediction and control of Markovian
environments. These algorithms, including the TD(lambda) algorithm of
Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be
motivated heuristically as approximations to dynamic programming (DP).
In this paper we provide a rigorous proof of convergence of these DP-based
learning algorithms by relating them to the powerful techniques of
stochastic approximation theory via a new convergence theorem. The
theorem establishes a general class of convergent algorithms to which
both TD (lambda) and Q-learning belong.
============================
How to get a copy of above reports:
The files are in compressed postscript format and are named by their
AI memo number, e.g., the Jordan and Jacobs paper is named
AIM-1440.ps.Z.
Here is the procedure for ftp-ing:
unix> ftp ftp.ai.mit.edu (log-in as anonymous)
ftp> cd ai-pubs/publications/1993
ftp> binary
ftp> get AIM-number.ps.Z
ftp> quit
unix> zcat AIM-number.ps.Z | lpr
I will periodically update the above list as new titles become
available.
Best wishes,
Reza Shadmehr
Center for Biological and Computational Learning
M. I. T.
Cambridge, MA 02139
More information about the Connectionists
mailing list