preprints available

Tue Dec 22 14:22:55 EST 1998

The following preprints are available at http://www.research.att.com/~lsaul.

==============================================================================

         ATTRACTOR DYNAMICS IN FEEDFORWARD NEURAL NETWORKS

                   L. Saul and M. Jordan

We study the probabilistic generative models parameterized by
feedforward neural networks.  An attractor dynamics for probabilistic
inference in these models is derived from a mean field approximation
for large, layered sigmoidal networks.  Fixed points of the dynamics
correspond to solutions of the mean field equations, which relate the
statistics of each unit to those of its Markov blanket.  We establish
global convergence of the dynamics by providing a Lyapunov function
and show that the dynamics generate the signals required for
unsupervised learning.  Our results for feedforward networks provide a
counterpart to those of Cohen-Grossberg and Hopfield for symmetric
networks.

==============================================================================

   MARKOV PROCESSES ON CURVES FOR AUTOMATIC SPEECH RECOGNITION

                    L. Saul and M. Rahim

We investigate a probabilistic framework for automatic speech
recognition based on the intrinsic geometric properties of curves.  In
particular, we analyze the setting in which two variables---one
continuous (X), one discrete (S)---evolve jointly in time.  We suppose
that the vector X traces out a smooth multidimensional curve and that
the variable S evolves stochastically as a function of the arc length
traversed along this curve.  Since arc length does not depend on the
rate at which a curve is traversed, this gives rise to a family of
Markov processes whose predictions, Pr[S|X], are invariant to
nonlinear warpings of time.  We describe the use of such models, known
as Markov processes on curves (MPCs), for automatic speech
recognition, where X are acoustic feature trajectories and S are
phonetic transcriptions.  On two tasks---recognizing New Jersey town
names and connected alpha-digits---we find that MPCs yield lower word
error rates than comparably trained hidden Markov models.

==============================================================================