TR Announcement

Wed Jun 30 18:45:48 EDT 1999

The following technical report is available online at 
http://cogsci.ucsd.edu  (follow links to Tech Reports & Software )
Physical copies are also available (see the site for information).

Modeling Path Distributions Using Partially Observable Diffusion
Networks: A Monte-Carlo Approach.

	                     Paul Mineiro
                      Department of Cognitive Science
                     University of California San Diego

                         Javier R. Movellan 
                     Department of Cognitive Science
	                           &
                     Institute for Neural Computation
                     University of California San Diego

                            Ruth J. Williams	
                        Department of Mathematics
	                           &
                     Institute for Neural Computation
                     University of California San Diego

  Hidden Markov models have been more successful than recurrent neural
  networks for problems involving temporal sequences, e.g., speech
  recognition.  One possible reason for this is that 
  recurrent neural networks are being used in ways that do not handle
  temporal uncertainty well. In this paper we present a framework for
  learning, recognition and stochastic filtering of temporal sequences
  based on a probabilistic version of continuous recurrent neural
  networks. We call these networks diffusion (neural) networks for
  they are based on stochastic diffusion processes defined by adding
  Brownian motion to the standard recurrent neural network
  dynamics. The goal is to combine the versatility of recurrent neural
  networks with the power of probabilistic techniques.  We focus on
  the problem of learning to approximate a desired probability
  distribution of sequences.  Once a distribution of sequences has
  been learned, well known techniques can be applied for the generation,
  recognition and stochastic filtering of new sequences.  We present
  an adaptive importance sampling scheme for estimation of
  log-likelihood gradients.  This allows the use of iterative
  optimization techniques, like gradient descent and the EM algorithm,
  to train diffusion networks.  We present results for an automatic
  visual speech recognition task in which diffusion networks provide
  excellent performance when compared to hidden Markov models.