TR on HMM induction available

Thu Jan 27 21:35:32 EST 1994

The following technical report is now available from ICSI.
FTP instructions appear at the end of this message.

Note that this report is a greatly expanded and revised follow-up to our
paper in last year's NIPS volume.  It replaces report TR-93-003 mentioned
in that paper, which was never released as we decided to include substantial
new material instead. We regret any confusion or inconvenience this may have
caused.

Andreas Stolcke
Stephen Omohundro

--------------------------------------------------------------------------

Best-first Model Merging for Hidden Markov Model Induction

Andreas Stolcke and Stephen M. Omohundro
TR-94-003
January 1994

Abstract:

This report describes a new technique for inducing the structure of
Hidden Markov Models from data which is based on the general `model
merging' strategy (Omohundro 1992). The process begins with a
maximum likelihood HMM that directly encodes the training
data. Successively more general models are produced by merging HMM
states. A Bayesian posterior probability criterion is used to
determine which states to merge and when to stop generalizing. The
procedure may be considered a heuristic search for the HMM structure
with the highest posterior probability.  We discuss a variety of
possible priors for HMMs, as well as a number of approximations which
improve the computational efficiency of the algorithm.

We studied three applications to evaluate the procedure. The first
compares the merging algorithm with the standard Baum-Welch approach
in inducing simple finite-state languages from small, positive-only
training samples. We found that the merging procedure is more robust
and accurate, particularly with a small amount of training data.  The
second application uses labelled speech data from the TIMIT database
to build compact, multiple-pronunciation word models that can be used
in speech recognition.  Finally, we describe how the algorithm was
incorporated in an operational speech understanding system, where it
is combined with neural network acoustic likelihood estimators to
improve performance over single-pronunciation word models.

--------------------------------------------------------------------------

Instructions for retrieving ICSI technical reports via ftp

Replace YEAR and tr-XX-YYY with the appropriate year and TR number.
If your name server is ignorant about ftp.icsi.berkeley.edu,
use 128.32.201.55 instead.

	unix% ftp ftp.icsi.berkeley.edu
	Name (ftp.icsi.berkeley.edu:): anonymous
	Password: your_name at your_machine
	ftp> cd /pub/techreports/YEAR
	ftp> binary
	ftp> get tr-XX-YYY.ps.Z
	ftp> quit
	unix% uncompress tr-XX-YYY.ps.Z
	unix% lpr -Pyour_printer tr-XX-YYY.ps

All files in this archive can also be obtained through an e-mail interface
in case direct ftp is not available.  Send mail containing the line
`send help' to ftpmail at ICSI.Berkeley.EDU for instructions.

As a last resort, hardcopies may be ordered for a small fee.
Send mail to info at ICSI.Berkeley.EDU for more information.