Thesis available on using SOM and LVQ for HMMs
Mikko Kurimo
mikkok at marconi.hut.fi
Mon Nov 10 08:58:10 EST 1997
The following Dr.Tech. thesis is available at
http://www.cis.hut.fi/~mikkok/thesis/ (WWW home page)
http://www.cis.hut.fi/~mikkok/thesis/book/ (output of latex2html)
http://www.cis.hut.fi/~mikkok/intro.ps.gz (compressed postscript, 188K)
http://www.cis.hut.fi/~mikkok/intro.ps (postscript, 57 pages, 712K)
The articles that belong to the thesis can be accessed through the page
http://www.cis.hut.fi/~mikkok/thesis/publications.html
---------------------------------------------------------------
Using Self-Organizing Maps
and Learning Vector Quantization
for Mixture Density Hidden Markov Models
Mikko Kurimo
Helsinki University of Technology
Neural Networks Research Centre
P.O.Box 2200, FIN-02015 HUT, Finland
Email: Mikko.Kurimo at hut.fi
Abstract
--------
This work presents experiments to recognize pattern sequences
using hidden Markov models (HMMs).
The pattern sequences in the experiments are computed from
speech signals and the recognition task is to decode the
corresponding phoneme sequences.
The training of the HMMs of the phonemes using the collected
speech samples is a difficult task because of the natural
variation in the speech.
Two neural computing paradigms, the Self-Organizing Map (SOM)
and the Learning Vector Quantization (LVQ) are used in the
experiments to improve the recognition performance of the models.
A HMM consists of sequential states
which are trained to model the feature changes
in the signal produced during the modeled process.
The output densities applied in this work are mixtures
of Gaussian density functions.
SOMs are applied to initialize and train the mixtures
to give a smooth and faithful presentation of the feature
vector space defined by the corresponding training samples.
The SOM maps similar feature vectors to nearby units,
which is here exploited in experiments to improve
the recognition speed of the system.
LVQ provides simple but efficient stochastic learning
algorithms to improve the classification accuracy in
pattern recognition problems.
Here, LVQ is applied to develop an iterative training
method for mixture density HMMs,
which increases both the modeling accuracy of the states
and the discrimination between the models of different phonemes.
Experiments are also made with LVQ based corrective tuning
methods for the mixture density HMMs,
which aim at improving the models by learning from
the observed recognition errors in the training samples.
The suggested HMM training methods are tested using
the Finnish speech database collected in the
Neural Networks Research Centre
at the Helsinki University of Technology.
Statistically significant improvements compared to the best
conventional HMM training methods are obtained using
the speaker dependent
but vocabulary independent phoneme models.
The decrease in the average number of phoneme recognition
errors for the tested speakers have been around 10 percent
in the applied test material.
--
Email: Mikko.Kurimo at hut.fi
Office: Helsinki University of Technology,
Neural Networks Research Centre
Mail: P.O.Box 2200, FIN-02015 HUT, Finland
More information about the Connectionists
mailing list