NEW BOOK: Applications of NNs to speech understanding
Ralf Kompe
kompe at fb.sony.de
Mon Oct 20 13:29:29 EDT 1997
(Sorry, if you receive this message more than once)
To whom it may concern:
The following book, which describes the application of neural
networks to real-word data, is now available:
Ralf Kompe
Prosody in Speech Understanding Systems
Lecture Notes in Artificial Intelligence, Vol. 1307
Subseries of Lecture Notes in Computer Science
Springer
Berlin, New York
1997
(370 pages)
ISBN 3-540-63580-7
------------------
ABSTRACT
Prosody covers acoustic phenomena of speech which are not spe-
cific to phonemes. These are mainly intonation, indicators for
phrase boundaries, and accentuation. This information can sup-
port the intelligibility of speech or even sometimes disambiguate
the meaning.
The aim of this book is to describe algorithms developed by the
author for the use of prosodic information on many levels of
speech understanding such as syntax, semantics, dialog, and
translation. An implementation of these algorithms has suc-
cessfully been integrated into the speech-to-speech translation
system Verbmobil and in the dialog system Evar. This is for the
first time that prosody is used in a fully operational speech
understanding and translation system. The Verbmobil prototype
system has been publicly demonstrated at several conferences and
industrial fairs.
The emphasis of the book lies on the improvement of parsing of
spontaneous speech with the help of prosodic clause boundary in-
formation. Prosody reduces the parse-time of word hypotheses
graphs by 92% and the number of parse trees by 96%. This is
achieved by integrating several knowledge sources such as proba-
bilities for prosodic events computed by neural networks and n-
gramms in an A*-search for the optimal parse. Without prosody the
automatic interpretation of spontaneous speech would be
infeasible.
The book gives a comprehensive review of the mathematical and
computational background of the algorithms and statistical models
useful for the integration of prosody in speech understanding.
It also shows unconventional applications of hidden Markov mod-
els, stochastic language models, and neural networks. The latter,
for example, are apart from several classification tasks used
for the inverse filtering of speech signals. The book also ex-
plains in detail the acoustic-prosodic phenomena of speech and
their functional role in communication. In contrast to many other
reports, it gives a lot of examples taken from real human-human
dialogs; many examples are supported by speech signals accessible
over the WWW. The use of prosodic information relies on the ro-
bust extraction of relevant features from digitized speech sig-
nals, on adequate labeling of large speech databases for train-
ing classifiers, and on the detection of prosodic events; the
methods used in Verbmobil and Evar are summarized as well in
this book. Furthermore, an overview of these state-of-the-art
speech understanding systems is given.
------------------
The book has been awarded with the "Dissertation Price" of the
German Institutes for Artificial Intelligence.
Sincerely yours,
Ralf Kompe
__________________________________________________________________________
Ralf Kompe
Sony International (Europe) GmbH . . o o O O
European Research and Development Stuttgart (ERDS) . . o o O
Advanced Developments . . o o O
Stuttgarter Str. 106 . . o o O O
D-70736 Fellbach . . o o O O O
Germany . . o o O O O
. . o o O O
Phone: +49-711-5858-366
Fax: +49-711-58-31-85
E-mail: kompe at fb.sony.de
__________________________________________________________________________
More information about the Connectionists
mailing list