[Olympus announcements 1]: ACL 2008 Tutorial: Building Practical Spoken Dialog Systems
Antoine Raux
antoine at cs.cmu.edu
Wed Mar 12 22:13:33 EDT 2008
Title: Building Practical Spoken Dialog Systems
Abstract:
This tutorial will give a practical description of the free software
Carnegie Mellon Olympus 2 Spoken Dialog Architecture. Building real
working dialog systems that are robust enough for the general public to
use is difficult. Most frequently, the functionality of the
conversations is severely limited - down to simple question-answer
pairs. While off-the-shelf toolkits help the development of such simple
systems, they do not support more advanced, natural dialogs nor do they
offer the transparency and flexibility required by computational
linguistic researchers. However, Olympus 2 offers a complete dialog
system with automatic speech recognition (Sphinx) and synthesis (SAPI,
Festival) and has been used, along with previous versions of Olympus,
for teaching and research at Carnegie Mellon and elsewhere for some 5
years. Overall, a dozen dialog systems have been built using various
versions of Olympus, handling tasks ranging from providing bus schedule
information to guidance through maintenance procedures for complex
machinery, to personal calendar management. In addition to simplifying
the development of dialog systems, Olympus provides a transparent
platform for teaching and conducting research on all aspects of dialog
systems, including speech recognition and synthesis, natural language
understanding and generation, and dialog and interaction management.
The tutorial will give a brief introduction to spoken dialog systems
before going into detail about how to create your own dialog system
within Olympus 2, using the Let's Go bus information system as an
example. Further, we will provide guidelines on how to use an actual
deployed spoken dialog system such as Let's Go to validate research
results in the real world. As a possible testbed for such research, we
will describe Let's Go Lab, which provides access to both the Let's Go
system and its genuine user population for research experiments.
Attendees will receive a CD with the latest version of the Olympus 2
architecture, along with several tutorials and example systems.
Outline:
* Introduction
* Overview of current spoken dialog system architectures
* Description of the Olympus2 dialog architechture
* How to build an Olympus2 dialog system (text I/O)
-break-
* Expanding an Olympus2 system to use speech - a true spoken dialog system
* Discussion of installation requirements and practical system-building
issues, including:
- telephony
- system backend
- ASR (re)training / (re)tuning
- improving synthesis output
- dialog strategies & parameters
- monitoring / logging
* Using Olympus2 for research and applications
- Let's Go Lab: a test platform for dialog systems with real users
* Final summary
Presenter Bios:
Antoine Raux
Language Technologies Institute
Carnegie Mellon University
http://www.cs.cmu.edu/~antoine/
email: antoine at cs.cmu.edu
Antoine Raux is a PhD student at the Language Technologies Institute at
Carnegie Mellon University. He has been conducting research and
published more than 15 reviewed papers on several aspects of dialog
systems, including speech recognition, speech synthesis, dialog and
interaction management, and system building. His teaching experience
includes two teaching assistantships in natural language-related
graduate courses, as well as the ongoing design of online tutorials for
the Olympus architecture.
Brian Langner
Language Technologies Institute
Carnegie Mellon University
http://www.cs.cmu.edu/~blangner/
email: blangner at cs.cmu.edu
Brian Langner is a PhD student at the Language Technologies Institute at
Carnegie Mellon University. He has been conducting research and
published more than 12 reviewed papers on speech synthesis, natural
language generation, and spoken dialog systems. He has six semesters of
experience as a teaching assistant for graduate and undergraduate
computing- or natural language- related courses, including some course
design, in addition to continuing work for the Olympus architecture
tutorials.
Dr. Alan W Black
Language Technologies Institute
Carnegie Mellon University
http://www.cs.cmu.edu/~awb/
email: awb at cs.cmu.edu
Alan W Black is an Associate Research Professor in the Language
Technologies Institute at Carnegie Mellon University. He previously
worked in the University of Edinburgh, and before that at ATR in Japan.
He received his PhD in Computational Linguistics from Edinburgh
University in 1993. He is one of the principal authors of the Festival
Speech Synthesis System. In addition to speech synthesis, he also works
on two-way speech-to-speech translation systems and, telephone-based
spoken dialog systems. He also has served on the IEEE Speech Technical
Committee (2003-2006), is on the editorial board of Speech
Communications and is a board member of ISCA. He teaches a number of
graduate and undergraduate courses and has taught a number of short term
tutorials on speech synthesis, speech technology and on rapid support
for new languages.
Dr. Maxine Eskenazi
Language Technologies Institute
Carnegie Mellon University
http://www.cs.cmu.edu/~max/
email: max at cs.cmu.edu
Maxine Eskenazi is on the faculty of the Language Technologies Institute
at Carnegie Mellon University. She has a BA from Carnegie Mellon
University in French and Education and a These de Troisieme Cycle from
the Universite de Paris 11 in Computer Science. She has extensive
publications on the use of automatic speech processing for spoken dialog
systems and on the use of language technologies for computer-assisted
language learning. She is the Principal Investigator on the NSF Let's Go
project.
More information about the Olympus-announcements
mailing list