From zedavid at l2f.inesc-id.pt  Tue Jun  8 06:55:31 2010
From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=)
Date: Tue, 08 Jun 2010 11:55:31 +0100
Subject: [Olympus developers 233]:  Problems with VAD
Message-ID: <4C0E21A3.9030601@l2f.inesc-id.pt>

I'm working on a dialogue system with a different ASR module. I'm
experiencing some problems with the VAD. This module seems very
sensitive, triggering very easily, and thus forwarding the data coming
from the microphone to the engines even if it is not speech. I was
wondering if I could convert our own GMM models for VAD to the sphinx 3
format. Is there any tool available to convert a standard format (for
instance HTK) to sphinx 3 format? I could also train GMMs using our own
Olympus data. Did anyone did this before?

Best,
Jose David


From Alex.Rudnicky at cs.cmu.edu  Tue Jun  8 10:31:19 2010
From: Alex.Rudnicky at cs.cmu.edu (Alex Rudnicky)
Date: Tue, 8 Jun 2010 10:31:19 -0400
Subject: [Olympus developers 234]: Re: Problems with VAD
In-Reply-To: <4C0E21A3.9030601@l2f.inesc-id.pt>
References: <4C0E21A3.9030601@l2f.inesc-id.pt>
Message-ID: <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu>

Jose,

There are actually three different end-pointers in Olympus; these are selectable in (I believe) the configuration for the Audio server. You might try experimenting with alternate ones. Also, it might be worth a try to modify the end-pointing sensitivity parameters.

Alex


-----Original Message-----
From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos? David Lopes
Sent: Tuesday, June 08, 2010 6:56 AM
To: olympus-developers at cs.cmu.edu
Subject: [Olympus developers 233]: Problems with VAD

I'm working on a dialogue system with a different ASR module. I'm
experiencing some problems with the VAD. This module seems very
sensitive, triggering very easily, and thus forwarding the data coming
from the microphone to the engines even if it is not speech. I was
wondering if I could convert our own GMM models for VAD to the sphinx 3
format. Is there any tool available to convert a standard format (for
instance HTK) to sphinx 3 format? I could also train GMMs using our own
Olympus data. Did anyone did this before?

Best,
Jose David


From zedavid at l2f.inesc-id.pt  Wed Jun  9 12:09:07 2010
From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=)
Date: Wed, 09 Jun 2010 17:09:07 +0100
Subject: [Olympus developers 235]: Re: Problems with VAD
In-Reply-To: <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu>
References: <4C0E21A3.9030601@l2f.inesc-id.pt>
	<9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu>
Message-ID: <4C0FBCA3.2050501@l2f.inesc-id.pt>

Alex,

Thanks for your quick answer!

What do you mean by end-pointers? Is it the run mode configuration? I've 
been looking at the documentation and it is not clear to me what these 
end-pointers are.

Jose

Em 08-06-2010 15:31, Alex Rudnicky escreveu:
> Jose,
>
> There are actually three different end-pointers in Olympus; these are selectable in (I believe) the configuration for the Audio server. You might try experimenting with alternate ones. Also, it might be worth a try to modify the end-pointing sensitivity parameters.
>
> Alex
>
>
> -----Original Message-----
> From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos? David Lopes
> Sent: Tuesday, June 08, 2010 6:56 AM
> To: olympus-developers at cs.cmu.edu
> Subject: [Olympus developers 233]: Problems with VAD
>
> I'm working on a dialogue system with a different ASR module. I'm
> experiencing some problems with the VAD. This module seems very
> sensitive, triggering very easily, and thus forwarding the data coming
> from the microphone to the engines even if it is not speech. I was
> wondering if I could convert our own GMM models for VAD to the sphinx 3
> format. Is there any tool available to convert a standard format (for
> instance HTK) to sphinx 3 format? I could also train GMMs using our own
> Olympus data. Did anyone did this before?
>
> Best,
> Jose David
>    


From mrmarge at cs.cmu.edu  Wed Jun  9 17:46:47 2010
From: mrmarge at cs.cmu.edu (Matthew Marge)
Date: Wed, 9 Jun 2010 17:46:47 -0400
Subject: [Olympus developers 236]: Re: Problems with VAD
In-Reply-To: <4C0FBCA3.2050501@l2f.inesc-id.pt>
References: <4C0E21A3.9030601@l2f.inesc-id.pt>
	<9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> 
	<4C0FBCA3.2050501@l2f.inesc-id.pt>
Message-ID: <AANLkTikTn_5JkQmx0w17c_hh1B-IY6uIPH54tQIFmfWl@mail.gmail.com>

Hi Jose,

This provides a good definition of endpointing as it relates to spoken
dialogue systems:

 Endpointing is the process of determining the beginning and the end of
> speech within an incoming sample stream. This takes into account whether or
> not energy is detected at speech frequencies, the duration of the detected
> sound and pitch extraction. The pitch is tracked to help recognise vowels as
> indications of speech and to filter out background noise events, such as a
> door closing. The endpointer can be tuned to avoid false barge-in, or
> false-triggers, i.e. cutting off the prompt when the user did not speak.
> This can be caused by background noise or prompt echo. The endpointer can
> also be tuned to avoid missing leading speech. Syllables, leading
> consonants, or in the worst case, everything a caller who speaks quietly
> says might be missed. Significant advances have been made in recent years in
> echo cancellation and endpointing techniques and these have resulted in
> better barge-in performance, and leading speech recognition companies are
> recommending the use of barge-in wherever possible.

http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm

Cheers,
Matt

2010/6/9 Jos? David Lopes <zedavid at l2f.inesc-id.pt>

> Alex,
>
> Thanks for your quick answer!
>
> What do you mean by end-pointers? Is it the run mode configuration? I've
> been looking at the documentation and it is not clear to me what these
> end-pointers are.
>
> Jose
>
> Em 08-06-2010 15:31, Alex Rudnicky escreveu:
>
>  Jose,
>>
>> There are actually three different end-pointers in Olympus; these are
>> selectable in (I believe) the configuration for the Audio server. You might
>> try experimenting with alternate ones. Also, it might be worth a try to
>> modify the end-pointing sensitivity parameters.
>>
>> Alex
>>
>>
>> -----Original Message-----
>> From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto:
>> olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos?
>> David Lopes
>> Sent: Tuesday, June 08, 2010 6:56 AM
>> To: olympus-developers at cs.cmu.edu
>> Subject: [Olympus developers 233]: Problems with VAD
>>
>> I'm working on a dialogue system with a different ASR module. I'm
>> experiencing some problems with the VAD. This module seems very
>> sensitive, triggering very easily, and thus forwarding the data coming
>> from the microphone to the engines even if it is not speech. I was
>> wondering if I could convert our own GMM models for VAD to the sphinx 3
>> format. Is there any tool available to convert a standard format (for
>> instance HTK) to sphinx 3 format? I could also train GMMs using our own
>> Olympus data. Did anyone did this before?
>>
>> Best,
>> Jose David
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100609/a8d18b15/attachment-0001.html

From zedavid at l2f.inesc-id.pt  Fri Jun 11 06:26:58 2010
From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=)
Date: Fri, 11 Jun 2010 11:26:58 +0100
Subject: [Olympus developers 237]: Re: Problems with VAD
In-Reply-To: <AANLkTikTn_5JkQmx0w17c_hh1B-IY6uIPH54tQIFmfWl@mail.gmail.com>
References: <4C0E21A3.9030601@l2f.inesc-id.pt>
	<9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu>
	<4C0FBCA3.2050501@l2f.inesc-id.pt>
	<AANLkTikTn_5JkQmx0w17c_hh1B-IY6uIPH54tQIFmfWl@mail.gmail.com>
Message-ID: <4C120F72.6000403@l2f.inesc-id.pt>

Hi Matt,

Thanks for your answer.
However, my misunderstanding was where to modify the configuration file 
in order to change the end-pointing mode.

Best,
Jose

Em 09-06-2010 22:46, Matthew Marge escreveu:
> Hi Jose,
>
> This provides a good definition of endpointing as it relates to spoken 
> dialogue systems:
>
>     Endpointing is the process of determining the beginning and the
>     end of speech within an incoming sample stream. This takes into
>     account whether or not energy is detected at speech frequencies,
>     the duration of the detected sound and pitch extraction. The pitch
>     is tracked to help recognise vowels as indications of speech and
>     to filter out background noise events, such as a door closing. The
>     endpointer can be tuned to avoid false barge-in, or
>     false-triggers, i.e. cutting off the prompt when the user did not
>     speak. This can be caused by background noise or prompt echo. The
>     endpointer can also be tuned to avoid missing leading speech.
>     Syllables, leading consonants, or in the worst case, everything a
>     caller who speaks quietly says might be missed. Significant
>     advances have been made in recent years in echo cancellation and
>     endpointing techniques and these have resulted in better barge-in
>     performance, and leading speech recognition companies are
>     recommending the use of barge-in wherever possible.
>
> http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm
>
> Cheers,
> Matt
>
> 2010/6/9 Jos? David Lopes <zedavid at l2f.inesc-id.pt 
> <mailto:zedavid at l2f.inesc-id.pt>>
>
>     Alex,
>
>     Thanks for your quick answer!
>
>     What do you mean by end-pointers? Is it the run mode
>     configuration? I've been looking at the documentation and it is
>     not clear to me what these end-pointers are.
>
>     Jose
>
>     Em 08-06-2010 15:31, Alex Rudnicky escreveu:
>
>         Jose,
>
>         There are actually three different end-pointers in Olympus;
>         these are selectable in (I believe) the configuration for the
>         Audio server. You might try experimenting with alternate ones.
>         Also, it might be worth a try to modify the end-pointing
>         sensitivity parameters.
>
>         Alex
>
>
>         -----Original Message-----
>         From: olympus-developers-bounces at mailman.srv.cs.cmu.edu
>         <mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu>
>         [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu
>         <mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu>] On
>         Behalf Of Jos? David Lopes
>         Sent: Tuesday, June 08, 2010 6:56 AM
>         To: olympus-developers at cs.cmu.edu
>         <mailto:olympus-developers at cs.cmu.edu>
>         Subject: [Olympus developers 233]: Problems with VAD
>
>         I'm working on a dialogue system with a different ASR module. I'm
>         experiencing some problems with the VAD. This module seems very
>         sensitive, triggering very easily, and thus forwarding the
>         data coming
>         from the microphone to the engines even if it is not speech. I was
>         wondering if I could convert our own GMM models for VAD to the
>         sphinx 3
>         format. Is there any tool available to convert a standard
>         format (for
>         instance HTK) to sphinx 3 format? I could also train GMMs
>         using our own
>         Olympus data. Did anyone did this before?
>
>         Best,
>         Jose David
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100611/91cba56e/attachment.html

From lang at cs.rochester.edu  Wed Jun 30 12:07:28 2010
From: lang at cs.rochester.edu (lang@cs.rochester.edu)
Date: Wed, 30 Jun 2010 12:07:28 -0400 (EDT)
Subject: [Olympus developers 238]:  a few questions
Message-ID: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu>

Hello!

My name is Katherine (Kate) Lang and I am a graduate student at the
University of Rochester.  I am currently trying to adapt the Olympus
system so that I may study the RavenClaw part.  However I am running into
a couple problems.

I have been working with tutorial 1 and on my computer I answer both the
original and destination places, but then there is an internal error.  Is
this common?  Is there a fix for it?

Lastly, I attempted to write my own dialogue system (which happens to be
my goal for the summer) based on the tutorial 1 code, but I cannot get it
to work.  This is asking a lot, and it is very short, but could you please
take a look at it and/or tell me common mistakes that I should look out
for?

The problem is that the Welcome inform agent does not print out (I am
using the tty option) anything when I have the request agent uncommented. 
If the request agent is commented out, then both of my inform agents work
perfectly. I also tried something similar with a copy of RoomLine, but got
the same results, so I do not believe that it is the internal error that I
mentioned before.

Thank you very much for your time,
Kate Lang


-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: DialogTask.cpp
Url: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100630/a921161d/DialogTask.ksh

From tkharris at gmail.com  Wed Jun 30 17:49:38 2010
From: tkharris at gmail.com (Thomas Harris)
Date: Wed, 30 Jun 2010 17:49:38 -0400
Subject: [Olympus developers 239]: Re: a few questions
In-Reply-To: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu>
References: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu>
Message-ID: <AANLkTinLo1c_jGdxFMM6zSyo-g3zSBxxEEYF8uGbaJ9z@mail.gmail.com>

Hello Kate,

I'm not familiar with the "internal error", can you give more details about
that?

I noticed in the source file that you sent that a variable appears to be
declared as num_total_ppl and then used as total_num_ppl. Maybe this could
be the root of the error.

Thanks,
-Thomas

On Wed, Jun 30, 2010 at 12:07 PM, <lang at cs.rochester.edu> wrote:

> Hello!
>
> My name is Katherine (Kate) Lang and I am a graduate student at the
> University of Rochester.  I am currently trying to adapt the Olympus
> system so that I may study the RavenClaw part.  However I am running into
> a couple problems.
>
> I have been working with tutorial 1 and on my computer I answer both the
> original and destination places, but then there is an internal error.  Is
> this common?  Is there a fix for it?
>
> Lastly, I attempted to write my own dialogue system (which happens to be
> my goal for the summer) based on the tutorial 1 code, but I cannot get it
> to work.  This is asking a lot, and it is very short, but could you please
> take a look at it and/or tell me common mistakes that I should look out
> for?
>
> The problem is that the Welcome inform agent does not print out (I am
> using the tty option) anything when I have the request agent uncommented.
> If the request agent is commented out, then both of my inform agents work
> perfectly. I also tried something similar with a copy of RoomLine, but got
> the same results, so I do not believe that it is the internal error that I
> mentioned before.
>
> Thank you very much for your time,
> Kate Lang
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100630/6ad34371/attachment.html