From zedavid at l2f.inesc-id.pt Tue Jun 8 06:55:31 2010 From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=) Date: Tue, 08 Jun 2010 11:55:31 +0100 Subject: [Olympus developers 233]: Problems with VAD Message-ID: <4C0E21A3.9030601@l2f.inesc-id.pt> I'm working on a dialogue system with a different ASR module. I'm experiencing some problems with the VAD. This module seems very sensitive, triggering very easily, and thus forwarding the data coming from the microphone to the engines even if it is not speech. I was wondering if I could convert our own GMM models for VAD to the sphinx 3 format. Is there any tool available to convert a standard format (for instance HTK) to sphinx 3 format? I could also train GMMs using our own Olympus data. Did anyone did this before? Best, Jose David From Alex.Rudnicky at cs.cmu.edu Tue Jun 8 10:31:19 2010 From: Alex.Rudnicky at cs.cmu.edu (Alex Rudnicky) Date: Tue, 8 Jun 2010 10:31:19 -0400 Subject: [Olympus developers 234]: Re: Problems with VAD In-Reply-To: <4C0E21A3.9030601@l2f.inesc-id.pt> References: <4C0E21A3.9030601@l2f.inesc-id.pt> Message-ID: <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> Jose, There are actually three different end-pointers in Olympus; these are selectable in (I believe) the configuration for the Audio server. You might try experimenting with alternate ones. Also, it might be worth a try to modify the end-pointing sensitivity parameters. Alex -----Original Message----- From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos? David Lopes Sent: Tuesday, June 08, 2010 6:56 AM To: olympus-developers at cs.cmu.edu Subject: [Olympus developers 233]: Problems with VAD I'm working on a dialogue system with a different ASR module. I'm experiencing some problems with the VAD. This module seems very sensitive, triggering very easily, and thus forwarding the data coming from the microphone to the engines even if it is not speech. I was wondering if I could convert our own GMM models for VAD to the sphinx 3 format. Is there any tool available to convert a standard format (for instance HTK) to sphinx 3 format? I could also train GMMs using our own Olympus data. Did anyone did this before? Best, Jose David From zedavid at l2f.inesc-id.pt Wed Jun 9 12:09:07 2010 From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=) Date: Wed, 09 Jun 2010 17:09:07 +0100 Subject: [Olympus developers 235]: Re: Problems with VAD In-Reply-To: <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> References: <4C0E21A3.9030601@l2f.inesc-id.pt> <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> Message-ID: <4C0FBCA3.2050501@l2f.inesc-id.pt> Alex, Thanks for your quick answer! What do you mean by end-pointers? Is it the run mode configuration? I've been looking at the documentation and it is not clear to me what these end-pointers are. Jose Em 08-06-2010 15:31, Alex Rudnicky escreveu: > Jose, > > There are actually three different end-pointers in Olympus; these are selectable in (I believe) the configuration for the Audio server. You might try experimenting with alternate ones. Also, it might be worth a try to modify the end-pointing sensitivity parameters. > > Alex > > > -----Original Message----- > From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos? David Lopes > Sent: Tuesday, June 08, 2010 6:56 AM > To: olympus-developers at cs.cmu.edu > Subject: [Olympus developers 233]: Problems with VAD > > I'm working on a dialogue system with a different ASR module. I'm > experiencing some problems with the VAD. This module seems very > sensitive, triggering very easily, and thus forwarding the data coming > from the microphone to the engines even if it is not speech. I was > wondering if I could convert our own GMM models for VAD to the sphinx 3 > format. Is there any tool available to convert a standard format (for > instance HTK) to sphinx 3 format? I could also train GMMs using our own > Olympus data. Did anyone did this before? > > Best, > Jose David > From mrmarge at cs.cmu.edu Wed Jun 9 17:46:47 2010 From: mrmarge at cs.cmu.edu (Matthew Marge) Date: Wed, 9 Jun 2010 17:46:47 -0400 Subject: [Olympus developers 236]: Re: Problems with VAD In-Reply-To: <4C0FBCA3.2050501@l2f.inesc-id.pt> References: <4C0E21A3.9030601@l2f.inesc-id.pt> <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> <4C0FBCA3.2050501@l2f.inesc-id.pt> Message-ID: Hi Jose, This provides a good definition of endpointing as it relates to spoken dialogue systems: Endpointing is the process of determining the beginning and the end of > speech within an incoming sample stream. This takes into account whether or > not energy is detected at speech frequencies, the duration of the detected > sound and pitch extraction. The pitch is tracked to help recognise vowels as > indications of speech and to filter out background noise events, such as a > door closing. The endpointer can be tuned to avoid false barge-in, or > false-triggers, i.e. cutting off the prompt when the user did not speak. > This can be caused by background noise or prompt echo. The endpointer can > also be tuned to avoid missing leading speech. Syllables, leading > consonants, or in the worst case, everything a caller who speaks quietly > says might be missed. Significant advances have been made in recent years in > echo cancellation and endpointing techniques and these have resulted in > better barge-in performance, and leading speech recognition companies are > recommending the use of barge-in wherever possible. http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm Cheers, Matt 2010/6/9 Jos? David Lopes > Alex, > > Thanks for your quick answer! > > What do you mean by end-pointers? Is it the run mode configuration? I've > been looking at the documentation and it is not clear to me what these > end-pointers are. > > Jose > > Em 08-06-2010 15:31, Alex Rudnicky escreveu: > > Jose, >> >> There are actually three different end-pointers in Olympus; these are >> selectable in (I believe) the configuration for the Audio server. You might >> try experimenting with alternate ones. Also, it might be worth a try to >> modify the end-pointing sensitivity parameters. >> >> Alex >> >> >> -----Original Message----- >> From: olympus-developers-bounces at mailman.srv.cs.cmu.edu [mailto: >> olympus-developers-bounces at mailman.srv.cs.cmu.edu] On Behalf Of Jos? >> David Lopes >> Sent: Tuesday, June 08, 2010 6:56 AM >> To: olympus-developers at cs.cmu.edu >> Subject: [Olympus developers 233]: Problems with VAD >> >> I'm working on a dialogue system with a different ASR module. I'm >> experiencing some problems with the VAD. This module seems very >> sensitive, triggering very easily, and thus forwarding the data coming >> from the microphone to the engines even if it is not speech. I was >> wondering if I could convert our own GMM models for VAD to the sphinx 3 >> format. Is there any tool available to convert a standard format (for >> instance HTK) to sphinx 3 format? I could also train GMMs using our own >> Olympus data. Did anyone did this before? >> >> Best, >> Jose David >> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100609/a8d18b15/attachment-0001.html From zedavid at l2f.inesc-id.pt Fri Jun 11 06:26:58 2010 From: zedavid at l2f.inesc-id.pt (=?ISO-8859-1?Q?Jos=E9_David_Lopes?=) Date: Fri, 11 Jun 2010 11:26:58 +0100 Subject: [Olympus developers 237]: Re: Problems with VAD In-Reply-To: References: <4C0E21A3.9030601@l2f.inesc-id.pt> <9C0D1A9F38D23E4290347EE31C22B0AF03DE3B8E@e2k3.srv.cs.cmu.edu> <4C0FBCA3.2050501@l2f.inesc-id.pt> Message-ID: <4C120F72.6000403@l2f.inesc-id.pt> Hi Matt, Thanks for your answer. However, my misunderstanding was where to modify the configuration file in order to change the end-pointing mode. Best, Jose Em 09-06-2010 22:46, Matthew Marge escreveu: > Hi Jose, > > This provides a good definition of endpointing as it relates to spoken > dialogue systems: > > Endpointing is the process of determining the beginning and the > end of speech within an incoming sample stream. This takes into > account whether or not energy is detected at speech frequencies, > the duration of the detected sound and pitch extraction. The pitch > is tracked to help recognise vowels as indications of speech and > to filter out background noise events, such as a door closing. The > endpointer can be tuned to avoid false barge-in, or > false-triggers, i.e. cutting off the prompt when the user did not > speak. This can be caused by background noise or prompt echo. The > endpointer can also be tuned to avoid missing leading speech. > Syllables, leading consonants, or in the worst case, everything a > caller who speaks quietly says might be missed. Significant > advances have been made in recent years in echo cancellation and > endpointing techniques and these have resulted in better barge-in > performance, and leading speech recognition companies are > recommending the use of barge-in wherever possible. > > http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm > > Cheers, > Matt > > 2010/6/9 Jos? David Lopes > > > Alex, > > Thanks for your quick answer! > > What do you mean by end-pointers? Is it the run mode > configuration? I've been looking at the documentation and it is > not clear to me what these end-pointers are. > > Jose > > Em 08-06-2010 15:31, Alex Rudnicky escreveu: > > Jose, > > There are actually three different end-pointers in Olympus; > these are selectable in (I believe) the configuration for the > Audio server. You might try experimenting with alternate ones. > Also, it might be worth a try to modify the end-pointing > sensitivity parameters. > > Alex > > > -----Original Message----- > From: olympus-developers-bounces at mailman.srv.cs.cmu.edu > > [mailto:olympus-developers-bounces at mailman.srv.cs.cmu.edu > ] On > Behalf Of Jos? David Lopes > Sent: Tuesday, June 08, 2010 6:56 AM > To: olympus-developers at cs.cmu.edu > > Subject: [Olympus developers 233]: Problems with VAD > > I'm working on a dialogue system with a different ASR module. I'm > experiencing some problems with the VAD. This module seems very > sensitive, triggering very easily, and thus forwarding the > data coming > from the microphone to the engines even if it is not speech. I was > wondering if I could convert our own GMM models for VAD to the > sphinx 3 > format. Is there any tool available to convert a standard > format (for > instance HTK) to sphinx 3 format? I could also train GMMs > using our own > Olympus data. Did anyone did this before? > > Best, > Jose David > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100611/91cba56e/attachment.html From lang at cs.rochester.edu Wed Jun 30 12:07:28 2010 From: lang at cs.rochester.edu (lang@cs.rochester.edu) Date: Wed, 30 Jun 2010 12:07:28 -0400 (EDT) Subject: [Olympus developers 238]: a few questions Message-ID: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu> Hello! My name is Katherine (Kate) Lang and I am a graduate student at the University of Rochester. I am currently trying to adapt the Olympus system so that I may study the RavenClaw part. However I am running into a couple problems. I have been working with tutorial 1 and on my computer I answer both the original and destination places, but then there is an internal error. Is this common? Is there a fix for it? Lastly, I attempted to write my own dialogue system (which happens to be my goal for the summer) based on the tutorial 1 code, but I cannot get it to work. This is asking a lot, and it is very short, but could you please take a look at it and/or tell me common mistakes that I should look out for? The problem is that the Welcome inform agent does not print out (I am using the tty option) anything when I have the request agent uncommented. If the request agent is commented out, then both of my inform agents work perfectly. I also tried something similar with a copy of RoomLine, but got the same results, so I do not believe that it is the internal error that I mentioned before. Thank you very much for your time, Kate Lang -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: DialogTask.cpp Url: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100630/a921161d/DialogTask.ksh From tkharris at gmail.com Wed Jun 30 17:49:38 2010 From: tkharris at gmail.com (Thomas Harris) Date: Wed, 30 Jun 2010 17:49:38 -0400 Subject: [Olympus developers 239]: Re: a few questions In-Reply-To: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu> References: <2374.128.151.24.207.1277914048.squirrel@www.cs.rochester.edu> Message-ID: Hello Kate, I'm not familiar with the "internal error", can you give more details about that? I noticed in the source file that you sent that a variable appears to be declared as num_total_ppl and then used as total_num_ppl. Maybe this could be the root of the error. Thanks, -Thomas On Wed, Jun 30, 2010 at 12:07 PM, wrote: > Hello! > > My name is Katherine (Kate) Lang and I am a graduate student at the > University of Rochester. I am currently trying to adapt the Olympus > system so that I may study the RavenClaw part. However I am running into > a couple problems. > > I have been working with tutorial 1 and on my computer I answer both the > original and destination places, but then there is an internal error. Is > this common? Is there a fix for it? > > Lastly, I attempted to write my own dialogue system (which happens to be > my goal for the summer) based on the tutorial 1 code, but I cannot get it > to work. This is asking a lot, and it is very short, but could you please > take a look at it and/or tell me common mistakes that I should look out > for? > > The problem is that the Welcome inform agent does not print out (I am > using the tty option) anything when I have the request agent uncommented. > If the request agent is commented out, then both of my inform agents work > perfectly. I also tried something similar with a copy of RoomLine, but got > the same results, so I do not believe that it is the internal error that I > mentioned before. > > Thank you very much for your time, > Kate Lang > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100630/6ad34371/attachment.html