<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=ISO-8859-1"

 http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

Hi Matt,<br>

<br>

Thanks for your answer.<br>

However, my misunderstanding was where to modify the configuration file

in order to change the end-pointing mode.<br>

<br>

Best,<br>

Jose<br>

<br>

Em 09-06-2010 22:46, Matthew Marge escreveu:

<blockquote

 cite="mid:AANLkTikTn_5JkQmx0w17c_hh1B-IY6uIPH54tQIFmfWl@mail.gmail.com"

 type="cite">Hi Jose,<br>

  <br>

This provides a good definition of endpointing as it relates to spoken

dialogue systems:<br>

  <br>

  <blockquote

 style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"

 class="gmail_quote">

    <table width="617" border="0" cellpadding="0" cellspacing="0"

 height="173">

      <tbody>

        <tr>

        </tr>

        <tr>

          <td colspan="11" valign="top" width="615" height="215"><font

 face="Verdana, Arial, Helvetica, sans-serif" size="2">Endpointing is

the process of determining the beginning and the end of speech within

an incoming sample stream. This takes into account whether or not

energy is detected at speech frequencies, the duration of the detected

sound and pitch extraction. The pitch is tracked to help recognise

vowels as indications of speech and to filter out background noise

events, such as a door closing. The endpointer can be tuned to avoid

false barge-in, or false-triggers, i.e. cutting off the prompt when the

user did not speak. This can be caused by background noise or prompt

echo. The endpointer can also be tuned to avoid missing leading speech.

Syllables, leading consonants, or in the worst case, everything a

caller who speaks quietly says might be missed. Significant advances

have been made in recent years in echo cancellation and endpointing

techniques and these have resulted in better barge-in performance, and

leading speech recognition companies are recommending the use of

barge-in wherever possible. </font></td>

        </tr>

      </tbody>

    </table>

  </blockquote>

  <a moz-do-not-send="true"

 href="http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm">http://spotlight.ccir.ed.ac.uk/public_documents/Dialogue_design_guide/barge_in_requirements.htm</a><br>

  <br>

Cheers,<br>

Matt<br>

  <br>

  <div class="gmail_quote">2010/6/9 Jos&eacute; David Lopes <span dir="ltr">&lt;<a

 moz-do-not-send="true" href="mailto:zedavid@l2f.inesc-id.pt">zedavid@l2f.inesc-id.pt</a>&gt;</span><br>

  <blockquote class="gmail_quote"

 style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">Alex,<br>

    <br>

Thanks for your quick answer!<br>

    <br>

What do you mean by end-pointers? Is it the run mode configuration?

I've been looking at the documentation and it is not clear to me what

these end-pointers are.<br>

    <br>

Jose<br>

    <br>

Em 08-06-2010 15:31, Alex Rudnicky escreveu:

    <div>

    <div class="h5"><br>

    <blockquote class="gmail_quote"

 style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">

Jose,<br>

      <br>

There are actually three different end-pointers in Olympus; these are

selectable in (I believe) the configuration for the Audio server. You

might try experimenting with alternate ones. Also, it might be worth a

try to modify the end-pointing sensitivity parameters.<br>

      <br>

Alex<br>

      <br>

      <br>

-----Original Message-----<br>

From: <a moz-do-not-send="true"

 href="mailto:olympus-developers-bounces@mailman.srv.cs.cmu.edu"

 target="_blank">olympus-developers-bounces@mailman.srv.cs.cmu.edu</a>

[mailto:<a moz-do-not-send="true"

 href="mailto:olympus-developers-bounces@mailman.srv.cs.cmu.edu"

 target="_blank">olympus-developers-bounces@mailman.srv.cs.cmu.edu</a>]

On Behalf Of Jos&eacute; David Lopes<br>

Sent: Tuesday, June 08, 2010 6:56 AM<br>

To: <a moz-do-not-send="true"

 href="mailto:olympus-developers@cs.cmu.edu" target="_blank">olympus-developers@cs.cmu.edu</a><br>

Subject: [Olympus developers 233]: Problems with VAD<br>

      <br>

I'm working on a dialogue system with a different ASR module. I'm<br>

experiencing some problems with the VAD. This module seems very<br>

sensitive, triggering very easily, and thus forwarding the data coming<br>

from the microphone to the engines even if it is not speech. I was<br>

wondering if I could convert our own GMM models for VAD to the sphinx 3<br>

format. Is there any tool available to convert a standard format (for<br>

instance HTK) to sphinx 3 format? I could also train GMMs using our own<br>

Olympus data. Did anyone did this before?<br>

      <br>

Best,<br>

Jose David<br>

&nbsp; <br>

    </blockquote>

    <br>

    </div>

    </div>

  </blockquote>

  </div>

  <br>

</blockquote>

<br>

</body>

</html>