Hi Antoine,<br><br>Yes, that was/is a problem and I tried something like this. But even more fundamental is the problem is that the p_seg_t* segment iterator that you get from pocketsphinx doesn&#39;t correctly implement ps_seg_prob when the segment iterator comes from the hypothesis iterator even though it works fine if you get the segment iterator from the best_hyp function (or whatever that&#39;s called). I&#39;ve sent David the code segment that illustrates this bug. I don&#39;t know that there&#39;s any kind of work around. For the most part we&#39;ve gotten mutiple hypotheses by running multiple recognizers, I guess.<br>

<br>Thanks,<br>-Thomas<br><br><div class="gmail_quote">On Tue, Apr 13, 2010 at 11:58 AM, Antoine Raux <span dir="ltr">&lt;<a href="mailto:antoine.raux@gmail.com">antoine.raux@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Hi all,<br>

<br>

What exactly is the confidence computation problem? Is it that we cannot compute the LM backoff type-based word confidence (see hyp_conf_slm in PocketsphinxEngine&#39;s main.cpp)?<br>

If that is the problem, one way to fix this might be to modify hyp_conf_slm to accept a ps_seg_t as an argument (instead of always getting seg_iter from ps_seg_iter):<br>

<br>

float* hyp_conf_slm (bool useFixedScore = false, ps_seg_t *seg_iter = NULL)<br>

{<br>

   const int MAX_TYPE_SIZE = 4096;<br>

   int32 score, type[MAX_TYPE_SIZE];<br>

   int32 k = 0;<br>

<br>

   // (antoine) no seg_iter was given, get the top segment iterator from ps<br>

   if (seg_iter == NULL)<br>

       seg_iter = ps_seg_iter(psd, &amp;score);<br>

<br>

       type[k++] = 3;                      // use the trigram dummy for first word<br>

<br>

   if (seg_iter != NULL) {<br>

       while (seg_iter = ps_seg_next(seg_iter)) {<br>

           if (k == MAX_TYPE_SIZE) return NULL;<br>

<br>

           int32 lscr, ascr;<br>

           ps_seg_prob(seg_iter, &amp;ascr, &amp;lscr, &amp;type[k++]);<br>

       }<br>

   }<br>

   type[k++] = 3; // (tk) dummy trigram after utterance<br>

   type[k++] = 3; // (tk) sometimes there&#39;s no end token, in which case<br>

                      // the list one was for the end token and this one is the dummy<br>

<br>

   // (antoine) allocate the array of confidence scores<br>

   float* conf = (float*)malloc(k*sizeof(float));<br>

<br>

   for (int32 i = 1; i &lt; k-2; i++) {<br>

       if(!useFixedScore) {<br>

           int32 t = type[i-1] + type[i] + ((type[i+1] + type[i+2])&lt;&lt;1); // (tk) wtf?<br>

           conf[i-1] = (float)((double)(t-6)/12.0);<br>

       } else {<br>

           conf[i-1] = 0.7f;<br>

       }<br>

   }<br>

<br>

   return conf;<br>

}<br>

<br>

Then further down, you can modify the third version of fillPartialHypStruct by just adding the argument when it calls hyp_conf_slm:<br>

<br>

// [2008-02-19] (antoine): this function takes a partial hypothesis and a reference to a<br>

//                        THypStruct and fills in the hyp struct<br>

void fillPartialHypStruct(ps_seg_t* curr_seg_iter, THypStruct* phs, int fromNBest) {<br>

<br>

   Log(STD_STREAM, &quot;Filling partial hyp struct\n&quot;);<br>

<br>

   size_t h_len, ch_len;<br>

   int n_words = 0, n_validwords, has_oov;<br>

   char tmp[16384];<br>

   float *lm_conf = NULL;<br>

<br>

   // Fill in confidence values for words in result and build filtered hypothesis<br>

   if (slm)<br>

       lm_conf = hyp_conf_slm(curr_seg_iter);<br>

   else<br>

       lm_conf = hyp_conf_slm(curr_seg_iter, true);<br>

<br>

(...)<br>

<br>

I don&#39;t really have any setup to test this but if someone who has could give it a shot and post the result to the mailing list...<br>

Now it might be that I misunderstood what the problem was altogether (in which case I apologize for the spam)...<br>

<br>

On a side note, the big commented out block in getHypStructs (as sent by Blaise) is from my Cactus code (which I had sent to Blaise as an example), so it&#39;s irrelevant to Olympus and should be deleted (for clarity&#39;s sake).<br>


<br>

antoine<br>

<br>

Blaise Thomson wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Hi Thomas / Alan,<br>

<br>

I&#39;ve now got some preliminary N-best list code to work with PocketSphinx. With the help of  some example code from Antoine I&#39;ve modified the pocketsphinx engine to produce a 1-best list for partial recognition results but an N-best list upon completion. I&#39;ve also modified the AudioServer to be able to receive multiple N-best lists from each of the recognizer (the number for each decoder specified by an optional &quot;:N&quot; after the decoder definition in the config file). In case this may be something you want to include in future versions of Olympus I&#39;ve attached my modified files.<br>


<br>

Note, however, that the code still doesn&#39;t produce any confidence score information for the N-best list. For this reason we will still probably be unable to use Olympus for our version of the LetsGo! system. If the PocketSphinx bugs you mentioned are fixed any time soon or if anyone finds out how to get confidence scores with the N-best list would you please let us know?<br>


<br>

Many thanks,<br>

Blaise<br>

<br>

<br>

<br>

Thomas Harris wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">

Hi Blaise,<br>

<br>

Thanks for looking into this. I hope we can include your bugfixes. I&#39;ve been looking into this as well, and there&#39;s a more fundamental issue. It seems like you can&#39;t get word confidence metrics from the PocketSphinx segment iterators when you&#39;ve gotten the sement iterators from the n_best hypothisis iterator. It smells like a PocketSphinx bug, but I haven&#39;t seen any reference implementation of PocketSphinx that makes use of those confidence metrics in an n_best setting, so I&#39;m not sure that it isn&#39;t a problem with how the PocketSphinx api is used. Until that issue is resolved n_best lists won&#39;t work in Olympus, too many downhill processes depend on those confidence metrics.<br>


<br>

Thanks,<br>

-Thomas<br>

<br></div><div><div></div><div class="h5">

On Wed, Mar 24, 2010 at 4:39 AM, Blaise Thomson &lt;<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a> &lt;mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a>&gt;&gt; wrote:<br>

<br>

    Dear Olympus developers,<br>

<br>

    I am trying to get the Olympus LetsGo! system to provide an N-best<br>

    list of speech recognition hypotheses. I found the -n_best switch<br>

    which can be passed to the PocketSphinxEngine which is supposed to<br>

    enable this but when I set the switch to anything other than 0 the<br>

    system crashes immediately on any audio input. I remember you said<br>

    that the system had been build to provide N-best lists so I was<br>

    wondering if you could give any advice on why it is not working.<br>

    Do you have a working N-best list system that you could send me to<br>

    see how things are configured?<br>

<br>

    In trying to solve the problem I took a look at the<br>

    PocketSphinxEngine source code and have noticed some possible<br>

    memory access bugs which may be contributing to this. These were<br>

    related to the way the iHypsGenerated variable was used. I&#39;ve<br>

    fixed these and can send them if you would like (I tried attaching<br>

    them but the mailing list won&#39;t let me). The resulting code still<br>

    crashes but at a later stage. After the fix, the log file<br>

    generates a WARNING: &quot;ngram_search.c&quot;, line 1000:. I don&#39;t know if<br>

    this might be the cause of the problem. There is also a<br>

    possibility that I simply have to add a configuration variable to<br>

    PocketSphinx itself. At the moment I have only used the n_best<br>

    switch on PocketSphinxEngine.<br>

<br>

    Please do let me know if you have any ideas of how to get this<br>

    working or who else to contact.<br>

<br>

    Thanks for all you help,<br>

<br>

    Blaise<br>

<br>

<br>

<br>

</div></div></blockquote>

<br>

</blockquote>

<br>

</blockquote></div><br>