Good idea. That sounds doable. But we&#39;re also not getting acoustic model scores. I don&#39;t know for sure, do latter stages of Olympus (like Helios) depend on word-level acoustic scores?<br><br>-Thomas<br><br><div class="gmail_quote">

On Tue, Apr 13, 2010 at 1:12 PM, Antoine Raux <span dir="ltr">&lt;<a href="mailto:antoine.raux@gmail.com">antoine.raux@gmail.com</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

Actually, rather than a bunch of ifs as I wrote, the (still temporary) solution is to get the ngram_model_t object from ps, and then use the sphinxbase functions (such as ngram_tg_score) to compute the backoff type (which is exactly what ps does at decoding time).<br>


<br>

antoine<br>

<br>

Thomas Harris wrote:<br>

<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">

Hi Antoine,<br>

<br>

Yes, that was/is a problem and I tried something like this. But even more fundamental is the problem is that the p_seg_t* segment iterator that you get from pocketsphinx doesn&#39;t correctly implement ps_seg_prob when the segment iterator comes from the hypothesis iterator even though it works fine if you get the segment iterator from the best_hyp function (or whatever that&#39;s called). I&#39;ve sent David the code segment that illustrates this bug. I don&#39;t know that there&#39;s any kind of work around. For the most part we&#39;ve gotten mutiple hypotheses by running multiple recognizers, I guess.<br>


<br>

Thanks,<br>

-Thomas<br>

<br></div><div><div></div><div class="h5">

On Tue, Apr 13, 2010 at 11:58 AM, Antoine Raux &lt;<a href="mailto:antoine.raux@gmail.com" target="_blank">antoine.raux@gmail.com</a> &lt;mailto:<a href="mailto:antoine.raux@gmail.com" target="_blank">antoine.raux@gmail.com</a>&gt;&gt; wrote:<br>


<br>

    Hi all,<br>

<br>

    What exactly is the confidence computation problem? Is it that we<br>

    cannot compute the LM backoff type-based word confidence (see<br>

    hyp_conf_slm in PocketsphinxEngine&#39;s main.cpp)?<br>

    If that is the problem, one way to fix this might be to modify<br>

    hyp_conf_slm to accept a ps_seg_t as an argument (instead of<br>

    always getting seg_iter from ps_seg_iter):<br>

<br>

    float* hyp_conf_slm (bool useFixedScore = false, ps_seg_t<br>

    *seg_iter = NULL)<br>

    {<br>

      const int MAX_TYPE_SIZE = 4096;<br>

      int32 score, type[MAX_TYPE_SIZE];<br>

      int32 k = 0;<br>

<br>

      // (antoine) no seg_iter was given, get the top segment iterator<br>

    from ps<br>

      if (seg_iter == NULL)<br>

          seg_iter = ps_seg_iter(psd, &amp;score);<br>

<br>

          type[k++] = 3;                      // use the trigram dummy<br>

    for first word<br>

<br>

      if (seg_iter != NULL) {<br>

          while (seg_iter = ps_seg_next(seg_iter)) {<br>

              if (k == MAX_TYPE_SIZE) return NULL;<br>

<br>

              int32 lscr, ascr;<br>

              ps_seg_prob(seg_iter, &amp;ascr, &amp;lscr, &amp;type[k++]);<br>

          }<br>

      }<br>

      type[k++] = 3; // (tk) dummy trigram after utterance<br>

      type[k++] = 3; // (tk) sometimes there&#39;s no end token, in which case<br>

                         // the list one was for the end token and<br>

    this one is the dummy<br>

<br>

      // (antoine) allocate the array of confidence scores<br>

      float* conf = (float*)malloc(k*sizeof(float));<br>

<br>

      for (int32 i = 1; i &lt; k-2; i++) {<br>

          if(!useFixedScore) {<br>

              int32 t = type[i-1] + type[i] + ((type[i+1] +<br>

    type[i+2])&lt;&lt;1); // (tk) wtf?<br>

              conf[i-1] = (float)((double)(t-6)/12.0);<br>

          } else {<br>

              conf[i-1] = 0.7f;<br>

          }<br>

      }<br>

<br>

      return conf;<br>

    }<br>

<br>

    Then further down, you can modify the third version of<br>

    fillPartialHypStruct by just adding the argument when it calls<br>

    hyp_conf_slm:<br>

<br>

    // [2008-02-19] (antoine): this function takes a partial<br>

    hypothesis and a reference to a<br>

    //                        THypStruct and fills in the hyp struct<br>

    void fillPartialHypStruct(ps_seg_t* curr_seg_iter, THypStruct*<br>

    phs, int fromNBest) {<br>

<br>

      Log(STD_STREAM, &quot;Filling partial hyp struct\n&quot;);<br>

<br>

      size_t h_len, ch_len;<br>

      int n_words = 0, n_validwords, has_oov;<br>

      char tmp[16384];<br>

      float *lm_conf = NULL;<br>

<br>

      // Fill in confidence values for words in result and build<br>

    filtered hypothesis<br>

      if (slm)<br>

          lm_conf = hyp_conf_slm(curr_seg_iter);<br>

      else<br>

          lm_conf = hyp_conf_slm(curr_seg_iter, true);<br>

<br>

    (...)<br>

<br>

    I don&#39;t really have any setup to test this but if someone who has<br>

    could give it a shot and post the result to the mailing list...<br>

    Now it might be that I misunderstood what the problem was<br>

    altogether (in which case I apologize for the spam)...<br>

<br>

    On a side note, the big commented out block in getHypStructs (as<br>

    sent by Blaise) is from my Cactus code (which I had sent to Blaise<br>

    as an example), so it&#39;s irrelevant to Olympus and should be<br>

    deleted (for clarity&#39;s sake).<br>

<br>

    antoine<br>

<br>

    Blaise Thomson wrote:<br>

<br>

        Hi Thomas / Alan,<br>

<br>

        I&#39;ve now got some preliminary N-best list code to work with<br>

        PocketSphinx. With the help of  some example code from Antoine<br>

        I&#39;ve modified the pocketsphinx engine to produce a 1-best list<br>

        for partial recognition results but an N-best list upon<br>

        completion. I&#39;ve also modified the AudioServer to be able to<br>

        receive multiple N-best lists from each of the recognizer (the<br>

        number for each decoder specified by an optional &quot;:N&quot; after<br>

        the decoder definition in the config file). In case this may<br>

        be something you want to include in future versions of Olympus<br>

        I&#39;ve attached my modified files.<br>

<br>

        Note, however, that the code still doesn&#39;t produce any<br>

        confidence score information for the N-best list. For this<br>

        reason we will still probably be unable to use Olympus for our<br>

        version of the LetsGo! system. If the PocketSphinx bugs you<br>

        mentioned are fixed any time soon or if anyone finds out how<br>

        to get confidence scores with the N-best list would you please<br>

        let us know?<br>

<br>

        Many thanks,<br>

        Blaise<br>

<br>

<br>

<br>

        Thomas Harris wrote:<br>

<br>

            Hi Blaise,<br>

<br>

            Thanks for looking into this. I hope we can include your<br>

            bugfixes. I&#39;ve been looking into this as well, and there&#39;s<br>

            a more fundamental issue. It seems like you can&#39;t get word<br>

            confidence metrics from the PocketSphinx segment iterators<br>

            when you&#39;ve gotten the sement iterators from the n_best<br>

            hypothisis iterator. It smells like a PocketSphinx bug,<br>

            but I haven&#39;t seen any reference implementation of<br>

            PocketSphinx that makes use of those confidence metrics in<br>

            an n_best setting, so I&#39;m not sure that it isn&#39;t a problem<br>

            with how the PocketSphinx api is used. Until that issue is<br>

            resolved n_best lists won&#39;t work in Olympus, too many<br>

            downhill processes depend on those confidence metrics.<br>

<br>

            Thanks,<br>

            -Thomas<br>

<br>

            On Wed, Mar 24, 2010 at 4:39 AM, Blaise Thomson<br>

            &lt;<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a> &lt;mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a>&gt;<br></div></div><div><div></div><div class="h5">

            &lt;mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a> &lt;mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a>&gt;&gt;&gt; wrote:<br>

<br>

               Dear Olympus developers,<br>

<br>

               I am trying to get the Olympus LetsGo! system to<br>

            provide an N-best<br>

               list of speech recognition hypotheses. I found the<br>

            -n_best switch<br>

               which can be passed to the PocketSphinxEngine which is<br>

            supposed to<br>

               enable this but when I set the switch to anything other<br>

            than 0 the<br>

               system crashes immediately on any audio input. I<br>

            remember you said<br>

               that the system had been build to provide N-best lists<br>

            so I was<br>

               wondering if you could give any advice on why it is not<br>

            working.<br>

               Do you have a working N-best list system that you could<br>

            send me to<br>

               see how things are configured?<br>

<br>

               In trying to solve the problem I took a look at the<br>

               PocketSphinxEngine source code and have noticed some<br>

            possible<br>

               memory access bugs which may be contributing to this.<br>

            These were<br>

               related to the way the iHypsGenerated variable was<br>

            used. I&#39;ve<br>

               fixed these and can send them if you would like (I<br>

            tried attaching<br>

               them but the mailing list won&#39;t let me). The resulting<br>

            code still<br>

               crashes but at a later stage. After the fix, the log file<br>

               generates a WARNING: &quot;ngram_search.c&quot;, line 1000:. I<br>

            don&#39;t know if<br>

               this might be the cause of the problem. There is also a<br>

               possibility that I simply have to add a configuration<br>

            variable to<br>

               PocketSphinx itself. At the moment I have only used the<br>

            n_best<br>

               switch on PocketSphinxEngine.<br>

<br>

               Please do let me know if you have any ideas of how to<br>

            get this<br>

               working or who else to contact.<br>

<br>

               Thanks for all you help,<br>

<br>

               Blaise<br>

<br>

<br>

<br>

<br>

<br>

<br>

</div></div></blockquote>

<br>

</blockquote></div><br>