[Olympus developers 218]: Re: N-best lists for PocketSphinx / Olympus

Antoine Raux antoine.raux at gmail.com
Tue Apr 13 11:58:14 EDT 2010


Hi all,

What exactly is the confidence computation problem? Is it that we cannot 
compute the LM backoff type-based word confidence (see hyp_conf_slm in 
PocketsphinxEngine's main.cpp)?
If that is the problem, one way to fix this might be to modify 
hyp_conf_slm to accept a ps_seg_t as an argument (instead of always 
getting seg_iter from ps_seg_iter):

float* hyp_conf_slm (bool useFixedScore = false, ps_seg_t *seg_iter = NULL)
{
    const int MAX_TYPE_SIZE = 4096;
    int32 score, type[MAX_TYPE_SIZE];
    int32 k = 0;

    // (antoine) no seg_iter was given, get the top segment iterator from ps
    if (seg_iter == NULL)
        seg_iter = ps_seg_iter(psd, &score);

        type[k++] = 3;                      // use the trigram dummy for 
first word

    if (seg_iter != NULL) {
        while (seg_iter = ps_seg_next(seg_iter)) {
            if (k == MAX_TYPE_SIZE) return NULL;

            int32 lscr, ascr;
            ps_seg_prob(seg_iter, &ascr, &lscr, &type[k++]);
        }
    }
    type[k++] = 3; // (tk) dummy trigram after utterance
    type[k++] = 3; // (tk) sometimes there's no end token, in which case
                       // the list one was for the end token and this 
one is the dummy

    // (antoine) allocate the array of confidence scores
    float* conf = (float*)malloc(k*sizeof(float));

    for (int32 i = 1; i < k-2; i++) {
        if(!useFixedScore) {
            int32 t = type[i-1] + type[i] + ((type[i+1] + 
type[i+2])<<1); // (tk) wtf?
            conf[i-1] = (float)((double)(t-6)/12.0);
        } else {
            conf[i-1] = 0.7f;
        }
    }

    return conf;
}

Then further down, you can modify the third version of 
fillPartialHypStruct by just adding the argument when it calls hyp_conf_slm:

// [2008-02-19] (antoine): this function takes a partial hypothesis and 
a reference to a
//                        THypStruct and fills in the hyp struct
void fillPartialHypStruct(ps_seg_t* curr_seg_iter, THypStruct* phs, int 
fromNBest) {

    Log(STD_STREAM, "Filling partial hyp struct\n");

    size_t h_len, ch_len;
    int n_words = 0, n_validwords, has_oov;
    char tmp[16384];
    float *lm_conf = NULL;

    // Fill in confidence values for words in result and build filtered 
hypothesis
    if (slm)
        lm_conf = hyp_conf_slm(curr_seg_iter);
    else
        lm_conf = hyp_conf_slm(curr_seg_iter, true);

(...)

I don't really have any setup to test this but if someone who has could 
give it a shot and post the result to the mailing list...
Now it might be that I misunderstood what the problem was altogether (in 
which case I apologize for the spam)...

On a side note, the big commented out block in getHypStructs (as sent by 
Blaise) is from my Cactus code (which I had sent to Blaise as an 
example), so it's irrelevant to Olympus and should be deleted (for 
clarity's sake).

antoine

Blaise Thomson wrote:
> Hi Thomas / Alan,
>
> I've now got some preliminary N-best list code to work with 
> PocketSphinx. With the help of  some example code from Antoine I've 
> modified the pocketsphinx engine to produce a 1-best list for partial 
> recognition results but an N-best list upon completion. I've also 
> modified the AudioServer to be able to receive multiple N-best lists 
> from each of the recognizer (the number for each decoder specified by 
> an optional ":N" after the decoder definition in the config file). In 
> case this may be something you want to include in future versions of 
> Olympus I've attached my modified files.
>
> Note, however, that the code still doesn't produce any confidence 
> score information for the N-best list. For this reason we will still 
> probably be unable to use Olympus for our version of the LetsGo! 
> system. If the PocketSphinx bugs you mentioned are fixed any time soon 
> or if anyone finds out how to get confidence scores with the N-best 
> list would you please let us know?
>
> Many thanks,
> Blaise
>
>
>
> Thomas Harris wrote:
>> Hi Blaise,
>>
>> Thanks for looking into this. I hope we can include your bugfixes. 
>> I've been looking into this as well, and there's a more fundamental 
>> issue. It seems like you can't get word confidence metrics from the 
>> PocketSphinx segment iterators when you've gotten the sement 
>> iterators from the n_best hypothisis iterator. It smells like a 
>> PocketSphinx bug, but I haven't seen any reference implementation of 
>> PocketSphinx that makes use of those confidence metrics in an n_best 
>> setting, so I'm not sure that it isn't a problem with how the 
>> PocketSphinx api is used. Until that issue is resolved n_best lists 
>> won't work in Olympus, too many downhill processes depend on those 
>> confidence metrics.
>>
>> Thanks,
>> -Thomas
>>
>> On Wed, Mar 24, 2010 at 4:39 AM, Blaise Thomson <brmt2 at cam.ac.uk 
>> <mailto:brmt2 at cam.ac.uk>> wrote:
>>
>>     Dear Olympus developers,
>>
>>     I am trying to get the Olympus LetsGo! system to provide an N-best
>>     list of speech recognition hypotheses. I found the -n_best switch
>>     which can be passed to the PocketSphinxEngine which is supposed to
>>     enable this but when I set the switch to anything other than 0 the
>>     system crashes immediately on any audio input. I remember you said
>>     that the system had been build to provide N-best lists so I was
>>     wondering if you could give any advice on why it is not working.
>>     Do you have a working N-best list system that you could send me to
>>     see how things are configured?
>>
>>     In trying to solve the problem I took a look at the
>>     PocketSphinxEngine source code and have noticed some possible
>>     memory access bugs which may be contributing to this. These were
>>     related to the way the iHypsGenerated variable was used. I've
>>     fixed these and can send them if you would like (I tried attaching
>>     them but the mailing list won't let me). The resulting code still
>>     crashes but at a later stage. After the fix, the log file
>>     generates a WARNING: "ngram_search.c", line 1000:. I don't know if
>>     this might be the cause of the problem. There is also a
>>     possibility that I simply have to add a configuration variable to
>>     PocketSphinx itself. At the moment I have only used the n_best
>>     switch on PocketSphinxEngine.
>>
>>     Please do let me know if you have any ideas of how to get this
>>     working or who else to contact.
>>
>>     Thanks for all you help,
>>
>>     Blaise
>>
>>
>>
>



More information about the Olympus-developers mailing list