[Olympus developers 222]: Re: N-best lists for PocketSphinx / Olympus
Thomas Harris
tkharris at gmail.com
Tue Apr 13 13:18:22 EDT 2010
Good idea. That sounds doable. But we're also not getting acoustic model
scores. I don't know for sure, do latter stages of Olympus (like Helios)
depend on word-level acoustic scores?
-Thomas
On Tue, Apr 13, 2010 at 1:12 PM, Antoine Raux <antoine.raux at gmail.com>wrote:
> Actually, rather than a bunch of ifs as I wrote, the (still temporary)
> solution is to get the ngram_model_t object from ps, and then use the
> sphinxbase functions (such as ngram_tg_score) to compute the backoff type
> (which is exactly what ps does at decoding time).
>
> antoine
>
> Thomas Harris wrote:
>
>> Hi Antoine,
>>
>> Yes, that was/is a problem and I tried something like this. But even more
>> fundamental is the problem is that the p_seg_t* segment iterator that you
>> get from pocketsphinx doesn't correctly implement ps_seg_prob when the
>> segment iterator comes from the hypothesis iterator even though it works
>> fine if you get the segment iterator from the best_hyp function (or whatever
>> that's called). I've sent David the code segment that illustrates this bug.
>> I don't know that there's any kind of work around. For the most part we've
>> gotten mutiple hypotheses by running multiple recognizers, I guess.
>>
>> Thanks,
>> -Thomas
>>
>> On Tue, Apr 13, 2010 at 11:58 AM, Antoine Raux <antoine.raux at gmail.com<mailto:
>> antoine.raux at gmail.com>> wrote:
>>
>> Hi all,
>>
>> What exactly is the confidence computation problem? Is it that we
>> cannot compute the LM backoff type-based word confidence (see
>> hyp_conf_slm in PocketsphinxEngine's main.cpp)?
>> If that is the problem, one way to fix this might be to modify
>> hyp_conf_slm to accept a ps_seg_t as an argument (instead of
>> always getting seg_iter from ps_seg_iter):
>>
>> float* hyp_conf_slm (bool useFixedScore = false, ps_seg_t
>> *seg_iter = NULL)
>> {
>> const int MAX_TYPE_SIZE = 4096;
>> int32 score, type[MAX_TYPE_SIZE];
>> int32 k = 0;
>>
>> // (antoine) no seg_iter was given, get the top segment iterator
>> from ps
>> if (seg_iter == NULL)
>> seg_iter = ps_seg_iter(psd, &score);
>>
>> type[k++] = 3; // use the trigram dummy
>> for first word
>>
>> if (seg_iter != NULL) {
>> while (seg_iter = ps_seg_next(seg_iter)) {
>> if (k == MAX_TYPE_SIZE) return NULL;
>>
>> int32 lscr, ascr;
>> ps_seg_prob(seg_iter, &ascr, &lscr, &type[k++]);
>> }
>> }
>> type[k++] = 3; // (tk) dummy trigram after utterance
>> type[k++] = 3; // (tk) sometimes there's no end token, in which case
>> // the list one was for the end token and
>> this one is the dummy
>>
>> // (antoine) allocate the array of confidence scores
>> float* conf = (float*)malloc(k*sizeof(float));
>>
>> for (int32 i = 1; i < k-2; i++) {
>> if(!useFixedScore) {
>> int32 t = type[i-1] + type[i] + ((type[i+1] +
>> type[i+2])<<1); // (tk) wtf?
>> conf[i-1] = (float)((double)(t-6)/12.0);
>> } else {
>> conf[i-1] = 0.7f;
>> }
>> }
>>
>> return conf;
>> }
>>
>> Then further down, you can modify the third version of
>> fillPartialHypStruct by just adding the argument when it calls
>> hyp_conf_slm:
>>
>> // [2008-02-19] (antoine): this function takes a partial
>> hypothesis and a reference to a
>> // THypStruct and fills in the hyp struct
>> void fillPartialHypStruct(ps_seg_t* curr_seg_iter, THypStruct*
>> phs, int fromNBest) {
>>
>> Log(STD_STREAM, "Filling partial hyp struct\n");
>>
>> size_t h_len, ch_len;
>> int n_words = 0, n_validwords, has_oov;
>> char tmp[16384];
>> float *lm_conf = NULL;
>>
>> // Fill in confidence values for words in result and build
>> filtered hypothesis
>> if (slm)
>> lm_conf = hyp_conf_slm(curr_seg_iter);
>> else
>> lm_conf = hyp_conf_slm(curr_seg_iter, true);
>>
>> (...)
>>
>> I don't really have any setup to test this but if someone who has
>> could give it a shot and post the result to the mailing list...
>> Now it might be that I misunderstood what the problem was
>> altogether (in which case I apologize for the spam)...
>>
>> On a side note, the big commented out block in getHypStructs (as
>> sent by Blaise) is from my Cactus code (which I had sent to Blaise
>> as an example), so it's irrelevant to Olympus and should be
>> deleted (for clarity's sake).
>>
>> antoine
>>
>> Blaise Thomson wrote:
>>
>> Hi Thomas / Alan,
>>
>> I've now got some preliminary N-best list code to work with
>> PocketSphinx. With the help of some example code from Antoine
>> I've modified the pocketsphinx engine to produce a 1-best list
>> for partial recognition results but an N-best list upon
>> completion. I've also modified the AudioServer to be able to
>> receive multiple N-best lists from each of the recognizer (the
>> number for each decoder specified by an optional ":N" after
>> the decoder definition in the config file). In case this may
>> be something you want to include in future versions of Olympus
>> I've attached my modified files.
>>
>> Note, however, that the code still doesn't produce any
>> confidence score information for the N-best list. For this
>> reason we will still probably be unable to use Olympus for our
>> version of the LetsGo! system. If the PocketSphinx bugs you
>> mentioned are fixed any time soon or if anyone finds out how
>> to get confidence scores with the N-best list would you please
>> let us know?
>>
>> Many thanks,
>> Blaise
>>
>>
>>
>> Thomas Harris wrote:
>>
>> Hi Blaise,
>>
>> Thanks for looking into this. I hope we can include your
>> bugfixes. I've been looking into this as well, and there's
>> a more fundamental issue. It seems like you can't get word
>> confidence metrics from the PocketSphinx segment iterators
>> when you've gotten the sement iterators from the n_best
>> hypothisis iterator. It smells like a PocketSphinx bug,
>> but I haven't seen any reference implementation of
>> PocketSphinx that makes use of those confidence metrics in
>> an n_best setting, so I'm not sure that it isn't a problem
>> with how the PocketSphinx api is used. Until that issue is
>> resolved n_best lists won't work in Olympus, too many
>> downhill processes depend on those confidence metrics.
>>
>> Thanks,
>> -Thomas
>>
>> On Wed, Mar 24, 2010 at 4:39 AM, Blaise Thomson
>> <brmt2 at cam.ac.uk <mailto:brmt2 at cam.ac.uk>
>> <mailto:brmt2 at cam.ac.uk <mailto:brmt2 at cam.ac.uk>>> wrote:
>>
>> Dear Olympus developers,
>>
>> I am trying to get the Olympus LetsGo! system to
>> provide an N-best
>> list of speech recognition hypotheses. I found the
>> -n_best switch
>> which can be passed to the PocketSphinxEngine which is
>> supposed to
>> enable this but when I set the switch to anything other
>> than 0 the
>> system crashes immediately on any audio input. I
>> remember you said
>> that the system had been build to provide N-best lists
>> so I was
>> wondering if you could give any advice on why it is not
>> working.
>> Do you have a working N-best list system that you could
>> send me to
>> see how things are configured?
>>
>> In trying to solve the problem I took a look at the
>> PocketSphinxEngine source code and have noticed some
>> possible
>> memory access bugs which may be contributing to this.
>> These were
>> related to the way the iHypsGenerated variable was
>> used. I've
>> fixed these and can send them if you would like (I
>> tried attaching
>> them but the mailing list won't let me). The resulting
>> code still
>> crashes but at a later stage. After the fix, the log file
>> generates a WARNING: "ngram_search.c", line 1000:. I
>> don't know if
>> this might be the cause of the problem. There is also a
>> possibility that I simply have to add a configuration
>> variable to
>> PocketSphinx itself. At the moment I have only used the
>> n_best
>> switch on PocketSphinxEngine.
>>
>> Please do let me know if you have any ideas of how to
>> get this
>> working or who else to contact.
>>
>> Thanks for all you help,
>>
>> Blaise
>>
>>
>>
>>
>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.srv.cs.cmu.edu/pipermail/olympus-developers/attachments/20100413/27fdf769/attachment-0001.html
More information about the Olympus-developers
mailing list