Good idea. That sounds doable. But we're also not getting acoustic model scores. I don't know for sure, do latter stages of Olympus (like Helios) depend on word-level acoustic scores?<br><br>-Thomas<br><br><div class="gmail_quote">
On Tue, Apr 13, 2010 at 1:12 PM, Antoine Raux <span dir="ltr"><<a href="mailto:antoine.raux@gmail.com">antoine.raux@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
Actually, rather than a bunch of ifs as I wrote, the (still temporary) solution is to get the ngram_model_t object from ps, and then use the sphinxbase functions (such as ngram_tg_score) to compute the backoff type (which is exactly what ps does at decoding time).<br>
<br>
antoine<br>
<br>
Thomas Harris wrote:<br>
<blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;"><div class="im">
Hi Antoine,<br>
<br>
Yes, that was/is a problem and I tried something like this. But even more fundamental is the problem is that the p_seg_t* segment iterator that you get from pocketsphinx doesn't correctly implement ps_seg_prob when the segment iterator comes from the hypothesis iterator even though it works fine if you get the segment iterator from the best_hyp function (or whatever that's called). I've sent David the code segment that illustrates this bug. I don't know that there's any kind of work around. For the most part we've gotten mutiple hypotheses by running multiple recognizers, I guess.<br>
<br>
Thanks,<br>
-Thomas<br>
<br></div><div><div></div><div class="h5">
On Tue, Apr 13, 2010 at 11:58 AM, Antoine Raux <<a href="mailto:antoine.raux@gmail.com" target="_blank">antoine.raux@gmail.com</a> <mailto:<a href="mailto:antoine.raux@gmail.com" target="_blank">antoine.raux@gmail.com</a>>> wrote:<br>
<br>
Hi all,<br>
<br>
What exactly is the confidence computation problem? Is it that we<br>
cannot compute the LM backoff type-based word confidence (see<br>
hyp_conf_slm in PocketsphinxEngine's main.cpp)?<br>
If that is the problem, one way to fix this might be to modify<br>
hyp_conf_slm to accept a ps_seg_t as an argument (instead of<br>
always getting seg_iter from ps_seg_iter):<br>
<br>
float* hyp_conf_slm (bool useFixedScore = false, ps_seg_t<br>
*seg_iter = NULL)<br>
{<br>
const int MAX_TYPE_SIZE = 4096;<br>
int32 score, type[MAX_TYPE_SIZE];<br>
int32 k = 0;<br>
<br>
// (antoine) no seg_iter was given, get the top segment iterator<br>
from ps<br>
if (seg_iter == NULL)<br>
seg_iter = ps_seg_iter(psd, &score);<br>
<br>
type[k++] = 3; // use the trigram dummy<br>
for first word<br>
<br>
if (seg_iter != NULL) {<br>
while (seg_iter = ps_seg_next(seg_iter)) {<br>
if (k == MAX_TYPE_SIZE) return NULL;<br>
<br>
int32 lscr, ascr;<br>
ps_seg_prob(seg_iter, &ascr, &lscr, &type[k++]);<br>
}<br>
}<br>
type[k++] = 3; // (tk) dummy trigram after utterance<br>
type[k++] = 3; // (tk) sometimes there's no end token, in which case<br>
// the list one was for the end token and<br>
this one is the dummy<br>
<br>
// (antoine) allocate the array of confidence scores<br>
float* conf = (float*)malloc(k*sizeof(float));<br>
<br>
for (int32 i = 1; i < k-2; i++) {<br>
if(!useFixedScore) {<br>
int32 t = type[i-1] + type[i] + ((type[i+1] +<br>
type[i+2])<<1); // (tk) wtf?<br>
conf[i-1] = (float)((double)(t-6)/12.0);<br>
} else {<br>
conf[i-1] = 0.7f;<br>
}<br>
}<br>
<br>
return conf;<br>
}<br>
<br>
Then further down, you can modify the third version of<br>
fillPartialHypStruct by just adding the argument when it calls<br>
hyp_conf_slm:<br>
<br>
// [2008-02-19] (antoine): this function takes a partial<br>
hypothesis and a reference to a<br>
// THypStruct and fills in the hyp struct<br>
void fillPartialHypStruct(ps_seg_t* curr_seg_iter, THypStruct*<br>
phs, int fromNBest) {<br>
<br>
Log(STD_STREAM, "Filling partial hyp struct\n");<br>
<br>
size_t h_len, ch_len;<br>
int n_words = 0, n_validwords, has_oov;<br>
char tmp[16384];<br>
float *lm_conf = NULL;<br>
<br>
// Fill in confidence values for words in result and build<br>
filtered hypothesis<br>
if (slm)<br>
lm_conf = hyp_conf_slm(curr_seg_iter);<br>
else<br>
lm_conf = hyp_conf_slm(curr_seg_iter, true);<br>
<br>
(...)<br>
<br>
I don't really have any setup to test this but if someone who has<br>
could give it a shot and post the result to the mailing list...<br>
Now it might be that I misunderstood what the problem was<br>
altogether (in which case I apologize for the spam)...<br>
<br>
On a side note, the big commented out block in getHypStructs (as<br>
sent by Blaise) is from my Cactus code (which I had sent to Blaise<br>
as an example), so it's irrelevant to Olympus and should be<br>
deleted (for clarity's sake).<br>
<br>
antoine<br>
<br>
Blaise Thomson wrote:<br>
<br>
Hi Thomas / Alan,<br>
<br>
I've now got some preliminary N-best list code to work with<br>
PocketSphinx. With the help of some example code from Antoine<br>
I've modified the pocketsphinx engine to produce a 1-best list<br>
for partial recognition results but an N-best list upon<br>
completion. I've also modified the AudioServer to be able to<br>
receive multiple N-best lists from each of the recognizer (the<br>
number for each decoder specified by an optional ":N" after<br>
the decoder definition in the config file). In case this may<br>
be something you want to include in future versions of Olympus<br>
I've attached my modified files.<br>
<br>
Note, however, that the code still doesn't produce any<br>
confidence score information for the N-best list. For this<br>
reason we will still probably be unable to use Olympus for our<br>
version of the LetsGo! system. If the PocketSphinx bugs you<br>
mentioned are fixed any time soon or if anyone finds out how<br>
to get confidence scores with the N-best list would you please<br>
let us know?<br>
<br>
Many thanks,<br>
Blaise<br>
<br>
<br>
<br>
Thomas Harris wrote:<br>
<br>
Hi Blaise,<br>
<br>
Thanks for looking into this. I hope we can include your<br>
bugfixes. I've been looking into this as well, and there's<br>
a more fundamental issue. It seems like you can't get word<br>
confidence metrics from the PocketSphinx segment iterators<br>
when you've gotten the sement iterators from the n_best<br>
hypothisis iterator. It smells like a PocketSphinx bug,<br>
but I haven't seen any reference implementation of<br>
PocketSphinx that makes use of those confidence metrics in<br>
an n_best setting, so I'm not sure that it isn't a problem<br>
with how the PocketSphinx api is used. Until that issue is<br>
resolved n_best lists won't work in Olympus, too many<br>
downhill processes depend on those confidence metrics.<br>
<br>
Thanks,<br>
-Thomas<br>
<br>
On Wed, Mar 24, 2010 at 4:39 AM, Blaise Thomson<br>
<<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a> <mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a>><br></div></div><div><div></div><div class="h5">
<mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a> <mailto:<a href="mailto:brmt2@cam.ac.uk" target="_blank">brmt2@cam.ac.uk</a>>>> wrote:<br>
<br>
Dear Olympus developers,<br>
<br>
I am trying to get the Olympus LetsGo! system to<br>
provide an N-best<br>
list of speech recognition hypotheses. I found the<br>
-n_best switch<br>
which can be passed to the PocketSphinxEngine which is<br>
supposed to<br>
enable this but when I set the switch to anything other<br>
than 0 the<br>
system crashes immediately on any audio input. I<br>
remember you said<br>
that the system had been build to provide N-best lists<br>
so I was<br>
wondering if you could give any advice on why it is not<br>
working.<br>
Do you have a working N-best list system that you could<br>
send me to<br>
see how things are configured?<br>
<br>
In trying to solve the problem I took a look at the<br>
PocketSphinxEngine source code and have noticed some<br>
possible<br>
memory access bugs which may be contributing to this.<br>
These were<br>
related to the way the iHypsGenerated variable was<br>
used. I've<br>
fixed these and can send them if you would like (I<br>
tried attaching<br>
them but the mailing list won't let me). The resulting<br>
code still<br>
crashes but at a later stage. After the fix, the log file<br>
generates a WARNING: "ngram_search.c", line 1000:. I<br>
don't know if<br>
this might be the cause of the problem. There is also a<br>
possibility that I simply have to add a configuration<br>
variable to<br>
PocketSphinx itself. At the moment I have only used the<br>
n_best<br>
switch on PocketSphinxEngine.<br>
<br>
Please do let me know if you have any ideas of how to<br>
get this<br>
working or who else to contact.<br>
<br>
Thanks for all you help,<br>
<br>
Blaise<br>
<br>
<br>
<br>
<br>
<br>
<br>
</div></div></blockquote>
<br>
</blockquote></div><br>