Closed
Bug 1193183
Opened 9 years ago
Closed 9 years ago
Correctly implement SpeechRecognitionAlternative::confidence
Categories
(Core :: Web Speech, defect)
Core
Web Speech
Tracking
()
RESOLVED
FIXED
FxOS-S5 (21Aug)
Tracking | Status | |
---|---|---|
firefox43 | --- | fixed |
People
(Reporter: kdavis, Assigned: kdavis)
Details
(Whiteboard: [webspeechapi][vaani][systemsfe])
Attachments
(1 file)
The current implementation of SpeechRecognitionAlternative::confidence always indicates a value of 100 for all final results. This does not reflect the value returned from pocketsphinx. The value of SpeechRecognitionAlternative::confidence should reflect the value returned from the pocketsphinx method ps_get_hyp().
Assignee: nobody → kdavis
Whiteboard: [webspeechapi][vaani][systemsfe]
(In reply to kdavis from comment #0) > The value of SpeechRecognitionAlternative::confidence should reflect > the value returned from the pocketsphinx method ps_get_hyp(). It appears that the score returned here is not supported in pocketsphinx http://cmusphinx.sourceforge.net/wiki/faq#qcan_pocketsphinx_reject_out-of-grammar_words_and_noises Hence, it looks as if the best path is to use ps_get_prob()
Part 1 of 1: Correctly implement SpeechRecognitionAlternative::confidence using ps_get_prob() Part 1 of 1 for this bug. Before this patch SpeechRecognitionAlternative::confidence was always set to 100 for all final SpeechEvent's. This was incorrect for at least two reasons: 1. According to the spec[1] confidence should lie in [0,1] 2. The value of confidence was not derived from pocketsphinx's decoding This patch fixes this by obtaining from ps_get_prob() the log posterior probability of the recognition. It then converts this to a probability and uses this probability as the confidence. A slight technical detail is that according to its documentation ps_get_prob() requires two conditions to be true in order to work: 1. The -bestpath option must be enabled 2. The result it's being used for must not be partial To satisfy these two requirements this commit 1. Enables the -bestpath option 2. Only calls ps_get_prob() on final results An additional bonus should follow from this commit. Up until now we did not require our results to be final. Hence, we sometimes incorrectly interpreted partial results as final results. After this commit, we only use final results as final results. This will increase the accuracy of our recognition as we will ignore results we are uncertain about. The try for this patch is running here https://treeherder.mozilla.org/#/jobs?repo=try&revision=b7e6d89f064a [1] https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html#dfn-confidence
Attachment #8646359 -
Flags: review?(bugs)
Comment 3•9 years ago
|
||
Just a note: ps_get_prob() on pocketsphinx works only for Language Models, that isn't what we are using on Mozilla API so far. For grammars, this part is not properly implemented yet and pocketsphinx codebase should be fixed to return this correctly.
(In reply to Andre Natal from comment #3) > Just a note: ps_get_prob() on pocketsphinx works only for Language Models, > that isn't what we are using on Mozilla API so far. > > For grammars, this part is not properly implemented yet and pocketsphinx > codebase should be fixed to return this correctly. I agree. That being the case, ps_get_prob() is the correct method to obtain the logprob from. Thus, in future when we use a language model or when grammar confidence is supported, the correct logprob will be returned from ps_get_prob() and our code will work without change. As to the current implementation of ps_get_prob() for grammar based recognition, it returns 0 which leads to a confidence of 1. So, the confidence is within the correct range [0,1], in contrast and to the previously hard-coded 100. So, this code is as "correct as possible" with the current pocketsphinx implemen- tation and is as "future proof as is possible" with the current pocketsphinx im- plementation.
Updated•9 years ago
|
Target Milestone: --- → FxOS-S5 (21Aug)
Comment 5•9 years ago
|
||
I wonder what is float64. Is there really such type in C/C++?
Comment 6•9 years ago
|
||
Comment on attachment 8646359 [details] [diff] [review] Part 1 of 1: Correctly implement SpeechRecognitionAlternative::confidence using ps_get_prob() Actually, any reason why Andre shouldn't review this? (This is after all more about pocketsphinx backend than the web phasing API.)
Attachment #8646359 -
Flags: review?(bugs) → review?(anatal)
Updated•9 years ago
|
Attachment #8646359 -
Flags: review?(anatal) → review+
Comment 7•9 years ago
|
||
This is perfect, we just need to check that comment from Olli about float64.
(In reply to Andre Natal from comment #7) > This is perfect, we just need to check that comment from Olli about float64. The type actually comes from sphinxbase. The file media/sphinxbase/sphinxbase/prim_type.h defines float64 to be a double.
Comment 9•9 years ago
|
||
yeah, I think float64 usage here is fine. It just isn't any real C++ type, which is why I wondered what it was.
Keywords: checkin-needed
Comment 10•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9734ce792065
Keywords: checkin-needed
You need to log in
before you can comment on or make changes to this bug.
Description
•