Closed Bug 1185018 Opened 10 years ago Closed 10 years ago

Prep for Bug 1178326: Make speech recognition services language dependent and remove assumption of a single service

Categories

(Core :: Web Speech, defect)

Product:

Component:

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla42

Tracking Flags:

Tracking

Status

firefox42

---

fixed

People

(Reporter: kdavis, Assigned: kdavis)

References

Details

(Whiteboard: [webspeechapi][vaani][systemsfe])

Attachments

(1 file, 2 obsolete files)

Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service 10 years ago kdavis 13.12 KB, patch	smaug : review-	Details \| Diff \| Splinter Review
Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service. 10 years ago kdavis 21.08 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service 10 years ago kdavis 21.07 KB, patch	kdavis : review+	Details \| Diff \| Splinter Review

Assignee

Description

•

10 years ago

Currently the WebSpeech API's code makes several assumptions that will be false once we allow for dynamically changing speech recognition languages. The core assumption the code makes is that there is a single global SpeechRecognition engine that has the "correct" language set. As a result of this assumption it reaches several conclusions that will not be true once we allow for dynamically changing speech recognition languages: 1. On each call to SpeechGrammarList::addFromString() it is possible to validate the passed grammar using the global SpeechRecognition engine. This is false once multiple engines and languages are allowed as each engine many have a different notion of "valid grammar". 2. The global SpeechRecognition engine is language independent. This is false as the global engine works only for a single language. In future a global engine will exists for each language. 3. SpeechRecognition::grammars is set implicitly once one creates a SpeechGrammarList instance, as there is only one global SpeechRecognition. Once there can be multiple SpeechRecognition instances present, one needs to explicitly assign SpeechRecognition::grammars to a given SpeechGrammarList instance. 4. SpeechRecognition::lang need never be set as there is only a single language en-US. Once multiple languages are allowed the value of this variable will determine the SpeechRecognition instance used.

Assignee

Updated

•

10 years ago

Summary: Prep for dynamically changing speech recognition languages: Make speech recognition services language dependent and remove assumption of a single service → Prep for Bug 1178326: Make speech recognition services language dependent and remove assumption of a single service

Assignee

Updated

•

10 years ago

Assignee: nobody → kdavis

Blocks: 1178326

Assignee

Updated

•

10 years ago

Whiteboard: [webspeechapi][vaani][systemsfe]

Assignee

Updated

•

10 years ago

Blocks: 1185234

Assignee

Comment 1

•

10 years ago

The block of Bug 1185234 is only a result of the fact that they both edit the same file and this patch must be applied first.

Assignee

Comment 2

•

10 years ago

Attached patch Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service (obsolete) — Details — Splinter Review

Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service REZA: This will brake your code!!! Please take a look at how this changes things in the examples below. This break, unfortunately, is required to properly support multiple languages. In preparation for Bug 1178326 the current WebSpeech API needs to be modified. The core modification of this patch consists of removing the assumption that there exists a single speech recognition engine. After this patch the assumption is that there exists a single speech recognition engine *per language*. The patch consists of various changes related to removing this assumption 1. The name of the default speech recognition service isn't "pocketsphinx" but "pocketsphinx-en-US” this is in preparation for the support of multiple languages where each language has a appropriately named service. 2. SpeechGrammarList::AddFromString no longer validates and sets the grammar on *the* speech recognition engine as there may be multiple engines present each with different notions of “valid” grammar. 3. The SpeechRecognition::grammars member must be explicitly assigned to as there may be multiple instances of SpeechRecognition present each with a difference language and the assignment can no longer happen implicitly. 4. The member SpeechRecognition::lang must be explicitly set or the lang is derived from the lang attribute of the enclosing HTML root element. If SpeechRecognition::lang is not set and the lang attribute of the enclosing HTML root element is not set, an error will occur on calls to SpeechRecognition::start() 5. Validation of a SpeechGrammarList only occurs upon calls to SpeechRecognition::start() as this is the only time at which all of the required elements (language and SpeechGrammarList) can be assumed to be correctly set by the user. Anytime before a call to SpeechRecognition::start(), the user may be in the process of assigning to the SpeechRecognition attributes. Practically these items have the effect that old code of the form var sr = new SpeechRecognition(), var sgl = new SpeechGrammarList(); sgl.addFromString( "#JSGF V1.0; grammar test; public <simple> = how to recognize speech |" + "how to wreck a nice beach ;”,1); sr.start(); is no longer valid as the SpeechRecognition has no associated SpeechGrammarList or language, assuming that enclosing HTML root has no lang set. If the enclosing HTML root has no lang set <html xmlns="http://www.w3.org/1999/html"> ... </html> one must explicitly set SpeechRecognition::grammars and SpeechRecognition::lang var sr = new SpeechRecognition(), sr.lang =“en-US”; var sgl = new SpeechGrammarList(); sgl.addFromString( "#JSGF V1.0; grammar test; public <simple> = how to recognize speech |" + "how to wreck a nice beach ;”,1); sr. grammars = sgl; sr.start(); // Validation of sr.grammars occurs here If the enclosing HTML root has a lang set <html xmlns="http://www.w3.org/1999/html" lang="en-US”> ... </html> one must only explicitly set SpeechRecognition::grammars var sr = new SpeechRecognition(), var sgl = new SpeechGrammarList(); sgl.addFromString( "#JSGF V1.0; grammar test; public <simple> = how to recognize speech |" + "how to wreck a nice beach ;”,1); sr. grammars = sgl; sr.start(); // Validation of sr.grammars occurs here as the language will be automatically picked up from the HTML root. Addendum: In making this patch the behaviour of SpeechRecognition::start() had to change with respect to having mRecognitionService null. Previously, as there was a single service, if mRecognitionService was null this was an error in gecko and a warning was issued but SpeechRecognition::start() didn’t throw. Now as the user’s language choice causes mRecognitionService to be set to point to the apropos service, if its’s null, due to the user choosing an unsupported language start() throws to tell the user they messed up. This change unfortunately exposed a problem in test_audio_capture_error.html which is also fixed in this patch. (On non-B2G platforms mRecognitionService was always null as the fake backend was never used and on non-B2G platforms pocketsphinx was also not there. This caused the new code to throw. Now the fake backend is used everywhere and it doesn’t throw.)

Attachment #8636502 - Flags: review?(bugs)

Attachment #8636502 - Flags: review?(anatal)

Assignee

Comment 3

•

10 years ago

Almost forgot, the try is running here https://treeherder.mozilla.org/#/jobs?repo=try&revision=59b406dd1ea1

Assignee

Comment 4

•

10 years ago

Any progress on reviews for this bug?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 5

•

10 years ago

review is coming. Sorry, was on vacation and my review load is what it is.

Assignee

Comment 6

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #5) > review is coming. Sorry, was on vacation and my review load is what it is. No problem. As there are two reviewers and no review, I was just a little worried that it was, for whatever reason, lost and not queued anywhere.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 7

•

10 years ago

Comment on attachment 8636502 [details] [diff] [review] Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service >+GetSpeechRecognitionService(const nsAString& aLang) > { > nsAutoCString speechRecognitionServiceCID; > >@@ -70,19 +74,23 @@ GetSpeechRecognitionService() > Preferences::GetCString(PREFERENCE_DEFAULT_RECOGNITION_SERVICE); > nsAutoCString speechRecognitionService; > >- if (!prefValue.get() || prefValue.IsEmpty()) { >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; >- } else { >+ if (!prefValue.IsEmpty()) { > speechRecognitionService = prefValue; >+ } else if (!aLang.IsEmpty()) { >+ speechRecognitionService = >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + >+ NS_ConvertUTF16toUTF8(aLang); >+ } else { >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > } I would first check if (!aLang.IsEmpty()) ... then the pref based value, and then DEFAULT_RECOGNITION_SERVICE. Otherwise we always pick us-EN on b2g, since the patch sets pocketsphinx-en-US pref. > >- if (!SpeechRecognition::mTestConfig.mFakeRecognitionService){ >+ if (SpeechRecognition::mTestConfig.mFakeRecognitionService){ while you're here, want to add the missing space between ) and { > SpeechRecognition::GetGrammars(ErrorResult& aRv) const > { >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); >- return nullptr; >+ nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; >+ return speechGrammarList.forget(); > } Per WebIDL .grammars is never null, but nothing guarantees mSpeechGrammarList isn't null. So, this would crash if you access speechRecognition.grammars before anyone has set any grammars. The spec is vague here, but could we do something like if (!mSpeechGrammarList) { mSpeechGrammarList = new SpeechGrammarList(GetParentObject()); } nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; return speechGrammarList.forget(); > SpeechRecognition::SetGrammars(SpeechGrammarList& aArg, ErrorResult& aRv) > { >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); >- return; >+ mSpeechGrammarList = &aArg; > } And since the getter or setter can throw anymore, want to remove [Throws] from the .webidl and then remove ErrorResult param. > > void > SpeechRecognition::GetLang(nsString& aRetVal, ErrorResult& aRv) const > { >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); >+ aRetVal = mLang; > return; > } > > void > SpeechRecognition::SetLang(const nsAString& aArg, ErrorResult& aRv) > { >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); >+ mLang = aArg; > return; > } ditto here about [Throws] and ErrorResult >+bool SpeechRecognition::SetRecognitionService(ErrorResult& aRv) { Nit, bool SpeechRecognition::SetRecognitionService(ErrorResult& aRv) { >+bool SpeechRecognition::ValidateAndSetGrammarList(ErrorResult& aRv) { ditto >+ if (nullptr == mSpeechGrammarList) { just if (!mSpeechGrammarList) >+ for (uint32_t count = 0; count < grammarListLength; ++count) { >+ nsRefPtr<SpeechGrammar> speechGrammar = mSpeechGrammarList->Item(count,aRv); space after , >+ if (NS_OK != mRecognitionService->ValidateAndSetGrammarList(speechGrammar.get(), nullptr)) { if (NS_FAILED(mRecognitionService->ValidateAndSetGrammarList(speechGrammar.get(), nullptr)))) >@@ -249,6 +250,10 @@ private: > nsCOMPtr<nsITimer> mSpeechDetectionTimer; > bool mAborted; > >+ nsString mLang; >+ >+ nsRefPtr<SpeechGrammarList> mSpeechGrammarList; We need to cycle collect this one. Which means... interesting, SpeechRequest doesn't add anything to cycle collection yet. Looks like mDOMStream member variable is also cycle collectable and should have been traversed/unlinked. So (this is perhaps new to you, if you haven't dealt with cycle collection in other patches yet.) You add NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED(SpeechRecognition, DOMEventTargetHelper) right after NS_DECL_ISUPPORTS_INHERITED in SpeechRecognition.h Then in .cpp NS_IMPL_CYCLE_COLLECTION_INHERITED(SpeechRecognition, DOMEventTargetHelper, mDOMStream, mSpeechGrammarList) That will add those two member variables to cycle collection, so that they get traversed and unlinked properly. You need to also change NS_INTERFACE_MAP_BEGIN(SpeechRecognition) to NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(SpeechRecognition) If you're curious about cycle collection http://blog.kylehuey.com/post/27564411715/cycle-collection is a good blog post, and http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsCycleCollector.cpp#8 has some more technical information.

Attachment #8636502 - Flags: review?(bugs)

Attachment #8636502 - Flags: review?(anatal)

Attachment #8636502 - Flags: review-

Assignee

Comment 8

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #7) > Comment on attachment 8636502 [details] [diff] [review] > Part 1 of 1: Made speech recognition services language dependent and removed > assumption of a single service > > >+GetSpeechRecognitionService(const nsAString& aLang) > > { > > nsAutoCString speechRecognitionServiceCID; > > > >@@ -70,19 +74,23 @@ GetSpeechRecognitionService() > > Preferences::GetCString(PREFERENCE_DEFAULT_RECOGNITION_SERVICE); > > nsAutoCString speechRecognitionService; > > > >- if (!prefValue.get() || prefValue.IsEmpty()) { > >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > >- } else { > >+ if (!prefValue.IsEmpty()) { > > speechRecognitionService = prefValue; > >+ } else if (!aLang.IsEmpty()) { > >+ speechRecognitionService = > >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + > >+ NS_ConvertUTF16toUTF8(aLang); > >+ } else { > >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > } > > I would first check if (!aLang.IsEmpty()) ... > then the pref based value, and then DEFAULT_RECOGNITION_SERVICE. > Otherwise we always pick us-EN on b2g, since the patch sets > pocketsphinx-en-US pref. > Right, don't know how I missed that. > > > >- if (!SpeechRecognition::mTestConfig.mFakeRecognitionService){ > >+ if (SpeechRecognition::mTestConfig.mFakeRecognitionService){ > > while you're here, want to add the missing space between ) and { OK > > SpeechRecognition::GetGrammars(ErrorResult& aRv) const > > { > >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); > >- return nullptr; > >+ nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; > >+ return speechGrammarList.forget(); > > } > Per WebIDL .grammars is never null, but nothing guarantees > mSpeechGrammarList isn't null. > So, this would crash if you access speechRecognition.grammars before anyone > has set any grammars. Just for so I learn, how would this cause a crash? Is this a problem with nsRefPtr not allowing nullptr? Or is a problem with the WebIDL machinery? If mSpeechGrammarList is nullptr, shouldn't it cause an "undefined", or at worst "null", JavaScript value to be returned if someone asked for it? > The spec is vague here, but could we do something like > if (!mSpeechGrammarList) { > mSpeechGrammarList = new SpeechGrammarList(GetParentObject()); > } > nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; > return speechGrammarList.forget(); In light of the above having to explicitly new up a SpeechGrammarList if mSpeechGrammarList is nullptr seems like the wrong thing to do as is does not reflect the fact that the member SpeechRecognition.grammars has been declared but not assigned to and thus should be "undefined" when accessing it from JavaScript. > > > SpeechRecognition::SetGrammars(SpeechGrammarList& aArg, ErrorResult& aRv) > > { > >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); > >- return; > >+ mSpeechGrammarList = &aArg; > > } > And since the getter or setter can throw anymore, want to remove [Throws] > from the .webidl and then remove ErrorResult param. > Ok. > > > > void > > SpeechRecognition::GetLang(nsString& aRetVal, ErrorResult& aRv) const > > { > >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); > >+ aRetVal = mLang; > > return; > > } > > > > void > > SpeechRecognition::SetLang(const nsAString& aArg, ErrorResult& aRv) > > { > >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); > >+ mLang = aArg; > > return; > > } > ditto here about [Throws] and ErrorResult OK. > >+bool SpeechRecognition::SetRecognitionService(ErrorResult& aRv) { > Nit, > bool > SpeechRecognition::SetRecognitionService(ErrorResult& aRv) > { > OK > >+bool SpeechRecognition::ValidateAndSetGrammarList(ErrorResult& aRv) { > ditto OK > > >+ if (nullptr == mSpeechGrammarList) { > just > if (!mSpeechGrammarList) OK > >+ for (uint32_t count = 0; count < grammarListLength; ++count) { > >+ nsRefPtr<SpeechGrammar> speechGrammar = mSpeechGrammarList->Item(count,aRv); > space after , OK > >+ if (NS_OK != mRecognitionService->ValidateAndSetGrammarList(speechGrammar.get(), nullptr)) { > if > (NS_FAILED(mRecognitionService->ValidateAndSetGrammarList(speechGrammar. > get(), nullptr)))) OK > >@@ -249,6 +250,10 @@ private: > > nsCOMPtr<nsITimer> mSpeechDetectionTimer; > > bool mAborted; > > > >+ nsString mLang; > >+ > >+ nsRefPtr<SpeechGrammarList> mSpeechGrammarList; > We need to cycle collect this one. > Which means... interesting, SpeechRequest doesn't add anything to cycle > collection yet. > Looks like mDOMStream member variable is also cycle collectable and should > have been traversed/unlinked. > So (this is perhaps new to you, if you haven't dealt with cycle collection > in other patches yet.) > You add > NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED(SpeechRecognition, > DOMEventTargetHelper) > right after NS_DECL_ISUPPORTS_INHERITED in SpeechRecognition.h > > Then in .cpp > NS_IMPL_CYCLE_COLLECTION_INHERITED(SpeechRecognition, DOMEventTargetHelper, > mDOMStream, mSpeechGrammarList) > > That will add those two member variables to cycle collection, so that they > get traversed and unlinked properly. OK > You need to also change > NS_INTERFACE_MAP_BEGIN(SpeechRecognition) > to > NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(SpeechRecognition) OK > If you're curious about cycle collection > http://blog.kylehuey.com/post/27564411715/cycle-collection is a good blog > post, and > http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsCycleCollector. > cpp#8 has some more technical information. I'll take a look

Assignee

Comment 9

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #7) > Comment on attachment 8636502 [details] [diff] [review] > Part 1 of 1: Made speech recognition services language dependent and removed > assumption of a single service > > > SpeechRecognition::GetGrammars(ErrorResult& aRv) const > > { > >- aRv.Throw(NS_ERROR_NOT_IMPLEMENTED); > >- return nullptr; > >+ nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; > >+ return speechGrammarList.forget(); > > } > Per WebIDL .grammars is never null, but nothing guarantees > mSpeechGrammarList isn't null. > So, this would crash if you access speechRecognition.grammars before anyone > has set any grammars. > The spec is vague here, but could we do something like > if (!mSpeechGrammarList) { > mSpeechGrammarList = new SpeechGrammarList(GetParentObject()); > } > nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; > return speechGrammarList.forget(); > Confirmed crash, but returning an empty SpeechGrammarList also seems incorrect as it does not reflect the fact that the member SpeechRecognition.grammars has been declared but not assigned to and thus should be "undefined" when accessing it from JavaScript. Is there anyway I can check if mSpeechGrammarList is nullptr and then return some special C++ value that will be translated into "undefined" in the JS layer? Or, at worst, a "null" JS value?

Assignee

Comment 10

•

10 years ago

(In reply to kdavis from comment #8) > (In reply to Olli Pettay [:smaug] from comment #7) > > Comment on attachment 8636502 [details] [diff] [review] > > Part 1 of 1: Made speech recognition services language dependent and removed > > assumption of a single service > > > > >+GetSpeechRecognitionService(const nsAString& aLang) > > > { > > > nsAutoCString speechRecognitionServiceCID; > > > > > >@@ -70,19 +74,23 @@ GetSpeechRecognitionService() > > > Preferences::GetCString(PREFERENCE_DEFAULT_RECOGNITION_SERVICE); > > > nsAutoCString speechRecognitionService; > > > > > >- if (!prefValue.get() || prefValue.IsEmpty()) { > > >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > >- } else { > > >+ if (!prefValue.IsEmpty()) { > > > speechRecognitionService = prefValue; > > >+ } else if (!aLang.IsEmpty()) { > > >+ speechRecognitionService = > > >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + > > >+ NS_ConvertUTF16toUTF8(aLang); > > >+ } else { > > >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > > } > > > > I would first check if (!aLang.IsEmpty()) ... > > then the pref based value, and then DEFAULT_RECOGNITION_SERVICE. > > Otherwise we always pick us-EN on b2g, since the patch sets > > pocketsphinx-en-US pref. > > > > > Right, don't know how I missed that. > Now I see how I missed it. I was testing on desktop where the pref is not set.

Assignee

Comment 11

•

10 years ago

(In reply to kdavis from comment #10) > (In reply to kdavis from comment #8) > > (In reply to Olli Pettay [:smaug] from comment #7) > > > Comment on attachment 8636502 [details] [diff] [review] > > > Part 1 of 1: Made speech recognition services language dependent and removed > > > assumption of a single service > > > > > > >+GetSpeechRecognitionService(const nsAString& aLang) > > > > { > > > > nsAutoCString speechRecognitionServiceCID; > > > > > > > >@@ -70,19 +74,23 @@ GetSpeechRecognitionService() > > > > Preferences::GetCString(PREFERENCE_DEFAULT_RECOGNITION_SERVICE); > > > > nsAutoCString speechRecognitionService; > > > > > > > >- if (!prefValue.get() || prefValue.IsEmpty()) { > > > >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > > >- } else { > > > >+ if (!prefValue.IsEmpty()) { > > > > speechRecognitionService = prefValue; > > > >+ } else if (!aLang.IsEmpty()) { > > > >+ speechRecognitionService = > > > >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + > > > >+ NS_ConvertUTF16toUTF8(aLang); > > > >+ } else { > > > >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > > > } > > > > > > I would first check if (!aLang.IsEmpty()) ... > > > then the pref based value, and then DEFAULT_RECOGNITION_SERVICE. > > > Otherwise we always pick us-EN on b2g, since the patch sets > > > pocketsphinx-en-US pref. > > > > > > > > > Right, don't know how I missed that. > > > > Now I see how I missed it. I was testing on desktop where the pref is not set. Now that I think about it more I am wondering if it wouldn't be best to remove the default value of the pref PREFERENCE_DEFAULT_RECOGNITION_SERVICE from the B2G build and leave the code as is. Then... 1. When testing, setting PREFERENCE_DEFAULT_RECOGNITION_SERVICE could override everything. 2. If SpeechRecognition.lang is set in JS, it is used. 3. If SpeechRecognition.lang is not set in JS but the HTML lang attribute is set, it is used. 4. If none of the above are set, the default DEFAULT_RECOGNITION_SERVICE is used. To me this seems more natural. You have the ability to force things with the preference value, the ability to work within the normal functionality setting SpeechRecognition.lang and/or the HTML lang attribute, and also, if all else fails, you default to DEFAULT_RECOGNITION_SERVICE.

Assignee

Comment 12

•

10 years ago

Attached patch Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service. (obsolete) — Details — Splinter Review

Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service. Part 1 of 1 of this bug. This patch obsoletes the previous patch and makes several changes relative to the previous patch: 1. It removes the default value of the B2G preference PREFERENCE_DEFAULT_RECOGNITION_SERVICE in accord with Comment 11 2. Fixes formatting in accord with Comment 7 3. Adds cycle collection code in accord with Comment 7 4. Removes throw specifier from SpeechRecognition.grammars in accord with Comment 7 5. Removes throw specifier from SpeechRecognition.lang in accord with Comment 7 6. Makes SpeechRecognition.grammars nullable to overcome the problems of Comment 7 and Comment 9 The try for this patch is here https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa797f44a12a The full interdiff between the previous patch and this patch is as follows: diff -u b/b2g/app/b2g.js b/b2g/app/b2g.js --- b/b2g/app/b2g.js +++ b/b2g/app/b2g.js @@ -1016,7 +1016,6 @@ // Enable Web Speech recognition API pref("media.webspeech.recognition.enable", true); -pref("media.webspeech.service.default", "pocketsphinx-en-US"); // Downloads API pref("dom.mozDownloads.enabled", true); diff -u b/dom/media/webspeech/recognition/SpeechRecognition.cpp b/dom/media/webspeech/recognition/SpeechRecognition.cpp --- b/dom/media/webspeech/recognition/SpeechRecognition.cpp +++ b/dom/media/webspeech/recognition/SpeechRecognition.cpp @@ -84,7 +84,7 @@ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; } - if (SpeechRecognition::mTestConfig.mFakeRecognitionService){ + if (SpeechRecognition::mTestConfig.mFakeRecognitionService) { speechRecognitionServiceCID = NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "fake"; } else { @@ -99,7 +99,9 @@ return recognitionService.forget(); } -NS_INTERFACE_MAP_BEGIN(SpeechRecognition) +NS_IMPL_CYCLE_COLLECTION_INHERITED(SpeechRecognition, DOMEventTargetHelper, mDOMStream, mSpeechGrammarList) + +NS_INTERFACE_MAP_BEGIN_CYCLE_COLLECTION_INHERITED(SpeechRecognition) NS_INTERFACE_MAP_ENTRY(nsIObserver) NS_INTERFACE_MAP_END_INHERITING(DOMEventTargetHelper) @@ -628,30 +630,28 @@ } already_AddRefed<SpeechGrammarList> -SpeechRecognition::GetGrammars(ErrorResult& aRv) const +SpeechRecognition::GetGrammars() const { nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; return speechGrammarList.forget(); } void -SpeechRecognition::SetGrammars(SpeechGrammarList& aArg, ErrorResult& aRv) +SpeechRecognition::SetGrammars(SpeechGrammarList* aArg) { - mSpeechGrammarList = &aArg; + mSpeechGrammarList = aArg; } void -SpeechRecognition::GetLang(nsString& aRetVal, ErrorResult& aRv) const +SpeechRecognition::GetLang(nsString& aRetVal) const { aRetVal = mLang; - return; } void -SpeechRecognition::SetLang(const nsAString& aArg, ErrorResult& aRv) +SpeechRecognition::SetLang(const nsAString& aArg) { mLang = aArg; - return; } bool @@ -750,7 +750,9 @@ NS_DispatchToMainThread(event); } -bool SpeechRecognition::SetRecognitionService(ErrorResult& aRv) { +bool +SpeechRecognition::SetRecognitionService(ErrorResult& aRv) +{ // See: https://dvcs.w3.org/hg/speech-api/raw-file/tip/webspeechapi.html#dfn-lang if (!mLang.IsEmpty()) { mRecognitionService = GetSpeechRecognitionService(mLang); @@ -791,8 +793,10 @@ return true; } -bool SpeechRecognition::ValidateAndSetGrammarList(ErrorResult& aRv) { - if (nullptr == mSpeechGrammarList) { +bool +SpeechRecognition::ValidateAndSetGrammarList(ErrorResult& aRv) +{ + if (!mSpeechGrammarList) { aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); return false; } @@ -804,11 +808,11 @@ } for (uint32_t count = 0; count < grammarListLength; ++count) { - nsRefPtr<SpeechGrammar> speechGrammar = mSpeechGrammarList->Item(count,aRv); + nsRefPtr<SpeechGrammar> speechGrammar = mSpeechGrammarList->Item(count, aRv); if (aRv.Failed()) { return false; } - if (NS_OK != mRecognitionService->ValidateAndSetGrammarList(speechGrammar.get(), nullptr)) { + if (NS_FAILED(mRecognitionService->ValidateAndSetGrammarList(speechGrammar.get(), nullptr))) { aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); return false; } diff -u b/dom/media/webspeech/recognition/SpeechRecognition.h b/dom/media/webspeech/recognition/SpeechRecognition.h --- b/dom/media/webspeech/recognition/SpeechRecognition.h +++ b/dom/media/webspeech/recognition/SpeechRecognition.h @@ -58,6 +58,7 @@ explicit SpeechRecognition(nsPIDOMWindow* aOwnerWindow); NS_DECL_ISUPPORTS_INHERITED + NS_DECL_CYCLE_COLLECTION_CLASS_INHERITED(SpeechRecognition, DOMEventTargetHelper) NS_DECL_NSIOBSERVER @@ -70,13 +71,13 @@ static already_AddRefed<SpeechRecognition> Constructor(const GlobalObject& aGlobal, ErrorResult& aRv); - already_AddRefed<SpeechGrammarList> GetGrammars(ErrorResult& aRv) const; + already_AddRefed<SpeechGrammarList> GetGrammars() const; - void SetGrammars(mozilla::dom::SpeechGrammarList& aArg, ErrorResult& aRv); + void SetGrammars(mozilla::dom::SpeechGrammarList* aArg); - void GetLang(nsString& aRetVal, ErrorResult& aRv) const; + void GetLang(nsString& aRetVal) const; - void SetLang(const nsAString& aArg, ErrorResult& aRv); + void SetLang(const nsAString& aArg); bool GetContinuous(ErrorResult& aRv) const; only in patch2: unchanged: --- a/dom/webidl/SpeechRecognition.webidl +++ b/dom/webidl/SpeechRecognition.webidl @@ -10,19 +10,17 @@ * liability, trademark and document use rules apply. */ [Constructor, Pref="media.webspeech.recognition.enable", Func="SpeechRecognition::IsAuthorized"] interface SpeechRecognition : EventTarget { // recognition parameters - [Throws] - attribute SpeechGrammarList grammars; - [Throws] + attribute SpeechGrammarList? grammars; attribute DOMString lang; [Throws] attribute boolean continuous; [Throws] attribute boolean interimResults; [Throws] attribute unsigned long maxAlternatives; [Throws]

Attachment #8636502 - Attachment is obsolete: true

Attachment #8641500 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 13

•

10 years ago

Comment on attachment 8641500 [details] [diff] [review] Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service. >- if (!prefValue.get() || prefValue.IsEmpty()) { >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; >- } else { >+ if (!prefValue.IsEmpty()) { > speechRecognitionService = prefValue; >+ } else if (!aLang.IsEmpty()) { >+ speechRecognitionService = >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + >+ NS_ConvertUTF16toUTF8(aLang); >+ } else { >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > } > >- if (!SpeechRecognition::mTestConfig.mFakeRecognitionService){ >+ if (SpeechRecognition::mTestConfig.mFakeRecognitionService) { > speechRecognitionServiceCID = >- NS_LITERAL_CSTRING(NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX) + >- speechRecognitionService; >+ NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "fake"; > } else { > speechRecognitionServiceCID = >- NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "fake"; >+ NS_LITERAL_CSTRING(NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX) + >+ speechRecognitionService; > } What about localization case in Firefox. If we enable pocketsphinx there, I could imagine Preferences UI to have some way to change the default recognition lang or something. So I would use the following order 1. lang 2. pref 3. default (which is en-US) >+ nsCOMPtr<nsPIDOMWindow> window = GetOwner(); >+ if(!window) { >+ aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); >+ return false; >+ } >+ nsCOMPtr<nsIDocument> document = window->GetDoc(); You actually want GetExtantDoc() here. > interface SpeechRecognition : EventTarget { > // recognition parameters >- [Throws] >- attribute SpeechGrammarList grammars; >- [Throws] >+ attribute SpeechGrammarList? grammars; You made .grammars nullable. That does make sense to me, but that isn't what the spec says. So, file a spec bug - just that the issue is recorded somewhere. I wonder how to get the spec updated these days. Or change GetGrammar to be what I said in comment 8. either way. (About returning undefined here would be wrong per Webidl spec. null is ok.) And the crash if you return null when .webidl doesn't expect null is largely because our webidl bindings are super optimized, where any extra checks or more possible code paths make things slower.

Attachment #8641500 - Flags: review?(bugs) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 14

•

10 years ago

(In reply to kdavis from comment #11) > 1. When testing, setting PREFERENCE_DEFAULT_RECOGNITION_SERVICE could > override everything. > 2. If SpeechRecognition.lang is set in JS, it is used. > 3. If SpeechRecognition.lang is not set in JS but the HTML lang attribute is > set, it is used. > 4. If none of the above are set, the default DEFAULT_RECOGNITION_SERVICE is > used. > > To me this seems more natural. You have the ability to force things with the > preference value, > the ability to work within the normal functionality setting > SpeechRecognition.lang and/or the > HTML lang attribute, and also, if all else fails, you default to > DEFAULT_RECOGNITION_SERVICE. In other words, I disagree with this. Why should the pref override .lang? Hmm, we hardcode recognition service here anyway. Shouldn't we always use the value from "media.webspeech.service.default", and then have a separate pref for the default language

Assignee

Comment 15

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #13) > Comment on attachment 8641500 [details] [diff] [review] > Part 1 of 1: Made speech recognition services language dependent and removed > assumption of a single service. > > > >- if (!prefValue.get() || prefValue.IsEmpty()) { > >- speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > >- } else { > >+ if (!prefValue.IsEmpty()) { > > speechRecognitionService = prefValue; > >+ } else if (!aLang.IsEmpty()) { > >+ speechRecognitionService = > >+ NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + > >+ NS_ConvertUTF16toUTF8(aLang); > >+ } else { > >+ speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; > > } > > > >- if (!SpeechRecognition::mTestConfig.mFakeRecognitionService){ > >+ if (SpeechRecognition::mTestConfig.mFakeRecognitionService) { > > speechRecognitionServiceCID = > >- NS_LITERAL_CSTRING(NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX) + > >- speechRecognitionService; > >+ NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "fake"; > > } else { > > speechRecognitionServiceCID = > >- NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "fake"; > >+ NS_LITERAL_CSTRING(NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX) + > >+ speechRecognitionService; > > } > What about localization case in Firefox. If we enable pocketsphinx there, I > could imagine Preferences UI to have some way to change the default > recognition lang or something. > So I would use the following order > 1. lang > 2. pref > 3. default (which is en-US) I'm fine with this order, with a pref for the default. But, the pref will come in a future patch. > > > >+ nsCOMPtr<nsPIDOMWindow> window = GetOwner(); > >+ if(!window) { > >+ aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); > >+ return false; > >+ } > >+ nsCOMPtr<nsIDocument> document = window->GetDoc(); > You actually want GetExtantDoc() here. OK. > > > interface SpeechRecognition : EventTarget { > > // recognition parameters > >- [Throws] > >- attribute SpeechGrammarList grammars; > >- [Throws] > >+ attribute SpeechGrammarList? grammars; > You made .grammars nullable. That does make sense to me, but that isn't what > the spec says. > So, file a spec bug - just that the issue is recorded somewhere. I wonder > how to get the spec updated these days. > Or change GetGrammar to be what I said in comment 8. either way. > (About returning undefined here would be wrong per Webidl spec. null is ok.) To adhere to the spec, I'll add code as in comment 8 with a non-nullable grammar. > And the crash if you return null when .webidl doesn't expect null is largely > because our webidl bindings are > super optimized, where any extra checks or more possible code paths make > things slower. Understood.

Assignee

Comment 16

•

10 years ago

(In reply to Olli Pettay [:smaug] from comment #14) > (In reply to kdavis from comment #11) > > 1. When testing, setting PREFERENCE_DEFAULT_RECOGNITION_SERVICE could > > override everything. > > 2. If SpeechRecognition.lang is set in JS, it is used. > > 3. If SpeechRecognition.lang is not set in JS but the HTML lang attribute is > > set, it is used. > > 4. If none of the above are set, the default DEFAULT_RECOGNITION_SERVICE is > > used. > > > > To me this seems more natural. You have the ability to force things with the > > preference value, > > the ability to work within the normal functionality setting > > SpeechRecognition.lang and/or the > > HTML lang attribute, and also, if all else fails, you default to > > DEFAULT_RECOGNITION_SERVICE. > > In other words, I disagree with this. Why should the pref override .lang? > > Hmm, we hardcode recognition service here anyway. > Shouldn't we always use the value from "media.webspeech.service.default", > and then have a separate pref for the default language I'm fine with the order you suggest 1. lang 2. pref 3. default with a separate pref for default, to be added in another commit.

Assignee

Comment 17

•

10 years ago

Attached patch Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service — Details — Splinter Review

Part 1 of 1: Made speech recognition services language dependent and removed assumption of a single service Part 1 of 1 of this bug. This patch modifies the previous patch with the changes suggested in comment 13. In particular this patch, relative to the previous patch: 1. Changes the order of the language prefs to aLang, pref, and default. 2. Makes SpeechRecognition.grammars non-nullable 3. Initializes SpeechRecognition.grammars to an empty SpeechGrammarList 4. Gets the window doc w/GetExtantDoc() and not GetDoc() The try is running here https://treeherder.mozilla.org/#/jobs?repo=try&revision=2ee1a6097272 The interdiff between this patch and the last is diff -u b/dom/media/webspeech/recognition/SpeechRecognition.cpp b/dom/media/webspeech/recognition/SpeechRecognition.cpp --- b/dom/media/webspeech/recognition/SpeechRecognition.cpp +++ b/dom/media/webspeech/recognition/SpeechRecognition.cpp @@ -74,12 +74,12 @@ Preferences::GetCString(PREFERENCE_DEFAULT_RECOGNITION_SERVICE); nsAutoCString speechRecognitionService; - if (!prefValue.IsEmpty()) { - speechRecognitionService = prefValue; - } else if (!aLang.IsEmpty()) { + if (!aLang.IsEmpty()) { speechRecognitionService = NS_LITERAL_CSTRING(DEFAULT_RECOGNITION_SERVICE_PREFIX) + NS_ConvertUTF16toUTF8(aLang); + } else if (!prefValue.IsEmpty()) { + speechRecognitionService = prefValue; } else { speechRecognitionService = DEFAULT_RECOGNITION_SERVICE; } @@ -115,7 +115,7 @@ , mEndpointer(kSAMPLE_RATE) , mAudioSamplesPerChunk(mEndpointer.FrameSize()) , mSpeechDetectionTimer(do_CreateInstance(NS_TIMER_CONTRACTID)) - , mSpeechGrammarList(nullptr) + , mSpeechGrammarList(new SpeechGrammarList(GetParentObject())) { SR_LOG("created SpeechRecognition"); @@ -630,16 +630,16 @@ } already_AddRefed<SpeechGrammarList> -SpeechRecognition::GetGrammars() const +SpeechRecognition::Grammars() const { nsRefPtr<SpeechGrammarList> speechGrammarList = mSpeechGrammarList; return speechGrammarList.forget(); } void -SpeechRecognition::SetGrammars(SpeechGrammarList* aArg) +SpeechRecognition::SetGrammars(SpeechGrammarList& aArg) { - mSpeechGrammarList = aArg; + mSpeechGrammarList = &aArg; } void @@ -770,7 +770,7 @@ aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); return false; } - nsCOMPtr<nsIDocument> document = window->GetDoc(); + nsCOMPtr<nsIDocument> document = window->GetExtantDoc(); if(!document) { aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR); return false; diff -u b/dom/media/webspeech/recognition/SpeechRecognition.h b/dom/media/webspeech/recognition/SpeechRecognition.h --- b/dom/media/webspeech/recognition/SpeechRecognition.h +++ b/dom/media/webspeech/recognition/SpeechRecognition.h @@ -71,9 +71,9 @@ static already_AddRefed<SpeechRecognition> Constructor(const GlobalObject& aGlobal, ErrorResult& aRv); - already_AddRefed<SpeechGrammarList> GetGrammars() const; + already_AddRefed<SpeechGrammarList> Grammars() const; - void SetGrammars(mozilla::dom::SpeechGrammarList* aArg); + void SetGrammars(mozilla::dom::SpeechGrammarList& aArg); void GetLang(nsString& aRetVal) const; diff -u b/dom/webidl/SpeechRecognition.webidl b/dom/webidl/SpeechRecognition.webidl --- b/dom/webidl/SpeechRecognition.webidl +++ b/dom/webidl/SpeechRecognition.webidl @@ -15,7 +15,7 @@ Func="SpeechRecognition::IsAuthorized"] interface SpeechRecognition : EventTarget { // recognition parameters - attribute SpeechGrammarList? grammars; + attribute SpeechGrammarList grammars; attribute DOMString lang; [Throws] attribute boolean continuous;

Attachment #8641500 - Attachment is obsolete: true

Attachment #8642200 - Flags: review+

Assignee

Updated

•

10 years ago

Keywords: checkin-needed

Comment 18

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/ee7df762c195

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 19

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/ee7df762c195 FWIW, you probably don't need "Part 1 of 1" in the commit message :) https://developer.mozilla.org/en-US/docs/Mercurial/Using_Mercurial#Commit_Message_Conventions

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox42: --- → fixed

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla42

You need to log in before you can comment on or make changes to this bug.