650295 - Add support for speech input

Reporter

Description

•

13 years ago

User-Agent:       Mozilla/5.0 (Windows NT 6.1; rv:6.0a1) Gecko/20110414 Firefox/6.0a1
Build Identifier: 

"With this API, developers can give web apps the ability to transcribe your voice to text. When a web page uses this feature, you simply click on an icon and then speak into your computer’s microphone. The recorded audio is sent to speech servers for transcription, after which the text is typed out for you."

Google implemented support in Chrome and made a it available on Google Translate. (you need Chrome Canary build)

Reproducible: Always

(mostly gone) XtC4UaLL [:xtc4uall]

Updated

•

13 years ago

Severity: normal → enhancement

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 1

•

13 years ago

I know this is something people have been talking about, but I couldn't find an existing bug, so confirming.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Kai Liu

Comment 2

•

13 years ago

This is probably better addressed at the people at Google who somehow think that this would be a good addition to the HTML specs...

How exactly would this interact (or interfere) with the speech recognition that is already present on a user's machine?

On Windows, all versions since XP have sophisticated built-in speech recognition capabilities (capable of adapting and training to each person's individual voice and diction) that already allow users to speech-input text into any location where keyboard input is accepted (including form inputs in any browser).  And although other operating systems, to my knowledge, do not have built-in speech diction recognition like Windows, there are third-party software that do the same thing.

Also, if the lack of popularity of the recognition feature in Windows is any indication, I don't think this is a feature that will have legs in the desktop world (mobile is probably a very different story, of course).

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 3

•

13 years ago

http://lists.w3.org/Archives/Public/public-xg-htmlspeech/2011Feb/att-0020/api-draft.html is just one *proposed* API, and it has sever problems, so
it should not be implemented as such.
I have proposed a different API and Microsoft has proposed few APIs.

URL: http://lists.w3.org/Archives/Public/p...

Summary: Add support for HTML speech input API → Add support for speech input and tts output API

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 4

•

13 years ago

s/sever/severe/

:Ms2ger (he/him; ⌚ UTC+1/+2)

Updated

•

13 years ago

Assignee: nobody → roshanvid

Status: NEW → ASSIGNED

OS: Windows 7 → All

Hardware: x86 → All

Version: unspecified → Trunk

alex_mayorga

Comment 5

•

13 years ago

Be in the look out for duplicates of this from users unable to "Search Out Loud" in Firefox now that this is live =S
Reference: http://www.google.com/insidesearch/voicesearch-chrome.html

Steffen Wilberg

Updated

•

13 years ago

Summary: Add support for speech input and tts output API → Add support for speech input and tts output API (used for voice search)

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

13 years ago

Summary: Add support for speech input and tts output API (used for voice search) → Add support for speech input and tts output API

Mardeg

Updated

•

13 years ago

Blocks: webrtc

David Bolter [:davidb] (NeedInfo me for attention)

Comment 6

•

13 years ago

Updating summary. Note TTS already filed as bug 525444.

Summary: Add support for speech input and tts output API → Add support for speech input

Trevor Saunders (:tbsaunde)

Updated

•

13 years ago

Blocks: webapi

rogjunk

Comment 7

•

13 years ago

What is the status on this bug?

I have been using speech input in Chrome to great effect for voice commands, searching data without typing, and game input.

However, I'd like to see an improved (and more open) implementation for Gecko.  We should be able to specify the server the speech input gets sent to for processing.  Being locked into Google's speech-to-text servers won't cut it especially since their Translate API previously went down and came back as a paid service.

Google's server-side code also doesn't seem to handle background noise very well, and there's no way for us to fix this due to the closed source nature.

Further, especially for game input, the way it's implemented in Chromium is especially cumbersome.  Games are keyboard-driven applications and forcing the user to click a microphone icon every time with the mouse is a hassle and poor design.  Therefore, it is proposed that we can "optionally" request permission to automatically process speech input.  Here's the bug I filed at Chromium:

http://code.google.com/p/chromium/issues/detail?id=95523

Roshan Vidyashankar [:rvid]

Updated

•

12 years ago

Depends on: 746536

Andy Bajka

Comment 8

•

12 years ago

I have also been using speech input in Chrome with great effectiveness. I use forums all the time and it would be wonderful if textinput boxes would have the ability to accept speech input using Firefox. 

Here's an example website that shows how to allow speech input using Chrome.

http://www.mybloggertricks.com/2012/05/add-free-speech-recognition-to-your.html

Please add this HTML5 feature to Firefox.

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 9

•

12 years ago

http://www.mybloggertricks.com/2012/05/add-free-speech-recognition-to-your.html describes 
a Chrome only, non-standard (and to-be-removed) feature how to add speech input.
Nothing to do with HTML(5) specification.

Andy Bajka

Comment 10

•

12 years ago

Thank you for the clarification Olli Pettay. I was under the impression it was a HTML5 feature.

At any rate it would be wonderful to see a similar implementation with Firefox as it's a great feature.

Andy Bajka

Comment 11

•

12 years ago

Apparently the Chrome implementation of speech input uses HTML5 features.

http://webdesign.about.com/od/html5tutorials/qt/Collect-Speech-Input-With-Html5-On-Google-Chrome.htm

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 12

•

12 years ago

As I said, nothing to do with HTML5, or any specification in fact. 
The API described in http://webdesign.about.com/od/html5tutorials/qt/Collect-Speech-Input-With-Html5-On-Google-Chrome.htm is Google's proprietary API, and will be hopefully changed to  
something closer to http://www.w3.org/2005/Incubator/htmlspeech/XGR-htmlspeech-20111206/

Andy Bajka

Comment 13

•

12 years ago

Thank you once again Olli. 

I just hope that Firefox will have speech to text available sometime soon. Being able to fill out textinput forms like the one I'm typing in now using speech would be very helpful.

Eitan Isaacson [:eeejay]

Comment 14

•

12 years ago

I think this bug should cover this spec:
http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html

In addition to the above spec, I think it should be well integrated with MediaStream objects. See: http://lists.w3.org/Archives/Public/public-speech-api/2012Jun/0072.html

As for speech engines and implementations, I think it should be agnostic. Meaning, many potential engines should be supported, some examples:
1. Platform services (SAPI on windows, and similar on Linux, Mac, and Android).
2. Remote services, like Nuance or Google (translate).
3. Local JS and native implementations (how and if they would be bundled with Firefox is debatable).

The implementation "glue" should be available in chrome-access JS, so a JS object would just need to implement an interface, and register the service object with a privileged call.

[:fabrice] Fabrice Desré

Comment 15

•

12 years ago

I think Nikhil did implement something along these lines.

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Comment 16

•

12 years ago

(In reply to Eitan Isaacson [:eeejay] from comment #14)
> I think this bug should cover this spec:
> http://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html
> 
> In addition to the above spec, I think it should be well integrated with
> MediaStream objects. See:
> http://lists.w3.org/Archives/Public/public-speech-api/2012Jun/0072.html
> 
> As for speech engines and implementations, I think it should be agnostic.
> Meaning, many potential engines should be supported, some examples:
> 1. Platform services (SAPI on windows, and similar on Linux, Mac, and
> Android).
> 2. Remote services, like Nuance or Google (translate).
> 3. Local JS and native implementations (how and if they would be bundled
> with Firefox is debatable).

Work on this is already underway.

> 
> The implementation "glue" should be available in chrome-access JS, so a JS
> object would just need to implement an interface, and register the service
> object with a privileged call.

No immediate plan for exposing to JS since much of the underlying APIs are not
directly exposed to JS and we don't expect there to a ton of speech-to-text providers.

Eitan Isaacson [:eeejay]

Comment 17

•

12 years ago

(In reply to Fabrice Desré [:fabrice] from comment #15)
> I think Nikhil did implement something along these lines.

Oops! wrong bug, I meant to post this in bug #525444. Nikhil, is your work public somewhere?

Eitan Isaacson [:eeejay]

Comment 18

•

12 years ago

(In reply to Nikhil Marathe from comment #16)
> (In reply to Eitan Isaacson [:eeejay] from comment #14)
> > 
> > The implementation "glue" should be available in chrome-access JS, so a JS
> > object would just need to implement an interface, and register the service
> > object with a privileged call.
> 
> No immediate plan for exposing to JS since much of the underlying APIs are
> not
> directly exposed to JS and we don't expect there to a ton of speech-to-text
> providers.

This makes sense, for now. On the TTS front, I think supporting more than one engine is important, since each engine has different strengths and varying language support, and the user can have different preferences. This may be true one day for speech recognition too, but for now there is not much out there.

Doug Turner (:dougt)

Comment 19

•

12 years ago

> Oops! wrong bug, I meant to post this in bug #525444. Nikhil, is your work public somewhere?

Parts can be immediately.  Other parts, not so much -- we started using (for prototyping) google's service which we may or may not be able to share right now (i am not sure i chromium uses it or just chrome).  nsm, can you investigate this?

Reuben Morais [:reuben]

Comment 20

•

11 years ago

Hm, Chromium uses[1] the HTTP API, yes, and there are lots of examples of how to use it on the Web, but I couldn't find anything official from Google saying if the API is supposed to be used by the public. The closest thing I found was this message in the Google Product Forums: 

http://productforums.google.com/forum/#!topic/chrome/U02jzA32IGc
>The documentation and terms of service for this HTTP service are not made publicly
>available because this service is not intended for public use.
>The Google Speech HTTP API is intended for use from Chrome and should not be used
>by third parties without explicit permission from Google.
>
>Third parties who want to access this API can request access by emailing a summary
>of their intended application, together with a business case for why Google should
>support their use of this service, to "google-speech-api" at our company's domain
>(google.com).

[1] http://src.chromium.org/viewvc/chrome/trunk/src/content/browser/speech/google_one_shot_remote_engine.cc?view=markup

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Updated

•

11 years ago

Assignee: roshanvid → ggoncalves

Guilherme Gonçalves [:ggp]

Assignee

Comment 21

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

This patch implements the state machine that handles recording audio, creating the backend recognition service (through a factory mechanism), detecting speech end and issuing events and errors associated with the results of the recognition.

The DOM objects associated with the recognition (result list, alternative, errors and events) are also implemented.

Attachment #712684 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 22

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

This patch implements tests for the finite state machine introduced in the previous patch. It uses the nsIObserver interface implemented by the main SpeechRecognition to request artificial state machine events (to simulate, for example, a response from the recognition service), and then checks whether the expected DOM events are raised.

Attachment #712689 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 23

•

11 years ago

Comment on attachment 712684 [details] [diff] [review]
Bug 650295 - Implement the main state machine for Speech Recognition


>+// The atoms below are used by webspeech
>+// but already defined by other APIs
>+// GK_ATOM(onerror, "onerror")
>+// GK_ATOM(onend, "onend")
No need to add this kind of comment
>+// This is a helper class which enables Web Speech to be enabled or disabled
>+// as whole.  Individual Web Speech object classes should inherit from this.
>+class EnableWebSpeechCheck {
Nit, { should go to the next line

>+SpeechGrammar::SpeechGrammar(nsISupports *parent) : mParent(parent)
nsISupport* aParent

>+SpeechGrammar::GetSrc(nsString& retval) const
aRetVal

>+{
>+  NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+void
>+SpeechGrammar::SetSrc(const nsAString& arg)
>+{
>+  NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+float
>+SpeechGrammar::Weight() const
>+{
>+  NS_ABORT_IF_FALSE(0, "not implemented");
>+  return 0;
>+}
>+
>+void
>+SpeechGrammar::SetWeight(float arg)
>+{
>+  NS_ABORT_IF_FALSE(0, "not implemented");
>+}
Better to throw for now. NS_ABORT is quite strong.

>+public:
>+  SpeechGrammar(nsISupports *parent);
nsISupports* aParent


>+  static SpeechGrammar* Constructor(const GlobalObject& aGlobal, ErrorResult& aRv);
>+
>+  void GetSrc(nsString& retval) const;
aRetVal


>+
>+  void SetSrc(const nsAString& arg);
aArg


>+  void SetWeight(float arg);
aArg


>+private:
>+  nsCOMPtr<nsISupports> mParent;
You need to add mParent to cycle collection. So in .cpp file
NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE_0(SpeechGrammar) ->
NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE_1(SpeechGrammar, mParent)

>+
>+NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE_0(SpeechGrammarList)
Also this should traverse object's mParent


>+already_AddRefed<SpeechGrammar>
>+SpeechGrammarList::Item(uint32_t index)
aIndex


>+SpeechGrammarList::AddFromURI(const nsAString &src, const Optional< float >& weight)
const nsAString& aSrc, const Optional<float>& aWeight


>+SpeechGrammarList::AddFromString(const nsAString& string, const Optional< float >& weight)
const nsAString& aString, const Optional<float>& aWeight


>+{
>+  JS_ASSERT(0);
No JS_* stuff unless absolutely needed. We have MOZ_ASSERT.
Also, assert is perhaps too strong for now. Better to thrown? or just return something dummy for now.

>+already_AddRefed<SpeechGrammar>
>+SpeechGrammarList::IndexedGetter(uint32_t index, bool& present)
aIndex, aPresent

>+public:
>+  SpeechGrammarList(nsISupports *parent);
nsISupports* aParent

>+  // Mark this as resultNotAddRefed to return raw pointers
Leftover from the binding example generator ?

>+  already_AddRefed<SpeechGrammar> Item(uint32_t index);
aIndex

>+
>+  void AddFromURI(const nsAString& src, const Optional< float >& weight);
>+
>+  void AddFromString(const nsAString& string, const Optional< float >& weight);
>+
>+  already_AddRefed<SpeechGrammar> IndexedGetter(uint32_t index, bool& present);
Check the coding style of parameters.
If the example binding codegen uses wrong coding style, file a bug to get that fixed.


>+GetSpeechRecognitionLog()
>+{
>+  static PRLogModuleInfo *sLog;
>+  if (!sLog)
>+    sLog = PR_NewLogModule("SpeechRecognition");
if (expr) {
  stmt;
}



>+SpeechRecognition::SpeechRecognition()
>+  : mProcessingEvent(false)
>+  , mEndpointer(new Endpointer(kSAMPLE_RATE))
>+  , mSpeechDetectionTimer(do_CreateInstance(NS_TIMER_CONTRACTID))
>+{
>+  SR_LOG("created SpeechRecognition");
>+  SetIsDOMBinding();
>+  nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
>+  obs->AddObserver(this, "xpcom-shutdown", false);
>+  obs->AddObserver(this, "getUserMedia:response:allow", false);
>+  obs->AddObserver(this, "getUserMedia:response:deny", false);
So these AddObserver calls mean that all the SpeechRecognition objects are kept alive
until xpcom-shutdown.



>+already_AddRefed<SpeechRecognition>
>+SpeechRecognition::Constructor(const GlobalObject& aGlobal, ErrorResult& aRv)
>+{
>+  nsCOMPtr<nsPIDOMWindow> win = do_QueryInterface(aGlobal.Get());
>+  if (!win) {
>+    aRv.Throw(NS_ERROR_FAILURE);
>+  }
>+  MOZ_ASSERT(win->IsInnerWindow());
>+  SpeechRecognition* object = new SpeechRecognition();
>+  NS_ADDREF(object);
>+  object->BindToOwner(win);
>+  return object;
>+}
nsRefPtr<SpeechRecognition> object = new SpeechRecognition();
object->BindToOwner(win);
return object.forget();



>+SpeechRecognition::ProcessEvent(SpeechEvent *event)
>+{
>+  SR_LOG("Processing event %d", event->mType);
>+  NS_ASSERTION(NS_IsMainThread(), "Should be running on main thread");
Not sure how useful this assertion is. If a cycle collectable object is used in non-main-thread, 
we'll crash soon and hard.

>+MediaOperationRunnable *
>+SpeechRecognition::MakeMediaOperationRunnable(MediaOperation aType)
>+{
>+  nsIThread *mt;
>+
>+  NS_GetMainThread(&mt);
>+  nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(GetOwner());
Why do you need do_QueryInterface? GetOwner() return nsPIDOMWindow.

>+
>+  nsRefPtr<GetUserMediaCallbackMediaStreamListener> listener(
>+      new GetUserMediaCallbackMediaStreamListener(mt, window->WindowID()));
2 space indentation please.


>+uint64_t SpeechRecognition::ProcessAudioSegment(AudioSegment *segment) {
AudioSegment* aSegment
and { goes to the next line

>+SpeechRecognition::StartedAudioCapture(SpeechEvent *event)
SpeechEvent* aEvent


>+{
>+  nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
>+  obs->NotifyObservers(nullptr, "webspeech:audio-capture-started", nullptr);
>+  DispatchTrustedEvent(NS_LITERAL_STRING("start"));
>+  DispatchTrustedEvent(NS_LITERAL_STRING("audiostart"));
Is this a safe point to dispatch events? Remember that event listeners can call stop() or
navigate the page to a new location and what not.
Could the events be dispatched asynchronously?
(need to check what the spec says)

>+SpeechRecognition::DetectSpeech(SpeechEvent *event)
SpeechEvent* aEvent


>+SpeechRecognition::WaitForSpeechEnd(SpeechEvent *event)
SpeechEvent* aEvent


>+SpeechRecognition::NotifyFinalResult(SpeechEvent *event)
ditto and same also elsewhere

>+SpeechRecognition::AbortSilently(SpeechEvent *event)
>+{
>+  if (mRecognitionService)
>+    mRecognitionService->Abort();
if (expr) {
  stmt;
}



>+SpeechRecognition::SelectDevice()
>+{
>+  SR_LOG("Selecting device");
>+  MediaEngine *engine = MediaManager::Get()->GetBackend();
>+
>+  nsTArray<nsRefPtr<MediaEngineAudioSource> > audioSources;
>+  engine->EnumerateAudioDevices(&audioSources);
>+
>+  if (audioSources.Length() <= 0)
>+    return NS_ERROR_FAILURE;
Maybe
NS_ENSURE_STATE(audioSources.Length() > 0);
And .Length() sure can't return < 0, since it is unsigned


>+
>+  for (uint32_t i = 0; i < audioSources.Length(); i++) {
++i


>+SpeechRecognition::StartRecording()
>+{
>+  if (!mAudioSource)
>+    return NS_ERROR_FAILURE;
NS_ENSURE_STATE(mAudioSource);


>+SpeechRecognition::Observe(nsISupports *aSubject, const char *aTopic, const PRUnichar * aData)
>+{
>+  NS_ASSERTION(NS_IsMainThread(), "Observer invoked off the main thread");
>+  nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
>+
>+  if (!strcmp(aTopic, "xpcom-shutdown")) {
>+    // TODO: cleanup?
>+    return NS_OK;
>+  } else if (!strcmp(aTopic, "getUserMedia:response:allow")) {
>+    nsCOMPtr<nsIMediaDevice> device;
>+    if (aSubject) {
>+      device = do_QueryInterface(aSubject);
>+      if (device) {
>+        nsString type;
>+        device->GetType(type);
>+        if (type.EqualsLiteral("audio")) {
>+          nsString name;
>+          device->GetName(name);
Do you really need nsString in these two cases?
nsAutoString would be better since it does stack allocation by default


>+          SR_LOG("User provided device %s", NS_ConvertUTF16toUTF8(name).get());
>+          MediaDevice *dev = static_cast<MediaDevice*>(device.get());
MediaDevice* dev = ...

>+// TODO Mark this as resultNotAddRefed to return raw pointers
>+already_AddRefed<SpeechGrammarList>
>+SpeechRecognition::Grammars() const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+	return 0;
>+}
>+
>+void
>+SpeechRecognition::SetGrammars(mozilla::dom::SpeechGrammarList& arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+void
>+SpeechRecognition::GetLang(nsString& retval) const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+void
>+SpeechRecognition::SetLang(const nsAString& arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+bool
>+SpeechRecognition::Continuous() const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+	return false;
>+}
>+
>+void
>+SpeechRecognition::SetContinuous(bool arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+bool
>+SpeechRecognition::InterimResults() const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+	return false;
>+}
>+
>+void
>+SpeechRecognition::SetInterimResults(bool arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+uint32_t
>+SpeechRecognition::MaxAlternatives() const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+	return 0;
>+}
>+
>+void
>+SpeechRecognition::SetMaxAlternatives(uint32_t arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+void
>+SpeechRecognition::GetServiceURI(nsString& retval) const
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
>+
>+void
>+SpeechRecognition::SetServiceURI(const nsAString& arg)
>+{
>+	NS_ABORT_IF_FALSE(0, "not implemented");
>+}
Some \t characters in the code above


>+SpeechRecognition::getGUMData()
GetGUMData();

>+{
>+  nsresult rv;
>+  nsCOMPtr<nsIUUIDGenerator> uuidgen =
>+    do_GetService("@mozilla.org/uuid-generator;1", &rv);
>+  if (NS_FAILED(rv))
>+    return EmptyString();
if (expr) {
  stmt;
}

>+class SpeechEvent : public nsRunnable
>+{
>+public:
>+  SpeechEvent(SpeechRecognition *recognition, SpeechRecognition::EventType type)
>+  : mAudioSegment(0)
>+  , mRecognitionResultList(0)
>+  , mError(0)
>+  , mRecognition(recognition)
>+  , mType(type)
>+  {
>+  }
>+
>+  ~SpeechEvent();
>+
>+  NS_IMETHOD Run();
>+  AudioSegment *mAudioSegment;
>+  nsRefPtr<SpeechRecognitionResultList> mRecognitionResultList; // TODO: make this a session being passed which also has index and stuff
>+  nsCOMPtr<nsIDOMSpeechRecognitionError> mError;
>+
>+  friend class SpeechRecognition;
>+private:
>+  SpeechRecognition *mRecognition;
This is super-unsafe, once we stop leaking all the SpeechRecognition objects.
nsRefPtr<SpeechRecognition> ?


>+NS_IMPL_CYCLE_COLLECTION_WRAPPERCACHE_0(SpeechRecognitionResult)
_1 and mParent


>+SpeechRecognitionResultList::SpeechRecognitionResultList(SpeechRecognition *parent)
* aParent

>+SpeechStreamListener::NotifyQueuedTrackChanges(MediaStreamGraph *aGraph, TrackID aID,
>+                                TrackRate aTrackRate,
>+                                TrackTicks aTrackOffset,
>+                                uint32_t aTrackEvents,
>+                                const MediaSegment& aQueuedMedia)
Odd indentation


>+++ b/content/media/webspeech/recognition/energy_endpointer.h
>@@ -0,0 +1,156 @@
>+// Copyright (c) 2012 The Chromium Authors. All rights reserved.
>+// Use of this source code is governed by a BSD-style license that can be
>+// found in the LICENSE file.
We have other Chromium code too in tree, but I wonder if we should keep such code in
certain places. You could ask perhaps cjones or bsmedberg or gerv

Attachment #712684 - Flags: review?(bugs) → review-

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 24

•

11 years ago

Comment on attachment 712689 [details] [diff] [review]
Bug 650295 - Tests for Speech Recognition

Shouldn't we have fake events also in non-debug builds?
Might be better to enable that stuff based on some pref.

Coding style for if-else is

if (expr) {
  stmt1;
} else {
  stmt2;
}


I need to go through this more properly once I've re-reviewed the updated version of the first patch.

Attachment #712689 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 25

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

Fixed coding style violations.

Known issue: SpeechEvents carry an nsRefPtr to SpeechRecognition, which is initialized out of the main thread. Since SpeechRecognition is not thread-safe, this causes a crash.

Attachment #712684 - Attachment is obsolete: true

Guilherme Gonçalves [:ggp]

Assignee

Comment 26

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

Due to the lack of a recognition service, the tests will fail to run. The next iteration of this patch should contain a FakeSpeechRecognitionService that allows the tests to be used.

Attachment #712689 - Attachment is obsolete: true

Guilherme Gonçalves [:ggp]

Assignee

Comment 27

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

Attachment #713705 - Attachment is obsolete: true

Attachment #717310 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 28

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

This new version contains only small (mostly cosmetic) changes compared to the previous. Once again, the test browser_success_with_recognition_service.js is not expected to pass.

Attachment #713706 - Attachment is obsolete: true

Attachment #717312 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 29

•

11 years ago

Comment on attachment 717310 [details] [diff] [review]
Bug 650295 - Implement the main state machine for Speech Recognition


>+class SpeechGrammar MOZ_FINAL : public nsISupports,
>+                                public nsWrapperCache,
>+                                public EnableWebSpeechCheck
>+{
>+public:
>+  SpeechGrammar(nsISupports* aParent);
>+  ~SpeechGrammar();
>+
>+  NS_DECL_CYCLE_COLLECTING_ISUPPORTS
>+  NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS(SpeechGrammar)
>+
>+  // TODO: return something sensible here, and change the return type
>+  nsISupports* GetParentObject() const;
You can remove the TODO comment, right?
>+class SpeechGrammarList MOZ_FINAL : public nsISupports,
>+                                    public nsWrapperCache,
>+                                    public EnableWebSpeechCheck
>+{
>+public:
>+  SpeechGrammarList(nsISupports* aParent);
>+  ~SpeechGrammarList();
>+
>+  NS_DECL_CYCLE_COLLECTING_ISUPPORTS
>+  NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS(SpeechGrammarList)
>+
>+  SpeechGrammarList* Constructor(const GlobalObject& aGlobal, ErrorResult& aRv);
>+
>+  // TODO: return something sensible here, and change the return type
>+  nsISupports* GetParentObject() const;
ditto

>+NS_INTERFACE_MAP_BEGIN(SpeechRecognition)
>+  NS_INTERFACE_MAP_ENTRY(nsIObserver)
>+  NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsIObserver)
>+NS_INTERFACE_MAP_END_INHERITING(nsDOMEventTargetHelper)
nsDOMEventTargetHelper gives the right nsISupports, so remove
NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsIObserver)
(I believe it would be even a security bug to return wrong nsISupports)


>+SpeechRecognition::~SpeechRecognition()
>+{
>+  mSpeechDetectionTimer = nullptr;
There shouldn't be need for this. nsCOMPtr/nsRefPtr release automatically when the object owning them is deleted.


>+SpeechRecognition::StartedAudioCapture(SpeechEvent* aEvent)
>+{
>+  DispatchTrustedEvent(NS_LITERAL_STRING("start"));
>+  DispatchTrustedEvent(NS_LITERAL_STRING("audiostart"));
>+
>+  mEndpointer->SetEnvironmentEstimationMode();
>+  mEstimationSamples += ProcessAudioSegment(aEvent->mAudioSegment);
Should we dispatch both events after mEndpointer->SetEnvironmentEstimationMode() and mEstimationSamples += ProcessAudioSegment(aEvent->mAudioSegment); ?



>+SpeechRecognition::AbortSilently(SpeechEvent* aEvent)
>+{
>+  if (mRecognitionService) {
>+    mRecognitionService->Abort();
>+  }
>+
>+  if (STATE_BETWEEN(STATE_ESTIMATING, STATE_RECOGNIZING)) {
>+    StopRecording();
>+  }
>+
>+  DispatchTrustedEvent(NS_LITERAL_STRING("end"));
>+  return Reset();
Should we call Reset before dispatching end? The event listener for "end" might
try to re-start recognition or something like that.



>+SpeechRecognition::FSMState
>+SpeechRecognition::AbortError(SpeechEvent* aEvent)
>+{
>+  AbortSilently(aEvent);
>+  NotifyError(aEvent);
>+  return Reset();
>+}
Same here


>+SpeechRecognition::NotifyError(SpeechEvent* aEvent)
>+{
>+  nsCOMPtr<nsIDOMEvent> domEvent = do_QueryInterface(aEvent->mError);
>+  NS_ASSERTION(domEvent, "SpeechEvent::mError not a valid error event");
Not very useful assertion. We crash anyway in the next line if mError isn't domEvent


>+SpeechRecognition::Observe(nsISupports* aSubject, const char* aTopic,
>+  const PRUnichar* aData)
align the last parameter under the other parameters

>+SpeechRecognition::SetGrammars(mozilla::dom::SpeechGrammarList& aArg,
>+  ErrorResult& aRv)
ditto

>+SpeechRecognition::Start(ErrorResult& aRv)
>+{
>+  if (!STATE_EQUALS(STATE_IDLE)) {
>+    aRv.Throw(NS_ERROR_DOM_INVALID_STATE_ERR);
>+    return;
>+  }
>+
>+  const char* speechRecognitionServiceCID =
>+    NS_SPEECH_RECOGNITION_SERVICE_CONTRACTID_PREFIX "google";
So we need to make this more generic. Add a pref for the service name?
"media.webspeech.service", "google"
(Btw, do we need separate contractid for simple google service and for the continuous recognition service?)





>+
>+  nsresult rv;
>+  mRecognitionService = do_GetService(speechRecognitionServiceCID, &rv);
>+  NS_ASSERTION(mRecognitionService.get(),
>+               "failed to instantiate recognition service");
And if you read the pref, you need to then null check here and not do the assertion.

>+SpeechRecognition::DispatchError(EventType aErrorType, int aErrorCode,
>+  const nsAString& aMessage)
Align the last parameter

>+{
>+  NS_ASSERTION(aErrorType == EVENT_RECOGNITIONSERVICE_ERROR ||
>+               aErrorType == EVENT_AUDIO_ERROR, "Invalid error type!");
Could you add MOZ_ASSERT to check that this is called only on main thread.



>+  SpeechEvent* event = new SpeechEvent(this, aErrorType);
nsRefPtr<SpeechEvent> so tht you don't leak if NS_DispatchToMainThread fails
(it may fail during shutdown).



>+void
>+SpeechRecognition::FeedAudioData(already_AddRefed<SharedBuffer> aSamples,
>+  unsigned int aDuration,
>+  MediaStreamListener* aProvider)
Align params

>+{
>+  NS_ASSERTION(!NS_IsMainThread(),
>+               "FeedAudioData should not be called in the main thread");
>+
>+  AudioSegment* segment = new AudioSegment();
>+
>+  nsAutoTArray<const int16_t*, 1> channels;
>+  channels.AppendElement(static_cast<const int16_t*>(aSamples.get()->Data()));
>+  segment->AppendFrames(aSamples, channels, aDuration);
>+
>+  SpeechEvent* event = new SpeechEvent(this, EVENT_AUDIO_DATA);
nsRefPtr<SpeechEvent>

>+
>+// TODO: ownership model for this
So the ownership is now clear?


>+
>+  // Mark this as resultNotAddRefed to return raw pointers
>+  already_AddRefed<SpeechRecognitionAlternative> Item(uint32_t aIndex);
Remove the //.... comment


>+class SpeechRecognitionResultList MOZ_FINAL : public nsISupports,
>+                                              public nsWrapperCache,
>+                                              public EnableWebSpeechCheck
>+{
>+public:
>+  SpeechRecognitionResultList(SpeechRecognition* aParent);
>+  ~SpeechRecognitionResultList();
>+
>+  NS_DECL_CYCLE_COLLECTING_ISUPPORTS
>+  NS_DECL_CYCLE_COLLECTION_SCRIPT_HOLDER_CLASS(SpeechRecognitionResultList)
>+
>+  // TODO: return something sensible here, and change the return type
>+  nsISupports* GetParentObject() const;
Remove TODO comment

>+
>+  virtual JSObject* WrapObject(JSContext* aCx, JSObject* aScope,
>+                               bool* aTriedToWrap);
>+
>+  uint32_t Length() const;
>+
>+  // Mark this as resultNotAddRefed to return raw pointers
>+  already_AddRefed<SpeechRecognitionResult> Item(uint32_t aIndex);
>+
>+  already_AddRefed<SpeechRecognitionResult> IndexedGetter(uint32_t aIndex, bool& aPresent);
>+
>+  nsTArray<nsRefPtr<SpeechRecognitionResult>> mItems;
You need to cycle collect mItems
>+
>+// TODO should this be in the DOM namespace?
dom namespace is ok.


In general looking really good so far :)

Attachment #717310 - Flags: review?(bugs) → review-

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 30

•

11 years ago

Comment on attachment 717312 [details] [diff] [review]
Bug 650295 - Tests for Speech Recognition

I think for testing purposes we want mTestConfig to be available even in opt build, but perhaps use
a pref for it.


> SpeechRecognition::StopRecording()
> {
>+#ifdef DEBUG
>+  if (mTestConfig.fakeFSMEvents) {
>+    NS_ENSURE_STATE(mDOMStream.get() == nullptr);
>+  }
>+#endif
Could you explain this?
Also, it could be NS_ENSURE_STATE(!mDOMStream);



>@@ -433,20 +451,72 @@
> 
>   if (!strcmp(aTopic, NS_TIMER_CALLBACK_TOPIC) &&
>       STATE_BETWEEN(STATE_IDLE, STATE_WAITING_FOR_SPEECH)) {
> 
>     DispatchError(SpeechRecognition::EVENT_AUDIO_ERROR,
>                   nsIDOMSpeechRecognitionError::NO_SPEECH,
>                   NS_LITERAL_STRING("No speech detected (timeout)"));
>   }
>+#ifdef DEBUG
>+  else if (mTestConfig.fakeFSMEvents &&
>+          !strcmp(aTopic, SPEECH_RECOGNITION_TEST_EVENT_REQUEST_TOPIC)) {
>+    ProcessTestEventRequest(nsDependentString(aData));
>+  }
>+#endif




>+SpeechRecognition::ProcessTestEventRequest(const nsAString& aEventName)
>+{
>+  if (aEventName.EqualsLiteral("EVENT_START")) {
>+    ErrorResult err;
>+    Start(err);
>+    StartRecording(nullptr);
>+  } else if (aEventName.EqualsLiteral("EVENT_STOP")) {
>+    Stop();
>+  } else if (aEventName.EqualsLiteral("EVENT_ABORT")) {
>+    Abort();
>+  } else if (aEventName.EqualsLiteral("EVENT_AUDIO_ERROR")) {
>+    DispatchError(SpeechRecognition::EVENT_AUDIO_ERROR,
>+                  nsIDOMSpeechRecognitionError::AUDIO_CAPTURE, // TODO different codes?
>+                  NS_LITERAL_STRING("AUDIO_ERROR test event"));
>+  } else if (aEventName.EqualsLiteral("EVENT_AUDIO_DATA")) {
>+    const char* mockAudioDataFilename = "browser/content/media/webspeech/recognition/test/hello.wav";
Hmm, could we read the file name from fake service. And one could set the file name to fake service.
It would make it easier to write certain automatic tests. And perhaps the fake service could have a method
which takes url and gives back the data. It might need to read the data synchronously, which is rather bad, but
then, the service is for testing only.

>+    SR_LOG("Using mock audio file '%s'\n", mockAudioDataFilename);
>+
>+    FILE* mockAudioDataFile = fopen(mockAudioDataFilename, "rb");
>+    NS_ASSERTION(mockAudioDataFile, "failed to open mock audio data file");
>+
>+    while (true) {
>+      nsRefPtr<SharedBuffer> samples(SharedBuffer::Create(160 * sizeof(int16_t)));
>+      int16_t* to = static_cast<int16_t*>(samples->Data());
>+      unsigned int read = fread(to, sizeof(int16_t), 160, mockAudioDataFile);
>+      if (read == 0) break;
>+
>+      /* FIXME pass FakeSpeechStreamListener when we have it */
>+      FeedAudioData(samples.forget(), read, nullptr);
>+    }
If url would be taken as param, then data reading should use necko APIs.
Or, hmm, perhaps sync XHR would work.


>+FakeSpeechRecognitionService::Initialize(dom::SpeechRecognition* aSpeechRecognition)
>+{
>+    mRecognition = aSpeechRecognition;
>+    nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
>+    obs->AddObserver(this, SPEECH_RECOGNITION_TEST_EVENT_REQUEST_TOPIC, false);
>+    return NS_OK;
>+}
2 space indentation here and elsewhere

>+  } else if (eventName.EqualsLiteral("EVENT_RECOGNITIONSERVICE_FINAL_RESULT")) {
>+    SpeechEvent* event = new SpeechEvent(mRecognition, SpeechRecognition::EVENT_RECOGNITIONSERVICE_FINAL_RESULT);
nsRefPtr<SpeechEvent>


>+FakeSpeechRecognitionService::BuildMockResultList()
>+{
>+  SpeechRecognitionResultList *resultList = new SpeechRecognitionResultList(mRecognition);
* goes with the type. Same elsewhere

Attachment #717312 - Flags: review?(bugs) → review-

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

11 years ago

Blocks: b2g-v-next

Guilherme Gonçalves [:ggp]

Assignee

Comment 31

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

Attachment #717310 - Attachment is obsolete: true

Attachment #721943 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 32

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

In this iteration, the tests were converted to plain mochitest files. We can now specify the input audio file for tests in JS, using an <audio> element's mozCaptureStreamUntilEnded() method to get a stream that mocks microphone input.

Attachment #717312 - Attachment is obsolete: true

Attachment #721945 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 33

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

Refreshing the patch with some small fixes and moz.build conversion.

Attachment #721943 - Attachment is obsolete: true

Attachment #721943 - Flags: review?(bugs)

Attachment #722596 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Comment 34

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

Attachment #721945 - Attachment is obsolete: true

Attachment #721945 - Flags: review?(bugs)

Attachment #722597 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 35

•

11 years ago

Comment on attachment 722596 [details] [diff] [review]
Bug 650295 - Implement the main state machine for Speech Recognition

>+SpeechGrammarList::AddFromURI(const nsAString& aSrc,
>+  const Optional<float>& aWeight, ErrorResult& aRv)
Could you align the params below const nsAString& aSrc

>+SpeechGrammarList::AddFromString(const nsAString& aString,
>+  const Optional<float>& aWeight, ErrorResult& aRv)
Similar thing here


>+SpeechGrammarList::IndexedGetter(uint32_t aIndex, bool& aPresent,
>+  ErrorResult& aRv)
And here

>+  virtual JSObject* WrapObject(JSContext* aCx, JSObject* aScope,
>+                               bool* aTriedToWrap);
Note, the last param has been removed from WrapObject


Could you make sure that PrefControlled works ok when the pref is false.
None of the interfaces should be visible in the global scope in that case.


Need to still review SpeechRecognition and SpeechStreamListener

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 36

•

11 years ago

Comment on attachment 722596 [details] [diff] [review]
Bug 650295 - Implement the main state machine for Speech Recognition


>+  SR_LOG("created SpeechRecognition");
>+  SetIsDOMBinding();
>+  mEndpointer->set_speech_input_complete_silence_length(500000);
>+  mEndpointer->set_long_speech_input_complete_silence_length(1000000);
>+  mEndpointer->set_long_speech_length(3*1000000);
space before and after *
Might be good to have prefs for all these values.
So, something like 
mEndpointer->set_speech_input_complete_silence_length(Preferences::GetInt(""media.webspeech.silence_length", 500000));
etc.

>+
>+  nsresult rv = mSpeechDetectionTimer->Init(this, kSPEECH_DETECTION_TIMEOUT_MS,
>+                                            nsITimer::TYPE_ONE_SHOT);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  return NS_OK;
>+}
Just
return mSpeechDetectionTimer->Init(this, kSPEECH_DETECTION_TIMEOUT_MS, nsITimer::TYPE_ONE_SHOT);


>+  nsCOMPtr<nsIDOMEvent> domEvent;
>+  NS_NewDOMSpeechRecognitionError(getter_AddRefs(domEvent), nullptr, nullptr);
>+
>+  nsCOMPtr<nsIDOMSpeechRecognitionError> srError = do_QueryInterface(domEvent);
Oh, the interface should be nsIDOMSpeechRecognitionErrorEvent 
I filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=21272

No need to change code before that bug is fixed though



>+  nsAutoPtr<Endpointer> mEndpointer;
Hmm, couldn't this be just Endpointer mEndpointer.


>+  nsRefPtr<SharedBuffer> samples(SharedBuffer::Create(aChunk.mDuration
>+                                                      * 1 // channel
>+                                                      * sizeof(int16_t)));
* should be in the previous line in both cases

Someone familiar with our media APIs should take a look at too.

Attachment #722596 - Flags: review?(bugs) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 37

•

11 years ago

Comment on attachment 722597 [details] [diff] [review]
Bug 650295 - Tests for Speech Recognition

> SpeechRecognition::SpeechRecognition()
>   : mProcessingEvent(false)
>   , mEndpointer(new Endpointer(kSAMPLE_RATE))
>+  , mAudioSamplesPerChunk(mEndpointer->FrameSize())
>   , mSpeechDetectionTimer(do_CreateInstance(NS_TIMER_CONTRACTID))
> {
>   SR_LOG("created SpeechRecognition");
>   SetIsDOMBinding();
>+
>+  Preferences::AddBoolVarCache(&mTestConfig.enableTests, TEST_PREFERENCE_ENABLE);
>+  if (mTestConfig.enableTests) {
>+    nsCOMPtr<nsIObserverService> obs = services::GetObserverService();
>+    obs->AddObserver(this, SPEECH_RECOGNITION_TEST_EVENT_REQUEST_TOPIC, false);
>+    obs->AddObserver(this, SPEECH_RECOGNITION_TEST_END_TOPIC, false);
>+    Preferences::AddBoolVarCache(&mTestConfig.fakeFSMEvents, TEST_PREFERENCE_FAKE_FSM_EVENTS);
>+    Preferences::AddBoolVarCache(&mTestConfig.fakeRecognitionService, TEST_PREFERENCE_FAKE_RECOGNITION_SERVICE);
We don't want to add these bool var caches each time a SpeechRecognition object is created, only once.


>-uint64_t
>+unsigned int
> SpeechRecognition::ProcessAudioSegment(AudioSegment* aSegment)
> {
>   AudioSegment::ChunkIterator iterator(*aSegment);
>-  uint64_t samples = 0;
>+  unsigned int samples = 0;
Why?
I'd really prefer explicit int type.


>+ * Buffer audio samples into mAudioSamplesBuffer until aBufferSize.
>+ * Updates mBufferedSamples and returns the number of samples that were buffered.
>+ */
>+unsigned int
>+SpeechRecognition::FillSamplesBuffer(const int16_t* aSamples,
>+                                     unsigned int aSampleCount)
>+{
unsigned int?

(need to get these patches landed, disabled by default. Easier to then hack this stuff some more)

Attachment #722597 - Flags: review?(bugs) → review+

Nikhil Marathe [:nsm] (No longer reading bugmail, please needinfo?)

Updated

•

11 years ago

Blocks: 850857

Mozilla RelEng Bot

Comment 38

•

11 years ago

Try run for 39466d8f1c53 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=39466d8f1c53
Results (out of 18 total builds):
    failure: 18
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nsm.nikhil@gmail.com-39466d8f1c53

Mozilla RelEng Bot

Comment 39

•

11 years ago

Try run for 727cb5c33e69 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=727cb5c33e69
Results (out of 233 total builds):
    exception: 8
    success: 78
    warnings: 60
    failure: 87
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nsm.nikhil@gmail.com-727cb5c33e69

Mozilla RelEng Bot

Comment 40

•

11 years ago

Try run for b0594e7aa914 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=b0594e7aa914
Results (out of 23 total builds):
    exception: 22
    failure: 1
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/nsm.nikhil@gmail.com-b0594e7aa914

Guilherme Gonçalves [:ggp]

Assignee

Updated

•

11 years ago

Depends on: 852196

Guilherme Gonçalves [:ggp]

Assignee

Comment 41

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition (obsolete) — Details — Splinter Review

Updating the patches. Several small changes since the last version.

Attachment #722596 - Attachment is obsolete: true

Guilherme Gonçalves [:ggp]

Assignee

Comment 42

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition (obsolete) — Details — Splinter Review

Attachment #722597 - Attachment is obsolete: true

Guilherme Gonçalves [:ggp]

Assignee

Updated

•

11 years ago

Attachment #729278 - Flags: review?(bugs)

Guilherme Gonçalves [:ggp]

Assignee

Updated

•

11 years ago

Attachment #729280 - Flags: review?(bugs)

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 43

•

11 years ago

Comment on attachment 729280 [details] [diff] [review]
Bug 650295 - Tests for Speech Recognition

Nit, void Init() {
'{' goes to its own line.

Attachment #729280 - Flags: review?(bugs) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

11 years ago

Attachment #729278 - Flags: review?(bugs) → review+

Guilherme Gonçalves [:ggp]

Assignee

Comment 44

•

11 years ago

Attached patch Bug 650295 - Implement the main state machine for Speech Recognition — Details — Splinter Review

Refreshing with licensing resolved and smaug's last comments. Keeping the r+ from smaug.

Attachment #729278 - Attachment is obsolete: true

Attachment #730307 - Flags: review+

Guilherme Gonçalves [:ggp]

Assignee

Comment 45

•

11 years ago

Attached patch Bug 650295 - Tests for Speech Recognition — Details — Splinter Review

Attachment #729280 - Attachment is obsolete: true

Attachment #730308 - Flags: review+

John Schoenick [:johns]

Comment 46

•

11 years ago

Comment on attachment 730307 [details] [diff] [review]
Bug 650295 - Implement the main state machine for Speech Recognition

https://hg.mozilla.org/integration/mozilla-inbound/rev/dfd7768f8f48

Attachment #730307 - Flags: checkin+

John Schoenick [:johns]

Comment 47

•

11 years ago

Comment on attachment 730308 [details] [diff] [review]
Bug 650295 - Tests for Speech Recognition

https://hg.mozilla.org/integration/mozilla-inbound/rev/c5e0abff4496

Attachment #730308 - Flags: checkin+

Ryan VanderMeulen [:RyanVM]

Comment 48

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/dfd7768f8f48
https://hg.mozilla.org/mozilla-central/rev/c5e0abff4496

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla22

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Depends on: 855704

Jan Beich

Updated

•

11 years ago

Blocks: 855728

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 49

•

11 years ago

This seems to be missing review from a build peer.

Ryan VanderMeulen [:RyanVM]

Updated

•

11 years ago

Depends on: 855735

Andrew McCreight [:mccr8]

Updated

•

11 years ago

No longer blocks: 855728

Depends on: 855728

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 50

•

11 years ago

Why would this need a review from a build peer?
If I missed something, my bad, but this passed on try.

Asa Dotzler [:asa]

Updated

•

11 years ago

relnote-firefox: --- → ?

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 51

•

11 years ago

No relnote yet. This won't be enabled before we have backend.

Ioana (away)

Updated

•

11 years ago

QA Contact: simona.marcu

Alex Keybl [:akeybl]

Updated

•

11 years ago

relnote-firefox: ? → ---

Phil Ringnalda (:philor)

Updated

•

11 years ago

Depends on: 861538

Phil Ringnalda (:philor)

Updated

•

11 years ago

Depends on: 861590

Eric Shepherd [:sheppy]

Updated

•

10 years ago

Keywords: dev-doc-needed

Santanu

Comment 52

•

10 years ago

(In reply to Oskar Ivanić (:icecold) from comment #0)
> User-Agent:       Mozilla/5.0 (Windows NT 6.1; rv:6.0a1) Gecko/20110414
> Firefox/6.0a1
> Build Identifier: 
> 
> "With this API, developers can give web apps the ability to transcribe your
> voice to text. When a web page uses this feature, you simply click on an
> icon and then speak into your computer’s microphone. The recorded audio is
> sent to speech servers for transcription, after which the text is typed out
> for you."
> 
> Google implemented support in Chrome and made a it available on Google
> Translate. (you need Chrome Canary build)
> 
> Reproducible: Always

So when google speech api will be available in Firefox.

Mahesh Virupa

Comment 55

•

9 years ago

Hi All,

This bug status says, it is resolved.
In which release version, It has been resolved, Please update.

Thanks,
Mahesh

alex_mayorga

Comment 56

•

9 years ago

¡Hola Guilherme!

So is this media.webspeech.recognition.enable;false and media.webspeech.synth.enabled;false on Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0 ID:20151019030227 CSet: 1a157155a4fe0074b3d03b54fe9e466472c2cd56 ?

How do I go about verifying this bug?

All demos out there on the web seem to depend on x-webkit-speech

¡Gracias!
Alex

Flags: needinfo?(guilherme.p.gonc+bmo)

Guilherme Gonçalves [:ggp]

Assignee

Updated

•

8 years ago

Flags: needinfo?(guilherme.p.gonc+bmo)

Comment hidden (spam)

Karl Tomlinson (:karlt)

Updated

•

3 years ago

Blocks: 1685416

Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 116.04 KB, patch	smaug : review-	Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 79.23 KB, patch		Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 112.51 KB, patch		Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 79.88 KB, patch		Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 110.40 KB, patch	smaug : review-	Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 91.16 KB, patch	smaug : review-	Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 111.71 KB, patch		Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 191.08 KB, patch		Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 112.30 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 192.29 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 123.37 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 193.15 KB, patch	smaug : review+	Details \| Diff \| Splinter Review
Bug 650295 - Implement the main state machine for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 124.39 KB, patch	ggp : review+ johns : checkin+	Details \| Diff \| Splinter Review
Bug 650295 - Tests for Speech Recognition 11 years ago Guilherme Gonçalves [:ggp] 193.16 KB, patch	ggp : review+ johns : checkin+	Details \| Diff \| Splinter Review