Closed Bug 1003464 Opened 11 years ago Closed 9 years ago

Text to Speech API on Firefox for Linux

Categories

(Core :: DOM: Core & HTML, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox44 --- fixed

People

(Reporter: yash.girdhar, Assigned: eeejay)

References

Details

(Keywords: dev-doc-complete)

Attachments

(1 file, 1 obsolete file)

As a part of Web Speech API (https://dvcs.w3.org/hg/speech-api/raw-file/tip/speechapi.html), implement Text to Speech API on firefox for Linnux. Plan is to use speechd, as the gnome-speech has not been active for a time now.
So the API has been implemented already, but backend is missing.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Assignee: nobody → eitan
Depends on: 1207806
Comment on attachment 8665201 [details] [diff] [review] Support Web Speech API synthesis via speech-dispatcher kdavis said he could take look. (sorry, my review load is a bit high)
Attachment #8665201 - Flags: review?(bugs) → review?(kdavis)
Comment on attachment 8665201 [details] [diff] [review] Support Web Speech API synthesis via speech-dispatcher Review of attachment 8665201 [details] [diff] [review]: ----------------------------------------------------------------- Overall it's fine, but I have a few nits, noted throughout. However, there is at least one bug in SpeechDispatcherService::Speak where rate is set. If this is bug fixed, along with some of the nits, it's a review+ ::: configure.in @@ +4785,5 @@ > +if test "$MOZ_ENABLE_GTK" -o "$MOZ_ENABLE_QT" > +then > + MOZ_SYNTH_SPEECHD=1 > + > + MOZ_ARG_DISABLE_BOOL(dbus, The name here 'dbus' duplicates the name of the dbus support bool above. A unique name should be given to this boolean. ::: dom/media/webspeech/synth/speechd/SpeechDispatcherService.cpp @@ +5,5 @@ > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > + > +#include "nsISupports.h" > +#include "nsPrintfCString.h" > +#include "SpeechDispatcherService.h" Should come before all other includes[http://mzl.la/1Fv89o8] @@ +14,5 @@ > +#include "nsReadableUtils.h" > + > +#include "mozilla/dom/nsSynthVoiceRegistry.h" > +#include "mozilla/dom/nsSpeechTask.h" > +#include "mozilla/Preferences.h" Includes should be in alphabetic order[http://mzl.la/1Fv89o8] @@ +59,5 @@ > + SPDCallback callback_pause; > + SPDCallback callback_resume; > + SPDCallbackIM callback_im; > + > + /* partial, more private fields in structure */ If this is to have private fields, why is it a struct and not a class? @@ +134,5 @@ > + > + // Voice name > + nsString mName; > + > + // Voice language, in BCB-47 syntax Should be BCP-47 @@ +248,5 @@ > + case SPD_EVENT_END: > + mTask->DispatchEnd((TimeStamp::Now() - mStartTime).ToSeconds(), 0); > + remove = true; > + break; > + Can you explicitly indicate here that SPD_EVENT_INDEX_MARK is not supported? Something as simple as: case SPD_EVENT_INDEX_MARK: // Not yet supported break; @@ +352,5 @@ > + > + spd_set_notification_on(mSpeechdClient, SPD_BEGIN); > + spd_set_notification_on(mSpeechdClient, SPD_END); > + spd_set_notification_on(mSpeechdClient, SPD_CANCEL); > + spd_set_notification_on(mSpeechdClient, SPD_RESUME); I assume no pause due to reasons explained in SpeechDispatcherCallback::OnPause. @@ +442,5 @@ > +SpeechDispatcherService::Speak(const nsAString& aText, const nsAString& aUri, > + float aVolume, float aRate, float aPitch, > + nsISpeechTask* aTask) > +{ > + NS_ENSURE_TRUE(mInitialized, NS_ERROR_NOT_AVAILABLE); Should use NS_WARN_IF instead of NS_ENSURE_* [http://mzl.la/1NWF2eJ] @@ +467,5 @@ > + int rate = 0; > + > + if (aRate > 1) { > + rate = static_cast<int>((aRate - 1) * 10); > + } else if (aRate < 0) { This should be (aRate < 1) as according to the comments above (aRate < 0) never occurs as "We provide a rate of 0.1 to 10 with 1 being default." @@ +468,5 @@ > + > + if (aRate > 1) { > + rate = static_cast<int>((aRate - 1) * 10); > + } else if (aRate < 0) { > + rate = static_cast<int>((aRate * 100) - 110); This also can't be correct as aRate = 1 should map to rate = 0. It doesn't. It maps to -10. There is a discontinuity in the mapping as [assuming the (aRate < 0)=>(aRate < 1) switch above is made] aRate = 1+epsilon maps to rate = 0+epsilon but aRate = 1-epsilon maps to rate = -10+epsilon. (Where of course epsilon is small and positive.) Actually this should be: rate = static_cast<int>((aRate - 1) * (100/0.9)); @@ +477,5 @@ > + // We provide a pitch of 0 to 2 with 1 being the default. > + // speech-dispatcher expects -100 to 100 with 0 being default. > + spd_set_voice_pitch(mSpeechdClient, static_cast<int>((aPitch - 1) * 100)); > + > + // The last three parameters doesn't matter for an indirect service Type-o "don't matter" ::: dom/media/webspeech/synth/speechd/SpeechDispatcherService.h @@ +11,5 @@ > +#include "nsIThread.h" > +#include "nsISpeechService.h" > +#include "nsRefPtrHashtable.h" > +#include "nsTArray.h" > +#include "mozilla/StaticPtr.h" Include "mozilla/StaticPtr.h" should be sorted alphabetically[http://mzl.la/1Fv89o8]
Attachment #8665201 - Flags: review?(kdavis) → review+
(In reply to kdavis from comment #4) > Comment on attachment 8665201 [details] [diff] [review] > Support Web Speech API synthesis via speech-dispatcher > > Review of attachment 8665201 [details] [diff] [review]: > ----------------------------------------------------------------- > > Overall it's fine, but I have a few nits, noted throughout. > > However, there is at least one bug in SpeechDispatcherService::Speak where > rate is set. > If this is bug fixed, along with some of the nits, it's a review+ > > ::: configure.in > @@ +4785,5 @@ > > +if test "$MOZ_ENABLE_GTK" -o "$MOZ_ENABLE_QT" > > +then > > + MOZ_SYNTH_SPEECHD=1 > > + > > + MOZ_ARG_DISABLE_BOOL(dbus, > > The name here 'dbus' duplicates the name of the dbus support bool above. A > unique name should be given to this boolean. > Oops. > ::: dom/media/webspeech/synth/speechd/SpeechDispatcherService.cpp > @@ +5,5 @@ > > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */ > > + > > +#include "nsISupports.h" > > +#include "nsPrintfCString.h" > > +#include "SpeechDispatcherService.h" > > Should come before all other includes[http://mzl.la/1Fv89o8] Done. > > @@ +14,5 @@ > > +#include "nsReadableUtils.h" > > + > > +#include "mozilla/dom/nsSynthVoiceRegistry.h" > > +#include "mozilla/dom/nsSpeechTask.h" > > +#include "mozilla/Preferences.h" > > Includes should be in alphabetic order[http://mzl.la/1Fv89o8] > Done. > @@ +59,5 @@ > > + SPDCallback callback_pause; > > + SPDCallback callback_resume; > > + SPDCallbackIM callback_im; > > + > > + /* partial, more private fields in structure */ > > If this is to have private fields, why is it a struct and not a class? > It is a C struct in the libspeechd library. When I say private here, I mean that there is a comment in the speechd headers that says that the rest of the fields are private: http://git.freebsoft.org/?p=speechd.git;a=blob;f=src/api/c/libspeechd.h;h=050c9f59cd5be1039d73d7ea2c289d11290420bf;hb=HEAD#l90 I'll make the comment clearer. > @@ +134,5 @@ > > + > > + // Voice name > > + nsString mName; > > + > > + // Voice language, in BCB-47 syntax > > Should be BCP-47 > Yup. > @@ +248,5 @@ > > + case SPD_EVENT_END: > > + mTask->DispatchEnd((TimeStamp::Now() - mStartTime).ToSeconds(), 0); > > + remove = true; > > + break; > > + > > Can you explicitly indicate here that SPD_EVENT_INDEX_MARK is not supported? > Something as simple as: > > case SPD_EVENT_INDEX_MARK: > // Not yet supported > break; > Done. > @@ +352,5 @@ > > + > > + spd_set_notification_on(mSpeechdClient, SPD_BEGIN); > > + spd_set_notification_on(mSpeechdClient, SPD_END); > > + spd_set_notification_on(mSpeechdClient, SPD_CANCEL); > > + spd_set_notification_on(mSpeechdClient, SPD_RESUME); > > I assume no pause due to reasons explained in > SpeechDispatcherCallback::OnPause. > Yeah, removed that. > @@ +442,5 @@ > > +SpeechDispatcherService::Speak(const nsAString& aText, const nsAString& aUri, > > + float aVolume, float aRate, float aPitch, > > + nsISpeechTask* aTask) > > +{ > > + NS_ENSURE_TRUE(mInitialized, NS_ERROR_NOT_AVAILABLE); > > Should use NS_WARN_IF instead of NS_ENSURE_* [http://mzl.la/1NWF2eJ] > Done, thanks. > @@ +467,5 @@ > > + int rate = 0; > > + > > + if (aRate > 1) { > > + rate = static_cast<int>((aRate - 1) * 10); > > + } else if (aRate < 0) { > > This should be (aRate < 1) as according to the comments above (aRate < 0) > never occurs as "We provide a rate of 0.1 to 10 with 1 being default." > Oops! > @@ +468,5 @@ > > + > > + if (aRate > 1) { > > + rate = static_cast<int>((aRate - 1) * 10); > > + } else if (aRate < 0) { > > + rate = static_cast<int>((aRate * 100) - 110); > > This also can't be correct as aRate = 1 should map to rate = 0. It doesn't. > It maps to -10. > > There is a discontinuity in the mapping as [assuming the (aRate < 0)=>(aRate > < 1) switch above is made] aRate = 1+epsilon maps to rate = 0+epsilon but > aRate = 1-epsilon maps to rate = -10+epsilon. (Where of course epsilon is > small and positive.) > > Actually this should be: > > rate = static_cast<int>((aRate - 1) * (100/0.9)); > Yes! Thank you! I was sloppy there. The new block looks like this: if (aRate > 1) { rate = static_cast<int>((aRate - 1) * 10); } else if (aRate <= 1) { rate = static_cast<int>((aRate - 1) * (100/0.9)); } > @@ +477,5 @@ > > + // We provide a pitch of 0 to 2 with 1 being the default. > > + // speech-dispatcher expects -100 to 100 with 0 being default. > > + spd_set_voice_pitch(mSpeechdClient, static_cast<int>((aPitch - 1) * 100)); > > + > > + // The last three parameters doesn't matter for an indirect service > > Type-o "don't matter" > Yup. > ::: dom/media/webspeech/synth/speechd/SpeechDispatcherService.h > @@ +11,5 @@ > > +#include "nsIThread.h" > > +#include "nsISpeechService.h" > > +#include "nsRefPtrHashtable.h" > > +#include "nsTArray.h" > > +#include "mozilla/StaticPtr.h" > > Include "mozilla/StaticPtr.h" should be sorted > alphabetically[http://mzl.la/1Fv89o8] Done.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Depends on: 1227848
This has now been documented — see all the goodness at https://developer.mozilla.org/en-US/docs/Web/API/Web_Speech_API! A technical review would be really nice, although I appreciate that it is a fairly big API, so we might want to share the work between the different engineers involved.
Depends on: 1229489
Depends on: 1252732
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: