Closed Bug 904687 Opened 11 years ago Closed 11 years ago

Introduce Pico speech synthesis service

Categories

(Core :: DOM: Core & HTML, defect)

All
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox26 --- fixed
firefox27 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: eeejay, Assigned: eeejay)

References

Details

Attachments

(1 file, 3 obsolete files)

We now have pico installed in the gonk layer in b2g (bug #863174). We need a service for it that will use it.
Attached patch A WIP (obsolete) — Splinter Review
Attachment #798430 - Flags: review?(bugs)
Comment on attachment 798430 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

One thing I noticed while browsing the patch:
>+    // We are almost always in the main thread here, but it is not gaurenteed.

Nit: "guaranteed".
Comment on attachment 798430 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

build changes need review from a peer.
Attachment #798430 - Flags: review?(gps)
Comment on attachment 798430 [details] [diff] [review]
Introduce Svox Pico speech synthesis service


>+// Pico API constants
>+
>+#define PICO_MEM_SIZE 2500000 // size of memory allocated for engine and voice
>+
>+#define PICO_RETSTRINGSIZE 200  // max length of returned strings
>+
>+#define PICO_MAX_OUTBUF_SIZE 128 // max size of synth data buffer
The comments could be a bit better and I wonder if we should have (optional) prefs for this, and
perhaps just default to these values if the prefs aren't set.
Especially PICO_MEM_SIZE is something which might need tweaking and using a pref makes it easier to test
various values.

>+static class PicoApi
Hmm, static class. Why?

>+{
>+public:
>+
>+  PicoApi() : mInitialized(false) {}
>+
>+  bool Init()
>+  {
>+    if (mInitialized) {
>+      return true;
>+    }
>+
>+    void* handle = dlopen("libttspico.so", RTLD_LAZY);
Would it be possible to try to open the library also on OSX and windows, if it is installed?
Could you confirm bsmedberg or someone that this kind of dlopen/dlsym usage is ok.

>+
>+  pico_Status (* pico_initialize)(void*, uint32_t, pico_System*);
>+  pico_Status (* pico_terminate)(pico_System*);
>+  pico_Status (* pico_getSystemStatusMessage)(
>+    pico_System, pico_Status, pico_Retstring);
>+
>+  pico_Status (* pico_loadResource)(pico_System, const char*, pico_Resource*);
>+  pico_Status (* pico_unloadResource)(pico_System, pico_Resource*);
>+  pico_Status (* pico_getResourceName)(
>+    pico_System, pico_Resource, pico_Retstring);
>+  pico_Status (* pico_createVoiceDefinition)(pico_System, const char*);
>+  pico_Status (* pico_addResourceToVoiceDefinition)(
>+    pico_System, const char*, const char*);
>+  pico_Status (* pico_releaseVoiceDefinition)(pico_System, const char*);
>+  pico_Status (* pico_newEngine)(pico_System, const char*, pico_Engine*);
>+  pico_Status (* pico_disposeEngine)(pico_System, pico_Engine*);
>+
>+  pico_Status (* pico_resetEngine)(pico_Engine, int32_t);
>+  pico_Status (* pico_putTextUtf8)(
>+    pico_Engine, const char*, const int16_t, int16_t*);
>+  pico_Status (* pico_getData)(
>+    pico_Engine, void*, const int16_t, int16_t*, int16_t*);

Why these don't have 'm' prefix


>+#define PICO_ENSURE_SUCCESS(_funcName, _status)                           \
Maybe PICO_ENSURE_SUCCESS_VOID

>+#define PICO_ENSURE_SUCCESS_RV(_funcName, _status, _rv)                   \
And drop _RV from this.
That would be more compatible NS_ macros
>+  ~PicoCallbackRunnable()
>+  {
>+    // We are almost always in the main thread here, but it is not gaurenteed.
>+    nsCOMPtr<nsIThread> mainThread;
>+    NS_GetMainThread(getter_AddRefs(mainThread));
>+    NS_ProxyRelease(mainThread, mService, false);
Use nsMainThreadPtrHandle for mService?



>+
>+  nsISpeechTask* mTask;
What keeps mTask alive while this runnable is being processed?


>+  // Add SSML markup for pitch and rate
>+  nsPrintfCString markedUpText(
>+    "<pitch level=\"%0.0f\"><speed level=\"%0.0f\">%s</speed></pitch>",
>+    std::min(std::max(50.0f, mPitch * 100), 200.0f),
>+    std::min(std::max(20.0f, mRate * 100), 500.0f),
>+    mText.get());
We don't need namespaces or anything?


>+
>+  const char* text = markedUpText.get();
>+  size_t buffer_size = 512, buffer_offset = 0;
>+  nsRefPtr<SharedBuffer> buffer = SharedBuffer::Create(buffer_size);
>+  int16_t text_offset = 0, bytes_recv = 0, bytes_sent = 0, out_data_type = 0;
>+  int16_t text_remaining = strlen(text) + 1;
markedUpText.Length()


>+
>+  // Run this loop while this is the current task
>+  while (IsCurrentTask()) {
pico doesn't have any callback based way to read data

>+      if (mFirstData) {
>+        task->Setup(mCallback, 1, 16000, 2);
Could you #define 1, 16000 and 2 somewhere

>+      }
>+
>+      return task->SendAudioNative(
>+        mBufferSize ? (short*)mBuffer->Data() : nullptr, mBufferSize / 2);
C++ casting please
>+struct VoiceTraverserData {
{ to the next line

>+    if (GetVoiceFileLanguage(leafName, lang)) {
>+      nsAutoString uri;
>+      uri.AssignLiteral("urn:moz-tts:pico:");
>+      uri.Append(lang);
>+
>+      bool found = false;
>+      PicoVoice* voice = mVoices.GetWeak(uri, &found);
>+
>+      if (!found) {
>+        voice = new PicoVoice(lang);
>+        mVoices.Put(uri, voice);
>+      }
>+
>+      if (StringEndsWith(leafName, NS_LITERAL_CSTRING("_ta.bin"))) {
>+        rv = voiceFile->GetPersistentDescriptor(voice->mTaFile);
>+        MOZ_ASSERT(NS_SUCCEEDED(rv));
>+      }
>+
>+      if (StringEndsWith(leafName, NS_LITERAL_CSTRING("_sg.bin"))) {
>+        rv = voiceFile->GetPersistentDescriptor(voice->mSgFile);
>+        MOZ_ASSERT(NS_SUCCEEDED(rv));
>+      }
This stuff needs some explanation. _ta.bin? _sg.bin?

>+
>+  nsISpeechTask* mCurrentTask;
Is it guaranteed that this won't ever point to a deleted object?


This patch needs probably few iterations.
Attachment #798430 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #5)
> Comment on attachment 798430 [details] [diff] [review]
> Introduce Svox Pico speech synthesis service
> 
> 
> >+// Pico API constants
> >+
> >+#define PICO_MEM_SIZE 2500000 // size of memory allocated for engine and voice
> >+
> >+#define PICO_RETSTRINGSIZE 200  // max length of returned strings
> >+
> >+#define PICO_MAX_OUTBUF_SIZE 128 // max size of synth data buffer
> The comments could be a bit better and I wonder if we should have (optional)
> prefs for this, and
> perhaps just default to these values if the prefs aren't set.
> Especially PICO_MEM_SIZE is something which might need tweaking and using a
> pref makes it easier to test
> various values.
> 

I agree on the comments. About the values, the PICO_MEM_SIZE simply needs to be enough for an instantiated engine and a single voice. In Android's adapter, it is 2500000. Same with the PICO_OUTBUF_SIZE, which is 128 bytes. You could see it here:
http://git.mozilla.org/?p=external/caf/platform/external/svox.git;a=blob;f=pico/tts/com_svox_picottsengine.cpp;h=23709642ce4c7bb8724cd10271f5e97aaa7107fe;hb=ics_strawberry#l54
http://git.mozilla.org/?p=external/caf/platform/external/svox.git;a=blob;f=pico/tts/com_svox_picottsengine.cpp;h=23709642ce4c7bb8724cd10271f5e97aaa7107fe;hb=ics_strawberry#l69

> >+static class PicoApi
> Hmm, static class. Why?
> 

My inspiration has been AndroidGraphicBuffer.cpp/GLFunctions.

> >+{
> >+public:
> >+
> >+  PicoApi() : mInitialized(false) {}
> >+
> >+  bool Init()
> >+  {
> >+    if (mInitialized) {
> >+      return true;
> >+    }
> >+
> >+    void* handle = dlopen("libttspico.so", RTLD_LAZY);
> Would it be possible to try to open the library also on OSX and windows, if
> it is installed?
> Could you confirm bsmedberg or someone that this kind of dlopen/dlsym usage
> is ok.
> 

I guess it would be nice to support that. Have not tried. Assume the shard lib extensions are different on those platforms. Adding needinfo flag for bsmedberg.

> >+
> >+  pico_Status (* pico_initialize)(void*, uint32_t, pico_System*);
> >+  pico_Status (* pico_terminate)(pico_System*);
> >+  pico_Status (* pico_getSystemStatusMessage)(
> >+    pico_System, pico_Status, pico_Retstring);
> >+
> >+  pico_Status (* pico_loadResource)(pico_System, const char*, pico_Resource*);
> >+  pico_Status (* pico_unloadResource)(pico_System, pico_Resource*);
> >+  pico_Status (* pico_getResourceName)(
> >+    pico_System, pico_Resource, pico_Retstring);
> >+  pico_Status (* pico_createVoiceDefinition)(pico_System, const char*);
> >+  pico_Status (* pico_addResourceToVoiceDefinition)(
> >+    pico_System, const char*, const char*);
> >+  pico_Status (* pico_releaseVoiceDefinition)(pico_System, const char*);
> >+  pico_Status (* pico_newEngine)(pico_System, const char*, pico_Engine*);
> >+  pico_Status (* pico_disposeEngine)(pico_System, pico_Engine*);
> >+
> >+  pico_Status (* pico_resetEngine)(pico_Engine, int32_t);
> >+  pico_Status (* pico_putTextUtf8)(
> >+    pico_Engine, const char*, const int16_t, int16_t*);
> >+  pico_Status (* pico_getData)(
> >+    pico_Engine, void*, const int16_t, int16_t*, int16_t*);
> 
> Why these don't have 'm' prefix
> 

Wherever I have seen previous uses of dlsyms it has been without an 'm' prefix.

> 
> >+#define PICO_ENSURE_SUCCESS(_funcName, _status)                           \
> Maybe PICO_ENSURE_SUCCESS_VOID
> 
> >+#define PICO_ENSURE_SUCCESS_RV(_funcName, _status, _rv)                   \
> And drop _RV from this.
> That would be more compatible NS_ macros

Sounds good.

> >+  ~PicoCallbackRunnable()
> >+  {
> >+    // We are almost always in the main thread here, but it is not gaurenteed.
> >+    nsCOMPtr<nsIThread> mainThread;
> >+    NS_GetMainThread(getter_AddRefs(mainThread));
> >+    NS_ProxyRelease(mainThread, mService, false);
> Use nsMainThreadPtrHandle for mService?
> 

Awesome, good to know.

> 
> 
> >+
> >+  nsISpeechTask* mTask;
> What keeps mTask alive while this runnable is being processed?
> 

We compare to make sure nsPicoService and the runnable have the same task. If they are not, this one is canceled/dead and the runnable stops.

> 
> >+  // Add SSML markup for pitch and rate
> >+  nsPrintfCString markedUpText(
> >+    "<pitch level=\"%0.0f\"><speed level=\"%0.0f\">%s</speed></pitch>",
> >+    std::min(std::max(50.0f, mPitch * 100), 200.0f),
> >+    std::min(std::max(20.0f, mRate * 100), 500.0f),
> >+    mText.get());
> We don't need namespaces or anything?
> 
> 

You are saying add the namespace on top, or that we already have it? I'll try both.

> >+
> >+  const char* text = markedUpText.get();
> >+  size_t buffer_size = 512, buffer_offset = 0;
> >+  nsRefPtr<SharedBuffer> buffer = SharedBuffer::Create(buffer_size);
> >+  int16_t text_offset = 0, bytes_recv = 0, bytes_sent = 0, out_data_type = 0;
> >+  int16_t text_remaining = strlen(text) + 1;
> markedUpText.Length()
> 

Ok

> 
> >+
> >+  // Run this loop while this is the current task
> >+  while (IsCurrentTask()) {
> pico doesn't have any callback based way to read data
> 

Nope.

> >+      if (mFirstData) {
> >+        task->Setup(mCallback, 1, 16000, 2);
> Could you #define 1, 16000 and 2 somewhere
> 

Sure.

> >+      }
> >+
> >+      return task->SendAudioNative(
> >+        mBufferSize ? (short*)mBuffer->Data() : nullptr, mBufferSize / 2);
> C++ casting please
> >+struct VoiceTraverserData {
> { to the next line
> 

Ok

> >+    if (GetVoiceFileLanguage(leafName, lang)) {
> >+      nsAutoString uri;
> >+      uri.AssignLiteral("urn:moz-tts:pico:");
> >+      uri.Append(lang);
> >+
> >+      bool found = false;
> >+      PicoVoice* voice = mVoices.GetWeak(uri, &found);
> >+
> >+      if (!found) {
> >+        voice = new PicoVoice(lang);
> >+        mVoices.Put(uri, voice);
> >+      }
> >+
> >+      if (StringEndsWith(leafName, NS_LITERAL_CSTRING("_ta.bin"))) {
> >+        rv = voiceFile->GetPersistentDescriptor(voice->mTaFile);
> >+        MOZ_ASSERT(NS_SUCCEEDED(rv));
> >+      }
> >+
> >+      if (StringEndsWith(leafName, NS_LITERAL_CSTRING("_sg.bin"))) {
> >+        rv = voiceFile->GetPersistentDescriptor(voice->mSgFile);
> >+        MOZ_ASSERT(NS_SUCCEEDED(rv));
> >+      }
> This stuff needs some explanation. _ta.bin? _sg.bin?
> 

Ok,

> >+
> >+  nsISpeechTask* mCurrentTask;
> Is it guaranteed that this won't ever point to a deleted object?
> 

Holding a strong reference could potentially leak the window. The only way it would be prematurely deleted is if it were cancelled, so we remove the pointer in OnCancelled in the callback. I guess I could make mCurrentTask officially a weak reference...

> 
> This patch needs probably few iterations.

Of course, are you ok with not adding prefs for the defines above? And the sPicoApi naming scheme?
Flags: needinfo?(benjamin)
> > >+static class PicoApi
> > Hmm, static class. Why?
> > 
> 
> My inspiration has been AndroidGraphicBuffer.cpp/GLFunctions.
Well, it is not clear to me what static class in C++ even does.
All the comments I've found say that "In C++, a 'static class' has no meaning."

> We compare to make sure nsPicoService and the runnable have the same task.
> If they are not, this one is canceled/dead and the runnable stops.
Then the raw member variables certainly need few comments

> 
> You are saying add the namespace on top, or that we already have it? I'll
> try both.
Trying to say that SSML elements probably should be in some namespace


> > 
> > This patch needs probably few iterations.
> 
> Of course, are you ok with not adding prefs for the defines above? And the
> sPicoApi naming scheme?
I guess so, but please add some more comments to those defines to explain why the value is what it is.
Is there special hardware support for pico, or is it entirely in software? Should we consider doing this via emscripten rather than compiled code?
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #8)
> Should we consider doing this via emscripten rather than compiled code?

We did the driver in JS at first, but it had this terrible stuttering you can hear in my Firefox OS accessibility demo video: http://www.youtube.com/watch?v=gNWcCCddSm8&feature=youtube_gdata_player

With the patch written in C++, this latency and stuttering is completely gone. So I would recommend against it.
"it" being the implementation in JS or emscripten.
Is there a bug or newsgroup history I can read about the prior emscripten effort? Also, what is the urgency of this patch/what project does this bug block? If we can fix some code or platform issues and get this running via emscripten, that is strongly preferable to introducing binary attack points.

At this point I recommend making this #ifdef MOZ_B2G only and don't worry about DLL loading issues on mac or windows; we would definitely need to avoid dll injection attacks on Windows by using an absolute path of some form or another.
Flags: needinfo?(eitan)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
> Is there a bug or newsgroup history I can read about the prior emscripten

iirc what there is in bug 525444

> effort? Also, what is the urgency of this patch/what project does this bug

Well, we need to do something like this to make screen readers and some other assistive technology type apps work on b2g, so reasonably high I'd say.

> block? If we can fix some code or platform issues and get this running via
> emscripten, that is strongly preferable to introducing binary attack points.

It comes down to tts just being fairly compute heavy and latency being really important, so in the long run its probably possible, but its not clear to me that its a big deal given we have and are adding lots of C++ for all sorts of other things...

> At this point I recommend making this #ifdef MOZ_B2G only and don't worry

that's certainly the easiest thing to do for now

> about DLL loading issues on mac or windows; we would definitely need to
> avoid dll injection attacks on Windows by using an absolute path of some
> form or another.

unrelated, but I'm curious how such an attack makes any sense, surely the dynamic linker won't load dlls the user doesn't implicitly trust?
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #11)
> Is there a bug or newsgroup history I can read about the prior emscripten
> effort?

I didn't do an emscripten version, but I used js-ctypes. The problem was variable latencies between the main thread and the javascript worker that was pumping audio samples. Some messages would be delayed by hundreds of milliseconds (yes, we tried transferable buffers). So an emscripten version would have the same problem, because we would probably delegate it to a worker as well. Plus the usual js overhead.

The other, less technical issue, is that we succeeded in getting pico into gonk by default. So it would be sad to see that reversed or unused :P
Flags: needinfo?(eitan)
I would assume emscripten version would use quite a bit more memory.
There has been work elsewhere to convert various stuff from JS to C++ because of memory usage.
Comment on attachment 798430 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

Review of attachment 798430 [details] [diff] [review]:
-----------------------------------------------------------------

Build changes are almost there.

::: configure.in
@@ +7692,5 @@
>  dnl ========================================================
> +dnl = Enable Pico Speech Synthesis (Gonk usually)
> +dnl ========================================================
> +MOZ_ARG_ENABLE_BOOL(synth-pico,
> +[  --enable-b2g-bt      Set compile flags necessary for compiling Pico Web Speech API ],

b2g-bt??

::: content/media/webspeech/synth/Makefile.in
@@ +10,5 @@
>  include $(topsrcdir)/dom/dom-config.mk
>  
>  VPATH += \
>    $(srcdir)/ipc \
> +  $(srcdir)/services \

Why do you use VPATH here? We don't like to use VPATH because it is evil.

::: content/media/webspeech/synth/pico/Makefile.in
@@ +11,5 @@
> +
> +include $(DEPTH)/config/autoconf.mk
> +include $(topsrcdir)/dom/dom-config.mk
> +
> +EXPORT_LIBRARY  = 1

EXPORT_LIBRARY is now supported in moz.build. Don't define it here.
Attachment #798430 - Flags: review?(gps) → feedback+
(In reply to Olli Pettay [:smaug] from comment #5)
> >+  ~PicoCallbackRunnable()
> >+  {
> >+    // We are almost always in the main thread here, but it is not gaurenteed.
> >+    nsCOMPtr<nsIThread> mainThread;
> >+    NS_GetMainThread(getter_AddRefs(mainThread));
> >+    NS_ProxyRelease(mainThread, mService, false);
> Use nsMainThreadPtrHandle for mService?
> 
> 

This actually won't work because it forbids us from dereferencing off the main thread.

> 
> >+
> >+  nsISpeechTask* mTask;
> What keeps mTask alive while this runnable is being processed?
> 
> 
> >+  // Add SSML markup for pitch and rate
> >+  nsPrintfCString markedUpText(
> >+    "<pitch level=\"%0.0f\"><speed level=\"%0.0f\">%s</speed></pitch>",
> >+    std::min(std::max(50.0f, mPitch * 100), 200.0f),
> >+    std::min(std::max(20.0f, mRate * 100), 500.0f),
> >+    mText.get());
> We don't need namespaces or anything?

No, pico doesn't use an XML parser, it is a simple tokenizer. So I don't think there is much point here for name spaces.
Fixed what I could from Olli's feedback, wasn't all possible (like the nsMainThreadPtrHandle),
and changed build stuff as per Gregory's feedback
Attachment #797166 - Attachment is obsolete: true
Attachment #798430 - Attachment is obsolete: true
Attachment #801008 - Flags: review?(gps)
Attachment #801008 - Flags: review?(bugs)
Comment on attachment 801008 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

Review of attachment 801008 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/webspeech/synth/pico/Makefile.in
@@ +4,5 @@
> +
> +DEPTH            := @DEPTH@
> +topsrcdir        := @top_srcdir@
> +srcdir           := @srcdir@
> +VPATH            := @srcdir@

You don't need this boilerplate any more.

@@ +5,5 @@
> +DEPTH            := @DEPTH@
> +topsrcdir        := @top_srcdir@
> +srcdir           := @srcdir@
> +VPATH            := @srcdir@
> +FAIL_ON_WARNINGS := 1

FAIL_ON_WARNINGS should be defined in moz.build files.

::: layout/build/Makefile.in
@@ +204,5 @@
> +ifdef MOZ_SYNTH_PICO
> +SHARED_LIBRARY_LIBS += \
> +  $(DEPTH)/content/media/webspeech/synth/pico/$(LIB_PREFIX)synthpico.$(LIB_SUFFIX) \
> +  $(NULL)
> +endif

SHARED_LIBRARY_LIBS is part of moz.build now.
Attachment #801008 - Flags: review?(gps) → feedback+
Comment on attachment 801008 [details] [diff] [review]
Introduce Svox Pico speech synthesis service


>+
>+#define PICO_ENSURE_SUCCESS(_funcName, _status, _rv)                   \
>+  if (_status < 0) {                                                      \
Nit, the first \ isn't aligned properly


>+class PicoVoice
>+{
>+public:
>+
>+  PicoVoice(const nsAutoString& aLanguage)
nsAutoString? You want nsAString&


>+  ~PicoCallbackRunnable()
>+  {
>+    // We are almost always in the main thread here, but it is not gaurenteed.
>+    nsCOMPtr<nsIThread> mainThread;
>+    NS_GetMainThread(getter_AddRefs(mainThread));
>+    nsPicoService* service = nullptr;
>+    mService.swap(service);
>+    NS_ProxyRelease(mainThread, service, false);
Couldn't you just pass mService.forget().get() to NS_ProxyRelease. No need for temporary service variable.

>+  }
>+
>+  NS_DECL_ISUPPORTS
This is wrong. nsRunnable has already refcnt and QI. Use _INHERITED

>+
>+  // Add SSML markup for pitch and rate
Could you still add some comment why namespace isn't needed

>+  if (langPath.IsEmpty()) {
>+    langPath.AssignLiteral("/system/tts/lang_pico");
Could you #define this somewhere

>+
>+  nsISpeechTask* mCurrentTask;
This should be Atomic<nsISpeechTask*>
Comment on attachment 801008 [details] [diff] [review]
Introduce Svox Pico speech synthesis service


>+nsPicoService::nsPicoService()
>+  : mInitialized(false)
>+  , mVoicesMonitor("nsPicoService::mVoices")
>+  , mCurrentTask(nullptr)
>+  , mPicoSystem(nullptr)
>+  , mPicoEngine(nullptr)
>+  , mSgResource(nullptr)
>+  , mTaResource(nullptr)
>+  , mPicoMemArea(nullptr)
>+{
>+  nsCOMPtr<nsIRunnable> event = NS_NewRunnableMethod(this, &nsPicoService::Init);
>+  DebugOnly<nsresult> rv = NS_DispatchToCurrentThread(event);
>+  MOZ_ASSERT(NS_SUCCEEDED(rv));
>+}
We need to make sure PicoService is initialized off-main-thread since it does plenty of IO
(although it is perhaps just once per process).
I don't see anything making the guarantee.
Attachment #801008 - Flags: review?(bugs) → review-
- moz.build updates.
- Use nsISupports thread-safe decleration macros.
- Also with PicoVoice, changed to thread-safe add/remove ref macro. We are now creating them off main thread.
- Do initialization off main thread, but go back to main thread to register the voices.
Attachment #804810 - Flags: review?(bugs)
Comment on attachment 804810 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

>+static const mozilla::Module::CategoryEntry kCategories[] = {
>+  { "app-startup", "Pico Speech Synth", "service," PICOSERVICE_CONTRACTID },
>+  { NULL }

I wonder if the service should be instantiated on startup only if the user has set some pref,
and otherwise lazily. Perhaps worth a followup bug.

>+  const char* text = markedUpText.get();
>+  size_t buffer_size = 512, buffer_offset = 0;
>+  nsRefPtr<SharedBuffer> buffer = SharedBuffer::Create(buffer_size);
Why you create buffer here, but use it way later? I'd prefer creating it when first needed.
>+    do {
>+      // Run this loop while the result of getData is STEP_BUSY, when it finishes
>+      // synthesizing audio for the given text, it returns STEP_IDLE. We then
>+      // break to the outer loop and feed more text, if there is any left.
>+      if (!IsCurrentTask()) {
>+        // If the task has changed, quit.
>+        break;
>+      }
>+
>+      if (buffer_size - buffer_offset < PICO_MAX_OUTBUF_SIZE) {
>+        // The next audio data retrieved may be bigger than our buffer,
>+        // so send the data and flush the buffer.
This could use a bit better comment. Hard to understand this way when buffer_size is 512 and PICO_MAX_OUTBUF_SIZE is 128

>+
>+  PicoVoice* mCurrentVoice;
Could you add a getter to which makes access happens only on non-main-thread
Attachment #804810 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #22)
> Comment on attachment 804810 [details] [diff] [review]
> Introduce Svox Pico speech synthesis service
> 
> >+static const mozilla::Module::CategoryEntry kCategories[] = {
> >+  { "app-startup", "Pico Speech Synth", "service," PICOSERVICE_CONTRACTID },
> >+  { NULL }
> 
> I wonder if the service should be instantiated on startup only if the user
> has set some pref,
> and otherwise lazily. Perhaps worth a followup bug.
> 

That would be very nice to have by default. Having this on startup is only necessary because getVoices() is synchronous. If we fix the spec[1] it could be done lazily all the time.

https://www.w3.org/Bugs/Public/show_bug.cgi?id=22003

> >+  const char* text = markedUpText.get();
> >+  size_t buffer_size = 512, buffer_offset = 0;
> >+  nsRefPtr<SharedBuffer> buffer = SharedBuffer::Create(buffer_size);
> Why you create buffer here, but use it way later? I'd prefer creating it
> when first needed.

The decleration needs to be outside the `while (IsCurrentTask())` scope to span multiple loop iterations. I could set it to null, and only create it before pico_getData. Is that what you mean?

> >+    do {
> >+      // Run this loop while the result of getData is STEP_BUSY, when it finishes
> >+      // synthesizing audio for the given text, it returns STEP_IDLE. We then
> >+      // break to the outer loop and feed more text, if there is any left.
> >+      if (!IsCurrentTask()) {
> >+        // If the task has changed, quit.
> >+        break;
> >+      }
> >+
> >+      if (buffer_size - buffer_offset < PICO_MAX_OUTBUF_SIZE) {
> >+        // The next audio data retrieved may be bigger than our buffer,
> >+        // so send the data and flush the buffer.
> This could use a bit better comment. Hard to understand this way when
> buffer_size is 512 and PICO_MAX_OUTBUF_SIZE is 128
> 

I will change the language to chunk and buffer. You get the data from pico in chunks, and the buffer must be large enough to contain the max size of a chunk.

> >+
> >+  PicoVoice* mCurrentVoice;
> Could you add a getter to which makes access happens only on non-main-thread

Sounds great.
(In reply to Eitan Isaacson [:eeejay] from comment #23)
> The decleration needs to be outside the `while (IsCurrentTask())` scope to
> span multiple loop iterations. I could set it to null, and only create it
> before pico_getData. Is that what you mean?
Yes. Though, perhaps the current setup isn't too bad either.
Would be nice to uplift this and friends to 1.2...
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac9d3818894c

I missed yesterday's merge since inbound was closed.
Assignee: nobody → eitan
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → All
https://hg.mozilla.org/mozilla-central/rev/ac9d3818894c
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Comment on attachment 804810 [details] [diff] [review]
Introduce Svox Pico speech synthesis service

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Unfinished dependency of bug 863174.
User impact if declined: Firefox OS app developers would not be able to use the screen reader on a FF OS device to test their apps for accessibility.
Testing completed (on m-c, etc.): Yes. I've confirmed that this builds and flashes to a Firefox OS device and can be used by the screen reader when building from Mozilla-Central.
Risk to taking this patch (and alternatives if risky): None known, this is simply a service used by the screen reader when activated in Developer settings. The synthesizer itself is in Gonk.
String or IDL/UUID changes made by this patch: None.
Attachment #804810 - Flags: approval-mozilla-aurora?
Verified that this builds and flashes to a Firefox OS device without any additional pieces and that the synthesizer works fine when used from the screen reader.
Status: RESOLVED → VERIFIED
Attachment #804810 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #801008 - Attachment is obsolete: true
Depends on: 918989
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: