Closed Bug 1541290 Opened 5 years ago Closed 5 years ago

Crash on Web Speech API, (Speech Recognition portion) when feeding audio from the microphone

Categories

(Core :: Audio/Video: MediaStreamGraph, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: anatal, Assigned: achronop)

References

Details

Attachments

(2 files, 1 obsolete file)

STR:

1 - Build Firefox from trunk. I used commit b2cbe5a
2 - Once it's built, start it with ./mach run
3 - Go to about:config and turn on the following prefs:

  • media.webspeech.recognition.enable
  • media.webspeech.recognition.force_enable
  • media.webspeech.test.fake_fsm_events
  • media.webspeech.test.fake_recognition_service

4 - Navigate to any website which calls the Web Speech API and feeds audio from the microphone. Use this for example, and click: Start WSA microphone testing. The crash doesn't happen when feeding a DomMediaStream loaded with audio frames
5 - Let it capture more than 10 seconds of audio and the crash should happen. I managed to reproduce it on both OSX [1] as Linux [2]

[1]
Assertion failure: !IsStarted() && OnCubebOperationThread(), at /Users/anatal/projects/mozilla/gecko-dev/dom/media/GraphDriver.cpp:684
#01: mozilla::AudioCallbackDriver::Init()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ffbed1]
#02: mozilla::AsyncCubebTask::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x2ffb7c7]
#03: nsThreadPool::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x158b12]
#04: non-virtual thunk to nsThreadPool::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x158e8d]
#05: nsThread::ProcessNextEvent(bool, bool*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x151e24]
#06: NS_ProcessNextEvent(nsIThread*, bool)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x155ea8]
#07: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x846ced]
#08: MessageLoop::Run()[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x7e1a15]
#09: nsThread::ThreadFunc(void*)[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/XUL +0x14e534]
#10: _pt_root[/Users/anatal/projects/mozilla/obj-x86_64-apple-darwin18.2.0/dist/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x16f490]
#11: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3305]

[2]
[Child 15665, AudioIPC0] WARNING: No audio tracks, but full-duplex audio is enabled!!!!!: file /home/andre/projects/gecko-dev-speech/dom/media/MediaStreamGraph.cpp, line 334
Assertion failure: !IsStarted() && OnCubebOperationThread(), at /home/andre/projects/gecko-dev-speech/dom/media/GraphDriver.cpp:684
#01: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x3a0129d]
#02: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x3a00cf8]
#03: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x17706fc]
#04: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x17707b7]
#05: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x176ccc7]
#06: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x176ebca]
#07: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x1d442ae]
#08: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x1cd71b1]
#09: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x1cd710a]
#10: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libxul.so +0x176a95a]
#11: ???[/home/andre/projects/obj-x86_64-pc-linux-gnu/dist/bin/libnspr4.so +0x2a704]
#12: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x76db]
#13: clone[/lib/x86_64-linux-gnu/libc.so.6 +0x12188f]
#14: ??? (???:???)

Blocks: 1248897
Flags: needinfo?(apehrson)

I just got this crash at least a minute after the recognition have had ended.

This is because on a started AudioCallbackDriver we get an AudioCallbackDriver::Revive from here:

#0 0x00007f23b7f823f5 in mozilla::AsyncCubebTask::AsyncCubebTask(mozilla::AudioCallbackDriver*, mozilla::AsyncCubebOperation) (this=0x7f23a4727180, aDriver=0x7f23a738a390, aOperation=mozilla::INIT)
at /home/achronop/myrepos/mozilla/firefox/dom/media/GraphDriver.cpp:418
#1 0x00007f23b7f83824 in mozilla::AudioCallbackDriver::Revive() (this=0x7f23a738a390) at /home/achronop/myrepos/mozilla/firefox/dom/media/GraphDriver.cpp:718
#2 0x00007f23b805ea2a in mozilla::MediaStreamGraphImpl::RunInStableState(bool) (this=0x7f23a479b800, aSourceIsMSG=false) at /home/achronop/myrepos/mozilla/firefox/dom/media/MediaStreamGraph.cpp:1703
#3 0x00007f23b8068ff2 in mozilla::(anonymous namespace)::MediaStreamGraphStableStateRunnable::Run() (this=0x7f23a47273c0) at /home/achronop/myrepos/mozilla/firefox/dom/media/MediaStreamGraph.cpp:1586
#4 0x00007f23b5c028b0 in mozilla::CycleCollectedJSContext::ProcessStableStateQueue() (this=0x7f23aa320000) at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:382
#5 0x00007f23b5c036c7 in mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) (this=0x7f23aa320000, aRecursionDepth=<optimized out>)
at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:441

Looking at our code I realize that the way it is written, Revive of a started AudioCallbackDriver is not supported/expected. This is because the AudioCallbackDriver::mSarted member is set once at AudioCallbackDriver::StartStream and is never cleared, maybe on AudioCallbackDriver::Stop, so we always hit this assert on Revive. Since we expect that this method will restart the same driver, this looks like a bug.

Paul, can you comment on what is expected here? Is the assert wrong, or is state wrong? You probably know this code best.

Flags: needinfo?(apehrson) → needinfo?(padenot)

Hmm I did a patch to fix the assert problem and now I am hitting a new assert inside cycle collection which looks strange:

#0 0x00007f069f5f9570 in __GI___nanosleep (requested_time=requested_time@entry=0x7fff56a83880, remaining=remaining@entry=0x7fff56a83880) at ../sysdeps/unix/sysv/linux/nanosleep.c:27
#1 0x00007f069f5f947a in sleep (seconds=0) at ../sysdeps/posix/sleep.c:55
#2 0x00007f069037f81e in ah_crap_handler(int) (signum=11) at /home/achronop/myrepos/mozilla/firefox/toolkit/xre/nsSigHandlers.cpp:95
#3 0x00007f069037f8f7 in child_ah_crap_handler(int) (signum=11) at /home/achronop/myrepos/mozilla/firefox/toolkit/xre/nsSigHandlers.cpp:105
#4 0x00007f0691258077 in WasmTrapHandler(int, siginfo_t*, void*) (signum=11, info=0x7fff56a83ab0, context=<optimized out>) at /home/achronop/myrepos/mozilla/firefox/js/src/wasm/WasmSignalHandlers.cpp:962
#5 0x00007f06a03ec1c0 in <signal handler called> () at /lib64/libpthread.so.0
#6 0x00007f068c4d9477 in mozilla::CycleCollectedJSContext::CleanupIDBTransactions(unsigned int) (this=<optimized out>, aRecursionDepth=2)
at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:391
#7 0x00007f068c4d9f4c in mozilla::CycleCollectedJSContext::PerformMicroTaskCheckPoint(bool) (this=0x7f0681819000, aForce=false)
at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:543
#8 0x00007f068c4da2cc in mozilla::CycleCollectedJSContext::BeforeProcessTask(bool) (this=0x7f069f8e7660 <_IO_2_1_stderr
>, aMightBlock=<optimized out>)
at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:422
#9 0x00007f068c5a7626 in nsThread::ProcessNextEvent(bool, bool*) (this=0x7f069f12b6d0, aMayWait=true, aResult=0x7fff56a845b7) at /home/achronop/myrepos/mozilla/firefox/xpcom/threads/nsThread.cpp:1087
#10 0x00007f068c5a9c68 in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7f069f8e7660 <_IO_2_1_stderr
>, aMayWait=true) at /home/achronop/myrepos/mozilla/firefox/xpcom/threads/nsThreadUtils.cpp:486
#11 0x00007f068c5abc16 in mozilla::SpinEventLoopUntil<(mozilla::ProcessFailureBehavior)1, nsThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int)::$_6>(nsThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int)::$6&&, nsIThread*) (aPredicate=..., aThread=<optimized out>) at /home/achronop/myrepos/mozilla/firefox/obj-firefox-debug-opt/dist/include/nsThreadUtils.h:348
#12 0x00007f068c5abc16 in nsThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) (this=<optimized out>, aEvent=..., aFlags=<optimized out>)
at /home/achronop/myrepos/mozilla/firefox/xpcom/threads/nsThreadPool.cpp:285
#13 0x00007f068c599573 in mozilla::SharedThreadPool::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) (this=<optimized out>, event=..., flags=1)
at /home/achronop/myrepos/mozilla/firefox/obj-firefox-debug-opt/dist/include/mozilla/SharedThreadPool.h:71
#14 0x00007f068c552b10 in nsIEventTarget::Dispatch(nsIRunnable*, unsigned int) (this=0x7f067f460cc0, aEvent=<optimized out>, aFlags=0)
at /home/achronop/myrepos/mozilla/firefox/obj-firefox-debug-opt/dist/include/nsIEventTarget.h:37
#15 0x00007f068e85f22a in mozilla::AudioCallbackDriver::Revive() (this=0x7f06818c5090) at /home/achronop/myrepos/mozilla/firefox/dom/media/GraphDriver.cpp:721
#16 0x00007f068e93a45a in mozilla::MediaStreamGraphImpl::RunInStableState(bool) (this=0x7f066beaa000, aSourceIsMSG=false) at /home/achronop/myrepos/mozilla/firefox/dom/media/MediaStreamGraph.cpp:1703
#17 0x00007f068e944a22 in mozilla::(anonymous namespace)::MediaStreamGraphStableStateRunnable::Run() (this=0x7f066cc4e600) at /home/achronop/myrepos/mozilla/firefox/dom/media/MediaStreamGraph.cpp:1586
#18 0x00007f068c4d9540 in mozilla::CycleCollectedJSContext::ProcessStableStateQueue() (this=0x7f0681819000) at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:382
#19 0x00007f068c4da357 in mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) (this=0x7f0681819000, aRecursionDepth=<optimized out>)
at /home/achronop/myrepos/mozilla/firefox/xpcom/base/CycleCollectedJSContext.cpp:441
#20 0x00007f068d01e49c in XPCJSContext::AfterProcessTask(unsigned int) (this=0x7f0681819000, aNewRecursionDepth=1) at /home/achronop/myrepos/mozilla/firefox/js/xpconnect/src/XPCJSContext.cpp:1273
#21 0x00007f068c5a814d in nsThread::ProcessNextEvent(bool, bool*) (this=0x7f069f12b6d0, aMayWait=<optimized out>, aResult=0x7fff56a84f47)
at /home/achronop/myrepos/mozilla/firefox/xpcom/threads/nsThread.cpp:1240
#22 0x00007f068c5a9c68 in NS_ProcessNextEvent(nsIThread*, bool) (aThread=0x7f069f8e7660 <_IO_2_1_stderr
>, aMayWait=true) at /home/achronop/myrepos/mozilla/firefox/xpcom/threads/nsThreadUtils.cpp:486
#23 0x00007f068cb9f679 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) (this=0x7f069f19d420, aDelegate=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/glue/MessagePump.cpp:110
#24 0x00007f068cb327b7 in MessageLoop::RunInternal() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:315
#25 0x00007f068cb32710 in MessageLoop::RunHandler() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:308
#26 0x00007f068cb32710 in MessageLoop::Run() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:290
#27 0x00007f068f104f83 in nsBaseAppShell::Run() (this=0x7f06818c60b0) at /home/achronop/myrepos/mozilla/firefox/widget/nsBaseAppShell.cpp:137
#28 0x00007f069037c213 in XRE_RunAppShell() () at /home/achronop/myrepos/mozilla/firefox/toolkit/xre/nsEmbedFunctions.cpp:919
#29 0x00007f068cb9fc96 in mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) (this=0x7f069f19d420, aDelegate=0x7fff56a85180)
at /home/achronop/myrepos/mozilla/firefox/ipc/glue/MessagePump.cpp:238
#30 0x00007f068cb327b7 in MessageLoop::RunInternal() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:315
#31 0x00007f068cb32710 in MessageLoop::RunHandler() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:308
#32 0x00007f068cb32710 in MessageLoop::Run() (this=0x7fff56a85180) at /home/achronop/myrepos/mozilla/firefox/ipc/chromium/src/base/message_loop.cc:290
#33 0x00007f069037bbf8 in XRE_InitChildProcess(int, char**, XREChildData const*) (aArgc=<optimized out>, aArgv=<optimized out>, aChildData=<optimized out>)
at /home/achronop/myrepos/mozilla/firefox/toolkit/xre/nsEmbedFunctions.cpp:757
#34 0x000055ab1c4c8a97 in content_process_main(mozilla::Bootstrap*, int, char**) (bootstrap=0x7f069f1296c0, argc=<optimized out>, argv=0x7fff56a865a8)
at /home/achronop/myrepos/mozilla/firefox/browser/app/../../ipc/contentproc/plugin-container.cpp:56
#35 0x000055ab1c4c8cf1 in main(int, char**, char**) (argc=16, argv=0x7fff56a865a8, envp=<optimized out>) at /home/achronop/myrepos/mozilla/firefox/browser/app/nsBrowserApp.cpp:263

It's spinning the event loop in stable state. That is not allowed. How does a nsThreadPool::Dispatch end up spinning the event loop? There's not a sync dispatch in this code is it? That's the only path I can see.

Flags: needinfo?(achronop)

The patch I've applied to overcome the Revive problem (comment 3) does a sync dispatch. I am creating a new patch to change that.

Flags: needinfo?(achronop)

The SyncRunnable version works without a crash. However, that was the Revive issue. The question remains on Paul, if the problem is in the assert, or we are in a wrong state. By testing it manually I don't see a problem, given that the assert is fixed.

AsyncCubebTask dispatch method provided an option for sync dispatch. This type of dispatch was allowing the spawing of a loop event. This patch replaces the sync dispatch with SyncRunnable. In addition to that the Revive method of AudioCallbackDriver was wrong because it was re-initializing an already initialized driver and did not stop the drained stream. The new sync method is used to fix the above.

Make AsyncCubebTask accept a series of operations and execute them one after the other in the same thread. This is what is really needed in the Revive method and not a sync dispatch even though sync dispatch works well.

To summarize the discussion I've had with padenot and achronop over the last few days:

We have three options to take here:

  1. Refactor MediaStreamGraph to remove the need for GraphDriver::Revive, and remove GraphDriver::Revive completely
  2. Fix GraphDriver::Revive so it doesn't crash, and so that it works as intended. This will need test coverage since we cannot trigger a Revive accurately from JS.
  3. Fix SpeechRecognition so it doesn't cause a Revive. This will ignore the Revive problems and leave us vulnerable as other things can cause a Revive (like well-timed js).

My thoughts on these are:

  1. Most desired long-term fix as it simplifies things. Has some unknown implications since we need to guarantee there are no races. Might not be such a quick fix.
  2. Work already done by Alex here, just needs test coverage so we don't regress Revive again. Shouldn't be too hard following the pattern for cubeb mocking in [1].
  3. Just pushes the problem to somewhere else. We might want to do this in addition to 2) though, to avoid the number of Revives.

I'd say we should do 2), including a gtest, as it is the proper fix with the least time-risk. Alex has volunteered on IRC to write that test, so I'm assigning him this bug. If someone disagrees, let me know.

[1] https://searchfox.org/mozilla-central/rev/0376cbf447efa16922c550da3bfd783b916e35d3/dom/media/gtest/TestAudioDeviceEnumerator.cpp

Assignee: nobody → achronop
Attachment #9056842 - Attachment description: Bug 1541290 - Replace sync dispatch with sync runnable to avoid spawning the event loop. → Bug 1541290 - Restart driver properly on revive method.
Attachment #9056844 - Attachment is obsolete: true

CloseAudioInut method posts a message, to the graph thread, in order to close the input asynchonously. When CloseAudioInput method was being executed from a thread other than the main thread, a runnable would be posted to main thread in order to post the async message from there. That was a risky path because when the graph was shutting down there were no guarantee that the close-input message would reach the graph thread before destroy takes place. By limiting CloseAudioInput to main thread it is ensured that the close-input message will be executed before destroy.

Blocks: 1545079
See Also: → 1543619
Pushed by achronopoulos@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/510178f0dc32
Restart driver properly on revive method. r=padenot
https://hg.mozilla.org/integration/autoland/rev/f75aabcf81d0
Close audio input from main thread to ensure that the message is not executed on a closed graph. r=pehrsons
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68

Clearing NI.

Flags: needinfo?(padenot)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: