Closed
Bug 1228788
Opened 9 years ago
Closed 9 years ago
mozGetUserMedia — FATAL ERROR: AsyncShutdown timeout (Media shutdown: blocking on media thread)
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Core
WebRTC: Audio/Video
Tracking
()
RESOLVED
FIXED
mozilla45
backlog | webrtc/webaudio+ |
People
(Reporter: jruderman, Assigned: gcp)
References
Details
(5 keywords)
Attachments
(3 files)
1. Create a fresh profile with prefs.js:
user_pref("media.navigator.permission.disabled", true);
user_pref("browser.tabs.remote.autostart", false);
user_pref("browser.tabs.remote.autostart.1", false);
user_pref("browser.tabs.remote.autostart.2", false);
2. Load the testcase.
3. ⌘Q as soon as the orange recording icon appears in the address bar.
Result:
30% of the time, shutdown succeeds but everything leaks
70% of the time, shutdown stalls and eventually aborts with the following message:
FATAL ERROR: AsyncShutdown timeout in profile-before-change Conditions: [{"name":"Media shutdown: blocking on media thread","state":"(none)","filename":"dom/media/MediaManager.cpp","lineNumber":1600,"stack":"Media shutdown"}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources.
Reporter | ||
Comment 1•9 years ago
|
||
The C++ stack mostly says it's calling JS, and the JS stack mostly says it's in onTimeout (AsyncShutdown.jsm:948).
Comment 2•9 years ago
|
||
I assume the component is wrong here?
Comment 3•9 years ago
|
||
Seems like it.
Assignee: mozilla → nobody
Component: AutoConfig (Mission Control Desktop) → WebRTC: Audio/Video
Comment 4•9 years ago
|
||
This may well be an issue with the Camera IPC channel getting nuked before it had a response. We need mediamanager:4,camerasparent:4,cameraschild:4 logs.
backlog: --- → webrtc/webaudio+
Rank: 25
Priority: -- → P2
Comment 5•9 years ago
|
||
I've reproduced it. The content process is hung with main-thread waiting for ShutdownTask to come back from the MediaManager thread, but the MediaManager thread is in a MediaOperationTask stuck in DispatchToParent waiting for a reply that never comes:
MediaManager (34)
> #0 0x00007fff9406ef5e in __psynch_cvwait ()
> #1 0x00007fff9b0b673d in _pthread_cond_wait ()
> #2 0x00000001003f506e in PR_WaitCondVar at /Users/Jan/moz/mozilla-central/nsprpub/pr/src/pthreads/ptsynch.c:396
> #3 0x0000000100824107 in mozilla::CondVar::Wait(unsigned int) at /Users/Jan/moz/mozilla-central/xpcom/glue/BlockingResourceBase.cpp:501
> #4 0x000000010077ec72 in mozilla::Monitor::Wait(unsigned int) at /Users/Jan/moz/mozilla-central/obj-x86_64-apple-darwin12.2.1-debug/xpcom/io/../../dist/include/mozilla/Monitor.h:40
> #5 0x000000010077ebde in mozilla::MonitorAutoLock::Wait(unsigned int) at /Users/Jan/moz/mozilla-central/obj-x86_64-apple-darwin12.2.1-debug/xpcom/io/../../dist/include/mozilla/Monitor.h:88
> #6 0x0000000103e61732 in mozilla::camera::CamerasChild::DispatchToParent(nsIRunnable*, mozilla::MonitorAutoLock&) at /Users/Jan/moz/mozilla-central/dom/media/systemservices/CamerasChild.cpp:225
bool
CamerasChild::DispatchToParent(nsIRunnable* aRunnable,
MonitorAutoLock& aMonitor)
{
{
OffTheBooksMutexAutoLock lock(CamerasSingleton::Mutex());
CamerasSingleton::Thread()->Dispatch(aRunnable, NS_DISPATCH_NORMAL);
}
// We can't see if the send worked, so we need to be able to bail
// out on shutdown (when it failed and we won't get a reply).
if (!mIPCIsAlive) {
return false;
}
// Guard against spurious wakeups.
mReceivedReply = false;
// Wait for a reply
do {
-> aMonitor.Wait();
} while (!mReceivedReply && mIPCIsAlive);
if (!mReplySuccess) {
return false;
}
return true;
}
> #7 0x0000000103e62714 in mozilla::camera::CamerasChild::ReleaseCaptureDevice(mozilla::camera::CaptureEngine, int) at /Users/Jan/moz/mozilla-central/dom/media/systemservices/CamerasChild.cpp:476
MonitorAutoLock monitor(mReplyMonitor);
->if (!DispatchToParent(runnable, monitor)) {
return -1;
}
> #8 0x0000000103e62621 in mozilla::camera::ReleaseCaptureDevice(mozilla::camera::CaptureEngine, int) at /Users/Jan/moz/mozilla-central/dom/media/systemservices/CamerasChild.cpp:459
> #9 0x0000000103f008ac in mozilla::MediaEngineRemoteVideoSource::Deallocate() at /Users/Jan/moz/mozilla-central/dom/media/webrtc/MediaEngineRemoteVideoSource.cpp:148
> #10 0x0000000103d08843 in mozilla::MediaOperationTask::Run() at /Users/Jan/moz/mozilla-central/dom/media/MediaManager.cpp:331
case MEDIA_STOP:
case MEDIA_STOP_TRACK:
{
NS_ASSERTION(!NS_IsMainThread(), "Never call on main thread");
if (mAudioDevice) {
mAudioDevice->GetSource()->Stop(source, kAudioTrack);
mAudioDevice->GetSource()->Deallocate();
}
if (mVideoDevice) {
mVideoDevice->GetSource()->Stop(source, kVideoTrack);
-> mVideoDevice->GetSource()->Deallocate();
}
> #11 0x0000000100e60fa0 in MessageLoop::RunTask(Task*) at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:364
> #12 0x0000000100e6151f in MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:372
> #13 0x0000000100e61744 in MessageLoop::DoWork() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:459
> #14 0x0000000100f37bae in mozilla::ipc::DoWorkRunnable::Run() at /Users/Jan/moz/mozilla-central/ipc/glue/MessagePump.cpp:220
> #15 0x00000001007c9acd in nsThread::ProcessNextEvent(bool, bool*) at /Users/Jan/moz/mozilla-central/xpcom/threads/nsThread.cpp:964
> #16 0x00000001008555a7 in NS_ProcessNextEvent(nsIThread*, bool) at /Users/Jan/moz/mozilla-central/xpcom/glue/nsThreadUtils.cpp:297
> #17 0x0000000100f38450 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) at /Users/Jan/moz/mozilla-central/ipc/glue/MessagePump.cpp:355
> #18 0x0000000100e60e85 in MessageLoop::RunInternal() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:234
> #19 0x0000000100e60d95 in MessageLoop::RunHandler() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:227
> #20 0x0000000100e60d3d in MessageLoop::Run() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/message_loop.cc:201
> #21 0x0000000100e862f9 in base::Thread::ThreadMain() at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/thread.cc:172
> #22 0x0000000100e8753c in ThreadFunc(void*) at /Users/Jan/moz/mozilla-central/ipc/chromium/src/base/platform_thread_posix.cc:39
> #23 0x00007fff9b0b59b1 in _pthread_body ()
> #24 0x00007fff9b0b592e in _pthread_start ()
> #25 0x00007fff9b0b3385 in thread_start ()
Oddly, mIPCIsAlive appears to be true, but it doesn't matter since the CamerasChild::mReplyMonitor never lets up.
Updated•9 years ago
|
Flags: needinfo?(gpascutto)
Assignee | ||
Comment 6•9 years ago
|
||
I can reproduce this. Oddly, from a quick read of the CamerasChild/CamerasParent logs it doesn't appear to be an actual Shutdown problem, as there's no sign of any Shutdown sequence. Investigating further.
Flags: needinfo?(gpascutto)
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → gpascutto
Assignee | ||
Comment 7•9 years ago
|
||
Looking at the logic, the question whether to try to reply to the child or not is predicated on IsShuttingDown(), which (among others) checks mWebRTCIsAlive. But mWebRTCIsAlive says something about whether the WebRTC stack is alive, not the IPC channels. I'm adding logging to verify but I suspect the aforementioned might be the issue here.
Assignee | ||
Comment 8•9 years ago
|
||
Nope, it's not. The Mac capture code is plain hanging in its destructor.
Thread 48:: VideoCapture
0 libsystem_kernel.dylib 0x00007fff8b4c5f5e __psynch_cvwait + 10
1 libsystem_pthread.dylib 0x00007fff970af73d _pthread_cond_wait + 767
2 com.apple.Foundation 0x00007fff971ad048 -[NSCondition wait] + 240
3 com.apple.Foundation 0x00007fff971ac01a -[NSObject(NSThreadPerformAdditions) performSelector:onThread:withObject:waitUntilDone:modes:] + 935
4 com.apple.Foundation 0x00007fff97261abc -[NSObject(NSThreadPerformAdditions) performSelectorOnMainThread:withObject:waitUntilDone:modes:] + 87
5 com.apple.QTKit 0x00007fff8a8483aa -[QTCaptureSession _rebuildGraph] + 45
6 com.apple.Foundation 0x00007fff971be0cf NSKeyValueNotifyObserver + 379
7 com.apple.Foundation 0x00007fff971bd93f NSKeyValueDidChange + 457
8 com.apple.Foundation 0x00007fff9718a17c -[NSObject(NSKeyValueObserverNotification) didChangeValueForKey:] + 118
9 com.apple.Foundation 0x00007fff971be0cf NSKeyValueNotifyObserver + 379
10 com.apple.Foundation 0x00007fff971bd93f NSKeyValueDidChange + 457
11 com.apple.Foundation 0x00007fff9718a17c -[NSObject(NSKeyValueObserverNotification) didChangeValueForKey:] + 118
12 XUL 0x0000000103ee6adc -[VideoCaptureMacQTKitObjC dealloc] + 124 (video_capture_qtkit_objc.mm:43)
13 XUL 0x0000000103ee5bad webrtc::videocapturemodule::VideoCaptureMacQTKit::~VideoCaptureMacQTKit() + 141 (video_capture_qtkit.mm:68)
14 XUL 0x0000000103ee781e webrtc::RefCountImpl<webrtc::videocapturemodule::VideoCaptureMacQTKit>::~RefCountImpl() + 14 (ref_count.h:44)
15 XUL 0x0000000103ee787c webrtc::RefCountImpl<webrtc::videocapturemodule::VideoCaptureMacQTKit>::Release() + 44 (ref_count.h:73)
16 XUL 0x0000000103e48134 webrtc::ViECapturer::~ViECapturer() + 180 (vie_capturer.cc:112)
17 XUL 0x0000000103e4849e webrtc::ViECapturer::~ViECapturer() + 14 (vie_capturer.cc:100)
18 XUL 0x0000000103e66bab webrtc::ViEInputManager::DestroyCaptureDevice(int) + 123 (vie_input_manager.cc:252)
19 XUL 0x0000000103e467ad webrtc::ViECaptureImpl::ReleaseCaptureDevice(int) + 189 (vie_capture_impl.cc:134)
20 XUL 0x00000001032bbb25 mozilla::camera::CamerasParent::RecvReleaseCaptureDevice(int const&, int const&)::$_16::operator()() const + 37 (CamerasParent.cpp:703)
21 XUL 0x00000001032bba9d mozilla::media::LambdaRunnable<mozilla::camera::CamerasParent::RecvReleaseCaptureDevice(int const&, int const&)::$_16>::Run() +
I think this is a bug I ran into during development, but for which we removed the workaround in bug 1209987 because the issue no longer reproduced for me after the threading changes there.
Assignee | ||
Comment 9•9 years ago
|
||
The webrtc update does break the old workaround of doing that free on the main thread.
# Fatal error in /Users/morbo/hg/mozilla-central/media/webrtc/trunk/webrtc/system_wrappers/source/thread_posix.cc, line 107
# Check failed: thread_checker_.CalledOnValidThread()
Assignee | ||
Comment 10•9 years ago
|
||
Looks like we hit a similar issue before in bug 837539.
Comment 11•9 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #6)
> Oddly, from a quick read of the CamerasChild/CamerasParent logs it doesn't appear to be an actual Shutdown
> problem, as there's no sign of any Shutdown sequence.
Well, I closed the last non-about: content-tab, which as I understand it (just from empirical observation really), is effectively a "shutdown" in the content process, but not in the parent process?
Assignee | ||
Comment 12•9 years ago
|
||
Sure, but I mean that the bug isn't in the code trying to tear down IPC channels or with handling the state where one of the ends has already disappeared. It's much higher up, when we try to free up the device. It's locking up before MediaManager::mBackend::Shutdown() gets called.
Reporter | ||
Comment 13•9 years ago
|
||
FWIW, I originally reported this without e10s, and you might find it easier to debug in that configuration.
Assignee | ||
Comment 14•9 years ago
|
||
This seems to work for me. I tried variations of the fix for bug 837539 until something worked. Please verify if it fixes the hang for you.
Attachment #8694301 -
Flags: review?(rjesup)
Attachment #8694301 -
Flags: feedback?(jruderman)
Comment 15•9 years ago
|
||
Comment on attachment 8694301 [details] [diff] [review]
Force QT device release to happen on the main thread
Review of attachment 8694301 [details] [diff] [review]:
-----------------------------------------------------------------
Yeah, we've seen stuff like this before with Mac capture code...
Attachment #8694301 -
Flags: review?(rjesup) → review+
Assignee | ||
Comment 16•9 years ago
|
||
Assignee | ||
Comment 17•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd3ee5f0e8f63480cd1493aaf6c2b452acb8be02
Bug 1228788 - Force QT device release to happen on the main thread. r=jesup
Comment 18•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
Comment 19•9 years ago
|
||
I'm still seeing this crash using the following steps:
1. Launch Firefox [1] with a clean profile.
2. Click on Hello button and launch a conversation.
3. Update Firefox.
[1]
* 44.0RC build 1 updated to the latest 44.0RC build 3
** bp-62f9324a-def4-437c-b4ad-6b9332160204
* 44.0RC build 3 updated to 45 beta 2 (on 'beta-cdntest' update channel)
** bp-e72ed27b-ec5c-4c6f-8b59-e1be62160204
This issue is also reproducible when applying updates from a Nightly build before this fix (E.g. 45.0a1 from 2015-12-01) to the latest 47.0a1 (bp-5a08939d-3e8d-4523-a69d-1ca5f2160204), but it *does not reproduce* with a Nightly build after this fix (E.g. 45.0a1 from 2015-12-05).
Affected platforms: Mac OS X 10.9.5 and Mac OS X 10.11.1
Terminal output when the crash occurs:
*************************
JavaScript error: resource://gre/modules/PerformanceStats.jsm, line 492: Error: forget() called twice
WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"Media shutdown: blocking on media thread","state":"(none)","filename":"/builds/slave/m-cen-m64-ntly-000000000000000/build/src/dom/media/MediaManager.cpp","lineNumber":1600,"stack":"Media shutdown"}] Barrier: profile-before-change
JavaScript error: chrome://loop/content/modules/MozLoopPushHandler.jsm, line 73: NS_ERROR_NOT_AVAILABLE: Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIWebSocketChannel.asyncOpen]
FATAL ERROR: AsyncShutdown timeout in profile-before-change Conditions: [{"name":"Media shutdown: blocking on media thread","state":"(none)","filename":"/builds/slave/m-cen-m64-ntly-000000000000000/build/src/dom/media/MediaManager.cpp","lineNumber":1600,"stack":"Media shutdown"}] At least one completion condition failed to complete within a reasonable amount of time. Causing a crash to ensure that we do not leave the user with an unresponsive process draining resources.
[Parent 3254] ###!!! ABORT: file /builds/slave/m-cen-m64-ntly-000000000000000/build/src/dom/media/MediaManager.cpp, line 1600
As far as I can see the above crashes are related with this bug report. Any ideas?
status-firefox44:
--- → affected
Flags: needinfo?(gpascutto)
Assignee | ||
Comment 20•9 years ago
|
||
1) The fix went in Firefox 45, not Firefox 44.
2) There's no reason why the fix here would depend on whether Nightly was updated or not.
3) There's 2 months of other changes since the fix was committed.
You might be seeing another bug, I might've not fixed it but just made it more intermittent, I have no idea.
Flags: needinfo?(gpascutto)
Comment 21•9 years ago
|
||
(In reply to Gian-Carlo Pascutto [:gcp] from comment #20)
> 1) The fix went in Firefox 45, not Firefox 44.
Yes, indeed; the fix is not in 44 and that's why I was pointing to the fact that some users might hit this crash on Mac OS X when applying updates from 44 to 45, under the mentioned circumstances.
> 2) There's no reason why the fix here would depend on whether Nightly was
> updated or not.
Are you referring to the fact that applying an update to nightly without the fix crashes and no crash with a nightly with the fix is not related with this report?
> 3) There's 2 months of other changes since the fix was committed.
> You might be seeing another bug, I might've not fixed it but just made it
> more intermittent, I have no idea.
Logged bug 1245863. Thanks for your input!
Assignee | ||
Comment 22•9 years ago
|
||
(In reply to Alexandra Lucinet, QA Mentor [:adalucinet] from comment #21)
> Are you referring to the fact that applying an update to nightly without the
> fix crashes and no crash with a nightly with the fix is not related with
> this report?
Wait, I think I may have completely misunderstood what you said wrt. updating.
Are you saying that:
a) Firefox versions before the fix hang when you do <some action related to shutdown like updating>.
b) Firefox versions after the fix do not hang.
If so, that's 100% the expected behavior. There was a shutdown hang and this bug fixed it.
Comment 24•9 years ago
|
||
Gian-Carlo -- When can this bug happen? Can it only happen in shutdown when the video camera is already running/capturing? If that's correct, it seems that the likelihood of a Release user hitting this bug is fairly low. (Just trying to get a feel for how Fx44 users are possibly impacted by this bug.) Thanks.
Flags: needinfo?(gpascutto)
Assignee | ||
Comment 25•9 years ago
|
||
(In reply to Maire Reavy [:mreavy] (Plz ni?:mreavy) from comment #24)
> Gian-Carlo -- When can this bug happen? Can it only happen in shutdown when
> the video camera is already running/capturing? If that's correct, it seems
> that the likelihood of a Release user hitting this bug is fairly low. (Just
> trying to get a feel for how Fx44 users are possibly impacted by this bug.)
> Thanks.
That's correct (and only on Mac OS X).
Flags: needinfo?(gpascutto)
You need to log in
before you can comment on or make changes to this bug.
Description
•