Closed Bug 1317234 Opened 8 years ago Closed 8 years ago

audiounit_stream_init() sometimes gets stuck forever on OSX 10.10

Categories

(Core :: Audio/Video: cubeb, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox50 --- unaffected
firefox51 --- unaffected
firefox52 --- fixed
firefox53 --- fixed

People

(Reporter: jwwang, Assigned: achronop)

References

Details

https://hg.mozilla.org/try/rev/c3464b8468e77479d2cdb271f5f51e9cafdaa920 MediaDecoderStateMachine::DumpDebugInfo() is called to dump debugging info when a mochitest times out. I try to fail the assertion if the printing task is not finished within 2s so the stack trace can be dumped. https://treeherder.mozilla.org/logviewer.html#?job_id=31001231&repo=try#L11521 https://archive.mozilla.org/pub/firefox/try-builds/jwwang@mozilla.com-7ce53692ec7b0e1a136b4b652cdf7989cc010580/try-macosx64-debug/try_yosemite_r7-debug_test-mochitest-media-e10s-bm106-tests1-macosx-build1310.txt.gz The try logs show the MDSM task queue thread (thread 49) is stuck in setup_audiounit_stream() and therefore the printing task is never executed. It looks like a deadlock as thread 45 also shows a similar stack.
Flags: needinfo?(kinetik)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ce53692ec7b0e1a136b4b652cdf7989cc010580 Note it is not easy to reproduce. I usually have to re-trigger at least 50 times to repro the issue.
Thread 50 is stuck on `AudioComponentInstanceNew`, thread 49 is stuck on `AudioUnitInitialize`. Should we try to linearize those calls using a global mutex ?
All threads 45, 49 and 50 are blocked at the same mutex (CoreAudio!HALB_Mutex::Lock()) which is deep down the CoreAudio framework. I have tested stream_init in parallel with 50 threads, for another bug that we had a similar issue and works. It seems to me like a CoreAudio issue. I will try to find the other bug to compare the stack.
See Also: → 1291745
The stack is pretty close. Again the AudioComponentInstanceNew is involved. In their case the hung occurred after a long sleep and we thought that it is related to sleep/wake up. But it seems it's not. Bug is linked at "See Also" field.
I wonder if we're hitting something like https://chromium.googlesource.com/chromium/src/+/8412641875bff466b062126ac1382fba7b0f799e Although from a quick look, we only ever query the buffer size and never actually change it.
Flags: needinfo?(kinetik)
Rank: 15
Priority: -- → P1
According to that thread any Get/SetProperty can hurt: https://lists.freedesktop.org/archives/gstreamer-bugs/2016-January/166862.html
After a bunch of try pushes I realized the followings: The deadlock happens between the stream_init thread and a second thread which attach the system notification handlers. The problem is that the latter is only controlled by framework so a simple mutexing in stream_init does not solve the issue. For the try run mentioned in Comment 1 that is thread 45. A 2nd finding is that working with CFRunLoop has become deprecated and the advice is to use libdispatch instead. The good thing is that dispatch queues looks better and more intuitive so might be an improvement. My plan is to find how to send synchronously all critical operations to audio dispatch queue in order to eliminate the possibility of deadlock. The use of the dispatch_sync method is in that direction. Any comment welcome.
Assignee: nobody → achronop
Any updates here, Alex? This is not going to be fun to carry on 52 for the duration of the next ESR cycle.
Flags: needinfo?(achronop)
The patch is up waiting for review in the cubeb repo: https://github.com/kinetiknz/cubeb/pull/208. It takes sometime due to Xmans leaves.
Flags: needinfo?(achronop)
Depends on: 1331869
OrangeFactor is looking good since bug 1331869 landed.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
IIUC, this was fixed on 52 by https://hg.mozilla.org/releases/mozilla-beta/rev/dc169ee114d4. Please correct me if that's mistaken.
Ryan, that's correct.
You need to log in before you can comment on or make changes to this bug.