Shutdown deadlock involving cubeb
Categories
(Core :: Audio/Video: cubeb, defect)
Tracking
()
People
(Reporter: glandium, Assigned: glandium)
References
Details
Attachments
(2 files)
148.43 KB,
text/plain
|
Details | |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr91+
|
Details | Review |
For some reason, when building instrumented builds for PGO with upcoming rustc 1.56, this shutdown deadlock happens 100% of the time on automation... unless using an interactive task (which makes for fun debugging). I wasn't able to reproduce it locally.
Attached is the crash part of the log, once the minidump processing knows about system libraries (because otherwise the stack traces are almost useless). The parts that I think are relevant:
Thread 0 tid 506
0 libpthread.so.0!__GI___pthread_timedjoin_ex [pthread_join_common.c : 89 + 0x25]
rax = 0xfffffffffffffe00 rdx = 0x0000000000000282
rcx = 0x00007feae89f1d2d rbx = 0x00007feab54fe700
rsi = 0x0000000000000000 rdi = 0x00007feab54fe9d0
rbp = 0x0000000000000000 rsp = 0x00007ffd9cfa0ff0
r8 = 0x00000000000000ca r9 = 0x00007feab54fe9d0
r10 = 0x0000000000000000 r11 = 0x0000000000000246
r12 = 0x0000000000000000 r13 = 0x00007ffd9cfa1000
r14 = 0x0000000000000000 r15 = 0x00007feae7840310
rip = 0x00007feae89f1d2d
Found by: given as instruction pointer in context
1 libxul.so!<audioipc::core::CoreThread as core::ops::drop::Drop>::drop [core.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 34 + 0x33]
rbx = 0x0000000000000001 rbp = 0x00007ffd9cfa1130
rsp = 0x00007ffd9cfa1060 r12 = 0x00007fead1a66590
r13 = 0x00007fead1a66530 r14 = 0x00007ffd9cfa1078
r15 = 0x00007feae7840310 rip = 0x00007fead8bd7d73
Found by: call frame info
2 libxul.so!audioipc_server_stop [lib.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 169 + 0xd]
rbx = 0x00007feab5769980 rbp = 0x00007ffd9cfa1160
rsp = 0x00007ffd9cfa1140 r12 = 0x00007fead1a66590
r13 = 0x00007fead1a66530 r14 = 0x00007fead5643d10
r15 = 0x00007feae7840310 rip = 0x00007fead8c0e662
Found by: call frame info
3 libxul.so!mozilla::CubebUtils::ShutdownLibrary() [CubebUtils.cpp:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 669 + 0x19]
rbx = 0x0000000000000000 rbp = 0x00007ffd9cfa1180
rsp = 0x00007ffd9cfa1170 r12 = 0x00007fead1a66590
r13 = 0x00007fead1a66530 r14 = 0x00007fead5643d10
r15 = 0x00007feae7840310 rip = 0x00007fead49e6766
Found by: call frame info
4 libxul.so!nsLayoutStatics::Shutdown() [nsLayoutStatics.cpp:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 362 + 0x5]
rbx = 0x00007feac6027870 rbp = 0x00007ffd9cfa11a0
rsp = 0x00007ffd9cfa1190 r12 = 0x00007fead1a66590
r13 = 0x00007fead1a66530 r14 = 0x00007fead5643d10
r15 = 0x00007feae7840310 rip = 0x00007fead60b017b
Found by: call frame info
and
Thread 39 tid 642
0 libpthread.so.0!__pthread_cond_wait [pthread_cond_wait.c : 655 + 0xfb]
rax = 0xfffffffffffffe00 rdx = 0x0000000000000000
rcx = 0x00007feae89f69f3 rbx = 0x00007feaa66e39a0
rsi = 0x0000000000000080 rdi = 0x00007feaa66e39c8
rbp = 0x00007feaa66e39c4 rsp = 0x00007feab54fcc90
r8 = 0x0000000000000000 r9 = 0x0000000000000000
r10 = 0x0000000000000000 r11 = 0x0000000000000246
r12 = 0x00007feaa66e39c8 r13 = 0x0000000000000000
r14 = 0x00007feaa663f9d0 r15 = 0x0000000000000014
rip = 0x00007feae89f69f3
Found by: given as instruction pointer in context
1 libpulse.so.0!pa_threaded_mainloop_wait [thread-mainloop.c : 215 + 0x5]
rbx = 0x00007feaa695bf40 rbp = 0x00007feab54fd260
rsp = 0x00007feab54fcd60 r12 = 0x00007feaa5ce24c0
r13 = 0x00007feaacefda00 r14 = 0x00007feadd394618
r15 = 0x00007feaa695bf40 rip = 0x00007feabe9e2a68
Found by: call frame info
2 libxul.so!<cubeb_pulse::backend::stream::PulseStream as cubeb_backend::traits::StreamOps>::stop [stream.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 641 + 0x30]
rbx = 0x00007feabe9e2a40 rbp = 0x00007feab54fd260
rsp = 0x00007feab54fcd70 r12 = 0x00007feaa5ce24c0
r13 = 0x00007feaacefda00 r14 = 0x00007feadd394618
r15 = 0x00007feaa695bf40 rip = 0x00007fead8d44801
Found by: call frame info
3 libxul.so!cubeb_backend::capi::capi_stream_stop [capi.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 186 + 0x5]
rbx = 0x00007feab570a220 rbp = 0x00007feab54fd270
rsp = 0x00007feab54fd270 r12 = 0x00007feab54fd6c0
r13 = 0x00007feab54fd740 r14 = 0x00007feab54fd528
r15 = 0x00007feab5759200 rip = 0x00007fead8d42851
Found by: call frame info
4 libxul.so!audioipc_server::server::CubebServer::process_msg [server.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 506 + 0x60]
rbx = 0x00007feab570a220 rbp = 0x00007feab54fd510
rsp = 0x00007feab54fd280 r12 = 0x00007feab54fd6c0
r13 = 0x00007feab54fd740 r14 = 0x00007feab54fd528
r15 = 0x00007feab5759200 rip = 0x00007fead8c2b575
Found by: call frame info
5 libxul.so!<audioipc_server::server::CubebServer as audioipc::rpc::server::Server>::process [server.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 373 + 0x159]
rbx = 0x00007feab570a208 rbp = 0x00007feab54fd5e0
rsp = 0x00007feab54fd520 r12 = 0x00007feab54fd6c0
r13 = 0x00007feab570a210 r14 = 0x00007feab54fd740
r15 = 0x00007feab5759200 rip = 0x00007fead8c282c9
Found by: call frame info
6 libxul.so!<audioipc::rpc::driver::Driver<T> as futures::future::Future>::poll [driver.rs:58db01c6d5f8d7629d8601dd4e08ca2ded93868e : 132 + 0x5bc]
rbx = 0x00007feab5759200 rbp = 0x00007feab54fd8c0
rsp = 0x00007feab54fd5f0 r12 = 0x00007feab54fd740
r13 = 0x00007feab57592f8 r14 = 0x00007feab54fd758
r15 = 0x000000000000000d rip = 0x00007fead8c1fb2a
Found by: call frame info
Assignee | ||
Comment 2•4 years ago
|
||
Assignee | ||
Comment 3•4 years ago
|
||
Comment 5•4 years ago
|
||
bugherder |
Comment 6•3 years ago
•
|
||
Mike,
I think you need to backport this one in Debian to fix https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998108 .
Seems this already is in beta. Would it be worth uplifting to release? Other distros (openSUSE, Arch) shipping Rust 1.56/LLVM 13 are affected as well.
Assignee | ||
Comment 7•3 years ago
|
||
I'll wait for confirmation that it fixes it.
Comment 8•3 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #7)
I'll wait for confirmation that it fixes it.
It seems such confirmation have been given in:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998108#212
Thanks :-)
Assignee | ||
Comment 9•3 years ago
|
||
Matthew, can you pull https://github.com/glandium/cubeb-pulse-rs/tree/drain_timer-esr91 and push it as a new branch on https://github.com/mozilla/cubeb-pulse-rs ?
Comment 10•3 years ago
|
||
Assignee | ||
Comment 11•3 years ago
•
|
||
Comment on attachment 9246239 [details]
Bug 1735905 - Upgrade cubeb-pulse to fix a race condition that can lead to shutdown deadlock.
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Rust unsafe violation leading to some versions of the rust compiler to generate unexpected code that can deadlock (in more situations than originally stated ; note: this does not affect the mozilla.org builds)
- User impact if declined: See above.
- Fix Landed on Version: 95
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Fix is rather straightforward and well understood, replacing a pointer with an atomic pointer.
- String or UUID changes made by this patch: N/A
Note: the revision in phabricator was updated with a backport for esr91 based on the branch from comment 10.
Comment 13•3 years ago
|
||
Comment on attachment 9246239 [details]
Bug 1735905 - Upgrade cubeb-pulse to fix a race condition that can lead to shutdown deadlock.
Approved for 91.4esr.
Comment 14•3 years ago
|
||
bugherder uplift |
Updated•3 years ago
|
Comment 17•3 years ago
|
||
Is this fix going to backported to Firefox 94.x? The release notes for 94.0.2 have not been published yet:
We’re still preparing the notes for this release, and will post them here when they are ready. Please check back later.
Comment 18•3 years ago
|
||
The 94.0.2 source archive is already available, but unfortunately does not contain the fix.
firefox-94.0.2$ md5sum third_party/rust/cubeb-pulse/src/backend/stream.rs
4335bd496eeb9a1e576cae5ba635197e third_party/rust/cubeb-pulse/src/backend/stream.rs
This is the same version as in 94.0.
What could have been done, to get the fix into 94.0.2?
Comment 19•3 years ago
|
||
[Tracking Requested - why for this release]:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998108#241 and https://bugzilla.opensuse.org/show_bug.cgi?id=1192067 backported the fix.
Wouldn't it be better to backport the fix upstream (here) to serve all downstream builds equally?
(Paul Menzel from comment #18)
What could have been done, to get the fix into 94.0.2?
Comment 20•3 years ago
|
||
Too late to get this into 94.0.2. Given that Fx95 goes to RC next week, it's pretty unlikely we'll ship another Fx94 in the mean time.
Comment 21•3 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #20)
Too late to get this into 94.0.2. Given that Fx95 goes to RC next week, it's pretty unlikely we'll ship another Fx94 in the mean time.
So that is works better the next time, what should have been done to get this into 94.0.2?
Comment 22•3 years ago
|
||
Asking for it around the time of comment 11 when the ESR branching and request was made but not the same for release.
Description
•