Closed Bug 1622931 Opened 1 year ago Closed 1 month ago

Crash in [@ trunc | std::sys_common::backtrace::__rust_begin_short_backtrace<T>]

Categories

(Core :: Audio/Video: cubeb, defect, P1)

75 Branch
x86
Windows
defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- disabled
firefox74 --- unaffected
firefox75 + disabled
firefox76 --- disabled
firefox77 + disabled
firefox78 --- disabled

People

(Reporter: philipp, Assigned: kinetik)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

This bug is for crash report bp-980997a7-b889-4882-8f53-38b2d0200316.

Top 10 frames of crashing thread:

0 xul.dll RustMozCrash mozglue/static/rust/wrappers.cpp:16
1 xul.dll mozglue_static::panic_hook mozglue/static/rust/lib.rs:89
2 xul.dll core::ops::function::Fn::call<fn src/libcore/ops/function.rs:232
3 xul.dll std::panicking::rust_panic_with_hook src/libstd/panicking.rs:475
4 xul.dll std::panicking::begin_panic_handler src/libstd/panicking.rs:375
5 xul.dll core::panicking::panic_fmt src/libcore/panicking.rs:84
6 xul.dll trunc 
7 xul.dll std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0,  src/libstd/sys_common/backtrace.rs:136
8 xul.dll core::ops::function::FnOnce::call_once<closure-0,  src/libcore/ops/function.rs:232
9 xul.dll alloc::boxed::{{impl}}::call_once< src/liballoc/boxed.rs:1022

this crash signature is starting to appear since firefox 75 from users using 32bit builds on windows. most of the crashing urls are youtube videos

based on the stack i am unsure in which component this bug should go.

Seems like an audioIPC crash.

Component: General → Audio/Video: cubeb
Priority: -- → P2

Could be a duplicate of bug 1621360, but the crash reason for this is different: "unexpected execution error".

Assignee: nobody → kinetik

The panic message "unexpected execution error" most likely comes from tokio, specifically current_thread::Runtime::block_on (calling into_inner on an error returned from blocked on future), which is called from AudioIPC's core::spawn_thread. AudioIPC is blocking on the receiver of a oneshot::channel for signalling shutdown, so it might mean that something unexpected happened to the sender side of the channel. Not sure what would cause this yet, will continue investigating.

Priority: P2 → P1
See Also: → 1621360
Regressed by: 1432303
Depends on: 1623793
Blocks: 1623798

Bug 1623793 should get rid of this for 75.

Severity: normal → critical

Bug 1623798 left 32-bit disabled due to this crash, so marking as disabled in 76.

Matthew, what's the plan for 77 here? AFAICT this is set to be enabled again in beta next week?

Flags: needinfo?(kinetik)

I'm making progress on the crashes, but no fix yet. I'll make 32-bit nightly-only in bug 1634658 until I have fixes.

Flags: needinfo?(kinetik)

Summarizing what I know so far:

  • there are two related crashes under this signature, one in the server's "AudioIPC Callback RPC" thread and one in the client's "AudioIPC Client RPC" thread
  • both crashes are the same expect("unexpected execution error") inside tokio under spawn_thread's call to Runtime::block_on, caused by an unknown internal error
    • current suspect is a failing GetQueuedCompletionStatusEx call in miow
  • "available page file" is < 42MB for every crash with this signature in the last month
    • most are < 10MB

For crash 4fe6d133-db30-477f-bd10-b3c8f0200430, the complete stack is:

 # ChildEBP RetAddr  
00 (Inline) -------- xul!MOZ_Crash+0xb [/builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h @ 332] 
01 17aff56c 59e28512 xul!RustMozCrash+0x11 [/builds/worker/checkouts/gecko/mozglue/static/rust/wrappers.cpp @ 16] 
02 17aff998 59e2848a xul!mozglue_static::panic_hook+0x82 [/builds/worker/checkouts/gecko/mozglue/static/rust/lib.rs @ 71] 
03 17aff9a0 59f11eaf xul!core::ops::function::Fn::call<fn(core::panic::PanicInfo*),(core::panic::PanicInfo*)>+0xa [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libcore\ops\function.rs @ 72] 
04 17affa50 59f16bcc xul!std::panicking::rust_panic_with_hook+0x14f [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libstd\panicking.rs @ 475] 
05 17affa98 59c57e50 xul!std::panicking::begin_panic_handler+0x4c [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libstd\panicking.rs @ 375] 
06 17affaac 59c5d4ed xul!core::panicking::panic_fmt+0x20 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libcore\panicking.rs @ 84] 
07 17affad8 59bd3a94 xul!core::option::expect_failed+0x4d [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libcore\option.rs @ 1188] 
08 17affe3c 59bd1615 xul!audioipc::core::spawn_thread::{{closure}}<str*,closure-1,closure-2>+0x2474 [/builds/worker/checkouts/gecko/media/audioipc/audioipc/src/core.rs @ 69] 
09 17affe64 59bd159b xul!std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0,()>+0x35 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\sys_common\backtrace.rs @ 137] 
0a (Inline) -------- xul!std::thread::{{impl}}::spawn_unchecked::{{closure}}::{{closure}}+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\thread\mod.rs @ 469] 
0b (Inline) -------- xul!std::panic::{{impl}}::call_once+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\panic.rs @ 318] 
0c (Inline) -------- xul!std::panicking::try::do_call+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\panicking.rs @ 292] 
0d (Inline) -------- xul!panic_abort::__rust_maybe_catch_panic+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libpanic_abort\lib.rs @ 28] 
0e (Inline) -------- xul!std::panicking::try+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\panicking.rs @ 270] 
0f (Inline) -------- xul!std::panic::catch_unwind+0x30 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\panic.rs @ 394] 
10 (Inline) -------- xul!std::thread::{{impl}}::spawn_unchecked::{{closure}}+0x50 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libstd\thread\mod.rs @ 468] 
11 17affe94 59f1f04b xul!core::ops::function::FnOnce::call_once<closure-0,()>+0x5b [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\libcore\ops\function.rs @ 232] 
12 17affefc 59f20d75 xul!alloc::boxed::{{impl}}::call_once<(),FnOnce<()>>+0x3b [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\liballoc\boxed.rs @ 1022] 
13 (Inline) -------- xul!alloc::boxed::{{impl}}::call_once+0xe [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\src\liballoc\boxed.rs @ 1022] 
14 (Inline) -------- xul!std::sys_common::thread::start_thread+0x53 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libstd\sys_common\thread.rs @ 13] 
15 17afff08 748d38f4 xul!std::sys::windows::thread::{{impl}}::new::thread_start+0x55 [/rustc/f3e1a954d2ead4e2fc197c7da7d71e6c61bad196\/src\libstd\sys\windows\thread.rs @ 51] 
16 17afff1c 772b5de3 kernel32!BaseThreadInitThunk+0x24
17 17afff64 772b5dae ntdll!__RtlUserThreadStart+0x2f
18 17afff74 00000000 ntdll!_RtlUserThreadStart+0x1b

Frame 8 is the frame of interest, but it's fairly difficult to make sense of in minidumps as it's an ~11kB compiled function with an 868 byte stack frame. As far as I can tell (trusting Ghidra's control flow analysis), expect_failed was preceded by a call to tokio_reactor::Reactor::turn. Internally, turn may call mio::Poll::poll, which eventually makes the assumed-to-be failing GetQueuedCompletionStatusEx reachable via a call to miow::CompletionPort::get_many, e.g. (captured from a unrelated crash report):

Frame 	Module 	Signature 	Source
0 	ntdll.dll 	KiFastSystemCallRet 	
1 	ntdll.dll 	NtRemoveIoCompletionEx 	
2 	KERNELBASE.dll 	GetQueuedCompletionStatusEx 	
3 	xul.dll 	miow::iocp::CompletionPort::get_many(mut slice<miow::iocp::CompletionStatus>*, core::option::Option<core::time::Duration>) 	third_party/rust/miow-0.2.1/src/iocp.rs:146
4 	xul.dll 	mio::poll::Poll::poll1(mio::poll::Events*, core::option::Option<core::time::Duration>, bool) 	third_party/rust/mio/src/poll.rs:1139
5 	xul.dll 	tokio_reactor::Reactor::turn(core::option::Option<core::time::Duration>) 	third_party/rust/tokio-reactor/src/lib.rs:327
6 	xul.dll 	tokio::runtime::current_thread::runtime::Runtime::block_on<futures::sync::oneshot::Receiver<()>>(futures::sync::oneshot::Receiver<()>) 	third_party/rust/tokio-0.1.11/src/runtime/current_thread/runtime.rs:182
7 	xul.dll 	std::sys_common::backtrace::__rust_begin_short_backtrace<closure-0, ()>(audioipc::core::spawn_thread::closure-0) 	../f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libstd/sys_common/backtrace.rs:136
8 	xul.dll 	core::ops::function::FnOnce::call_once<closure-0, ()>(std::thread::{{impl}}::spawn_unchecked::closure-0*) 	../f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/libcore/ops/function.rs:232
9 	xul.dll 	alloc::boxed::{{impl}}::call_once<(), FnOnce<()>>() 	../f3e1a954d2ead4e2fc197c7da7d71e6c61bad196/src/liballoc/boxed.rs:1022
10 	xul.dll 	std::sys::windows::thread::{{impl}}::new::thread_start() 	../f3e1a954d2ead4e2fc197c7da7d71e6c61bad196//src/libstd/sys/windows/thread.rs:51
11 	kernel32.dll 	BaseThreadInitThunk 	
12 	ntdll.dll 	__RtlUserThreadStart 	
13 	ntdll.dll 	_RtlUserThreadStart 	

Currently trying to reproduce locally with in a VM with low page file, and continuing analysis of dumps. Also looking into locally forking miow to add additional debugging around GetQueuedCompletionStatusEx.

bug 1634658 will disable this for 77, and since this is now nightly-only I'll mark it fix-optional for 78.

These are current, open bugs with a Severity of critical. The Severity of these bugs is being changed to S2 to be consistent with the May 4 2020 Severity definitions.

Please let Release Management know if these bugs are still S2.

Severity: critical → S2

I believe this was resolved by bug 1644522 landing. Bug 1663553 re-enables the AudioIPC feature on 32-bit Windows for early beta - I'll continue to monitor the crash rate.

Status: NEW → RESOLVED
Closed: 1 month ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.