Crash in [@ `anonymous namespace'::wasapi_stream_render_loop]
Categories
(Core :: Audio/Video: cubeb, defect, P2)
Tracking
()
People
(Reporter: gsvelto, Assigned: kinetik)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
This bug is for crash report bp-354eb52b-7449-48d6-a8fc-33ca60190922.
Top 5 frames of crashing thread:
0 xul.dll static unsigned int `anonymous namespace'::wasapi_stream_render_loop media/libcubeb/src/cubeb_wasapi.cpp:1189
1 ucrtbase.dll thread_start<unsigned int , 1>
2 kernel32.dll BaseThreadInitThunk
3 mozglue.dll static void patched_BaseThreadInitThunk mozglue/dllservices/WindowsDllBlocklist.cpp:562
4 ntdll.dll RtlUserThreadStart
This is not a new bug as I can find reports going back months though there's been an increase in September. While 64-bit crashes seem to be hitting the 0xffffffffffffffff
address most 32-bit ones have the memory poison pattern as the crashing address so this might be an use-after-free bug on those platforms.
Assignee | ||
Comment 1•5 years ago
|
||
The recent crashes all point to the emergency_bailout check before we enter WaitForMultipleObjects, assuming the crash stacks are reliable in this case. I can't immediately find a way for the render thread's on-stack copy of emergency_bailout to be anything but a valid pointer (or nullptr if new is fallible). I can't see any recent changes to cubeb_wasapi.cpp that'd explain a spike in Firefox 69, either.
Assignee | ||
Comment 2•5 years ago
|
||
If the WaitForSingleObject waiting on the render thread to exit in stop_and_join_render_thread returns anything other than the explicitly handled results WAIT_TIMEOUT or WAIT_FAILED, we assume it was successful (WAIT_OBJECT_0) and will delete the emergency_bailout pointer. The only other documented result for WFSO is WAIT_ABANDONED, which I don't think applies in this case, so I don't think this is the cause but I don't have any better ideas so far.
Assignee | ||
Comment 3•5 years ago
|
||
https://social.msdn.microsoft.com/Forums/vstudio/en-US/6455b373-6975-493c-9676-d4e8da1fc70d/how-to-check-if-a-thread-is-suspended?forum=vcgeneral#a0f6a57f-a748-4517-8a7d-5ad89ff7b336 suggests WFSO can return WAIT_ABANDONED for thread handle waits if the target thread is suspended. So, even if this doesn't explain the crash, we should probably fix the code to handle this case.
Reporter | ||
Comment 4•5 years ago
|
||
Since the access shows a memory poison pattern it's likely that the cubeb_stream
object has already been freed when wasapi_stream_render_loop()
is called. Without knowing the code in detail that smells like a race to me. Is it possible for stop_and_join_render_thread()
to fail so that the thread keeps running? That would explain what's happening here as the next call to wasapi_stream_render_loop()
would be running with a dead cubeb_stream
object.
Comment 5•5 years ago
|
||
Yes, this is why there is the emergency bailout boolean, but comment 2 is onto something maybe.
It seems like WAIT_ABANDONED
is only for Mutex, though, as Matthew says.
Comment 6•5 years ago
|
||
Assignee | ||
Comment 7•5 years ago
|
||
Comment 3 doesn't seem to be true, at least on Windows 10 (1903). I tried suspending the render thread programmatically (via SuspendThread(stm->thread)
) and externally (via Process Explorer) - WFSO returns WAIT_TIMEOUT after the specified 5s timeout. Same result for 0 and INFINITE timeouts.
Assignee | ||
Comment 8•5 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #6)
dmajor sat with me today and found that this probably wasn't the best counter measure: if
stm
was freed in between creating the thread and the thread actually starting, this would read a poisoning patter and thus not exiting immediately here.
This is the only issue I can find after spending more time staring at this.
In the binaries I looked at, wasapi_stream_render_loop uses rbx
to hold emergency_bailout
. In every crash report I looked at, the actual value of rbx
was either 0xe5e5e5e5e5e5e5e5 or 0xe5e5e5e5e5e50000. The only way for that to happen is as you describe above, i.e. we call wasapi_stream_start/wasapi_stream_destroy (hit the 5s WFSO timeout and trigger bailout) all before wasapi_stream_render_loop had a chance to copy stm->emergency_bailout
to a local variable.
It seems unusual that wasapi_stream_render_loop would be stalled for >5s, so I don't have an explanation for why that would've started happening more frequently in September.
Assignee | ||
Comment 9•5 years ago
|
||
Assignee | ||
Comment 10•5 years ago
|
||
This landed via the libcubeb resync in bug 1575883.
Updated•5 years ago
|
Description
•