Crash in [@ IPCError-browser | ShutDownKill | NtAlpcSendWaitReceivePort]
Categories
(Core :: Widget: Win32, defect, P3)
Tracking
()
People
(Reporter: pascalc, Unassigned)
Details
(Keywords: crash)
Crash Data
This bug is for crash report bp-b3fc3bd6-a805-4f68-817c-3f7500200209.
Top 10 frames of crashing thread:
0 ntdll.dll NtAlpcSendWaitReceivePort
1 rpcrt4.dll long LRPC_BASE_CCALL::DoSendReceive
2 audioses.dll AEWMILOG_DROP
3 rpcrt4.dll void Ndr64UDTSimpleTypeMarshall1
4 rpcrt4.dll void Ndr64SupplementMarshall
5 rpcrt4.dll virtual long LRPC_CCALL::SendReceive
6 rpcrt4.dll virtual void* LRPC_CCALL::`scalar deleting destructor'
7 rpcrt4.dll I_RpcSendReceive
8 rpcrt4.dll NdrSendReceive
9 audioses.dll AEWMILOG_DROP
Comment 1•5 years ago
|
||
Nightly-only hung content processes at shutdown. The vast majority of the stacks include this frame:
The other crashes also seem to point to audio-related operations. Could they just be slow?
Updated•5 years ago
|
Comment 3•5 years ago
|
||
I did another pass over the crash reports and established a couple of things:
- The content processes aren't hung, they're just being slow. Most would shut down correctly if given enough time.
- Destroying the
IAudioSessionControlobject is slow, so we try to do it on a background thread - However in these stacks we're actually releasing the object on the main thread, which means we failed spawning the other thread and got here
David, since you wrote this code, does my analysis seem correct to you? Do you have any ideas why thread creation might fail and we're stuck on the main thread?
Comment 4•5 years ago
|
||
:gsvelto, the behavior you are talking about was introduced in bug 1419488. It works the way you suggest, but it's limited to Windows 7 [1], which means it can't be responsible for most of what we are seeing here (nearly all crashes are Win 10).
From here is gets complicated. The crash with CDeviceEnumerator::UnregisterEndpointNotificationCallback -- the one that came in when bug 1614585 was duplicated to this one -- very much mirrors what we saw in bug 1419488. And that crash is currently 100% in Windows 10, going back 6 months. So there may be 2 things there : (1) its not actually a dupe of this and (2) we should extend the code in [1] to work in Windows 10, not just Windows 7, as the hang now shows up there too.
That would still leave the crash in comment 0. The crash is in system code and doesn't seem to be giving any really useful data. It happens in all versions of Windows. Crash-stats says that many of them are actually startup crashes (I don't know how much to believe this). The crash is a core system-RPC-related operation and it looks like some of them have some audio stuff on the stack but most don't. The ones that look audio-related seem to be in shutdown/hang behavior (again, I'm not certain of this). So I took a look at the Windows 7 crashes under that signature [2]. I've only checked a dozen or so but none were clearly audio related and most seemed very much not (most, but not all, seem to be in graphics). From all that, I'm thinking that extending the Win 7 audio fix above to the rest of Windows may also fix the audio-related crashes under this signature, but not fix the crash in its entirety since it probably has many causes.
I'm going to un-dupe bug 1614585 and extend the win 7 fix to the rest of Windows there. I'll leave this bug open because I don't know how best to deal with it.
[1] https://searchfox.org/mozilla-central/rev/c1e3d3edd4a9b784971555dc74a5de23d768b2e1/widget/windows/AudioSession.cpp#281
[2] https://crash-stats.mozilla.org/signature/?platform_pretty_version=~7&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20NtAlpcSendWaitReceivePort&date=%3E%3D2019-08-21T18%3A42%3A00.000Z&date=%3C2020-02-21T18%3A42%3A00.000Z&_columns=date&_columns=version&_columns=build_id&_columns=reason&_columns=address&_columns=install_time&_columns=startup_crash&_columns=platform_pretty_version&_sort=-date&page=1
Comment 5•5 years ago
•
|
||
FYI: There are actually Windows 7 crashes with the CDeviceEnumerator::UnregisterEndpointNotificationCallback crash in that list -- because they are from builds that predate the fix from bug 1419488, which went in in version 62.
Comment 6•5 years ago
|
||
¡Hola!
Per https://crash-stats.mozilla.org/signature/?product=Firefox&signature=IPCError-browser%20%7C%20ShutDownKill%20%7C%20NtAlpcSendWaitReceivePort also 73 and 75 are affected.
¡Gracias!
Alex
Comment 7•5 years ago
|
||
Thanks for the very detailed explanation David! I've opened a bunch of crash reports under the signature that was added in comment 0 and most of them have the IAudioSessionControl destruction on the stack... but not all of them. That's why I originally duped bug 1614585 against this one, they seemed the same but apparently there's at least two different stacks under this signature.
I will remove the second signature so we keep the bugs separate. My guess is that once you've landed the fix for bug 1614585 the crash volume here will go down and we'll be left with only the non-audio stacks.
Comment 8•5 years ago
|
||
Bug 1617283 might be factoring into this.
Comment 9•5 years ago
|
||
Ah yes, that'd be nice! Seems like this issue is going away soon.
Comment 10•5 years ago
|
||
The priority flag is not set for this bug.
:jimm, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•5 years ago
|
Comment 11•5 years ago
|
||
After the fix for bug 1614585 landed the volume here dropped almost to zero with all the audio-related crashes going away. The few remaining reports have stack traces that are all over the place so I'm marking this fixed and adding the signature to the "slow" ones.
Updated•5 years ago
|
Description
•