Crash in [@ IPCError-browser | ShutDownKill | NtYieldExecution]
Categories
(Core :: XPCOM, defect, P3)
Tracking
()
People
(Reporter: pascalc, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Crash Data
This bug is for crash report bp-d90d2a48-9fa2-41a9-800d-3df1c0200210.
Top 10 frames of crashing thread:
0 ntdll.dll NtYieldExecution
1 user32.dll PeekMessageW
2 xul.dll SingleNativeEventPump::OnProcessNextEvent widget/windows/nsAppShell.cpp:140
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1124
4 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
5 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:87
6 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:308
7 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:290
8 xul.dll nsBaseAppShell::Run widget/nsBaseAppShell.cpp:137
9 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:406
Reporter | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
This signature shows up only on nightly which is odd, because I don't see why it shouldn't show up in release build. What makes it worrisome is that if you look at the crashes you'll see that all the threads are stopped waiting. The content processes here didn't even begin shutting down, they're just sitting idle.
Reporter | ||
Comment 2•5 years ago
|
||
Tracking given the volume, let's see if the crashes get into 74 beta.
Reporter | ||
Comment 4•5 years ago
|
||
Restricted to nightly, no crashes in 74 beta so the release is unaffected.
Comment 5•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Comment 6•5 years ago
|
||
Nathan, the crash volume seems to be high. Could you set the priority flag for this bug?
Updated•5 years ago
|
Comment hidden (obsolete) |
Comment 8•5 years ago
|
||
I am apparently dumb, because what's getting changed in ntdll.dll
is our hooking of functions using WindowsDllInterceptor
, and the code getting changed is completely unrelated to where the crashes are. Thanks to dmajor for educating me.
So I have no idea what's going on with these content processes, unless we're not going through their proper shutdown sequence due to some nightly-only thing. I don't suppose we're able to correlate these with parent process crashes (or maybe the parent process is shutting down cleanly...), gsvelto?
Comment 9•5 years ago
|
||
Looking at the IPCShutdownState
annotation there's roughly 30% of the crashes which have the annotation set to SendFinishShutdown (sent)
so they were just being slow; the process finished shutting down right after we captured the minidump and before we killed the process.
The remaining ones don't have the annotation set at all which indicates that they haven't received the shutdown message yet. This can happen if the process has never been scheduled between when we sent the shutdown IPC message and when we decided to kill it. It's 5 seconds, which is a pretty long time, but it could just indicate slowness.
Which made me remember something important on Windows: since bug 1366356 processes that were not running a foreground tab had their CPU priority demoted. Since we're killing these processes they're certainly not running a foreground tab. A way to speed them up might be to raise their priority again just before we send them the shutdown IPC message. I'll file a bug for that.
Comment 10•5 years ago
|
||
Considering the "regressionwindow-wanted" tag, I could try to find the regression if some steps to reproduce are provided.
Please NI me if any working STR are obtained. Thanks.
Updated•5 years ago
|
Comment 11•5 years ago
|
||
This is not a new issue, removed the regression-related flags.
Comment 13•5 years ago
|
||
I believe comment 7 is obsolete: comment 8 explains it. I've hidden the comment to avoid future confusion.
Comment 14•5 years ago
|
||
This is one of the top overall Nightly crashes at the moment. Given the resolution of bug 1619676, is there anything else we can do to mitigate this issue?
Comment 15•5 years ago
|
||
This is just a "content process being slow" issue so I'm dup'ing against bug 1279293 which is where this belongs.
Updated•5 years ago
|
Description
•