Crash in [@ shutdownhang | NtGetContextThread]
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr60 | --- | unaffected |
| firefox66 | --- | unaffected |
| firefox67 | --- | fixed |
| firefox68 | --- | fixed |
People
(Reporter: calixte, Assigned: gsvelto)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
|
47 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
|
4.69 KB,
patch
|
Details | Diff | Splinter Review |
This bug is for crash report bp-0800ebfb-4dc3-4cec-88d6-ef1770190320.
Top 10 frames of crashing thread:
0 ntdll.dll NtGetContextThread
1 kernelbase.dll GetThreadContext
2 dbgcore.dll Win32LiveSystemProvider::GetThreadContextEx
3 dbgcore.dll NtWin32LiveSystemProvider::GetThreadContextEx
4 dbgcore.dll GenAllocateThreadObject
5 dbgcore.dll GenGetProcessInfo
6 dbgcore.dll MiniDumpProvideDump
7 dbgcore.dll MiniDumpWriteDump
8 xul.dll bool google_breakpad::ExceptionHandler::WriteMinidumpWithExceptionForProcess toolkit/crashreporter/breakpad-client/windows/handler/exception_handler.cc:950
9 xul.dll google_breakpad::ExceptionHandler::WriteMinidumpForChild toolkit/crashreporter/breakpad-client/windows/handler/exception_handler.cc:814
There are 4 crashes starting with buildid 20190319095054.
The pushlog for this build is:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fe798624cda0&tochange=880331515823
It appears that it could be related to patch for bug 1533842.
:gsvelto, could you investigate please ?
Updated•6 years ago
|
| Assignee | ||
Comment 1•6 years ago
|
||
This is very odd, it looks as if bug 1498942 has come back but the change in bug 1533842 should not have changed the behavior at all. It's late now but I'll check again tomorrow first thing in the morning.
| Assignee | ||
Comment 2•6 years ago
|
||
Investigating this issue turned out to be more complex than I had anticipated. As it turns out my fix for bug 1498942 worked purely by chance which explains quite a few things. First of all since we were not initializing the mShuttingDown variable when creating a ContentParent object the variable ended up holding garbage, but since we use a non-zero poison pattern for freed memory the variable evaluated to true most of the time. This in turn prevented us from taking minidumps of hung content processes even when we weren't shutting down which is bad because it must have artificially lowered our content process hang rate on crash-stats.
That being said since we weren't taking minidumps of content process anymore on shutdown we didn't hit this particular crash. When I fixed the variable initialization the patch started behaving like it should except that there was a second issue in the way: we set the mShuttingDown variable inside the profile-before-change/xpcom-shutdown observer but - depending on the state of the content process - there seem to be a race that causes the observer not to be called and so we never set mShuttingDown. I must decouple the shutdown detection from the rest of the ContentParent logic because it's the only way to prevent this issue from happening.
| Assignee | ||
Comment 3•6 years ago
|
||
I've identified the race between the process being killed and receiving profile-before-change, the fix should be straightforward.
| Assignee | ||
Updated•6 years ago
|
| Assignee | ||
Comment 4•6 years ago
|
||
This code was already present but it had two flaws: the first one was that the
variable used to prevent minidumps from being taken when shutting down was not
being initialized which caused it to be true most of the time and thus
prevented captures even in valid scenarios. The second was that we relied on
the "profile-before-change" change event to detect shutdown but this often
happens after we've already started closing tabs and shutting down processes
thus the minidump generation would often race against it.
Comment 6•6 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 7•6 years ago
|
||
I don't see any new crashes after buildid 20190329220047 so this should be fixed for good.
| Assignee | ||
Comment 8•6 years ago
|
||
Comment on attachment 9053236 [details]
Bug 1536850 - Properly detect when the browser is shutting down to avoid taking content process minidumps too late
Beta/Release Uplift Approval Request
- Feature/Bug causing the regression: Bug 1498942
- User impact if declined: Crash reports for hung content processes are never sent
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The changes uses the proper interface to detect shutdown and the patch is small
- String changes made/needed: None
Comment 9•6 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #8)
- List of other uplifts needed: None
I believe we'll want to re-land bug 1533842 if this gets approved?
Comment 10•6 years ago
|
||
Comment on attachment 9053236 [details]
Bug 1536850 - Properly detect when the browser is shutting down to avoid taking content process minidumps too late
Shutdown crash fix, approved for 67 beta 10, thanks!
Comment 11•6 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #9)
(In reply to Gabriele Svelto [:gsvelto] from comment #8)
- List of other uplifts needed: None
I believe we'll want to re-land bug 1533842 if this gets approved?
Gabriele, should we reland bug 1533842? Thanks
| Assignee | ||
Comment 12•6 years ago
|
||
No, the patch here replaces bug 1533842 entirely.
Updated•6 years ago
|
Comment 13•6 years ago
•
|
||
Tried to uplift to beta, failed at builds/worker/workspace/build/src/dom/ipc/ContentParent.cpp
Push with failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&resultStatus=testfailed%2Cbusted%2Cexception&revision=11ddfc9bd642f08e487fb76d23b7626f0b179e58
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=239550295&repo=mozilla-beta&lineNumber=23084
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - make[4]: Entering directory '/builds/worker/workspace/build/src/obj-firefox/dom/ipc'
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - /builds/worker/workspace/build/src/clang/bin/clang++ -o Unified_cpp_dom_ipc0.o -c -I/builds/worker/workspace/build/src/obj-firefox/dist/stl_wrappers -I/builds/worker/workspace/build/src/obj-firefox/dist/system_wrappers -include /builds/worker/workspace/build/src/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 '-DBIN_SUFFIX=""' -DMOZ_TOOLKIT_SEARCH -DSTATIC_EXPORTABLE_JS_API -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -I/builds/worker/workspace/build/src/dom/ipc -I/builds/worker/workspace/build/src/obj-firefox/dom/ipc -I/builds/worker/workspace/build/src/obj-firefox/ipc/ipdl/ipdlheaders -I/builds/worker/workspace/build/src/ipc/chromium/src -I/builds/worker/workspace/build/src/ipc/glue -I/builds/worker/workspace/build/src/caps -I/builds/worker/workspace/build/src/chrome -I/builds/worker/workspace/build/src/docshell/base -I/builds/worker/workspace/build/src/dom/base -I/builds/worker/workspace/build/src/dom/bindings -I/builds/worker/workspace/build/src/dom/events -I/builds/worker/workspace/build/src/dom/filesystem -I/builds/worker/workspace/build/src/dom/geolocation -I/builds/worker/workspace/build/src/dom/media/webspeech/synth/ipc -I/builds/worker/workspace/build/src/dom/security -I/builds/worker/workspace/build/src/dom/storage -I/builds/worker/workspace/build/src/extensions/cookie -I/builds/worker/workspace/build/src/extensions/spellcheck/src -I/builds/worker/workspace/build/src/gfx/2d -I/builds/worker/workspace/build/src/hal/sandbox -I/builds/worker/workspace/build/src/js/xpconnect/loader -I/builds/worker/workspace/build/src/js/xpconnect/src -I/builds/worker/workspace/build/src/layout/base -I/builds/worker/workspace/build/src/media/webrtc -I/builds/worker/workspace/build/src/netwerk/base -I/builds/worker/workspace/build/src/netwerk/protocol/http -I/builds/worker/workspace/build/src/toolkit/components/printingui/ipc -I/builds/worker/workspace/build/src/toolkit/crashreporter -I/builds/worker/workspace/build/src/toolkit/xre -I/builds/worker/workspace/build/src/uriloader/exthandler -I/builds/worker/workspace/build/src/widget -I/builds/worker/workspace/build/src/xpcom/base -I/builds/worker/workspace/build/src/xpcom/threads -I/builds/worker/workspace/build/src/modules/libjar -I/builds/worker/workspace/build/src/obj-firefox/dist/include -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nspr -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nss -fPIC -DMOZILLA_CLIENT -include /builds/worker/workspace/build/src/obj-firefox/mozilla-config.h -Qunused-arguments -U_FORTIFY_SOURCE -fno-common -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wempty-body -Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wwrite-strings -Wno-invalid-offsetof -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wc++1z-compat -Wc++2a-compat -Wcomma -Wimplicit-fallthrough -Werror=non-literal-null-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-enum-zero-compare -Wtautological-unsigned-zero-compare -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=return-std-move -Wno-error=atomic-alignment -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-unknown-warning-option -Wno-return-type-c-linkage -D_GLIBCXX_USE_CXX11_ABI=0 -fno-sized-deallocation -fsanitize=address -fcrash-diagnostics-dir=/builds/worker/artifacts -U_FORTIFY_SOURCE -fno-common -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -pipe -g -Xclang -load -Xclang /builds/worker/workspace/build/src/obj-firefox/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -O1 -fno-omit-frame-pointer -funwind-tables -Werror -I/builds/worker/workspace/build/src/widget/gtk/compat-gtk3 -pthread -I/usr/include/gtk-3.0 -I/usr/include/atk-1.0 -I/usr/include/at-spi2-atk/2.0 -I/usr/include/pango-1.0 -I/usr/include/gio-unix-2.0/ -I/usr/include/cairo -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/glib-2.0 -I/usr/lib/x86_64-linux-gnu/glib-2.0/include -I/usr/include/harfbuzz -I/usr/include/freetype2 -I/usr/include/pixman-1 -I/usr/include/libpng12 -I/usr/include/gtk-3.0/unix-print -Wno-error=shadow -MD -MP -MF .deps/Unified_cpp_dom_ipc0.o.pp /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc0.cpp
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - In file included from /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc0.cpp:56:
[task 2019-04-11T03:06:34.423Z] 03:06:34 ERROR - /builds/worker/workspace/build/src/dom/ipc/ContentParent.cpp:3380:54: error: too few arguments to function call, single argument 'aShuttingDown' was not specified
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - if (mCrashReporter && !appStartup->GetShuttingDown() &&
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/nsIAppStartup.h:78:25: note: 'GetShuttingDown' declared here
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - JS_HAZ_CAN_RUN_SCRIPT NS_IMETHOD GetShuttingDown(bool *aShuttingDown) = 0;
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - ^
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/nscore.h:134:20: note: expanded from macro 'NS_IMETHOD'
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - #define NS_IMETHOD NS_IMETHOD(nsresult)
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - ^
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/nscore.h:125:29: note: expanded from macro 'NS_IMETHOD_'
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - # define NS_IMETHOD_(type) virtual type
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - ^
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - 1 error generated.
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - /builds/worker/workspace/build/src/config/rules.mk:805: recipe for target 'Unified_cpp_dom_ipc0.o' failed
[task 2019-04-11T03:06:34.423Z] 03:06:34 ERROR - make[4]: *** [Unified_cpp_dom_ipc0.o] Error 1
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - make[4]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/dom/ipc'
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - make[4]: *** Waiting for unfinished jobs....
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - make[4]: Entering directory '/builds/worker/workspace/build/src/obj-firefox/dom/worklet'
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - mkdir -p '.deps/'
| Assignee | ||
Comment 14•6 years ago
|
||
(In reply to Daniel Varga [:dvarga] from comment #13)
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - In file included from /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc0.cpp:56:
[task 2019-04-11T03:06:34.423Z] 03:06:34 ERROR - /builds/worker/workspace/build/src/dom/ipc/ContentParent.cpp:3380:54: error: too few arguments to function call, single argument 'aShuttingDown' was not specified
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - if (mCrashReporter && !appStartup->GetShuttingDown() &&
[task 2019-04-11T03:06:34.423Z] 03:06:34 INFO - ~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Ugh, there's a problem with my original patch I hadn't spotted... and more importantly this error is present in a few other places in the code which fooled me into thinking I was doing the right thing. GetShuttingDown() will always return true here which is not what we wanted. Ignore this patch for now, I'll file a follow-up.
| Assignee | ||
Comment 15•6 years ago
|
||
OK, scratch part of my previous comment: the patch is working on central, it's failing on beta because it implicitly relied on bug 1523609 (which changed how GetShuttingDown() behaves by adding an inlined-version which does what you'd expect it to do). I'll write a beta-specific patch and attach it here.
| Assignee | ||
Comment 16•6 years ago
|
||
Comment 17•6 years ago
|
||
| bugherder uplift | ||
Updated•6 years ago
|
Description
•