Crash in PLDHashTable::Add | PLDHashTable::Add | mozilla::dom::WindowGlobalParent::Init

RESOLVED FIXED in Firefox 66

Status

()

defect
P2
critical
RESOLVED FIXED
8 months ago
5 months ago

People

(Reporter: gsvelto, Assigned: Nika)

Tracking

({crash, regression})

unspecified
mozilla67
Unspecified
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Fission Milestone:M1, firefox-esr60 unaffected, firefox65+ wontfix, firefox66+ fixed, firefox67+ fixed)

Details

(crash signature)

Attachments

(4 attachments)

This bug was filed from the Socorro interface and is
report bp-1eb26f44-3ea3-4fd5-8e86-644660181224.
=============================================================

Top 10 frames of crashing thread:

0 libxul.so PLDHashTable::Add xpcom/ds/PLDHashTable.h:327
1 libxul.so PLDHashTable::Add xpcom/ds/PLDHashTable.cpp:572
2 libxul.so mozilla::dom::WindowGlobalParent::Init dom/ipc/WindowGlobalParent.cpp:68
3 libxul.so mozilla::dom::TabParent::RecvPWindowGlobalConstructor dom/ipc/TabParent.cpp:995
4 libxul.so mozilla::ipc::PInProcessParent::OnMessageReceived ipc/ipdl/PInProcessParent.cpp:157
5 libxul.so mozilla::ipc::MessageChannel::DispatchAsyncMessage ipc/glue/MessageChannel.cpp:2159
6 libxul.so mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2086
7 libxul.so mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1966
8 libxul.so nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1157
9 libxul.so NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:468

=============================================================

Fennec-specific, not sure if this is the right component though since there's involved.
Flags: needinfo?(nika)
Interesting... It seems like Fennec is creating a WindowGlobal in a situation without a BrowsingContext created for it. That's quite weird and unfortunate.

In DEBUG mode, this assertion would be being hit: https://hg.mozilla.org/releases/mozilla-beta/annotate/dfe78db5e4ae6de6052b22e0e2d7ced375a0110c/dom/ipc/WindowGlobalParent.cpp#l61

I'm imagining that the source of the issue here is that the message is being asynchronously sent, but the BrowsingContext has already been destroyed before it is received, causing this failure. We may be able to get away with a `Send__delete__` call here.
Flags: needinfo?(nika)
Priority: -- → P2

This is showing up pretty high in the Fx65 crash data. Nika, can you please take another look?

Flags: needinfo?(nika)

Hmm - so looking at this more, it seems like this is occurring due to in-process browsing context destruction not following ordering when occurring in-process. In effect, something like this is probably happening:

  1. nsGlobalWindowInner is created for some content, which creates a WindowGlobalChild & dispatches msg to create WindowGlobalParent.
  2. DocShell is destroyed, destroying BrowsingContext, and WindowGlobalChild. Sends delete msg to destroy WindowGlobalParent.
  3. WindowGlobalParent ctor is received, but BrowsingContext is already gone, due to being destroyed synchronously.

The best option here is a bit complex. The goal here is to have the cross-process and in-process case be as similar as possible, but this change in behaviour with BrowsingContext state reveals how, in some cases, that doesn't happen. We're racing a synchronous update to shared state with an async call which expects that to be processed before additional work occurs.

One option here would be to, when sending the BrowsingContextId in-process, take a reference to the BrowsingContext itself to make sure it is still alive when reaching the other side. This would probably look like having an IPDLParamTraits implementation directly on BrowsingContext which has this special behaviour.

ni? farre for any thoughts

Flags: needinfo?(nika) → needinfo?(afarre)

As we discussed on IRC, some kind of stablestate?

Flags: needinfo?(afarre)

Nika, we have till 18 Mar to make it into 66. Is it possible to land a fix by then?

Flags: needinfo?(nika)

I have some work-in-progress patches which implement parts of what :farre and I talked about yesterday. I'll post it here when I get it done.

Flags: needinfo?(nika)

This patch takes the approach of taking a reference, so that we can land
it into the tree more quickly & fix issues we have.

This isn't a final solution by any means, we should also do something
along the lines of the StableState approach, but it should be sufficient
for now.

Depends on D19177

This patch changes the logic such that we use the new direct
BrowsingContext ParamTraits implementation when possible, and avoids
doing manual lookups.

Depends on D19178

Assignee: nobody → nika
Pushed by nlayzell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/62e70b24ecb8
Part 1: Expose whether a cross-process channel is in use, r=mccr8
https://hg.mozilla.org/integration/autoland/rev/e4bc2ad58acf
Part 2: Allow serializing BrowsingContext over IPC, r=farre
https://hg.mozilla.org/integration/autoland/rev/324d25d3f933
Part 3: Directly pass BrowsingContext over IPC when possible, r=farre
Fission Milestone: --- → M1

Is this something you're intending for beta uplift?

Flags: needinfo?(nika)

I'll make a minimal version which should fix the crash for beta uplift.

Flags: needinfo?(nika)

This patch is intended for the beta branch, to mitigate the crashing
case.

Comment on attachment 9045374 [details]
Bug 1516240 - Hold a BrowsingContext reference when constructing in-process actors in BETA,

Beta/Release Uplift Approval Request

Feature/Bug causing the regression

Bug 1500948

User impact if declined

Occasional crashes for in-process actors

Is this code covered by automated tests?

Yes

Has the fix been verified in Nightly?

No

Needs manual test from QE?

No

If yes, steps to reproduce

List of other uplifts needed

None

Risk to taking this patch

Low

Why is the change risky/not risky? (and alternatives if risky)

Simply adds an extra reference in this case to delay destruction of valuable objects.

String changes made/needed

None

Attachment #9045374 - Flags: approval-mozilla-beta?

Comment on attachment 9045374 [details]
Bug 1516240 - Hold a BrowsingContext reference when constructing in-process actors in BETA,

Avoid a top crash. Ok for uplift for beta 11.

Attachment #9045374 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Backed out changeset for causing build bustages on WindowGlobalParent.

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&resultStatus=testfailed%2Cbusted%2Cexception&revision=3f1501ac45d0630bddba90054d27ed96c60a7f6f&selectedJob=230105613

Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=230105613&repo=mozilla-beta&lineNumber=34962

Backout link: https://hg.mozilla.org/releases/mozilla-beta/rev/3e124918dc6827e285d5aac82d046086cf2b0ea5

[task 2019-02-23T21:54:25.625Z] 21:54:25 INFO - /builds/worker/workspace/build/src/clang/bin/clang++ --target=i686-linux-gnu -o Unified_cpp_dom_ipc1.o -c -I/builds/worker/workspace/build/src/obj-firefox/dist/stl_wrappers -I/builds/worker/workspace/build/src/obj-firefox/dist/system_wrappers -include /builds/worker/workspace/build/src/config/gcc_hidden.h -DDEBUG=1 -DOS_POSIX=1 -DOS_LINUX=1 '-DBIN_SUFFIX=""' -DMOZ_TOOLKIT_SEARCH -DSTATIC_EXPORTABLE_JS_API -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -I/builds/worker/workspace/build/src/dom/ipc -I/builds/worker/workspace/build/src/obj-firefox/dom/ipc -I/builds/worker/workspace/build/src/obj-firefox/ipc/ipdl/_ipdlheaders -I/builds/worker/workspace/build/src/ipc/chromium/src -I/builds/worker/workspace/build/src/ipc/glue -I/builds/worker/workspace/build/src/caps -I/builds/worker/workspace/build/src/chrome -I/builds/worker/workspace/build/src/docshell/base -I/builds/worker/workspace/build/src/dom/base -I/builds/worker/workspace/build/src/dom/bindings -I/builds/worker/workspace/build/src/dom/events -I/builds/worker/workspace/build/src/dom/filesystem -I/builds/worker/workspace/build/src/dom/geolocation -I/builds/worker/workspace/build/src/dom/media/webspeech/synth/ipc -I/builds/worker/workspace/build/src/dom/security -I/builds/worker/workspace/build/src/extensions/cookie -I/builds/worker/workspace/build/src/extensions/spellcheck/src -I/builds/worker/workspace/build/src/gfx/2d -I/builds/worker/workspace/build/src/hal/sandbox -I/builds/worker/workspace/build/src/js/xpconnect/loader -I/builds/worker/workspace/build/src/js/xpconnect/src -I/builds/worker/workspace/build/src/layout/base -I/builds/worker/workspace/build/src/media/webrtc -I/builds/worker/workspace/build/src/netwerk/base -I/builds/worker/workspace/build/src/netwerk/protocol/http -I/builds/worker/workspace/build/src/toolkit/components/printingui/ipc -I/builds/worker/workspace/build/src/toolkit/crashreporter -I/builds/worker/workspace/build/src/toolkit/xre -I/builds/worker/workspace/build/src/uriloader/exthandler -I/builds/worker/workspace/build/src/widget -I/builds/worker/workspace/build/src/xpcom/base -I/builds/worker/workspace/build/src/xpcom/threads -I/builds/worker/workspace/build/src/modules/libjar -I/builds/worker/workspace/build/src/obj-firefox/dist/include -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nspr -I/builds/worker/workspace/build/src/obj-firefox/dist/include/nss -fPIC -DMOZILLA_CLIENT -include /builds/worker/workspace/build/src/obj-firefox/mozilla-config.h -Qunused-arguments -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -Qunused-arguments -Wall -Wbitfield-enum-conversion -Wempty-body -Wignored-qualifiers -Woverloaded-virtual -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wwrite-strings -Wno-invalid-offsetof -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wc++1z-compat -Wc++2a-compat -Wcomma -Wimplicit-fallthrough -Werror=non-literal-null-conversion -Wstring-conversion -Wtautological-overlap-compare -Wtautological-unsigned-enum-zero-compare -Wtautological-unsigned-zero-compare -Wno-inline-new-delete -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wno-error=return-std-move -Wno-error=atomic-alignment -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -Wno-unknown-warning-option -Wno-return-type-c-linkage -D_GLIBCXX_USE_CXX11_ABI=0 -fno-sized-deallocation -fcrash-diagnostics-dir=/builds/worker/artifacts -march=pentium-m -msse -msse2 -mfpmath=sse -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 -fstack-protector-strong -fno-exceptions -fno-strict-aliasing -fno-rtti -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread -pipe -g -Xclang -load -Xclang /builds/worker/workspace/build/src/obj-firefox/build/clang-plugin/libclang-plugin.so -Xclang -add-plugin -Xclang moz-check -Os -fno-omit-frame-pointer -funwind-tables -Werror -I/builds/worker/workspace/build/src/widget/gtk/compat-gtk3 -pthread -I/usr/include/gtk-3.0 -I/usr/include/atk-1.0 -I/usr/include/at-spi2-atk/2.0 -I/usr/include/pango-1.0 -I/usr/include/gio-unix-2.0/ -I/usr/include/cairo -I/usr/include/gdk-pixbuf-2.0 -I/usr/include/glib-2.0 -I/usr/lib/i386-linux-gnu/glib-2.0/include -I/usr/include/harfbuzz -I/usr/include/freetype2 -I/usr/include/pixman-1 -I/usr/include/libpng12 -I/usr/include/gtk-3.0/unix-print -Wno-error=shadow -MD -MP -MF .deps/Unified_cpp_dom_ipc1.o.pp /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc1.cpp
[task 2019-02-23T21:54:25.628Z] 21:54:25 INFO - In file included from /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc1.cpp:92:
[task 2019-02-23T21:54:25.629Z] 21:54:25 ERROR - /builds/worker/workspace/build/src/dom/ipc/WindowGlobalParent.cpp:86:12: error: template argument for template type parameter must be a type
[task 2019-02-23T21:54:25.630Z] 21:54:25 INFO - RefPtr<BrowsingContext> dummy = dont_AddRef(mBrowsingContext);
[task 2019-02-23T21:54:25.631Z] 21:54:25 INFO - ^~~~~~~~~~~~~~~
[task 2019-02-23T21:54:25.632Z] 21:54:25 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/mozilla/layers/DoubleTapToZoom.h:12:17: note: template parameter is declared here
[task 2019-02-23T21:54:25.633Z] 21:54:25 INFO - template <class T>
[task 2019-02-23T21:54:25.634Z] 21:54:25 INFO - ^
[task 2019-02-23T21:54:25.635Z] 21:54:25 INFO - In file included from /builds/worker/workspace/build/src/obj-firefox/dom/ipc/Unified_cpp_dom_ipc1.cpp:92:
[task 2019-02-23T21:54:25.636Z] 21:54:25 ERROR - /builds/worker/workspace/build/src/dom/ipc/WindowGlobalParent.cpp:86:37: error: no matching function for call to 'dont_AddRef'
[task 2019-02-23T21:54:25.637Z] 21:54:25 INFO - RefPtr<BrowsingContext> dummy = dont_AddRef(mBrowsingContext);
[task 2019-02-23T21:54:25.638Z] 21:54:25 INFO - ^~~~~~~~~~~
[task 2019-02-23T21:54:25.639Z] 21:54:25 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/nsCOMPtr.h:115:28: note: candidate template ignored: could not match 'T ' against 'RefPtr<mozilla::dom::ChromeBrowsingContext>'
[task 2019-02-23T21:54:25.640Z] 21:54:25 INFO - inline already_AddRefed<T> dont_AddRef(T
aRawPtr) {
[task 2019-02-23T21:54:25.642Z] 21:54:25 INFO - ^
[task 2019-02-23T21:54:25.642Z] 21:54:25 INFO - /builds/worker/workspace/build/src/obj-firefox/dist/include/nsCOMPtr.h:120:30: note: candidate template ignored: could not match 'already_AddRefed' against 'RefPtr'
[task 2019-02-23T21:54:25.642Z] 21:54:25 INFO - inline already_AddRefed<T>&& dont_AddRef(
[task 2019-02-23T21:54:25.642Z] 21:54:25 INFO - ^
[task 2019-02-23T21:54:25.642Z] 21:54:25 INFO - 2 errors generated.
[task 2019-02-23T21:54:25.643Z] 21:54:25 INFO - /builds/worker/workspace/build/src/config/rules.mk:1106: recipe for target 'Unified_cpp_dom_ipc1.o' failed
[task 2019-02-23T21:54:25.643Z] 21:54:25 ERROR - make[4]: *** [Unified_cpp_dom_ipc1.o] Error 1
[task 2019-02-23T21:54:25.643Z] 21:54:25 INFO - make[4]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/dom/ipc'
[task 2019-02-23T21:54:25.643Z] 21:54:25 INFO - make[4]: *** Waiting for unfinished jobs....

Flags: needinfo?(nika)

Patch updated to fix that mistake - builds successfully locally.

Status: RESOLVED → REOPENED
Flags: needinfo?(nika) → needinfo?(aryx.bugmail)
Resolution: FIXED → ---

central didn't get backed out, back to "RESOLVED FIXED" (status-firefox66 tracks it for beta).

Status: REOPENED → RESOLVED
Closed: 6 months ago6 months ago
Flags: needinfo?(aryx.bugmail)
Resolution: --- → FIXED
Component: DOM → DOM: Core & HTML
Depends on: 1539959
You need to log in before you can comment on or make changes to this bug.