Crash in [@ nsWrapperCache::GetWrapper]
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox70 | --- | unaffected |
firefox71 | --- | unaffected |
firefox72 | blocking | fixed |
People
(Reporter: marcia, Assigned: emilio)
References
(Blocks 1 open bug)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
This bug is for crash report bp-c7fcf926-a6aa-4195-a35c-8c1af0191116.
Seen while looking at nightly crashes - started spiking in 72 in 20191114214957: https://mzl.la/2CYJf1Z. 165 crashes/47 installs so far. Noticeable spike in the build from 11/20. No particular pattern to the URLs and comment are not really useful.
Possible regression range based on build ID: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2f19e7b646e0a52fa855b75c868f0a3f3a990ad3&tochange=88db9bea4580df16dc444668f8c2cddbb3414318
Top 7 frames of crashing thread:
0 xul.dll nsWrapperCache::GetWrapper dom/base/nsWrapperCacheInlines.h:27
1 xul.dll static bool mozilla::dom::Window_Binding::getWindowGlobalChild dom/bindings/WindowBinding.cpp:7409
2 xul.dll mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::MaybeCrossOriginObjectThisPolicy, mozilla::dom::binding_detail::ThrowExceptions> dom/bindings/BindingUtils.cpp:3154
3 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:548
4 xul.dll js::jit::DoCallFallback js/src/jit/BaselineIC.cpp:2940
5 @0x312b97b3f9e
6 xul.dll trunc
Comment 1•5 years ago
|
||
I have had this happen to me today and have had 4 crashes in the last 40 minutes. There is no definitive steps to reproduce for me
Comment 2•5 years ago
|
||
Looking at that regression range, I think bz and emilio have modified code that is at least vaguely related to that. ni? to them to see if there's something obvious, or if something in that regression range jumps out to them whereas it doesn't to me.
Comment 3•5 years ago
|
||
Noticeable spike in the build from 11/20.
That's because, as of the fix for bug 1371390, these crashes have started to become visible on the Mac :-)
Comment 4•5 years ago
|
||
I've just turned off updates on nightly based on the crash volume in the latest build.
Comment 5•5 years ago
|
||
(Following up comment 3)
Actually probably not. These crashes are visible on the Mac for the first time in the 20191120094758 mozilla-central nightly. But the spike on all platforms (currently 691) is much greater than the spike on the Mac (138).
Updated•5 years ago
|
Comment 6•5 years ago
•
|
||
I'm having this problem with Firefox nightly 20191120094758 on Ubuntu Linux 19.04. I wasn't having this problem until I updated to the latest nightly.
I can reproduce it at will by opening any Google doc. As soon as the Google doc finishes loading, the tab crashes.
Here's one of my crash reports:
https://crash-stats.mozilla.org/report/index/b3f3dfc0-d2fc-4daf-be3d-9e8d40191120
Comment 7•5 years ago
|
||
Nightly updates are now rolled back to 20191119215132 and re-enabled.
Assignee | ||
Comment 8•5 years ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #6)
I'm having this problem with Firefox nightly 20191120094758 on Ubuntu Linux 19.04. I wasn't having this problem until I updated to the latest nightly.
I can reproduce it at will by opening any Google doc. As soon as the Google doc finishes loading, the tab crashes.
I couldn't repro on a simple google doc (or at all so far) on a mozilla-central build. Do you have any add-ons installed or such that could help me repro?
So far the stacks point to a null windowGlobalChild(), so it is more likely that some of the fission changes triggered this than my patches, or bz's patches.
I think this should be nullable: https://searchfox.org/mozilla-central/rev/652014ca1183c56bc5f04daf01af180d4e50a91c/dom/webidl/Window.webidl#552
But probably existing code didn't trigger it.
Looking a bit into it.
Assignee | ||
Comment 9•5 years ago
|
||
It can be null after FreeInnerObjects.
Though it looks a bit spooky for chrome code to poke at a window in that state?
We should at least figure out which change made this such a high-volume crash.
This should unblock nightly updates for now... I can add an assertion when
called from DOM bindings if you want, so that we catch this on debug builds at
least?
Assignee | ||
Comment 10•5 years ago
|
||
Is that regression range correct? It seems from 14th of november and such. so quite a bit ago.
Assignee | ||
Comment 11•5 years ago
|
||
Bugs that have recently introduced calls to getWindowGlobalChild():
- Bug 1595143
- Bug 1595154 (backed out)
- Bug 1577498
Of course the issue could be in other patches too... Bug 1577498 seems like it could be the culprit?
Reporter | ||
Comment 12•5 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #10)
Is that regression range correct? It seems from 14th of november and such. so quite a bit ago.
There were one off crashes started on 11/14, but the large spike which just presented itself is in 20191120094758. So that means it could be something that landed in https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fdd07df83c87f12725f4b97c80e644fd11673977&tochange=79821df172391d2d9ab224951b36bd8856df0fb1 (changes between 19th and 20th).
Comment 13•5 years ago
|
||
I think this should be nullable: https://searchfox.org/mozilla-central/rev/652014ca1183c56bc5f04daf01af180d4e50a91c/dom/webidl/Window.webidl#552
Very much so, given how it's handled in FreeInnerObjects
.
Though it looks a bit spooky for chrome code to poke at a window in that state?
It's not really that hard. As an example, if you are poking a window from an iframe that was removed from the DOM, it's in that state.
I wish we got JS stacks from crash-stats; something like https://crash-stats.mozilla.org/report/index/b3f3dfc0-d2fc-4daf-be3d-9e8d40191120 would be legible with those. As it stands, what I can see is that we are firing DOMContentLoaded
, which queues a Promise resolution or rejection, which then runs the Promise handler, which then calls PrecompiledScript::ExecuteInGlobal
(and note, that at this point if the global is the window involved it could already have had its FreeInnerObjects
called by some other DOMContentLoaded
listener!) and then the chrome script pokes at getWindowGlobalChild()
.
Bug 1577498 seems like it could be the culprit?
Well, it's definitely in the regression range from comment 12.
So, we can definitely make getWindowGlobalChild()
nullable, but then consumers need to check for null and deal with it, or at least check that they are OK with exceptions. Now luckily there just aren't that many consumers...
Tomislav, can any of the consumers added in bug 1577498 run at a point after the window's docshell has been torn down, do you know?
Comment hidden (off-topic) |
Comment 15•5 years ago
|
||
Yeah, this most definitely looks like it's getting triggered from bug 1577498. PrecompiledScript::ExecuteInGlobal
is how we run content scripts, and often behind the DOMContentLoaded
.
I'll dig in some more, but I believe returning null in that case would be fine, and even more expected from my point of view.
Comment 16•5 years ago
|
||
Comment 17•5 years ago
|
||
So yes, turns out this is executing lazily after we're already in the content script, the first time they try to access the the browser
api exposed inside the sandbox. Since we're already in extension code by that point, it's very possible that docShell was very much alive when we decided to inject the content script, and got killed in the meantime.
If getWindowGlobalChild()
starts returning null at that point, the lazy getter will just throw, and likely abort the whole content script (or at least neuter it, since we'll never expose the api endpoint). This is almost certainly a better outcome than what would currently happen in that case (before bug 1577498).
(In reply to Michal Kluka from comment #14)
This page will always crash for me https://www.postoj.sk/48952/za-ciarou?fbclid=IwAR2vQ_E8gmBejfAr7zOfnUZy0THId26tkjFQgxCcQ_Xe1WY6Ln6alwSZMrw
Hey Michal, sorry for marking your comment "off-topic", it's too long and was making the discussion hard to follow, and I don't know how else to collapse it.
Comment 18•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Comment 19•5 years ago
|
||
For what it's worth, these crashes are still happening in the current mozilla-central nightly (20191120234543), though at a much lower volume (and so far only on Windows):
Comment 20•5 years ago
|
||
That's really odd. That nightly seems to be build from rev 66531295716a76515be6e24774d788dc4ed8ecdf which should have this fix...
Comment 21•5 years ago
|
||
As best I can tell, the patch for this bug (https://phabricator.services.mozilla.com/D53967) landed first in the 20191120215217 mozilla-central nightly, whose rev is e76dbab2aea8354660281221c1aa08356107881c:
Since it landed there've been a number of these crashes, though not nearly so many as with the 20191120094758 nightly:
Interestingly, the vast majority are on "6.1.7601 Service Pack 1". And only one isn't on Windows (it's on macOS 10.15.1).
Comment 22•5 years ago
•
|
||
There are a number of crashes with AsyncShutdownTimeout in the signature, most (all?) of which also (like this bug's crashes) have PromiseReactionJob() on the stack, and are also disproportionately on Windows. Possibly related?
Comment 23•5 years ago
|
||
Pretty much all the crashes on builds after this fix have a null install date, so I'd take them with a grain of salt. Is it possible they're actually crashes from the broken build but we're making up some of the metadata on submission?
Comment 24•5 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #23)
Pretty much all the crashes on builds after this fix have a null install date, so I'd take them with a grain of salt. Is it possible they're actually crashes from the broken build but we're making up some of the metadata on submission?
Yeah, if they're missing the install time then they're orphaned crashed and the metadata is unreliable (we synthesize the bare minimum from the installation that's submitting them). Orphaned crashes are sent in batches when they're found so that might be why you're seeing a spike, it should go away quickly.
Updated•5 years ago
|
Comment 25•5 years ago
|
||
You're right, Gabriele. These crashes are completely gone as of the 20191123094742 trunk nightly.
Updated•5 years ago
|
Comment 27•5 years ago
|
||
Please specify a root cause for this bug. See :tmaity for more information.
Assignee | ||
Updated•5 years ago
|
Description
•