defineLazyPreferenceGetter ends up running unlinked code, triggering cache->PreservingWrapper() assertions
Categories
(Core :: XPConnect, defect, P2)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox68 | --- | affected |
People
(Reporter: bzbarsky, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(2 obsolete files)
Is still haven't decided which component this should be in, or even which product...
bdahl has been running into fatal assertions on try with MOZ_BROWSER_XHTML=1, due to the following sequence of event:
- browser/components/downloads/content/indicator.js runs in some window, calls defineLazyPreferenceGetter for the "browser.download.autohideButton" preference.
- This adds a pref listener that implements nsIWeakReference in JS.
- The window in question is closed.
- The cycle collector runs, sees nothing has any more references to that window (the weak ref is not a ref because it's not keeping the JS object involved alive!) and unlinks the window, the document, etc. The document is still accessible from the window because it's [StoreInSlot] and we don't clear out the slot, for various reasons.
- Before GC gets to run and collect the window, the preference changes and the pref observe is called.
- The pref observer calls the
_placeholdergetter defined in indicator.js and that tries to access properties on the document and hits the assertion when it discovers and expando object and no preserved wrapper (unlinking unpreserved).
MOZ_BROWSER_XHTML=1 matters only because XULDocument is not a DOM proxy (no named getter) and hence doesn't hit the proxy expando code at all.
If we cleared out the slot on the window, this code would stop asserting but would throw from the _placeholder getter instead, which may not be great either.
Ideally we would prevent calling from the pref service into this stuff somehow. Failing that, maybe indicator.js can listen for window teardown and remove its pref observer or no-op the callback function?
Comment 1•2 years ago
|
||
Late-night braindump which may or may not help:
I'm wondering if we should just move lazy prefs onto a separate jsm-style (or webidl) thing and stop supporting defining lazy pref getters on windows (or objects tied to windows). That and downloads code needs to stop importing the world when nothing is going on.
In the comments on https://phabricator.services.mozilla.com/D15449 I suggested perhaps in addition to AppConstants.jsm we want something like AppState.jsm where we track things like lazy prefs and "did thing X happen" (e.g. have there been any downloads this session) to avoid having to import additional modules that we don't (yet) need if X hasn't happened / some pref hasn't been flipped. I've been distracted with other things but if this sounds like a sane plan to people I can reprio it.
I've seen this conundrum in a few other places, e.g. pockets code that doesn't need to do much except in the comparatively rare case where it's disabled via policy or the about:config pref, which currently has to load Pocket's own jsms just to work that out. Would be cheaper if that type of state was in a central place.
Comment 2•2 years ago
|
||
(In reply to :Gijs (he/him) from comment #1)
I'm wondering if we should just move lazy prefs onto a separate jsm-style (or webidl) thing and stop supporting defining lazy pref getters on windows (or objects tied to windows).
I have plans to replace the current lazy preference helper with a native implementation, but that unfortunately doesn't help with this case, since it uses a change callback (which I'd never intended the lazy preference getter code to support to begin with...). It'll still have to use something approximating the current JS code if it wants the change callback.
Comment 3•2 years ago
|
||
I think this is a more particular manifestation of the general problem that weak references to cycle collected objects can cause dead objects to come back alive if we touch them at the wrong time.
It seems like the sanest thing here to do would be to keep track of weak references that participate in garbage cycles, and unlink them when unlinking everything else in those cycles. I feel like there should be a reasonably elegant way to do that, but my knowledge of the cycle collector doesn't run that deep.
| Reporter | ||
Comment 4•2 years ago
|
||
When we unlink an XPCWrappedJS we already call ClearWeakReferences() on it, looks like. So we're presumably not unlinking that in this case. Presumably because it's not actually participating in a garbage cycle, just pointing to it? My understanding of how the CC interacts with the GC here has atrophied, from it's already-poor state... In particular, I don't understand how the cycle collector ends up seeing JS-to-native edges and what it does with unlinking after that.
Olli, is there an explanation of that somewhere, as a sidenote?
Comment 5•2 years ago
|
||
Right, in this case by participating in a garbage cycle, I mean pointing into it. Essentially, I'm thinking about treating the weak reference to the JS object as a non-black root, and when there are no paths from black roots to the object it points to, and there are paths from that object to another object that we want to unlink as part of a garbage cycle, we'd unlink the weak reference too.
As for how the CC interacts with the GC, it mostly doesn't. It knows how to trace JS edges using their trace hooks, and it knows how to determine if an object one of those edges points to is a cycle collectable native object by the normal means. It only breaks cycles by unlinking the native objects that are part of them, though, and then leaves the GC to cleanup the then unreachable JS objects.
Comment 6•2 years ago
|
||
Using snow white objects or unlinked objects isn't that big problem as such, but does hint about issues elsewhere.
We do traverse XPCWrappedJSs during CC, and do trace them as non-black roots during GC.
And unlinking clears weak references.
But if I interpret the issue right, it is that we return early from
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#112-116 but then things like
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#118-121 keep wrappedJS object alive (it doesn't get unlinked). And unlinking is needed to clear weak references
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#487,506
Is there some easy way to reproduce this locally?
(Why proxy code doesn't recreate the (preserved) wrapper?)
Comment 7•2 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #6)
Is there some easy way to reproduce this locally?
I've only been able to reproduce on try (browser-chrome) and I think the same for bz.
Comment 8•2 years ago
|
||
ok, which particular test or on which platforms?
Updated•2 years ago
|
Comment 9•2 years ago
|
||
The test varies and it happens on all platforms. It pretty consistently happens on bc13 browser/base/content/test/general/browser_documentnavigation.js on linux 64 debug. You'll need to set MOZ_BROWSER_XHTML=1.
Example try run with a number of failures:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=022d846cd862ebdfc30fd9033e2290173a893cbf
Comment 10•2 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #6)
Using snow white objects or unlinked objects isn't that big problem as such, but does hint about issues elsewhere.
That may be true in principal, but in practice touching an object after it's been unlinked (or its 0 refcount hook has fired) causes problems. Sometimes quite subtle problems...
I'm not sure if this is related or not, but when I run this code in the browser console:
{
const PREF = "foo.pref";
Services.prefs.clearUserPref(PREF);
let obj = {
QueryInterface: ChromeUtils.generateQI(["nsISupportsWeakReference"]),
variant: Cc["@mozilla.org/variant;1"].createInstance(Ci.nsIWritableVariant),
observe() {
console.log("Hello.", this.variant, this.variant && this.variant.QueryInterface(Ci.nsIVariant));
},
};
obj.variant.setAsISupports(obj);
Services.prefs.addObserver(PREF, obj, true);
obj = null;
Cu.forceCC();
Services.prefs.setBoolPref(PREF, true);
}
the observer object never gets cleaned up (and the variant never gets unlinked), no matter how what combinations of forceGC() and forceCC() I do. If I leave out the cyclic reference through the variant (which should support cycle collection), the observer goes away after GC.
That suggests that we're either not traversing or not unlinking this reference cycle correctly, but I don't know whether or not it has anything to do with the weak reference.
Comment 11•2 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #4)
Presumably because it's not actually participating in a garbage cycle, just pointing to it?
On further thought, this actually must be participating in a garbage cycle. This bug happens when the CC unlinks the window. The preference observer is being kept alive by the window global's expando object, it has a path back to the window via (at least) its closure over the document object, and it's holding alive its XPCWrappedJS, which has a cyclic reference back to the JS object, so as far as the CC is concerned it has to be part of the same garbage cycle, and should be unlinked.
Comment 12•2 years ago
|
||
There's certainly an argument for just calling ClearWeakReferences() on all nsISupports cycle collected objects that QI to nsISupportsWeakReference. Though I see now that it isn't actually a method on that interface so we couldn't do it that easily.
| Reporter | ||
Comment 13•2 years ago
|
||
and it knows how to determine if an object one of those edges points to is a cycle collectable native object by the normal means.
Right, this is the part that I could not figure out where this is happening, if anywhere...
But if I interpret the issue right, it is that we return early from
Ah, because the weakptr thing is not really holding a ref to us, so the only ref is our self-ref while the JS object is alive, and we take that early return?
keep wrappedJS object alive (it doesn't get unlinked)
Meaning that us not running that code keeps it alive during the CC, in the sense that it has a ref that the CC doesn't know about so the CC considers it live?
Yes, this seems right.
(Why proxy code doesn't recreate the (preserved) wrapper?)
Recreate in what sense? One failure mode here is this sequence of events:
- Things get unlinked, wrapper (reflector) that has expandos gets unpreserved.
- The object is brought back to life.
- Someone reads an expando property from it.
- The someone hands the object over to C++, nothing is holding on to the reflector anymore.
- The reflector is GCed.
- The "someone" grabs the object from C++ again (with new reflector), the expando is gone.
For non-proxy objects (for which we don't even have an assert right now), where would we preserve the wrapper in all this? I don't see a place to do it.
For proxies, we could re-preserve the wrapper during step 3, if we discover we have an expando object but are not preserved. But even that would fail if step 3 happens from an IC (see GetPropIRGenerator::tryAttachDOMProxyExpando) because that basically reimplements the relevant codepath bit in jitcode except obviously without the assert in question and without the ability to preserve wrappers.
and it's holding alive its XPCWrappedJS, which has a cyclic reference back to the JS object
The JS object does not have an edge to the XPCWrappedJS from the CC's point of view. The XPCWrappedJS has a self-edge, but it (and the edge from the XPCWrappedJS to the JS object and in fact all of the XPCWrappedJS outgoing edges) are treated specially when the XPCWrappedJS refcount is 1; that's the first link in comment 6. So in fact if the XPCWrappedJS refcount is 1 it does not end up in a cycle because it has no outgoing edges as far as the CC is concerned. Then if the JS object is GCed, we go ahead and kill the XPCWrappedJS in JSObject2WrappedJSMap::UpdateWeakPointersAfterGC.
Comment 14•2 years ago
|
||
FWIW I tried applying Andrew's patch in Bug 1535403 to browser.xhtml and still see the same assertions: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1cbc93e0798574b5d4e26f43e3053036833bc3af&selectedJob=241244828. Not sure if it was expected to fix it but figured I'd try.
| Reporter | ||
Comment 15•2 years ago
|
||
Yeah, I would not expect that patch to help here...
What would help is if we could figure out a way to represent the conceptual ref from the JSObject to its XPCWrappedJS that's dealt with by the XPCWrappedJS self-ref to the cycle collector. Then the CC would know that the XPCWrappedJS is in a cycle and would actually unlink it while unlinking that cycle.
This is why I was asking how the CC currently detects/represents JS-to-C++ edges.
Comment 16•2 years ago
|
||
Yeah, I split out the weak ref stuff into a different patch, which I can upload tomorrow. I'm not sure if the weak ref part of those patches actually was helping the problem I was trying to fix, so I left it out for now. And they don't seem to fix all of the weak ref to unlinked inner window issues I was seeing.
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #15)
This is why I was asking how the CC currently detects/represents JS-to-C++ edges.
JS objects are traversed by the CC with CycleCollectedJSRuntime::TraverseGCThing().
The way things work now is that we never unlink JS to C++ edges in the CC. Instead, we wait for the GC to run and finalize the JS object, which then releases the C++ object.
Comment 17•2 years ago
|
||
Here's my WIP patch that drops a few instances of weak references to inner and outer windows. I don't know if it will help this.
| Reporter | ||
Comment 18•2 years ago
|
||
It wouldn't, afaict. The weak ref in this bug is to some random object that then entrains the window indirectly.
I was poking at my suggestion in comment 15 earlier today, and discovered a few things:
-
The same JSObject can actually correspond to multiple nsXPCWrappedJS instances (at most one "root" nsXPCWrappedJS and an arbitrary number of non-"root" ones) and hence conceptually has many outgoing edges to C++ things. These edges can coexist with the edges CycleCollectedJSRuntime::NoteGCThingXPCOMChildren already knows about, since any JSObject can end up as an XPCWrappedJS, afaict.
-
At the moment we don't have a JSObject-to-nsXPCWrappedJS map for anything other than "root" nsXPCWrappedJS values. So to implement comment 15 we'd need to create such a map, and due to point #1 it would need to be one-to-many.
-
All the existing knowledge about nsXPCWrappedJS is on XPCJSRuntime, of course, so getting there from NoteGCThingXPCOMChildren would involve a virtual function call or something. But on the other hand, that knowledge is not helpful anyway due to point #2. So maybe we can just add a new map of some sort on CycleCollectedJSRuntime.
Comment 19•2 years ago
|
||
Temporary workaround for preserving wrapper assertions in torn-down
window.
Comment 20•2 years ago
|
||
Workaround looks good on try https://treeherder.mozilla.org/#/jobs?repo=try&revision=7281e3326c3684b2fde312a3410a5d7742692456
Updated•2 years ago
|
The priority flag is not set for this bug.
:neha, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 22•2 years ago
|
||
Comment on attachment 9061538 [details]
Bug 1543537 - Remove download button's lazy pref getter on unload. r=bzbarsky
Revision D29294 was moved to bug 1548026. Setting attachment 9061538 [details] to obsolete.
Comment 23•2 years ago
|
||
As per discussion in #content this doesn't need to block release of browser.xhtml, since any behavior this condition is leading to is already present with XUL documents (we just don't assert for it). So as long as we are green post Bug 1548026 this can be removed from the blocker list.
Comment 24•11 months ago
|
||
So uh, this just bit me in bug 1634032.
Ideally we would prevent calling from the pref service into this stuff somehow.
I think this would be ideal. If this isn't possible,
If we cleared out the slot on the window, this code would stop asserting but would throw from the _placeholder getter instead, which may not be great either.
would help at least make things understandable to frontend people ("this is being called and there's no document anymore" has the benefit of being (a) more understandable than internal JS engine assertion failures and (b) running on opt builds which the majority of frontend folks use because JS debugging with the browser toolbox is too slow on debug ones, so we'd actually see the stuff we were breaking).
I'm afraid the discussions of GC/CC in the earlier comments are over my head. Andrew, do you know what our practical options are here?
Comment 25•10 months ago
|
||
Sorry I haven't answered this needinfo yet. I keep reading over the bug, but I haven't figured out what it means.
Comment 26•2 months ago
|
||
Sorry for ignoring this bug for so long. kmag figured out that this seems to be the underlying cause of bug 1652531, because in one of the assert stacks there was some pref callback stuff.
Anyways, thanks to that I guess I understand what the general issue here in a way that I didn't before. I think I was getting hung up on the store in slot stuff, which I think is not the root cause of this. I worked around it for a particular lazy pref callback in bug 1699614, but obviously it is bad to have to keep doing that.
Ideally, I'd be able to reproduce this locally so I can poke around with a debugger and understand specifically what is happening, but maybe I have some kind of theory.
Maybe there's some set of steps like:
- You have an XPCWJS and create a weak reference to it.
- The XPCWJS's refcount goes from 2 to 1, not via cycle collection. I think this can happen when we're creating a WJS for the pref observer, in the weak case. At this point, the WJS no longer roots its JS object with regards to the CC or GC. It also roots itself in the CC, so it won't be unlinked.
- The CC runs. Nothing is holding the pref callback alive any more, so nothing is holding the window or document alive, so everything (except the WJS) can get unlinked.
- Now, the pref observer runs. The WJS is still alive, so the weak reference can hand out a new strong reference to it, which means the WJS starts to root the JS object again, and basically turns back into a regular WJS again.
- I think because it is a regular WJS now, methods can just be called on it, but it means we can reach unlinked stuff.
Anyways, this seems like a fundamentally bad design. There's a comment in XPCWrappedJS.cpp that says: "There is an extra AddRef to support weak references to wrappers that are subject to finalization." So, this is actually deliberately supported. We should probably figure out how to not support that, and just hope that it won't break things.
Updated•2 months ago
|
Updated•2 months ago
|
Comment 27•2 months ago
|
||
The current behavior dates to bug 66950. I think the idea is that if you have a weak ref to a JS object via an XPCWrappedJS, the weak ref should not go away until the JS object goes away, which seems reasonable. If we only had a GC and not a CC, then this would be fine.
(I think the current design doesn't really work when there's a chain of WJS, because only the root WJS checks if there's a weak reference. That isn't going to be applicable to the weak observer case, because a QI to nsIObserver then a QI to nsISupportsWeakReference will result in only a single WJS, due to special handling of the latter interface in DelegatedQueryInterface.)
As Boris said in comment 15, I think the correct approach here is to represent the strong reference from the JS object to the WJS in the cycle collector somehow, instead of this IsSubjectToFinalization/self edge traverse hack that is there currently. This is similar to the support for weak references in the CC, where you need to store a strong edge outside of the object, but right now we only support JS objects as targets so some additional work will be needed.
We probably also want to get rid of the rigamarole where we do AddWrappedJSRoot/RemoveFromRootSet depending on the refcount. Instead, we'd just always root the JS object, until we are unlinked. This means that a CC is required to free a WJS object, but that's how regular preserved DOM reflectors work so I think that's fine. If we always root the JS object, then all WJS will be added to the CC graph, so we won't need a separate set of all JS that are wrapped by a WJS as mentioned in point 2 in comment 18.
Comment 18 talks about supporting the extra edge by modifying NoteGCThingXPCOMChildren. It looks like that is only called on objects. Can you have a XPCWJS to a function? I think to support that in this way we'd have to do a hashtable lookup on every JS GC thing we add to the CC graph which doesn't seem great.
Description
•