Closed Bug 1543537 Opened 5 years ago Closed 2 years ago

defineLazyPreferenceGetter ends up running unlinked code, triggering cache->PreservingWrapper() assertions

Categories

(Core :: XPConnect, defect, P2)

defect

Tracking

()

RESOLVED FIXED
107 Branch
Tracking Status
firefox-esr102 --- wontfix
firefox105 --- wontfix
firefox106 --- wontfix
firefox107 --- fixed

People

(Reporter: bzbarsky, Assigned: mccr8)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

Is still haven't decided which component this should be in, or even which product...

bdahl has been running into fatal assertions on try with MOZ_BROWSER_XHTML=1, due to the following sequence of event:

  1. browser/components/downloads/content/indicator.js runs in some window, calls defineLazyPreferenceGetter for the "browser.download.autohideButton" preference.
  2. This adds a pref listener that implements nsIWeakReference in JS.
  3. The window in question is closed.
  4. The cycle collector runs, sees nothing has any more references to that window (the weak ref is not a ref because it's not keeping the JS object involved alive!) and unlinks the window, the document, etc. The document is still accessible from the window because it's [StoreInSlot] and we don't clear out the slot, for various reasons.
  5. Before GC gets to run and collect the window, the preference changes and the pref observe is called.
  6. The pref observer calls the _placeholder getter defined in indicator.js and that tries to access properties on the document and hits the assertion when it discovers and expando object and no preserved wrapper (unlinking unpreserved).

MOZ_BROWSER_XHTML=1 matters only because XULDocument is not a DOM proxy (no named getter) and hence doesn't hit the proxy expando code at all.

If we cleared out the slot on the window, this code would stop asserting but would throw from the _placeholder getter instead, which may not be great either.

Ideally we would prevent calling from the pref service into this stuff somehow. Failing that, maybe indicator.js can listen for window teardown and remove its pref observer or no-op the callback function?

Late-night braindump which may or may not help:

I'm wondering if we should just move lazy prefs onto a separate jsm-style (or webidl) thing and stop supporting defining lazy pref getters on windows (or objects tied to windows). That and downloads code needs to stop importing the world when nothing is going on.

In the comments on https://phabricator.services.mozilla.com/D15449 I suggested perhaps in addition to AppConstants.jsm we want something like AppState.jsm where we track things like lazy prefs and "did thing X happen" (e.g. have there been any downloads this session) to avoid having to import additional modules that we don't (yet) need if X hasn't happened / some pref hasn't been flipped. I've been distracted with other things but if this sounds like a sane plan to people I can reprio it.

I've seen this conundrum in a few other places, e.g. pockets code that doesn't need to do much except in the comparatively rare case where it's disabled via policy or the about:config pref, which currently has to load Pocket's own jsms just to work that out. Would be cheaper if that type of state was in a central place.

See Also: → 1516525

(In reply to :Gijs (he/him) from comment #1)

I'm wondering if we should just move lazy prefs onto a separate jsm-style (or webidl) thing and stop supporting defining lazy pref getters on windows (or objects tied to windows).

I have plans to replace the current lazy preference helper with a native implementation, but that unfortunately doesn't help with this case, since it uses a change callback (which I'd never intended the lazy preference getter code to support to begin with...). It'll still have to use something approximating the current JS code if it wants the change callback.

I think this is a more particular manifestation of the general problem that weak references to cycle collected objects can cause dead objects to come back alive if we touch them at the wrong time.

It seems like the sanest thing here to do would be to keep track of weak references that participate in garbage cycles, and unlink them when unlinking everything else in those cycles. I feel like there should be a reasonably elegant way to do that, but my knowledge of the cycle collector doesn't run that deep.

When we unlink an XPCWrappedJS we already call ClearWeakReferences() on it, looks like. So we're presumably not unlinking that in this case. Presumably because it's not actually participating in a garbage cycle, just pointing to it? My understanding of how the CC interacts with the GC here has atrophied, from it's already-poor state... In particular, I don't understand how the cycle collector ends up seeing JS-to-native edges and what it does with unlinking after that.

Olli, is there an explanation of that somewhere, as a sidenote?

Flags: needinfo?(bugs)

Right, in this case by participating in a garbage cycle, I mean pointing into it. Essentially, I'm thinking about treating the weak reference to the JS object as a non-black root, and when there are no paths from black roots to the object it points to, and there are paths from that object to another object that we want to unlink as part of a garbage cycle, we'd unlink the weak reference too.

As for how the CC interacts with the GC, it mostly doesn't. It knows how to trace JS edges using their trace hooks, and it knows how to determine if an object one of those edges points to is a cycle collectable native object by the normal means. It only breaks cycles by unlinking the native objects that are part of them, though, and then leaves the GC to cleanup the then unreachable JS objects.

Using snow white objects or unlinked objects isn't that big problem as such, but does hint about issues elsewhere.

We do traverse XPCWrappedJSs during CC, and do trace them as non-black roots during GC.
And unlinking clears weak references.

But if I interpret the issue right, it is that we return early from
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#112-116 but then things like
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#118-121 keep wrappedJS object alive (it doesn't get unlinked). And unlinking is needed to clear weak references
https://searchfox.org/mozilla-central/rev/69ace9da347adcc4a33c6fa3d8e074759b91068c/js/xpconnect/src/XPCWrappedJS.cpp#487,506

Is there some easy way to reproduce this locally?

(Why proxy code doesn't recreate the (preserved) wrapper?)

Flags: needinfo?(bugs)

(In reply to Olli Pettay [:smaug] from comment #6)

Is there some easy way to reproduce this locally?

I've only been able to reproduce on try (browser-chrome) and I think the same for bz.

ok, which particular test or on which platforms?

Component: General → XPConnect

The test varies and it happens on all platforms. It pretty consistently happens on bc13 browser/base/content/test/general/browser_documentnavigation.js on linux 64 debug. You'll need to set MOZ_BROWSER_XHTML=1.

Example try run with a number of failures:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=022d846cd862ebdfc30fd9033e2290173a893cbf

(In reply to Olli Pettay [:smaug] from comment #6)

Using snow white objects or unlinked objects isn't that big problem as such, but does hint about issues elsewhere.

That may be true in principal, but in practice touching an object after it's been unlinked (or its 0 refcount hook has fired) causes problems. Sometimes quite subtle problems...

I'm not sure if this is related or not, but when I run this code in the browser console:

{
  const PREF = "foo.pref";
  Services.prefs.clearUserPref(PREF);
  let obj = {
    QueryInterface: ChromeUtils.generateQI(["nsISupportsWeakReference"]),
    variant: Cc["@mozilla.org/variant;1"].createInstance(Ci.nsIWritableVariant),
    observe() {
      console.log("Hello.", this.variant, this.variant && this.variant.QueryInterface(Ci.nsIVariant));
    },
  };
  obj.variant.setAsISupports(obj);
  Services.prefs.addObserver(PREF, obj, true);
  obj = null;
  Cu.forceCC();
  Services.prefs.setBoolPref(PREF, true);
}

the observer object never gets cleaned up (and the variant never gets unlinked), no matter how what combinations of forceGC() and forceCC() I do. If I leave out the cyclic reference through the variant (which should support cycle collection), the observer goes away after GC.

That suggests that we're either not traversing or not unlinking this reference cycle correctly, but I don't know whether or not it has anything to do with the weak reference.

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #4)

Presumably because it's not actually participating in a garbage cycle, just pointing to it?

On further thought, this actually must be participating in a garbage cycle. This bug happens when the CC unlinks the window. The preference observer is being kept alive by the window global's expando object, it has a path back to the window via (at least) its closure over the document object, and it's holding alive its XPCWrappedJS, which has a cyclic reference back to the JS object, so as far as the CC is concerned it has to be part of the same garbage cycle, and should be unlinked.

There's certainly an argument for just calling ClearWeakReferences() on all nsISupports cycle collected objects that QI to nsISupportsWeakReference. Though I see now that it isn't actually a method on that interface so we couldn't do it that easily.

and it knows how to determine if an object one of those edges points to is a cycle collectable native object by the normal means.

Right, this is the part that I could not figure out where this is happening, if anywhere...

But if I interpret the issue right, it is that we return early from

Ah, because the weakptr thing is not really holding a ref to us, so the only ref is our self-ref while the JS object is alive, and we take that early return?

keep wrappedJS object alive (it doesn't get unlinked)

Meaning that us not running that code keeps it alive during the CC, in the sense that it has a ref that the CC doesn't know about so the CC considers it live?

Yes, this seems right.

(Why proxy code doesn't recreate the (preserved) wrapper?)

Recreate in what sense? One failure mode here is this sequence of events:

  1. Things get unlinked, wrapper (reflector) that has expandos gets unpreserved.
  2. The object is brought back to life.
  3. Someone reads an expando property from it.
  4. The someone hands the object over to C++, nothing is holding on to the reflector anymore.
  5. The reflector is GCed.
  6. The "someone" grabs the object from C++ again (with new reflector), the expando is gone.

For non-proxy objects (for which we don't even have an assert right now), where would we preserve the wrapper in all this? I don't see a place to do it.

For proxies, we could re-preserve the wrapper during step 3, if we discover we have an expando object but are not preserved. But even that would fail if step 3 happens from an IC (see GetPropIRGenerator::tryAttachDOMProxyExpando) because that basically reimplements the relevant codepath bit in jitcode except obviously without the assert in question and without the ability to preserve wrappers.

and it's holding alive its XPCWrappedJS, which has a cyclic reference back to the JS object

The JS object does not have an edge to the XPCWrappedJS from the CC's point of view. The XPCWrappedJS has a self-edge, but it (and the edge from the XPCWrappedJS to the JS object and in fact all of the XPCWrappedJS outgoing edges) are treated specially when the XPCWrappedJS refcount is 1; that's the first link in comment 6. So in fact if the XPCWrappedJS refcount is 1 it does not end up in a cycle because it has no outgoing edges as far as the CC is concerned. Then if the JS object is GCed, we go ahead and kill the XPCWrappedJS in JSObject2WrappedJSMap::UpdateWeakPointersAfterGC.

See Also: → 1545441

FWIW I tried applying Andrew's patch in Bug 1535403 to browser.xhtml and still see the same assertions: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1cbc93e0798574b5d4e26f43e3053036833bc3af&selectedJob=241244828. Not sure if it was expected to fix it but figured I'd try.

Yeah, I would not expect that patch to help here...

What would help is if we could figure out a way to represent the conceptual ref from the JSObject to its XPCWrappedJS that's dealt with by the XPCWrappedJS self-ref to the cycle collector. Then the CC would know that the XPCWrappedJS is in a cycle and would actually unlink it while unlinking that cycle.

This is why I was asking how the CC currently detects/represents JS-to-C++ edges.

Yeah, I split out the weak ref stuff into a different patch, which I can upload tomorrow. I'm not sure if the weak ref part of those patches actually was helping the problem I was trying to fix, so I left it out for now. And they don't seem to fix all of the weak ref to unlinked inner window issues I was seeing.

(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #15)

This is why I was asking how the CC currently detects/represents JS-to-C++ edges.

JS objects are traversed by the CC with CycleCollectedJSRuntime::TraverseGCThing().

The way things work now is that we never unlink JS to C++ edges in the CC. Instead, we wait for the GC to run and finalize the JS object, which then releases the C++ object.

Here's my WIP patch that drops a few instances of weak references to inner and outer windows. I don't know if it will help this.

It wouldn't, afaict. The weak ref in this bug is to some random object that then entrains the window indirectly.

I was poking at my suggestion in comment 15 earlier today, and discovered a few things:

  1. The same JSObject can actually correspond to multiple nsXPCWrappedJS instances (at most one "root" nsXPCWrappedJS and an arbitrary number of non-"root" ones) and hence conceptually has many outgoing edges to C++ things. These edges can coexist with the edges CycleCollectedJSRuntime::NoteGCThingXPCOMChildren already knows about, since any JSObject can end up as an XPCWrappedJS, afaict.

  2. At the moment we don't have a JSObject-to-nsXPCWrappedJS map for anything other than "root" nsXPCWrappedJS values. So to implement comment 15 we'd need to create such a map, and due to point #1 it would need to be one-to-many.

  3. All the existing knowledge about nsXPCWrappedJS is on XPCJSRuntime, of course, so getting there from NoteGCThingXPCOMChildren would involve a virtual function call or something. But on the other hand, that knowledge is not helpful anyway due to point #2. So maybe we can just add a new map of some sort on CycleCollectedJSRuntime.

Temporary workaround for preserving wrapper assertions in torn-down
window.

Attachment #9059899 - Attachment is obsolete: true

The priority flag is not set for this bug.
:neha, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(nkochar)
Assignee: nobody → bdahl
Status: NEW → ASSIGNED
Flags: needinfo?(nkochar)
Priority: -- → P2
See Also: → 1548026
Assignee: bdahl → nobody
Status: ASSIGNED → NEW

Comment on attachment 9061538 [details]
Bug 1543537 - Remove download button's lazy pref getter on unload. r=bzbarsky

Revision D29294 was moved to bug 1548026. Setting attachment 9061538 [details] to obsolete.

Attachment #9061538 - Attachment is obsolete: true

As per discussion in #content this doesn't need to block release of browser.xhtml, since any behavior this condition is leading to is already present with XUL documents (we just don't assert for it). So as long as we are green post Bug 1548026 this can be removed from the blocker list.

No longer blocks: 1533881
See Also: → 1533881

So uh, this just bit me in bug 1634032.

Ideally we would prevent calling from the pref service into this stuff somehow.

I think this would be ideal. If this isn't possible,

If we cleared out the slot on the window, this code would stop asserting but would throw from the _placeholder getter instead, which may not be great either.

would help at least make things understandable to frontend people ("this is being called and there's no document anymore" has the benefit of being (a) more understandable than internal JS engine assertion failures and (b) running on opt builds which the majority of frontend folks use because JS debugging with the browser toolbox is too slow on debug ones, so we'd actually see the stuff we were breaking).

I'm afraid the discussions of GC/CC in the earlier comments are over my head. Andrew, do you know what our practical options are here?

Flags: needinfo?(continuation)
See Also: → 1634032

Sorry I haven't answered this needinfo yet. I keep reading over the bug, but I haven't figured out what it means.

See Also: → 1652531

Sorry for ignoring this bug for so long. kmag figured out that this seems to be the underlying cause of bug 1652531, because in one of the assert stacks there was some pref callback stuff.

Anyways, thanks to that I guess I understand what the general issue here in a way that I didn't before. I think I was getting hung up on the store in slot stuff, which I think is not the root cause of this. I worked around it for a particular lazy pref callback in bug 1699614, but obviously it is bad to have to keep doing that.

Ideally, I'd be able to reproduce this locally so I can poke around with a debugger and understand specifically what is happening, but maybe I have some kind of theory.

Maybe there's some set of steps like:

  1. You have an XPCWJS and create a weak reference to it.
  2. The XPCWJS's refcount goes from 2 to 1, not via cycle collection. I think this can happen when we're creating a WJS for the pref observer, in the weak case. At this point, the WJS no longer roots its JS object with regards to the CC or GC. It also roots itself in the CC, so it won't be unlinked.
  3. The CC runs. Nothing is holding the pref callback alive any more, so nothing is holding the window or document alive, so everything (except the WJS) can get unlinked.
  4. Now, the pref observer runs. The WJS is still alive, so the weak reference can hand out a new strong reference to it, which means the WJS starts to root the JS object again, and basically turns back into a regular WJS again.
  5. I think because it is a regular WJS now, methods can just be called on it, but it means we can reach unlinked stuff.

Anyways, this seems like a fundamentally bad design. There's a comment in XPCWrappedJS.cpp that says: "There is an extra AddRef to support weak references to wrappers that are subject to finalization." So, this is actually deliberately supported. We should probably figure out how to not support that, and just hope that it won't break things.

Summary: indicator.js ends up running code in torn-down (cycle-collected, unlinked) windows, triggering cache->PreservingWrapper() assertions → defineLazyPreferenceGetter ends up running unlinked code, triggering cache->PreservingWrapper() assertions
Flags: needinfo?(continuation)
See Also: → 947049
Blocks: 1652531
See Also: 1652531

The current behavior dates to bug 66950. I think the idea is that if you have a weak ref to a JS object via an XPCWrappedJS, the weak ref should not go away until the JS object goes away, which seems reasonable. If we only had a GC and not a CC, then this would be fine.

(I think the current design doesn't really work when there's a chain of WJS, because only the root WJS checks if there's a weak reference. That isn't going to be applicable to the weak observer case, because a QI to nsIObserver then a QI to nsISupportsWeakReference will result in only a single WJS, due to special handling of the latter interface in DelegatedQueryInterface.)

As Boris said in comment 15, I think the correct approach here is to represent the strong reference from the JS object to the WJS in the cycle collector somehow, instead of this IsSubjectToFinalization/self edge traverse hack that is there currently. This is similar to the support for weak references in the CC, where you need to store a strong edge outside of the object, but right now we only support JS objects as targets so some additional work will be needed.

We probably also want to get rid of the rigamarole where we do AddWrappedJSRoot/RemoveFromRootSet depending on the refcount. Instead, we'd just always root the JS object, until we are unlinked. This means that a CC is required to free a WJS object, but that's how regular preserved DOM reflectors work so I think that's fine. If we always root the JS object, then all WJS will be added to the CC graph, so we won't need a separate set of all JS that are wrapped by a WJS as mentioned in point 2 in comment 18.

Comment 18 talks about supporting the extra edge by modifying NoteGCThingXPCOMChildren. It looks like that is only called on objects. Can you have a XPCWJS to a function? I think to support that in this way we'd have to do a hashtable lookup on every JS GC thing we add to the CC graph which doesn't seem great.

See Also: → 1694438
See Also: → 1779427
See Also: → 1789952
Blocks: 947049
See Also: 947049
Depends on: 1790102

I started looking at this again because it seems like it has popped up again as an issue for Firefox devs.

The prototype patch adds a NoteWeakMapping to the CC traversal callback. This lets us add arbitrary edges to the CC graph during traversal.

The main thing the patch does is remove all of the weird IsSubjectToFinalization stuff. XPCWJS now uses regular refcounting. They also always keep their JS object alive.

Now, only root wrappers have the special extra reference, because it only effectively did anything for those before anyways. This extra reference gets cleared in Unlink, because root WJS can only be freed after a CC.

What the old IsSubjectToFinalization state was doing before was that it made the JS object keep the WJS alive (instead of vice versa) if there were no other references to the WJS, and the WJS had a weak reference. The idea is that the weak ref shouldn't go away unless the underlying JS object goes away. This is the same idea as wrapper preservation.

Anyways, to try to do that same thing in a less bizarre way, I added a TraverseWeakEdge() call to the WJS traverse from the JS object to the WJS, if there's a weak reference. If there isn't we do a "self edge" and thus effectively ignore the extra reference. Unfortunately, this makes the patch leak some WJS (and other things). I don't know if this is a bug in my patch, or if we're currently relying on the crummy behavior that drops things too aggressively (the thing this bug is trying to address) to prevent leaks.

My patch makes it so there's a cycle between the WJS and the JS object, instead of a one way ownership, but I think that because we have the CC it shouldn't make a difference. I'll try to find some time to debug the leak my patch is causing.

I figured out what the bug in my patch was: a root wrapper needs to always traverse the self edge, so that our magical extra weak edge does not keep the WJS alive due to the refcount, because we're not actually accounting for that edge when we determine refcount. Instead, we rely on the weak edge liveness propagation. This at least makes a few Mochitests run without failing.

I should really write some detailed tests akin to what kmag came up with in comment 10. I also need to make sure we don't leak WJS until shutdown. I can probably do that by using the JS object of a WJS as a key in a weak map and then using the magic chrome only function to check if there are weak map entries.

I did some investigation of kmag's test case in comment 10. I fancied up the test case a bit using exactGC(), but the callback was still not going away as expected. It appears that this is due to the variant being touched by the pref callback, because I looked at a log and saw that the variant XPCWN reflector was being marked black while the observer JS object was being marked grey. I don't know why it takes so long for the XPCWN to go away, but I'm going to assume this is an issue with XPCWNs or variants and ignore this, as I'm already deep enough down a rabbit hole. I should file a bug about it...

See Also: → 1790649
Attachment #9294157 - Attachment is obsolete: true

I wrote some tests. The first one doesn't fail with the current setup (which it should to catch the issue in this bug), so I'll have to work on it.

When I was messing around with the tests, I found a bug in my previous patch. It only showed up when I had CanSkip optimizations disabled for reasons I don't remember. Anyways, we need to always visit the self edge for the refcount, even if there is a weak ref, because the weak edge hack I use doesn't count as a refcount, so it'll end up rooting the WJS which is bad. The second problem with my prior patch was that the self edge on root XPCWJS without weak references meant we couldn't skip them, but there are a lot of them in the parent process, so this would slow down the CC.

In my new patch, instead of using nsSupportsWeakReference to implement the weak reference, I implement my own weak reference with a key difference: it is actually a strong reference. Then I can eliminate the NS_ADDREF_THIS() in the XPCWJS ctor (used both in the current implementation and my prior patch) because the "weak" reference adds a refcount when needed. This means that CanSkip() remains correct, because we only hit the "true" part of the code when the JS object is black, and we only have the extra reference when the JS object keeps alive the WJS, so we're not going to skip the WJS when it could be freed.

(In reply to Andrew McCreight [:mccr8] from comment #27)

It looks like that is only called on objects. Can you have a XPCWJS to a function?

You can, but I don't know if you can make it a weak reference. We generate them when you pass a function to something expecting a [function] interface. Maybe with some effort you could add a QueryInterface to the function and make it support nsISupportsWeakReference. I'm not sure.

I managed to write a test that demonstrates the "access an unlinked object" issue with the current implementation of weak references to XPCWJS.

The sequence of steps required is:

  1. Create a WJS where:
  • The JS object it wraps is the only reference to some cycle collected object ("the canary").
  • The WJS has a weak reference, but no other references ("subject to finalization").
  • There is a JS/C++ cycle that involves the JS object, but not the WJS.
  1. Drop all other references to the JS/C++ cycle so it becomes garbage.
  2. Run the GC.
  • The JS object is held alive only by C++, so it gets marked gray. This is necessary for the CC to consider it garbage.
  • The WJS stays alive because the JS object is alive.
  1. Run the cycle collector.
  • CanSkip for the WJS returns true, because it is subject to finalization. Therefore it won't be added to the CC graph, so it won't be unlinked, and thus the weak reference won't be cleared.
  • The WJS doesn't hold the JS object alive, so the cycle with the JS object gets unlinked, and thus the canary held alive by the JS object is unlinked.
  1. Via the weak reference to the WJS, you can access the JS object, which in turn lets you access the canary.

(If the GC is run again, the weak reference will be cleared by the subject-to-finalization mechanism, and the state will be consistent again.)

In my test, I use an nsArrayCC (created via Cc["@mozilla.org/array;1"].createInstance(Ci.nsIMutableArray)) as the canary. The unlink for this class removes all elements of the array, so if the array starts out non-empty you can easily check if it was unlinked without crashing by seeing if the length is zero or not.

For nsXPCWrappedJS, we need to add an edge to the cycle collector graph
from the JS object to its WJS, even though the JS object does not actually
contain a pointer to the WJS object. We can (mis)use the weak mapping
interface for this purpose. The only difference is that we need it during
traversal instead of while adding roots. The API is fairly specialized to
the specific use case we need.

The major complexity of the lifetime of nsXPCWrappedJS is that when there is a weak
reference to it, the nsXPCWrappedJS must be kept alive until its JS object goes
away. This is analogous to wrapper preservation in nsWrapperCache, except that
we can't modify the JS objects being wrapped.

The existing mechanism centers around a property IsSubjectToFinalization
which relies on a combination of a complex refcounting scheme and the cooperation
of the garbage collector. Unfortunately, the cycle collector doesn't really
understand it, so it can end up unlinking things that are still reachable via
weak references, leading to crashes.

My patch eliminates the IsSubjectToFinalization scheme entirely, changing over
to standard refcounting and removing the GC involvement. Instead, I create a new
class SeverableSingletonStrongReference, which is like a weak reference, except
that it has a strong reference to the WJS. This means we keep alive any WJS with
a "weak" reference until we CC. The CC can then (mis)use the weak map mechanism to
add a synthetic reference from the JS object to the WJS.

Attachment #9294767 - Attachment is obsolete: true

There's a lot of orange in my try push, and some of it looks like it could be related, so I'll have to investigate.

It seems like maybe some tests depend on a GC alone freeing WJS in the "subject to finalization" state (eg where the only reference is a weak reference). I tried modifying my SeverableSingletonStrongReference approach to support this, but the end result was not really any better than the current state.

Then I went back and made a more minimal version of the patch that leaves that alone and just adds proper handling of subject-to-finalization WJS to the CC. That's actually a nice and small patch. The problem there is that since bug 1790102, WJS not being actively used are only suspected because they are JS holders. CycleCollectedJSRuntime::TraverseNativeRoots() doesn't add roots to the graph if they don't hold alive JS objects, which makes sense, but unfortunately "subject to finalization" WJS don't hold any strong references to JS objects. I've had a few ideas to add them back (iterate over the WJS tables in all compartments, make Trace behave differently in the CC vs GC) that I haven't been able to make work, so maybe I'll have to go back to there being a linked list of WJS that I can scan. I'll probably use a regular LinkedList.

The cycle collector needs to be able to see nsXPCWrappedJS that are held
alive only from JS in order to unlink them. This patch adds a linked
list of WJS that are subject to finalization, and suspects those WJS in
the CC. Wrapped JS are normally suspected via the JS holder mechanism,
but that skips things, like WJS subject to finalization, that aren't holding
alive JS objects.

nsXPCWrappedJS use a special extra refcount to support weak references. When
a WJS is held alive only by this refcount, instead of holding its JS object
alive, the JS object holds it alive. This patch adds support for properly
cycle collecting WJS in this state. First, it makes it so the CC will traverse
these WJS, so that it has a chance to collect them. Secondly, it represents
the reference from the JS object to the WJS via the NoteWeakMapping API.
This lets us represent a strong reference outside of the actual object.

This also adds some basic tests for the lifetime of WJS with weak references.
The first two tests pass with and without this patch. This patch makes the
third test pass.

Attachment #9295000 - Attachment is obsolete: true
Assignee: nobody → continuation
Status: NEW → ASSIGNED
Severity: normal → S2
See Also: → 1762665
Pushed by amccreight@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/350066b48584
part 1 - Add a NoteWeakMapping method to traversal callback. r=kmag
https://hg.mozilla.org/integration/autoland/rev/f40d36a84871
part 2 - Maintain and suspect a list of WJS that are subject to finalization. r=kmag
https://hg.mozilla.org/integration/autoland/rev/e48a707bcfa0
part 3 - Cycle collect subject-to-finalization nsXPCWrappedJS. r=kmag
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 107 Branch
Blocks: 1792457

This bug is extremely old, but the only real manifestation seems to be in tests that flip preferences for features written in chrome JS, and this is a bit of a scary change, so we should let it ride the trains unless it becomes an issue on another branch, or some kind of crash goes away that we notice.

Blocks: 1405521
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: