1782936 - Using WeakRef in add-on content-script to reference content loses reference too quickly

jelmer

Reporter

Description

•

2 years ago

Steps to reproduce:

When creating a WeakRef to a DOM element currently on the page from the context of an add-on's content-script.

I've got a minimal example at https://github.com/jelmervdl/minimal-weakref-addon. But the gist of it is:

Content script:

var ref;

function test() {
    const body = document.wrappedJSObject.querySelector('body');

    ref = new WeakRef(body);
    console.log('[content-script] created ref to', ref.deref());

    document.body.addEventListener('click', e => {
        console.log('[content-script] ref is', ref.deref());
    });
}

test();

Then go to a simple static web page in one tab.

Actual results:

When clicking on the page initially, "document.body" is printed to the console. But after switching to a new tab, or navigating a bit in another tab, going back to this tab and clicking the document prints "undefined" to the console.

Expected results:

I'd expect it to maintain the reference as long as that page exists.

Use case (because why would you do this?!) is that I want to maintain a map of all the original content that my add-on replaces. There is a MutationObserver that keeps my internal copy of the page up-to-date. As the page changes, some of the elements in that structure will also be retained by the page itself (although they're not currently in the DOM tree) but most of them are just thrown away. I'd like to be able to retain the ones that the page cares about, but not prevent the page from releasing the elements it doesn't care about.

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

2 years ago

The Bugbug bot thinks this bug should belong to the 'Core::JavaScript Engine' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → JavaScript Engine

Product: Firefox → Core

Matthew Gaudet (he/him) [:mgaudet]

Comment 2

•

2 years ago

Hey Jon: any thoughts on how WeakRef retention works with content scripts? I have no real intuition here at the moment!

Status: UNCONFIRMED → RESOLVED

Closed: 2 years ago

Flags: needinfo?(jcoppeard)

Resolution: --- → FIXED

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

2 years ago

Severity: -- → S3

Component: JavaScript Engine → JavaScript: GC

Priority: -- → P3

Jon Coppeard (:jonco)

Comment 3

•

2 years ago

I'd expect this to work. I don't know too much about how content scripts are implemented specifically though.

Status: RESOLVED → REOPENED

Ever confirmed: true

Resolution: FIXED → ---

Matthew Gaudet (he/him) [:mgaudet]

Comment 4

•

2 years ago

(Oops. No idea how I closed that one)

I too would expect this to work. The reproduction is very clear and easy. I did a bit of further investigation

I don't believe the ref is to an x-ray: console.log("body is wrapped? ", 'wrappedJSObject' in body); prints false.
I tried making a ref to an x-ray, by dropping the unwrap of the document. The ref still is cleared.
Just clicking on the page 3 times seems sufficient for the weak ref to get cleared.

Would it help if I could get a pernosco recording of this Jon?

Steve Fink [:sfink] [:s:]

Assignee

Comment 5

•

2 years ago

Yes, I think a Pernosco recording could be useful here.

Andrew McCreight [:mccr8]

Comment 6

•

2 years ago

Maybe Kris might have a guess about what is going wrong? We've had lots of nasty issues about lifetimes involving addon or chrome code and DOM stuff before.

Flags: needinfo?(kmaglione+bmo)

Matthew Gaudet (he/him) [:mgaudet]

Comment 7

•

2 years ago

Attached file WIP: Bug 1782936 - Weakref in extension loss testcase (obsolete) — Details

Matthew Gaudet (he/him) [:mgaudet]

Comment 8

•

2 years ago

•

Edited

Test case is attached, and here is a Pernosco recording of the failure

Kris Maglione [:kmag]

Comment 9

•

2 years ago

I'm assuming it has something to do with the CCW. Does the WeakRef stay alive if you hold a strong ref to the object somewhere else? And does it keep working as a WeakMap key longer than as a WeakRef?

Flags: needinfo?(kmaglione+bmo)

Steve Fink [:sfink] [:s:]

Assignee

Comment 10

•

2 years ago

I should probably try to understand how all this stuff works.

What I see in that Pernosco recording is that the HTMLBodyElement target is properly kept alive via wrapper preservation, but the WeakRef object is not. I would think it would be kept alive as part of the closure.

Details from the pernosco session:

The WeakRef is created with an HTMLBodyElement target, which is wrapper-preserved as a result of the WeakRef. The WRAPPER_PRESERVED_BIT is only cleared at shutdown. A GC while the test is running clears the target.

During that GC it calls TraceGrayJS, which marks the target. At the time the WeakRef's target is cleared, the target is marked. It's clearing it because the WeakRef is found to be dead during sweeping. Or rather, the WeakRef itself is marked gray, and its CCW wrapper is unmarked. I'm a bit confused about the compartments involved here, but at any rate I don't understand why a gray WeakRef would clear its target.

jonco: We create a WeakRef 0x...d0100 in compartment 1 with a target HTMLDocumentElement 0x...d00b0 in compartment 2. The WeakRef is wrapped with a CCW 0x...c6318 in the target compartment 2. (The HTMLDocumentElement also has a CCW wrapper 0x...c62a8 but it's only used to pass in the target at the very beginning, so doesn't come into play at all later.)

So compartment 2 must be the content document, and compartment 1 is the... addon? Whatever the test code is running in.

The WeakRef CCW 0x...c6318 is found to be dead. Which makes sense, I guess, since nothing in the content compartment would ever reference it. So the GC sees an unmarked WeakRef CCW that wraps a gray WeakRef, which has a cross-compartment 'target' edge pointing to the black HTMLDocumentElement. FinalizationObservers::traceWeakWeakRefVector sees the dead WeakRef CCW (with TraceWeakEdge) and clears out its target in updateForRemovedWeakRef. That's the part that seems fishy—just because a CCW to a WeakRef is dead, why should the WeakRef itself get its target cleared?

Steve Fink [:sfink] [:s:]

Assignee

Comment 11

•

2 years ago

(In reply to Kris Maglione [:kmag] from comment #9)

I'm assuming it has something to do with the CCW. Does the WeakRef stay alive if you hold a strong ref to the object somewhere else? And does it keep working as a WeakMap key longer than as a WeakRef?

I don't entirely understand the original use case either, but with my limited understanding it sounds to me like a classic WeakMap scenario. And in my opinion WeakMap should always be used in preference to WeakRef when possible, for robustness and forwards compatibility among other things. WeakRef semantics bad, WeakMap semantics good.

(We still need to fix this bug, of course.)

Kris Maglione [:kmag]

Comment 12

•

2 years ago

(In reply to Steve Fink [:sfink] [:s:] from comment #11)

I don't entirely understand the original use case either, but with my limited understanding it sounds to me like a classic WeakMap scenario. And in my opinion WeakMap should always be used in preference to WeakRef when possible, for robustness and forwards compatibility among other things. WeakRef semantics bad, WeakMap semantics good.

(We still need to fix this bug, of course.)

Yes, I agree that WeakMaps (or WeakSets) are generally preferable, and that this needs to be fixed either way. But I was mostly interested in whether WeakMap keys with a reference to the object were still being preserved after the WeakRef lost its target, because that would point to something possibly being different in the wrapper preservation semantics. I think we would have heard about this before if it was also broken for WeakMaps, though.

But I was assuming that we would create a WeakRef to the CCW for the body element. Now it sounds like we unwrap the element, create a WeakRef to the unwrapped element, and then return a CCW to that. Which I guess makes sense. We do similar things for other typed arrays.

(In reply to Steve Fink [:sfink] [:s:] from comment #10)

So compartment 2 must be the content document, and compartment 1 is the... addon? Whatever the test code is running in.

Compartment 1 would be the Sandbox for the add-on content script, with an expanded principal for the add-on origin and the content page origin.

The WeakRef CCW 0x...c6318 is found to be dead. Which makes sense, I guess, since nothing in the content compartment would ever reference it.

But the CCW isn't in the content compartment, it's in the Sandbox compartment, right? So we shouldn't mark it dead when sweeping the content compartment, right? And in the Sandbox compartment, the "click" listener closure should reference it. Or am I misunderstanding something?

Andrew McCreight [:mccr8]

Comment 13

•

2 years ago

Do weak refs deal with all of that annoying delegate stuff? I'd assumed so since they were made more recently, but this does sound like it is in the ballpark of those annoying delegate issues for DOM wrappers we had to deal with for weak maps (which took like 5 years to get them all fixed...).

Steve Fink [:sfink] [:s:]

Assignee

Comment 14

•

2 years ago

(In reply to Kris Maglione [:kmag] from comment #12)

But I was assuming that we would create a WeakRef to the CCW for the body element. Now it sounds like we unwrap the element, create a WeakRef to the unwrapped element, and then return a CCW to that. Which I guess makes sense. We do similar things for other typed arrays.

No, that's not what's happening here. You are correct about typed arrays, that's exactly what they do in order to make the typed array same-compartment with the ArrayBuffer. But here, we allow the WeakRef → target edge to be cross-compartment. Internally, we create a CCW to the WeakRef that gets added to a table in the unwrapped target's compartment, so that when the target dies we can look it up and find the WeakRefs to clear. But the WeakRef returned to the caller is not a CCW; it's the actual WeakRef.

(In reply to Steve Fink [:sfink] [:s:] from comment #10)

So compartment 2 must be the content document, and compartment 1 is the... addon? Whatever the test code is running in.

Compartment 1 would be the Sandbox for the add-on content script, with an expanded principal for the add-on origin and the content page origin.

Ok, thanks.

The WeakRef CCW 0x...c6318 is found to be dead. Which makes sense, I guess, since nothing in the content compartment would ever reference it.

But the CCW isn't in the content compartment, it's in the Sandbox compartment, right? So we shouldn't mark it dead when sweeping the content compartment, right? And in the Sandbox compartment, the "click" listener closure should reference it. Or am I misunderstanding something?

Right, this was the tricky part described above. The WeakRef CCW is in the content compartment. In the Sandbox compartment, there's an actual WeakRef object (alongside a CCW wrapper to the document) storing a direct pointer to the unwrapped document in the content compartment. Thus, the listener closure references the actual WeakRef object.

I talked to Jon about this. It seems like what's going on here is either too little or too much wrapper preservation. The code is creating wrappers because things are cross-compartment, but in this case they are not cross-Zone. If the target were in a different Zone, this would all work. So we can either do something like wrapper preservation for the WeakRef CCW, or we could say that the WeakRef CCW isn't necessary since the target → WeakRef table is on a Zone basis and so we should just use the actual WeakRef for the table instead of creating a useless CCW. I'll try the latter.

Steve Fink [:sfink] [:s:]

Assignee

Comment 15

•

2 years ago

(In reply to Andrew McCreight [:mccr8] from comment #13)

Do weak refs deal with all of that annoying delegate stuff? I'd assumed so since they were made more recently, but this does sound like it is in the ballpark of those annoying delegate issues for DOM wrappers we had to deal with for weak maps (which took like 5 years to get them all fixed...).

It's basically the same stuff, but implemented with a collection of explicit tables rather than magic rules about marking. I need to think about it, but I think they may have different constraints and so can't share the same mechanism? I'm not at all sure about that. Magic marking rules might work for both, but it was a pain to get those right.

Flags: needinfo?(jcoppeard)

Steve Fink [:sfink] [:s:]

Assignee

Comment 16

•

2 years ago

Minimal test case:

var target = {};

var g = newGlobal({sameZoneAs: target}); // same zone, different compartment
g.obj = target;
g.eval(`
  let wr = new WeakRef(obj);
  assertEq(wr.deref() === obj, true, "before gc, target is found");
  gc();
  assertEq(wr.deref() === obj, true, "after gc, target is found"); // assertion fails
`);

Kris Maglione [:kmag]

Comment 17

•

2 years ago

(In reply to Steve Fink [:sfink] [:s:] from comment #14)

No, that's not what's happening here. You are correct about typed arrays, that's exactly what they do in order to make the typed array same-compartment with the ArrayBuffer. But here, we allow the WeakRef → target edge to be cross-compartment. Internally, we create a CCW to the WeakRef that gets added to a table in the unwrapped target's compartment, so that when the target dies we can look it up and find the WeakRefs to clear. But the WeakRef returned to the caller is not a CCW; it's the actual WeakRef.

OK, thanks. It makes sense now.

I talked to Jon about this. It seems like what's going on here is either too little or too much wrapper preservation. The code is creating wrappers because things are cross-compartment, but in this case they are not cross-Zone. If the target were in a different Zone, this would all work.

Oh, right, I hadn't considered that. This is one of the few cases where we still have same-zone CCWs since we implemented same-compartment Realms.

Steve Fink [:sfink] [:s:]

Assignee

Comment 18

•

2 years ago

Attached file Bug 1782936 - fix early finalization of same-zone, cross-compartment WeakRefs — Details

Phabricator Automation

Updated

•

2 years ago

Assignee: nobody → sphink

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

2 years ago

Status: REOPENED → ASSIGNED

Pulsebot

Comment 19

•

2 years ago

Pushed by sfink@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f4d1ad042658
fix early finalization of same-zone, cross-compartment WeakRefs r=jonco

Sandor Molnar[:smolnar]

Comment 20

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f4d1ad042658

Status: ASSIGNED → RESOLVED

Closed: 2 years ago → 2 years ago

status-firefox105: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 105 Branch

Phabricator Automation

Updated

•

2 years ago

Attachment #9289130 - Attachment is obsolete: true

WIP: Bug 1782936 - Weakref in extension loss testcase 2 years ago Matthew Gaudet (he/him) [:mgaudet] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1782936 - fix early finalization of same-zone, cross-compartment WeakRefs 2 years ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review