Closed Bug 1520075 Opened 1 year ago Closed 1 year ago

Using `allFrames: true` for a ChildActor keeps frame content windows alive

Categories

(Toolkit :: Async Tooling, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla67
Fission Milestone M1
Tracking Status
firefox66 --- wontfix
firefox67 --- fixed

People

(Reporter: MattN, Assigned: Felipe)

References

Details

Attachments

(2 files, 1 obsolete file)

Many tests fail saying that the contentWindow is held alive when I add an allFrames:true ChildActor. Attached is a minimal test case that can be tested with

./mach test dom/abort/tests/test_event_listener_leaks.html

A debug build isn't needed for this kind of leak test.

This is blocking 2 conversions which need allFrames for dealing with form fields.

Kris, I made some quick attempts to do more cleanup in ActorManagerChild.jsm but I must have still missed something or the issue is maybe not in that file. Can you take a look at this please?

Flags: needinfo?(kmaglione+bmo)

So I investigated what is causing this leak. It's not related to the instantiation of the actor at all, as in MattN's example the actor never gets instantiated. It is actually the SingletonDispatcher that causes it, due to the "unload" listener from here:
https://searchfox.org/mozilla-central/rev/c07aaf12f13037c1f5a343d31f8291549e57373f/toolkit/modules/ActorManagerChild.jsm#59

I added a removeEventListener call in the cleanup function, and it did improve the situation, bringing the test failures from 2 to 1. But it still fails in this case:
https://searchfox.org/mozilla-central/rev/c07aaf12f13037c1f5a343d31f8291549e57373f/dom/events/test/event_leak_utils.js#36-39

It looks like the unload listener in the mm is never called in that case.

I changed the add/remove listener from the mm to the window, and it fixed it. I'm not sure if that's a good solution. Kris, what do you think?
I was hoping to get rid of the window listeners (the pageshow/pagehide) to help with bug 1520383.

Flags: needinfo?(kmaglione+bmo)

(Renewing the needinfo to kmag due to the new information posted)

Flags: needinfo?(kmaglione+bmo)

note: i was using this version of the test: dom/abort/tests/test_event_listener_leaks.html

Ok, I found what what the real cause of the problem is. The fix that I mentioned above only works if the Actor doesn't have any specified event listeners (as in the minimal testcase posted), because it was just wallpapering over the problem by tying the listener to the window lifetime. But if we add an event listener, it comes back again.

The real issue is that the MozDocumentObserver is called twice [1] during this testcase [2], one in the expected _eventListenerLeakStep, and once again for the same window when .open() is called. But only one unload is triggered during the testcase.

Kris, is this a bug in the MozDocumentObserver or somewhere else?

[1] https://searchfox.org/mozilla-central/rev/c07aaf12f13037c1f5a343d31f8291549e57373f/toolkit/modules/ActorManagerChild.jsm#293-295
[2] https://searchfox.org/mozilla-central/rev/c07aaf12f13037c1f5a343d31f8291549e57373f/dom/events/test/event_leak_utils.js#33-39

Fission Milestone: --- → M1

Felipe, since this is targeting Fission M1 (Feb end), we need an assignee on the bug. I'm wondering if that will be you or MattN as both of you have contributed here. :)

Flags: needinfo?(felipc)

I'll be the assignee... but I'm stuck on it and waiting for kmag's feedback

Assignee: nobody → felipc
Status: NEW → ASSIGNED
Flags: needinfo?(felipc)
Whiteboard: [2/14] per comment 6, waiting on kmag's inputs

So I spent some time debugging this today and found a bunch of weird behavior which I didn't know how to explain well, so let me write it down what I found:

The main trigger of the problem is calling frame.contentDocument.open() [1], which ends up replacing the inner window related to that frame (and that's what the test is exactly trying to catch, so.. good test!)

Whenever that .open() calls happen, ExtensionPolicyService will end up calling ActorManagerChild's onNewDocument again (through ExtensionPolicyService::CheckDocument). Now, this new window passed to it is not the same window as before (I guess because they are new WindowProxyHolder objects?), but they have the same outerWindowID.

So what I tried to do is to not create a new SingletonDispatcher when this second window gets created.. But that was no luck because what leaks is the first window.

I then tried to add unload listeners directly to both, to make sure clean-up happens.. But none of the unload handlers are called before this test perform its checks. So, no matter how strong of a clean-up we do, it's no help to fix the leak.

Anyways, after going through all of that I found a workaround, which I think will be an improvement in any case.. I replaced the window reference that we kept in the dispatcher to a weak reference, and then the problem is gone!

I'm not sure if there should be some more specific fix, but this one works and will help avoid other possible leaks, and I'm not sure how much more time we should spend on these actors before we start replacing them with the Window Actors, so I think this one will do.

[1] https://searchfox.org/mozilla-central/rev/c07aaf12f13037c1f5a343d31f8291549e57373f/dom/events/test/event_leak_utils.js#33-39

Flags: needinfo?(kmaglione+bmo)

Sorry, I meant to comment on this the last time I started looking into it.

I haven't been able to reproduce this locally, but given that it's triggered by observers fired for new inner windows created by document.open(), I'm pretty sure it's going to be fixed by the document.open() patches Boris is working on, and which I think are close to being landable.

I suppose we could try to workaround it in the mean time, but I don't know how useful that would be.

To trigger the leak you need to apply this patch and then run

mach test dom/abort/tests/test_event_listener_leaks.html

bz, could you do a quick test with and without your patches from bug 1489308 to see if this test is fixed by your changes?

Attachment #9036482 - Attachment is obsolete: true
Flags: needinfo?(bzbarsky)

To trigger the leak you need to apply this patch and then run

When I do that, I don't get a leak even without my patches for bug 1489308. Running a debug build on Linux built from hg rev fc01e86f7e80.

Flags: needinfo?(bzbarsky)

Hmm that's weird. I don't know what's different on my setup but I was able to reproduce it locally and also on try with a recent m-c (0e9d5fd6ff74): https://treeherder.mozilla.org/#/jobs?repo=try&revision=3766c063949e06e6bae03948b29c85c0f72965e6

I tried importing bz's patches with moz-phab but it fails with "Non linear dependency detected. Unable to patch the stack."

In any case, would anyone be opposed to taking this patch in the meantime, to unblock other work? I believe the patch is an improvement anyways as it will reduce the chance of other leaks in the future.

Once bz's document.open() work lands I can remove this patch locally and report back to see if the leak would be gone (and back it out if anyone wants it)

Pushed by fgomes@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dca3c3efb015
Store the window reference on the child actors as a weak reference. r=mconley
Priority: -- → P2
Whiteboard: [2/14] per comment 6, waiting on kmag's inputs → [2/22] patch pushed
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

(In reply to :Felipe Gomes (needinfo me!) from comment #12)

In any case, would anyone be opposed to taking this patch in the meantime, to unblock other work? I believe the patch is an improvement anyways as it will reduce the chance of other leaks in the future.

I'd really rather not, or at least back it out once Boris' patches land, if we do. This shouldn't be necessary, and just using a weak reference for the window means that our tests won't actually notice the (admitted probably smaller) leaks that happen if we get this logic wrong.

Whiteboard: [2/22] patch pushed
You need to log in before you can comment on or make changes to this bug.