Closed Bug 1610193 Opened 2 months ago Closed 2 months ago

Memory increases with each reload on youtube

Categories

(Core :: JavaScript: GC, defect, P1)

72 Branch
Desktop
All
defect

Tracking

()

VERIFIED FIXED
mozilla75
Tracking Status
firefox-esr68 --- unaffected
firefox72 --- wontfix
firefox73 + wontfix
firefox74 + verified
firefox75 --- verified

People

(Reporter: georgefb899, Assigned: sfink)

References

(Regression)

Details

(Keywords: memory-footprint, regression, Whiteboard: [MemShrink:P1])

Attachments

(3 files)

Attached file memory-report.json.gz

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:72.0) Gecko/20100101 Firefox/72.0

Steps to reproduce:

  1. Open https://www.youtube.com/
  2. Reload ~15 times
    Tested in safe mode

Actual results:

The process for the youtube tab uses 1GB of memory and it keeps increasing the more I reload the page.
I was using an extensions that auto reloads tabs and after about 10 hours it used ~10GB of memory.

Expected results:

Not use so much memory.

See Also: → 1610228
Component: Untriaged → Memory Allocator
Product: Firefox → Core
Component: Memory Allocator → Untriaged
Product: Core → Firefox

Hello, I've managed to reproduce the issue on Firefox 72.0.2, Firefox Beta 73.0b8 and 74.0a1 (2020-01-21).

Status: UNCONFIRMED → NEW
Ever confirmed: true

Setting component to Core → Performance, if you feel this is incorrect, please set the correct component.

Component: Untriaged → Performance
Product: Firefox → Core

I can reproduce the issue on Nightly74.0a1 as well as Firefpx72 Windows10.
In my case,
Good build: Immediately reduce the memory usage to less than 500Mb at every reloads.
Bad build : The memory discarding seems does not work, So, the memory usage more than 1Gb at 10th reloads.

Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=3a763059a684969df0246d673943ec1d5ff8996b&tochange=5decb743461a478278ee42d9c54c24b939db9d59

Suspect:
5decb743461a478278ee42d9c54c24b939db9d59 Steve Fink — Bug 1593399 - Rework how mark colors are handled in weakmap marking r=jonco

Component: Performance → JavaScript: GC
OS: Unspecified → All
Regressed by: 1593399
Hardware: Unspecified → Desktop
Has Regression Range: --- → yes
Has STR: --- → yes
Flags: needinfo?(sphink)
Keywords: regression
Whiteboard: [MemShrink:P1]

Jon, is this likely to be the same thing as the other similar bugs we've had floating around recently?

Flags: needinfo?(sphink) → needinfo?(jcoppeard)

Oh, wait. I didn't notice the connection to bug 1593399. I'll need to look into this, then.

Flags: needinfo?(jcoppeard) → needinfo?(sphink)

This is probably something we should try to fix before Fx73 ships.

Priority: -- → P1

Reproduces easily, either using the youtube.com front page or a video.

The increase in memory is spread over a lot of different objects on the page. Function objects, dom objects, layout stuff. I'm guessing the entire old pages are being retained.

The leak goes away if I comment out the part of WeakMap marking where a key's delegate marks the key, which was the main intended change of the patch. This was a correctness fix, so I don't want to just revert that part. But it is also incorrect for this memory to be retained.

One guess: the delegate's zone is not being collected, so it is always considered marked by the WeakMap code. Adding some printouts... yes, this case is definitely happening. (Which doesn't prove that it's the cause, but it looks promising.) The printouts show two such delegates per reload of the front page. Early on, the key was white and the delegate black, but later reloads always showed a gray key and a black delegate (which is the exact situation modified by bug 1593399.)

It seems we really want to be collecting the delegate zone in this case.

The brute-force approach would be to remember which zones contain delegates for which other zones, and automatically add the transitive closure of those to the set of zones to be collected. That seems a little harsh, though -- if you ever use an object from another zone as a WeakMap key, then you'll always have to collect that zone.

A more targeted approach would be to aggressive about collecting the zones of unloaded pages. If we're reloading, then presumably we know at some point that the zone is no longer active. Can we schedule it for collection on every GC after that point, or at least on the next GC (which would just be a regular scheduling)? Asking jonco because he's looked at similar things recently, and smaug since he'll probably know.

Flags: needinfo?(jcoppeard)
Flags: needinfo?(bugs)

(In reply to Steve Fink [:sfink] [:s:] from comment #8)
As discussed in person, I think it could help if we mark the key gray if the delegate is gray but in an uncollected zone (as opposed to black as we do now). I tried this but hit assertions because this tries to mark things gray in a zone that is marking black only. Possibly we could skip marking until the zone is marking black or gray but I haven't tried that yet.

The brute-force approach would be to remember which zones contain delegates for which other zones, and automatically add the transitive closure of those to the set of zones to be collected. That seems a little harsh, though

Yes, that seems like we could end up collecting all the zones all the time. But I don't have good view on how often we get edges between zones. This might work out.

If we're reloading, then presumably we know at some point that the zone is no longer active. Can we schedule it for collection on every GC after that point

Yes, that's a possibility, but we might end up always collecting that zone if really is still alive.

or at least on the next GC

That's what happens now, but it can happen that something is still holding the zone alive at this point.

Another approach is for the GC to automatically notice when zones are dead and collect then, which is what bug 1591371 is about.

Flags: needinfo?(jcoppeard)

(In reply to Jon Coppeard (:jonco) from comment #9)

(In reply to Steve Fink [:sfink] [:s:] from comment #8)
As discussed in person, I think it could help if we mark the key gray if the delegate is gray but in an uncollected zone (as opposed to black as we do now). I tried this but hit assertions because this tries to mark things gray in a zone that is marking black only. Possibly we could skip marking until the zone is marking black or gray but I haven't tried that yet.

With my current test (reloading the toplevel youtube.com page over crappy hotel wifi), I'm seeing a white (unmarked) delegate in an uncollected zone, for a gray key.

The key is a CrossOriginObjectWrapper around an nsOuterWindowProxy.

Note that if I mark the delegate the same color as the key (assuming the delegate is in an uncollected zone), then the leak goes away. But this reintroduces the original bug -- if the delegate is reachable from a black root, we run the risk of cycle collecting the gray key. Which honestly wouldn't be the worst thing -- the worst that would happen is that we would lose the value associated with the key in this map (or any other WeakMap). But it really isn't correct.

Flags: needinfo?(sphink)
Flags: needinfo?(bugs)

Steve or Jon can you formally take this bug so it doesn't stay unassigned?

Assignee: nobody → sphink

I've put a patch up in bug 1591371 to collect zones that we notice are dead. Maybe this will help.

See Also: → 1612705

(In reply to Jon Coppeard (:jonco) from comment #12)

I've put a patch up in bug 1591371 to collect zones that we notice are dead. Maybe this will help.

It might. I've been looking more into my rr recording of a sample run, and the delegate zone is never collected after it is first created. The GC that collects the key zone (where it looks at the delegate and treats it as black) is scheduled because something from the zone is in the CC's purple buffer.

The zone scheduling seems to be a red herring. It doesn't matter whether the delegate zone is being collected, because the delegate (the outer window) is fully live.

If the delegate is marked (either really or because its zone is not being collected so we don't know whether it's live), then the key needs to be preserved -- but it only needs to be marked as much as the least marked of the delegate and map. If one of those is gray, then the key only needs to be gray. (If either map or delegate is part of a garbage cycle, then there's no need to keep the entry.)

It's a lot like the current logic for marking the value: the value must be marked (at least) min(map color, key color). This is saying that the key must be marked at least min(map color, delegate color). It's two ephemeron relationships in one weakmap entry, thanks to our compartment stuff.

I'll upload a patch that fixes this. It's unclear to me whether this fully fixes the problem here, though. I tested by reloading a youtube video repeatedly. It started at 200MB. After 5 reloads, it was up to 600MB. After 2 more, it had dropped to 330MB or so. Clicking on "Minimize Memory Usage" dropped it down to around 200MB. But it seems correct and worth having no matter what, so I'll put it up for review before looking further.

Ok, my confidence has increased that this patch fixes the bug.

unpatched:
1st load => 2xxMB
7 reloads => 758MB
12 reloads => 1048MB

patched:
1st load => 367MB
7 reloads => 296MB
12 reloads => 429MB
15 reloads => 452MB

Pushed by sfink@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/92dd7149719c
Do not proxy-preserve weakmap keys with a darker color than the map itself r=jonco
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/50f41784af85
followup - update test for fixed behavior

Comment on attachment 9125132 [details]
Bug 1610193 - Do not proxy-preserve weakmap keys with a darker color than the map itself

Beta/Release Uplift Approval Request

  • User impact if declined: Pretty massive memory leaks when reloading many pages (youtube, facebook, at least)
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: My usual test scenario is to go to any video on youtube, open up a new window with about:memory and measure memory usage. The relevant number will be for the first content process listed. Then reload the page many times (10 is good) and measure again. With the bug, it'll probably around 1GB; without, it could be 200MB-400MB. Then maybe reload another 5 times and measure again to be sure.
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It touches the core WeakMap marking logic, which is always a little scary, but it's still doing something that is safer than what it was doing before the regressing bug landed, and the logic for why this patch makes sense is pretty straightforward. It's also a very small patch.

For this backport, I have already rolled up the followup fix (which was just an adjustment to a test that was expecting the previous behavior), so no other patches are needed.

  • String changes made/needed: none
Attachment #9125132 - Flags: approval-mozilla-beta?
Flags: qe-verify+

When I said the fix has not been verified in Nightly, I mean that I just landed it to autoland and it hasn't been in a Nightly yet.

Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla75

FYI, I'll wait a few days to see if there are nightly regressions caused by this patch before uplifting to beta.

QA Whiteboard: [qa-triaged]

I was able to reproduce the bug using an affected Nightly build (2020-01-19), and by following the steps from comment 20.

I can confirm that on latest Nightly 75.0a1 the memory usage stays within 200-400MB range in about:memory. Tested on Win 10 x64, macOS 10.11 and Ubuntu 18.04 x64.

I'll not remove the qe-verify+ flag for now, in case the patch gets uplifted to Beta.

Status: RESOLVED → VERIFIED

Probably worth letting this fix ride Fx74 rather than trying to squeeze it into a 73 dot release at this point.

Duplicate of this bug: 1613478
See Also: → 1571513

Comment on attachment 9125132 [details]
Bug 1610193 - Do not proxy-preserve weakmap keys with a darker color than the map itself

Uplift approved for 74.0b3, thanks.

Attachment #9125132 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 2 months ago2 months ago
Resolution: --- → FIXED

(In reply to Arthur Iakab [arthur_iakab] from comment #29)

Backed out from beta
Backed out changeset 38c1cbc9dc9e (bug 1610193) for causing spidermonkey bustage on weak-marking-03.js a=backout

Sorry, that's the wrong patch. Or rather, it's fine but it would require the other patch to land as well. I put a rolled up patch backported to beta in https://phabricator.services.mozilla.com/D62077 but I guess that didn't help. Either apply that patch, or backport both of:

https://hg.mozilla.org/mozilla-central/rev/92dd7149719c
https://hg.mozilla.org/mozilla-central/rev/50f41784af85

Flags: needinfo?(sphink)
QA Contact: ciprian.georgiu

This issue is verified fixed on Beta 74.0b3 (20200214015126) as well, under Windows 10 x64, macOS 10.15 and Ubuntu 18.04 x64.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.