Closed Bug 1497873 Opened 6 years ago Closed 5 years ago

Investigate improving the nursery resizing heuristic

Categories

(Core :: JavaScript: GC, enhancement, P1)

61 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla67
Tracking Status
firefox67 --- fixed

People

(Reporter: jonco, Assigned: pbone)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [MemShrink:P2])

Attachments

(5 files, 5 obsolete files)

Currently we resize the nursery depending on the promotion rate and the size of the nursery used, and according to some thresholds we can double the size of the nursery or shrink it by one chunk.

The problem with this is that growth is very aggressive while shrinking can take a while to have any effect and that the nursery size can end up growing and shrinking a lot even for a uniform workload.

It might make more sense instead to estimate an optimal nursery size based on the data we have, possibly using a moving average, and then resize to that.
Priority: -- → P3
Depends on Bug 1433007 because that bug complicates the GC size calculation and is already underway.

Blocks Bug 1449600 because this may make the nursery size more stable and therefore be "nicer" to the chunk allocator.

Related to Bug 1492720 because both bugs propose to change the nursery size calculation.
Blocks: MinGCMem
Depends on: 1433007
See Also: → 1492720
I'd like to work on this, I think the nursery size stability it should provide will make some of my other work have more tangible results.
Assignee: nobody → pbone
Status: NEW → ASSIGNED
Blocks: 1433007
No longer depends on: 1433007
This seems to improve some stability in nursery sizes.  Also the nursery is now very quick to react to changing workloads, maybe too quick.  For example in the some benchmarks as it switches between tasks the "switch" of tasks has different memory allocation patterns, the nursery reacts to this and then starts the next benchmark larger than necessary.  We could prerhaps add back in the condition that the current and previous promotion rates both exceed a threshold before we resize the nursery.
Attachment #9024923 - Flags: review?(jcoppeard)
Here's the first set of differences with this patch:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=e92a6d2427ba&framework=1&showOnlyComparable=1&selectedTimeRange=172800

There are some sagnificant size improvments but also some perf regressions.
Whiteboard: [MemShrink]
It seemed that the Nursery was adjusting it's size too quickly.  So I added this patch that restores the behaviour where it needs two collections with a high or low promotion rate before adjusting its size.
Attachment #9025209 - Flags: feedback?(jcoppeard)
In the ares6 test and possibly others we were getting unlucky and hit a high promotion rate while the nursery was small. causing the wrong objects to get pretenured, then those cause other objects to be tenured and the generational GC stops being an optimisation.
Blocks: 1507445
(In reply to Paul Bone [:pbone] from comment #7)
These patches seem reasonable but as you note there are still some regressions.

In order to compare with the current setup you post some numbers illustrating how the nursery size changes over time for benchmarks that make particular use of the nursery?
Attached patch Tweaking the pretenure condition (obsolete) — Splinter Review
Here is another patch I'm testing with, it may commit on this bug or maybe by itself.
Attachment #9026276 - Flags: feedback?(jcoppeard)
As I noted (comment 7) the ares6 test got unlucky with 3 specific object groups getting pretenured, causing a lot more nursery objects surviving the nursery and being evicted, which also meant more regular GC activity.

I've added another patch which refused to make a pretenuring decision if the nursery is less than 4MB.  My thought that if the nursery is small and we promoted a lot, we should probably grow the nursery before we make a pretenuring decision.  This works out and the test is improved.  However there's still some regressions.  One of the interesting comparisions is:

pretenure tweak vs pretenure tweak + nursery sizing change:

All the tests:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=c985a960915e&framework=1&showOnlyComparable=1&selectedTimeRange=172800

Ares6:
https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=try&originalRevision=c985a960915e&newProject=try&newRevision=eefa2a8c1919&originalSignature=0bae7e5fb769123dc4504aea0458898f63bee0dd&newSignature=0bae7e5fb769123dc4504aea0458898f63bee0dd&framework=11

So there's still something sub-optimal about this new nursery sizing change.  The pretenuring tweak on its own doesn't change much (there's some slight changes but it's hard to be sure), as it is it's only valuable if we also commit the sizing changes.
(In reply to Jon Coppeard (:jonco) from comment #8)
> (In reply to Paul Bone [:pbone] from comment #7)
> These patches seem reasonable but as you note there are still some
> regressions.
> 
> In order to compare with the current setup you post some numbers
> illustrating how the nursery size changes over time for benchmarks that make
> particular use of the nursery?

I'll see what I can do.
Here the Nightly version starts with a 10MB nursery, while the test version starts with a 1MB Nursery.  That's because with the patched code the previous test (Basic) drops its nursery size quickly, while the nightly version does not - and sees a higher tenure rate.
Okay, based on that data I've tweaked it a little further (patches soon).  What I've got now is it only consideres the current tenure rate (not the previous one) and it will at most double or halve the nursery.  We needed the halving because sometimes you'd see a really low tenure rate, the nursery size would drop to 1MB, then take 4 collections to restore itself to 16MB.

Here's the tests, comparing a specific version of central, against my patches:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=98c53f6ef2ba&newProject=try&newRevision=8fa0a2c613d9&framework=1&showOnlyConfident=1

There might be some regressions with about_preferences and tp6_facebook,  Neither do much GC work (I'm looking at the profiles generated from those tests) so I'm not sure why there's a difference.  I'll tidy up the patches and share them later tonight.
?
Attachment #9025209 - Attachment is patch: false
Attachment #9025209 - Flags: feedback?(jcoppeard)
Depends on: 1510560
The tests look good for this patch, but it must not land without Bug 1510560 without which it will regress.
Attachment #9024923 - Attachment is obsolete: true
Attachment #9026276 - Attachment is obsolete: true
Attachment #9024923 - Flags: review?(jcoppeard)
Attachment #9026276 - Flags: feedback?(jcoppeard)
Attachment #9028253 - Flags: review?(jcoppeard)
Whiteboard: [MemShrink] → [MemShrink:P2]
Attached file charts.zip
To help evaluate changes to the nursery size heuristics, I wrote some scripts to generate charts of the size and tenure rate against time for various benchmarks (octane plus some of talos ones).  Here's the output, with and without the patch in this bug.  (I can post the scripts too if interested).

The results show some improvements (e.g. for EarleyBoyer the nursery size is more stable) and some regressions (e.g. for Splay it varies a lot more).
I put the scripts to generate these charts on github: https://github.com/jonco3/nurserySizeCharts
Comment on attachment 9028253 [details] [diff] [review]
Bug 1497873 - Add a kind-of goal seeking herustic for nursery resizing

Review of attachment 9028253 [details] [diff] [review]:
-----------------------------------------------------------------

Cancelling review request for now.
Attachment #9028253 - Flags: review?(jcoppeard)
So in Jon's results, and replicated in mine, Splay running in the shell sees a lot of floundering WRT the nursery size.  While most of the other benchmarks show fairly stable behavour for stable tenure rates, and react quickly to changing tenure rates.  I also get the sense (but no hard data) that considering the previous tenure rate, and tweaking the pretenure condition result in better behavour in these tests (the latter definitly improved ares when tested on try due to some unlucky pretenuring).

I've exprimented with a few things to remove the floundering in Splay, including a larger deadzone for the tenure rate (didn't work).  So far what has worked, and seems a bit funny, is the "correct" calculation for the tenure rate (considering nursery used size not capacity).  I'm going to start some try/talos jobs now with this change and see what it's like.

One other change that I'm on the fence about is how it should behave WRT the previous tenure rate.  Should it observe it both for growing and shrinking the nursery, or only growing (as before)?  It seems like shrinking quickly might be desirable, particularly if there's not going to be another minor GC for a little while as the app hits a steady state (when tenure rate drops anyway).

¯\_(ツ)_/¯
My current patches with incorrect tenure rate calculation:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=6e55e0f8e303&newProject=try&newRevision=8fac29e891b3&framework=1&showOnlyConfident=1

This still has some regressions for ares6, but other tests seem good.  Including some size improvements.

With correct tenure rate calculation:

https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=6e55e0f8e303&newProject=try&newRevision=1c7c752cfa2e&framework=4&showOnlyConfident=1

Regresses awsy tests.

reminder to myself: Try to work out why ares6 regresses in the first comparison.  And for the 2nd, if we can have the correct tenure rate (which fixes the flip-floppy-ness of Splay) without regressing awsy.
So the problem with ares6 is in the babylon subtest (as always) there are more non-imcremental GCs with these patches due to GCBytesTrigger.
Priority: P3 → P1
Attachment #9025209 - Attachment is obsolete: true
Attachment #9028253 - Attachment is obsolete: true
Attachment #9032855 - Attachment is obsolete: true

Also allow the nursery to shrink up to half it's current size, previously
it'd be one chunk at a time. Growing is still capped at twice the current
size. These limits tend to prevent it changing too dramantically for small
changes in allocation patterns.

Depends on D17593

Hi Jon,

I'd like to land this next week. The current talos/raptor/js-bench results look good. The big red marks on some graphics test is because someone recently landed something that improved those benchmarks but the changes I tested were based of an older central version.

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=5a33f9b4e7466ecb0de6ba63cfae2699a7ab942d&framework=1&showOnlyComparable=1&showOnlyConfident=1&selectedTimeRange=86400

Note the raptor tests, some of those tests look very good.

Pushed by pbone@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/adcc2b05c708
Tweak the pretenure condition r=jonco
https://hg.mozilla.org/integration/autoland/rev/a3e1dfc5aae9
Add a kind-of goal seeking herustic for nursery resizing r=jonco

NI perfherder people. This may change a few of our benchmarks, see https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=5a33f9b4e7466ecb0de6ba63cfae2699a7ab942d&framework=10&showOnlyComparable=1&showOnlyConfident=1&selectedTimeRange=86400 (unless it was reddit that changed their code).

I still want to know about any changes you happen to notice, but if you're trying to find a patch that cased a perf change, look here.

Flags: needinfo?(rwood)
Flags: needinfo?(jmaher)
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67

thanks :pbone!

:igoldan, :bebe, heads up- you should be seeing alerts by now

Flags: needinfo?(jmaher)
Flags: needinfo?(igoldan)
Flags: needinfo?(fstrugariu)

Thanks Paul.

Flags: needinfo?(rwood)

Hrm, maybe the one I saw on try really was that reddit changed their code. Which is good because I didn't think this change would make that much difference.

Flags: needinfo?(igoldan)
Flags: needinfo?(fstrugariu)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: