Closed Bug 1293239 Opened 5 years ago Closed 5 years ago

Nursery promotion rate can be underestimated

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: jonco, Assigned: jonco)

References

Details

Attachments

(3 files, 1 obsolete file)

Attached patch fix-promotion-rate-estimate (obsolete) — Splinter Review
Currently the promotion rate is calculated like this:

  double promotionRate = mover.tenuredSize / double(allocationEnd() - start());

But allocationEnd() is the limit of the available nursery space so if minor GC has not been triggered by exhausting the nursery then the promotion rate will be an underestimate.
Attachment #8778849 - Flags: review?(terrence)
Comment on attachment 8778849 [details] [diff] [review]
fix-promotion-rate-estimate

Review of attachment 8778849 [details] [diff] [review]:
-----------------------------------------------------------------

I think the idea is that if we didn't reach the end of the nursery anyway, we certainly don't need/want to make it longer.

Specifically, if the code we are currently running is triggering barriers heavily, then a large fraction of the minor heap is going to be held live by the store buffer, so collections are going to be relatively long. My intuition is that making the heap larger is going to (modulo barriers) increase the time between collections, which is going to generally increase the collection length; whereas collecting more frequently will ideally flush the longer lived stuff sooner so that less things are available to entrain random garbage.

Of course, intuition isn't worth the garbage we're collecting, so how does this perform in real life? I'm mostly curious about Earley-Boyer: how is the performance affected there?
Attachment #8778849 - Flags: review?(terrence)
(In reply to Terrence Cole [:terrence] from comment #1)
Ok, that's interesting.  What I observed happening was that during splay the incremental GC slices end up shrinking the nursery.  I think this is probably undesirable too.  I'll have a look at Earley-Boyer and see what's going on there.
Comment on attachment 8778849 [details] [diff] [review]
fix-promotion-rate-estimate

Review of attachment 8778849 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, that's an excellent point!

Note that this patch does implement the intended logic and I am all for taking it if it is a win: wouldn't be the first time my internal heuristics have been affronted by reality.
Attachment #8778849 - Flags: review+
Here's an updated patch.

This time we don't increase the nursery size unless we filled the currently available space.

We will still shrink the nursery if we didn't fill it completely if the promotion rate is low enough, but note that the promotion rate will in general be higher the earlier we collect.

Also, we don't resize the nursery at all unless we made a significant amount of allocations, so as not to base the behaviour on insufficient data.

I tested Earley Boyer (and octane in general) and this didn't make any noticable difference either way.
Attachment #8778849 - Attachment is obsolete: true
Attachment #8779318 - Flags: review?(terrence)
Comment on attachment 8779318 [details] [diff] [review]
fix-promotion-rate-estimate v2

Review of attachment 8779318 [details] [diff] [review]:
-----------------------------------------------------------------

WFM
Attachment #8779318 - Flags: review?(terrence) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/62471afba14b
Improve nursery resizing heuristics r=terrence
https://hg.mozilla.org/mozilla-central/rev/62471afba14b
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Depends on: 1296272
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
As dicussed in bug 1296272, here's a patch to revert to the previous heuristics.  It's not a straight backout because dynamic nursery chunk allocation has happened in the meantime (among other things).
Attachment #8782525 - Flags: review?(terrence)
Comment on attachment 8782525 [details] [diff] [review]
backout-resize-heuristic

Review of attachment 8782525 [details] [diff] [review]:
-----------------------------------------------------------------

Let's try it.
Attachment #8782525 - Flags: review?(terrence) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/491bc7e98f2a
Revert nursery resizing heuristics r=terrence
Keywords: leave-open
See comment 12
Flags: needinfo?(jcoppeard)
It seems unlikely that backing this out would cause a regression if the original landing didn't cause an improvement, but it's possible I guess.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #14)
> It seems unlikely that backing this out would cause a regression if the
> original landing didn't cause an improvement, but it's possible I guess.

I can reproduce this locally:

octane - EarleyBoyer

After:  21357   (revision: 491bc7e98f2a)
Before: 39736   (revision: 567ed465cf00)

Seems like https://hg.mozilla.org/mozilla-central/rev/491bc7e98f2a did cause a regression!
(In reply to Hannes Verschore [:h4writer] (PTO:12-21 Aug) from comment #15)
It seems this change is to blame.  Investigating.
I messed up the previous patch to revert the initial change by using the maximum nursery size rather than the current nursery size when calculating the promotion rate.

The problem was that the promotion rate was always calculated as being very small and so we weren't growing the nursery.  This caused the reduction in the Earley-Boyer score.
Attachment #8783522 - Flags: review?(terrence)
Comment on attachment 8783522 [details] [diff] [review]
bug1293239-fix-backout

Review of attachment 8783522 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, right, because the meaning of nurserySize() has changed substantially.
Attachment #8783522 - Flags: review?(terrence) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5c56991cb9a5
Fix error when reverting nursery resize heuristic change r=terrence
Depends on: 1297646
(In reply to Pulsebot from comment #19)
> Pushed by jcoppeard@mozilla.com:
> https://hg.mozilla.org/integration/mozilla-inbound/rev/5c56991cb9a5
> Fix error when reverting nursery resize heuristic change r=terrence

5c56991cb9a5 seems to have caused a 14MiB regression in startup memory usage on AWSY. Is that the intent?
Flags: needinfo?(jcoppeard)
(In reply to Eric Rahm [:erahm] from comment #21)
The original failed attempt at backing this out (491bc7e98f2a) introduced a corresponding 16MB improvement and the actual backout regressed it again.

I had to hunt a little to find this in AWSY but it's under this build:

https://hg.mozilla.org/integration/mozilla-inbound/rev/169429641d30
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3a2ae963910c&tochange=169429641d30
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #22)
> (In reply to Eric Rahm [:erahm] from comment #21)
> The original failed attempt at backing this out (491bc7e98f2a) introduced a
> corresponding 16MB improvement and the actual backout regressed it again.
> 
> I had to hunt a little to find this in AWSY but it's under this build:
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/169429641d30
> https://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=3a2ae963910c&tochange=169429641d30

All I see is a 14MiB regression for explicit start (~10%) on the 22nd from 5c56991cb9a5, I don't see a previous improvement that was reverted.
(In reply to Eric Rahm [:erahm] from comment #23)
I had another look at this and it does seem to be a real issue.  

The symptom is that the explicit allocation of the browser is now 14MiB higher than before, and the data shows that this is due to the nursery allocation.

I don't think it's caused by the patch in this bug, but by intervening changes that happened between the original attempted backout of this patch and the actual backout.  The original failed backout just hid the effect of those changes by making it so the nursery never grew beyond its initial 1MiB allocation.

I think this is caused by two factors:

 - bug 1286506 makes nursery shrinking happen more slowly, taking two minor GCs to shrink it 1MiB.  This was done to reduce the nursery constantly being resized.
 - bug 1293127 reduces the number of minor GCs that need to happen, which is in general a good thing.

In concert however I think they mean that we don't shrink the nursery down again as fast as we used to.  If the user is not active and there are no GCs happening then the nursery size can remain large, which is not ideal.
Bug 1309615 fixes this issue in local testing.
Bug 1309615 seems to have fixed this on AWSY.
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → WONTFIX
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.