Closed Bug 1368972 Opened 8 years ago Closed 8 years ago

Increase slice time for long-running incremental GCs

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla57

Project Flags:

Performance Impact

medium

Tracking Flags:

Tracking

Status

firefox57

---

fixed

People

(Reporter: jonco, Assigned: smaug)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

longer_gc_slices_near_cc.diff 8 years ago Olli Pettay [:smaug][bugs@pettay.fi] 1.74 KB, patch	mccr8 : review+ jonco : review+	Details \| Diff \| Splinter Review

Jon Coppeard (:jonco)

Reporter

Description

•

8 years ago

If a GC is not making sufficient progress for whatever reason we should increase the slice budget to ensure that it completes. If we hit the next alloc threshold we will do a non-incremental GC which will cause jank. We could work out how many slices is 'normal' for a GC and start increasing the budget if it exceeds some threshold. Or we could look at allocation pressure and how far over the incremental threshold we are or how far from the non-incremental threshold we are. This was done for the cycle collector in bug 1304205.

Emanuel Hoogeveen [:ehoogeveen]

Comment 1

•

8 years ago

There's a trade-off here (of course): we don't want to get into a situation where we jank at the end of every GC when we could've finished it in time without adjusting the slice length. Ideally we would be able to quantify how much of the GC we've completed, so we can predict how many more milliseconds we need, then compare that to how many more milliseconds we expect the mutator to be able to run without triggering the next GC (and schedule the mutator-to-GC ratio accordingly). Then we could build in an element of persistence so that if we consistently end up behind, we start out with longer slices - and if we consistently finish with loads of time left, we start out with shorter slices. We can do something like that anyway, but without some sort of linear/exponential back-off the only metrics would be 'did we have a lot of time left over before the next trigger' and 'did we have to do a full GC because we were too slow' (the latter being the problem case).

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 2

•

8 years ago

In the CC, the time it runs is bounded (some number of slices), so I used that to guide how desperate the CC becomes. I'm not sure what the main causes are for the GC to stop running incrementally. I know the CC forces it to stop if the GC takes too long. Also maybe the GC will finish up if there's a need for memory.

Jean Gong :jgong

Updated

•

8 years ago

Whiteboard: [qf] → [qf:p2]

Steve Fink [:sfink] [:s:]

Comment 3

•

8 years ago

(In reply to Andrew McCreight [:mccr8] from comment #2) > I'm not sure what the main > causes are for the GC to stop running incrementally. I know the CC forces it > to stop if the GC takes too long. Also maybe the GC will finish up if > there's a need for memory. 75% ZoneChange 20% AbortRequested most of the rest is NonIncrementalRequested https://telemetry.mozilla.org/new-pipeline/dist.html#!cumulative=0&end_date=2017-05-30&keys=__none__!__none__!__none__&max_channel_version=nightly%252F55&measure=GC_RESET_REASON&min_channel_version=null&processType=*&product=Firefox&sanitize=1&sort_keys=submissions&start_date=2017-03-08&table=0&trim=1&use_submission_date=0 So currently, it's mostly because we add a new zone to the set we're collecting. I'm not sure how often that's from hitting a GC trigger vs the CC or somebody add in a zone? I also don't know which one the CC-forced GCs show up as. I failed at tracing the code. I did see that if we get to the case where we need a GC after the CC, then it does it by scheduling all of the zones for GC, which would presumably show up as ZoneChange if we were in the middle of an incremental GC? Or maybe that doesn't happen, because we wouldn't be GCing during a CC anyway? /me confused

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 4

•

8 years ago

This, or something similar, is showing up in BHR hang reports quite badly. CC has been waiting for GC to finish, but if that doesn't happen, it needs to synchronously finish the GC. I'll write some kind of patch for this.

Assignee: nobody → bugs

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 5

•

8 years ago

Attached patch longer_gc_slices_near_cc.diff — Details — Splinter Review

Perhaps something like this. This should start to kick on usually after 10% of locked time, so 3s and increasing from there up to 50ms (which happens to be also the max idle time) remote: View your change here: remote: https://hg.mozilla.org/try/rev/3bc55d7ad05eeebcc94e3e00bc08d466328a8588 remote: remote: Follow the progress of your build on Treeherder: remote: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3bc55d7ad05eeebcc94e3e00bc08d466328a8588 remote: remote: It looks like this try push has talos jobs. Compare performance against a baseline revision: remote: https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=3bc55d7ad05eeebcc94e3e00bc08d466328a8588

Attachment #8904973 - Flags: review?(continuation)

Andrew McCreight (out of office until 8/21) [:mccr8]

Comment 6

•

8 years ago

Comment on attachment 8904973 [details] [diff] [review] longer_gc_slices_near_cc.diff Review of attachment 8904973 [details] [diff] [review]: ----------------------------------------------------------------- This looks reasonable to me. Maybe you should get feedback from a GC person, or at least give them a heads up. ::: dom/base/nsJSEnvironment.cpp @@ +1785,5 @@ > InterSliceGCRunnerFired(TimeStamp aDeadline, void* aData) > { > nsJSContext::KillInterSliceGCRunner(); > MOZ_ASSERT(sActiveIntersliceGCBudget > 0); > + // We use longer budgets when CC has been locked since that means The CC is always locked out when the GC is running (it is set to true in JS::GC_CYCLE_BEGIN). Maybe you mean something more like, when the CC has been locked out but the CC has tried to run?

Attachment #8904973 - Flags: review?(continuation) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 7

•

8 years ago

(In reply to Andrew McCreight [:mccr8] from comment #6) > This looks reasonable to me. Maybe you should get feedback from a GC person, > or at least give them a heads up. Sure > The CC is always locked out when the GC is running (it is set to true in > JS::GC_CYCLE_BEGIN). Maybe you mean something more like, when the CC has > been locked out but the CC has tried to run? Indeed

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 8

•

8 years ago

Comment on attachment 8904973 [details] [diff] [review] longer_gc_slices_near_cc.diff s/when CC has been locked/when the CC has been locked out but the CC has tried to run/ sActiveIntersliceGCBudget is 5ms by default.

Attachment #8904973 - Flags: review?(sphink)

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Updated

•

8 years ago

Attachment #8904973 - Flags: review?(sphink) → review?(jcoppeard)

Jon Coppeard (:jonco)

Reporter

Comment 9

•

8 years ago

Comment on attachment 8904973 [details] [diff] [review] longer_gc_slices_near_cc.diff Review of attachment 8904973 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the heads up. This looks good, although I think that 30 is still too long to let a GC run before increasing slice time to the max. We can experiment with changing the settings later though when we see what effect this has on telemetry.

Attachment #8904973 - Flags: review?(jcoppeard) → review+

Olli Pettay [:smaug][bugs@pettay.fi]

Assignee

Comment 10

•

8 years ago

Yeah, 30s is a lot, but at least it is increased gradually before that already. Perhaps it shouldn't be * 10, but something like * 50. Better to eventually have 250ms slice if needed, rather than sync GCs right before CC. But I'll try with * 10.

Pulsebot

Comment 11

•

8 years ago

Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/01d736f31b39 Increase slice time for long-running incremental GCs when CC has tried to run, r=mccr8,jonco

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

8 years ago

Depends on: 1397864

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 12

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/01d736f31b39

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox57: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla57

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P2

Whiteboard: [qf:p2]

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Increase slice time for long-running incremental GCs

Categories

(Core :: JavaScript: GC, enhancement)

Tracking

()

People

(Reporter: jonco, Assigned: smaug)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Updated

Comment 12

Updated

Attachment

General

Description

File Name

Content Type