Open Bug 1376557 Opened 7 years ago Updated 2 years ago

finalisation slice can run late

Categories

(Core :: JavaScript: GC, defect, P3)

defect

Tracking

()

Performance Impact low

People

(Reporter: pbone, Unassigned)

References

Details

(Keywords: perf)

Attachments

(2 files)

The JS shell runs GC slices when the amount of allocated memory crosses a threshold.  If a slice ends after finishing sweeping but not finalisation (or maybe part-way through sweeping) then the allocated memory will have dropped, the next slice won't run now for some time (until the allocated memory increases again).  This means that if other zones want to begin a collection they need to interrupt this unfinished collection or due to bug 1367455 they will be delayed. Also barriers may still be in place (but I don't know) which may slow throughput.
Assignee: nobody → pbone
Status: NEW → ASSIGNED
See Also: → 1367455
Whiteboard: [qf]
Here are the two regressions that I'm attempting to fix:

https://treeherder.mozilla.org/perf.html#/alerts?id=7016
Blocks: 1376904
No longer blocks: 1376904
Paul is this something that only shows up with shell or does this pattern happen in the browser as well?  From the description it seems like the general browser could be affected by this too.
Flags: needinfo?(pbone)
Attached image Profile timeline
This is a screenshot from the gecko profiler.  This could be the same behaviour happening in the browser (either from the same or a different cause). I'm also suspicious about the integration between the GC and the gecko profiler, I'm currently trying to work out how all this fits together so I understand _completely_ what it is I'm seeing.
Flags: needinfo?(pbone)
Here is the gecko profile file for splay latency.
Whiteboard: [qf] → [qf:p3]
Here's what I know:

 + The same behavior with the final delayed slice does occur in the browser.

 + The GC following the one with the delayed slice becomes non-incremental due to GCBytesTrigger. This is also in Beta (I think 
   my earlier patch is now in Beta).  In my tests with the shell this was because the heap filled up during the collection 
   because the start of the collection was delayed.  But there are also other possibilities.

 + My patch, which i discussed with jonco, attempts to avoid this by adding a new monotonic counter for the amount of used
   memory, as memory is freed the counter is left alone, it is used to trigger the final slice in a reasonable time and 
   complete the zonal GC sooner and therefor not delaying the following GC.

 + My patch works, it does avoid this problem, however _all_ the GCs seem to switch to non-incremental (rather than only the 
   delayed GCs) due to GCBytesTrigger. This creates more frequent and longer pauses, lowering the benchmark scores further.
   This shouldn't happen if my original guess that the delayed GC became non-incremental because it was delayed.  Now none 
   are being delayed and yet ALL are non-incremental, this could also be a separate cause.

 + My patched version seems to have slices that "look" further apart but I haven't measured accurately.  This should not be 
   caused by no-longer delaying the final slice, that doesn't make sense.  However it does suggest my new counter is not
   counting as fast as the previous one. Which could cause the GC to fail to reclaim memory quick enough and therefore 
   switching to non-incremental mode.

 + Since lowering the slice budget to 5ms the whole system is possibly more likely to fail to reclaim enough memory soon 
   enough, as the trigger for slices wasn't changed to correspond to this.  I propose lowering the heap thresholds to begin a 
   collection earlier and therefore hope to run enough slices before the heap fills up.  Ideally we should calculate some of 
   these parameters dynamically, based on a projection of how much work remains during a collection and how much space is
   left in the heap.

There are some things I can test:

 + Adjust the threshold for my new counter, lower it and see if the behavior continues.

 + Search for places I'm missing to where I should be incrementing the counter.

 + Re-increase the GC pause time to 10ms and see if the problem still occurs.

 + Lower the thresholds that start a collection.
Blocks: 1367455
See Also: → 1386219
Priority: -- → P3
(In reply to Paul Bone [:pbone] from comment #5)
>  + My patch works, it does avoid this problem, however _all_ the GCs seem to
> switch to non-incremental (rather than only the 
>    delayed GCs) due to GCBytesTrigger. This creates more frequent and longer
> pauses, lowering the benchmark scores further.
>    This shouldn't happen if my original guess that the delayed GC became
> non-incremental because it was delayed.  Now none 
>    are being delayed and yet ALL are non-incremental, this could also be a
> separate cause.

It seems like it's important to understand what's going on here. If you look at a profile where this is happening, does the information on thresholds make sense? As in, if this is GCBytesTrigger, then timings.trigger_amount and timings.trigger_threshold should be getting set. Are these significantly different than before your patch? (It just strikes me that if your new counter were never incremented at all, due to some bug, then it would have this observed behavior.)
Flags: needinfo?(pbone)
Good thinking. I'll check that when I get to look at this next,  Probably not until Monday the 25th.
Keywords: perf
Flags: needinfo?(pbone)
See Also: → 1377430
Assignee: pbone → nobody
Status: ASSIGNED → NEW
Performance Impact: --- → P3
Whiteboard: [qf:p3]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: