run cycle collector less often when it is slow

RESOLVED WONTFIX

Status

()

defect
RESOLVED WONTFIX
8 years ago
3 years ago

People

(Reporter: mccr8, Unassigned)

Tracking

(Blocks 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P2])

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

8 years ago
We get in various situations where the cycle collector starts taking a long time (>500ms, say) without freeing much.  In that case, we could try running the cycle collector less often.  This would cause memory usage to increase a bit, because it would take longer for things to be freed, but it would probably improve the user experience.

This would require increasing the CC timer, and possibly the purple buffer size threshold.
(Reporter)

Updated

8 years ago
Blocks: 698919
(Reporter)

Updated

7 years ago
Assignee: nobody → continuation
(Reporter)

Comment 1

7 years ago
This will require some tuning.  One data point to consider is bz's experience with the GC scheduling regression.  What was the gap in CCs there?
(Reporter)

Comment 2

7 years ago
After every CC, computer how long the CC took.  Multiply that by 9 to get the next delay.  If that's less than a certain minimum (5 seconds), set it to the minimum.  If that's larger than a certain maximum (30 seconds), set it to the maximum.  The next time the CC timer is set, use that delay that you calculated.

The basic idea is that we need a minimum so we don't run the CC more than we do now when the CC is behaving itself.  I put a fairly generous maximum in there because to avoid weird scenarios like if somebody turns off their computer in the middle of a CC, comes back a day later, and then Firefox decides that the CC actually took 24 hours to run, and then decides not to run the cycle collector again for a week.

The numbers are just something vaguely reasonable I threw together, they can be easily changed.

I need to come up with a "bz-ifier" that can make the CC sit and spin for an extra 1 second to see how that feels with this patch.
(Reporter)

Comment 3

7 years ago
One possibility for an additional pressure valve here would be to use the minimum time if the CC actually collected a bunch of stuff (like, say, more than 10,000 things).  That would avoid a failure cascade if the browser is actually generating a ton of garbage, wherein the CC is slow, gets delayed, so more garbage is generated between CCs, so the CC gets even slower, etc.  Likewise, we could increase the factor if it isn't collecting anything.  In bz's case (and most of the cases I've seen with super slow CCs), the CC isn't collecting more than around 500 items.
Something like this could be ok for FF11.
I'm still aiming to get BBP (black-bit-propagation aka CanSkip) landed to FF12.
That changes how CC is triggered, because CanSkip needs to run occasionally before CC.
(Reporter)

Comment 5

7 years ago
Yeah, that's kind of why I put off working on this.  I think it would be nice to have this in our back pocket in case the schedule slips on your CanSkip stuff, but maybe it isn't worth landing.  I think it is a little too scary to land in 11 at this point in the cycle, especially without any telemetry about CC scheduling already in place.
(Reporter)

Comment 6

7 years ago
I'd also still like a scheduling backstop like this for the CC, even with your stuff, for cases where the CC goes berserk.
(Reporter)

Comment 7

7 years ago
I'll try tweaking this and landing it, so we can maybe get it put on Aurora.
(In reply to Andrew McCreight [:mccr8] from comment #3)
> One possibility for an additional pressure valve here would be to use the
> minimum time if the CC actually collected a bunch of stuff (like, say, more
> than 10,000 things).

This definitely sounds important to me. If a significant amount is being actually collected, then collection should arguably be done more frequently in order to reduce the maximum pause time and keep the browser responsive.
That is very much what 3.x did. Collection happened more often when user was inactive or
there was plenty of garbage collected previous time.
(Reporter)

Comment 10

7 years ago
P2 because this could be helpful in pathological situations.
Whiteboard: [Snappy] → [Snappy:P2]
(Reporter)

Comment 11

7 years ago
Posted patch CC less often (obsolete) — Splinter Review
This alters how often we CC by adjusting how many forget skippables we do before we check the size of the purple buffer.  The minimum, 15, keeps the current behavior, which is checking every 6 seconds.  The maximum, 50, checks every 20 seconds.

I use the somewhat arbitrary multiplier of 20 times the length of the previous CC to get when we'll check the CC the next time.  If a CC took 300ms, we get the minimum of 6 seconds.  If the CC took 500ms, it will be at least 10 seconds before we run the CC next.  If the CC took 1second or more, we will check again in 20 seconds.

The throttling behavior is disabled if we collect at least 1000 objects.  This is to try to avoid a death spiral where we delay the CC longer, which causes a longer CC, etc.

If you look at CC telemetry for the last week, CCs over 633ms are extremely rare.  Rare enough you can't tell how big the bars are, but they are each happen less than 0.107% of the time.
Attachment #590080 - Attachment is obsolete: true
(Reporter)

Comment 12

7 years ago
Another factor to consider would be adjusting how often we do forget skippable.  In theory we could end up spending a lot of our time on that.  I think generally that forget skippable is less vulnerable to redoing pointless work over and over again than the CC.  If we GC, then the next forget skippable will take longer, is the main problem.
Comment on attachment 600085 [details] [diff] [review]
CC less often

Could we just increase sCCTimer value, basically NS_CC_SKIPPABLE_DELAY
when CC is slow. That would increase also the time between 
forgetSkippables.
(Reporter)

Comment 14

7 years ago
This new patch increases the time between slices of CC work when the CC is slow.  The min time between them is 400, which is the current setting.  The max is 800, which is fairly conservative.  That will mean we do a CC every 12 seconds instead of every 6 seconds.

To determine the length of time between slices, we take the CC time and multiply it by .8.  Any CC of 500ms or shorter will end up with the minimum delay.  Multiply this by 15 to get the time between GCs.

500ms or less: 6 seconds
800ms: 9.6 seconds
1000ms or more: 12 seconds

If we collected at least 1000 objects during the CC, we use the minimum time, no matter how long the last CC took.

This is not ideal, because there could be a half million things in the CC graph, and maybe in that case freeing 1000 objects isn't really that productive.
Attachment #600085 - Attachment is obsolete: true
No longer blocks: 698919
Blocks: 698919
(Reporter)

Updated

4 years ago
Assignee: continuation → nobody
(Reporter)

Comment 15

3 years ago
This doesn't seem worth the time to implement.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.