Open Bug 906827 Opened 6 years ago Updated 2 years ago

Implement a privileged API to delay any potential GC

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

People

(Reporter: jaws, Unassigned)

References

(Blocks 1 open bug)

Details

There are times within the Firefox front-end where performance of our animations are highly sensitive. One example of this is the tab-opening/tab-closing animation.

Every so often while opening or closing a tab, a GC can occur which can cause the animation to stutter.

Even with IGC and GGC, we will still likely have single digit millisecond pauses.

One way to solve this is to introduce a way for privileged code to delay any potential garbage collections, at the expense of higher memory use (of course).

I imagine the API could work like so:

1)
.. a new tab is about to be opened
.. .. GC.delayGC(250)
.. .. new tab begins opening
.. .. new tab finishes opening
.. GC

2)
.. a new tab is about to be opened
.. .. GC.delayGC(250)
.. .. new tab A begins opening
.. .. new tab A finishes opening
.. a new tab is about to be opened
.. .. GC.delayGC(250)
.. .. new tab B begins opening
.. .. new tab B finishes opening
.. GC

In the case of (2), the delay is not the sum of the two calls (500), but instead occurs 250ms after the *last* delayGC function call.

The argument to delayGC is the number of milliseconds to delay any potential GCs. It is possible that after a call to delayGC with an argument of 100ms, it may still take multiple seconds for a GC to occur based on the current memory state of the browser. The point of this method is to try to guarantee that a GC will not occur in the next N milliseconds.

I believe a delay like this is much better than shutting down + restarting GC because we need to guarantee that GC operations will resume in the future.
Seems like a slippery slope. People will be tempted to resolve all GC pauses by inserting delayGC calls. It also has a magic number (250ms) that should vary by platform and is really yet another tuning parameter. And actually, isn't this specific use problematic already? If you did something like "open bookmark folder as tabs", couldn't this starve out the GC for a very long time for stuff that's going to be allocating memory?

In your use case, it seems like it wouldn't be all that bad to have a GC before the animation starts. So what about something like adviseGC(JS::GC::ANIMATION_STARTING), which in this case would check for whether we need to GC, but using a lower threshold than usual? (So if we have plenty of memory left, this is a nop. But if it looks like we might need to GC soon, then do it now instead of during the animation.)
The "250ms" in Jared's example was how long our animations (are supposed to) last.

I think an API as proposed in comment 0 is too easy to abuse, but some form of input on GC scheduling decisions is certainly worth looking into.

We haven't yet measured the impact of GC on animations very well, AIUI, so it's probably premature to expend a lot of effort trying to come up with a solution here.
I vaguely remember hearing that we did do (or attempt to do) something to avoid doing GC/CC during the tabclose animation (I remember this being a big Snappy item).  Anyone know more?  If so, perhaps it's just not working right, or it broke, and we shold fix it.
A GC is scheduled for a few seconds after a tab close, but I don't know offhand of any mechanism for squelching GCs that are already on the way (by having a timer ticking down).

It doesn't seem like it would be that hard to add a mechanism to trigger our timer-based GCs down the road a bit, with some kind of hard lockouts.  That wouldn't catch everything, but we probably don't want to squelch GCs generated internally by the JS engine, at least for now.

Jared, is there any particular reason that is consistently triggering these GCs that are interrupting the animation?  The error console or whatever should log the reason for the GC if you have javascript.options.mem.log set.
Curious, have you profiled it is actually GC or CC causing the pauses? Since there are plenty of
other things which may cause some jank.

Anyhow, comment 1 sounds rather good to me.



(If I find some spare time, I may implement generic GC/CC scheduling timeline, which would hopefully
simplify the current scheduling setup quite a big)
Generational GC's are on the order 100us, not 10ms, so GGC will help tremendously here. 

Regardless, I do strongly agree that we should do more to ensure that the GC never runs at a time when it is especially likely to be noticed by the user. The optimal solution, of course, is to wait for the user to blink before starting a GC.

Given that there are privacy issues with turning on all of our user's cameras and running facial recognition on them -- I believe this is actually quite feasible these days -- we are stuck looking at proxies for user attention, like animation. I would actually be in favor of something more aggressive and less fragile than the proposed solution: instrument the browser's core animation and event loops and tune the GC differently for animations, active user interaction, and idle times.

That would be a much bigger project than what you are requesting, however. For the near term, if the frontend knows of some specific animations that are extremely noticeable when choppy, we should definitely add more explicit GC tunings to solve those cases asap.
(In reply to Steve Fink [:sfink] from comment #1)
> Seems like a slippery slope. People will be tempted to resolve all GC pauses
> by inserting delayGC calls.

Yes, I agree it is a slippery slope. Although usages of this could hopefully be criticized greatly to make sure that there are no unnecessary usages.

> In your use case, it seems like it wouldn't be all that bad to have a GC
> before the animation starts. So what about something like
> adviseGC(JS::GC::ANIMATION_STARTING), which in this case would check for
> whether we need to GC, but using a lower threshold than usual? (So if we
> have plenty of memory left, this is a nop. But if it looks like we might
> need to GC soon, then do it now instead of during the animation.)

I imagine that this would actually have the opposite effect that we want. It may bring us closer to a smooth animation, but could potentially delay the animation by multiple frames introducing a separate variant of jank.
While this bug could still be relevant, I think that the trigger which lead to opening it was "sometimes the ~10th frame during tab-close animation with Australis is taking longer than the test", which was suspected as GC/CC.

mconley tracked this issue down to full tabstrip repaint around that 10th frame (only happens with Australis), which is apparently caused by specific CSS setup. See bug 908796.
(In reply to Avi Halachmi (:avih) from comment #8)
> While this bug could still be relevant, I think that the trigger which lead
> to opening it was "sometimes the ~10th frame during tab-close animation with
> Australis is taking longer than the test", which was suspected as GC/CC.

Not actually the trigger, more of just a precautionary measure as we start placing more attention on improving tabstrip performance.
Assignee: general → nobody
You need to log in before you can comment on or make changes to this bug.