Open Bug 911499 Opened 6 years ago Updated 4 years ago

inefficient animations on http://www.mozilla.org/en-US/firefox/os/

Categories

(Core :: Graphics: Layers, defect)

x86
macOS
defect
Not set

Tracking

()

People

(Reporter: gal, Unassigned)

References

Details

This might be the cause:

// Sub-pixel alignment is hard, lets punt on that.
if (state.mAnchor != nsPoint(0.0f, 0.0f)) {
  return false;
}

Also, layout.animated-image-layers.enabled must be set to true, and its false by default right now I think. I will attach a test case next.
I assume we're looking at http://www.mozilla.org/en-US/firefox/os/.

I assume the performance of that page in general is platform-specific. On my Linux laptop, where GL is disabled by default, it's extremely smooth, even when I zoom in so all images are scaled (as they would be on Mac Retina). That may be Xrender accelerated image compositing saving the day (for once).

>    I think this is whats killing us:
>
>     @-moz-keyframes keep-scrolling {
>        from { background-position: 0 0; }
>        to { background-position: 0 -875px; }
>      }

Sounds plausible. This could cause us to redraw many large scaled images on every paint. On Mac that image scaling will be happening on the CPU.

I don't know that animating background-position is common. It's the first I've seen of it that I remember.

(In reply to Andreas Gal :gal from comment #0)
> // Sub-pixel alignment is hard, lets punt on that.
> if (state.mAnchor != nsPoint(0.0f, 0.0f)) {
>   return false;
> }

We could potentially ignore this in the presence of animation, but in static cases things will break (e.g. with visible seams) if we ignore it. Getting image snapping correctly plumbed through the layer system would be hard.

> Also, layout.animated-image-layers.enabled must be set to true, and its
> false by default right now I think. I will attach a test case next.

No, you'd need layout.gpu-image-scaling.enabled to be set.
roc, can we check for anchor to be on a pixel boundary instead of 0? That would do the trick I think.
It's not that simple because of image scaling. Also, animation doesn't necessarily snap to pixels so I don't think that would help performance much.

I'd like to dig into this page. I actually don't understand how animation is being used here, since everything should be driven the current scroll position.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #1)
> I assume the performance of that page in general is platform-specific. On my
> Linux laptop, where GL is disabled by default, it's extremely smooth, even
> when I zoom in so all images are scaled (as they would be on Mac Retina).
> That may be Xrender accelerated image compositing saving the day (for once).

I tried it on the office Mac Mini (non-Retina display, but with zooming to ensure the images are scaled) and it still seemed pretty smooth. I guess pushing 4x the pixels with a Retina display must be the problem.

I'll look into what it would take to get better layerization here anyway.
I just had a look on a Retina Macbook Pro. Scrolling/animations are a little janky in a couple of places, but it's mostly smooth. Definitely not "2-3 FPS".
I see low FPS for the content inside the phone scrolling left and right and when yous scroll down there are 6 images that move as you scroll. Those repaint at every scroll tick. Enabling paint flashing shows this well. In any case, its not smooth, and we don't layerize here and we clearly should (by changing the site or gecko).
Gnome-shell's shift-ctrl-alt-R screencast-recording tool, plus gmplayer's frame-by-frame control, is incredibly useful for debugging here...

One problem that's quite interesting and should be fixable without too much trouble or difficult tradeoffs is excessive repaints when the fuzzy background image changes --- one image fades out and the other fades in. This seems to be done by putting the new image under the old image and then fading the opacity of the old image from 1 to 0. We should be able to do this with just one ThebesLayer repaint --- at the beginning, when we move the old image to its own layer --- but instead we always take 4. The nsDisplayOpacity isn't treated as active, at least, not early enough. Part of the fix is to be more aggressive about layerizing content where we know there's an animation/transition on the content but the content hasn't started transitioning yet --- an extension of what we already do for OMTA. But that isn't enough, I'm still working on it.

Repainting the images scrolling inside the phone is also a problem, but probably less of a problem since that area of the page is really small.
The fading-in of the big background images is actually done using jQuery scripted animation. Here's the relevant JS source from https://github.com/mozilla/bedrock/blob/master/media/js/firefox/os/desktop.js#L146:

    $('.adaptive-bg[data-current="1"]').attr('data-current', 0).css('opacity', 1).addClass('top');

    // set new bg to current, fully visible, and put on bottom of stack
    $('#adaptive-bg-' + bg).attr('data-current', 1).css('opacity', 1).removeClass('top');

    // fade out non-current bg
    $('.adaptive-bg[data-current="0"]').animate({
      'opacity': 0
    }, 350);

In the steady state all but the current image are hidden (via opacity:0), the current image has opacity:1, and the previously selected image is on top of the z-order. To switch images we set the new image to opacity 1, put the old image on top of the z-order, and fade it to opacity 0 over 350ms. The next time we draw, the jQuery animation hasn't started yet (it uses timers, not requestAnimationFrame --- see http://bugs.jquery.com/ticket/9381, http://forum.jquery.com/topic/please-don-t-remove-requestanimationframe :-( ). Unfortunately, although nothing has really changed in this frame in terms of what's rendered, DLBI repaints the affected area of the ThebesLayer containing the backgrounds, because
-- We changed the opacity of the new image from 0 to 1.
-- We changed the z-index of the new and old images.
DLBI does not rely on the display list order to find z-order changes. Instead it ignores changes to z-order and we rely on explicit invalidation when there are changes to 'z-index' or when an element is removed from and reinserted into the DOM (that causes frame reconstruction).

The next thing that happens is that jQuery's animation timer kicks in and sets the old image to some partial opacity. This moves the old image to its own layer --- an inactive layer, since we intend to wait for the *second* scripted opacity change before assuming opacity is animated --- a recent change, see bug 894773. So the area has to be repainted in the retained ThebesLayer.

Next, jQuery's animation timer runs again and updates the partial opacity. Unfortunately this still doesn't trigger our active-layer heuristic becuase of a bug in the patch for bug 894773. So we update the opacity of the inactive layer, repainting the retained ThebesLayer again.

Finally the animation finishes. The opacity:0 element is removed. The disappearance of the inactive layer means we have to repaint the retained ThebesLayer one more time.
I guess I've hijacked this bug. Sorry!

(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #8)
> DLBI does not rely on the display list order to find z-order changes.
> Instead it ignores changes to z-order and we rely on explicit invalidation
> when there are changes to 'z-index' or when an element is removed from and
> reinserted into the DOM (that causes frame reconstruction).

We could fix this, probably, but it would make DLBI even slower and more complicated and I don't want to go that way. I suspect that DOM/CSS changes to z-ordering that don't end up actually needing repainting are quite rare, and thus doing a bunch of work to detect them would not be a win.

> The next thing that happens is that jQuery's animation timer kicks in and
> sets the old image to some partial opacity. This moves the old image to its
> own layer --- an inactive layer, since we intend to wait for the *second*
> scripted opacity change before assuming opacity is animated --- a recent
> change, see bug 894773. So the area has to be repainted in the retained
> ThebesLayer.

There are a few things that would help here:
1) Web developers should be using CSS transitions and animations instead of jQuery's scripted animations, wherever possible. In this case, it would be easy, and the animation of 'opacity' would be shunted off to the OMTC compositor and we'll be golden.
2) jQuery should use requestAnimationFrame. In this case, that would mean the first repaint after the initial DOM changes would always also see an opacity change, so we'd get onto the "opacity changing" path quicker and coalesce away one frame of repainting.

> Next, jQuery's animation timer runs again and updates the partial opacity.
> Unfortunately this still doesn't trigger our active-layer heuristic becuase
> of a bug in the patch for bug 894773. So we update the opacity of the
> inactive layer, repainting the retained ThebesLayer again.

3) Obviously we should fix that bug immediately.
4) But we can do better. When an opacity change occurs in a setTimeout or requestAnimationFrame callback (as opposed to, say, an input event handler), we should immediately assume animation and make the layer active like we did before bug 894773.
5) We also need to make sure that when the image is put in an active layer, it gets optimized to an ImageLayer. Because these images use "contain", that requires the patch in bug 906462.

Taking the page as-is, and making the above fixes, I think we can get the background image switch down to two repaints of the ThebesLayer: the initial repaint due to z-order and opacity changes (which isn't actually needed), and a second repaint where the fading-out image is moved to an ImageLayer.
Assignee: nobody → roc
Summary: animating background-position causes invalidation → inefficient animations on http://www.mozilla.org/en-US/firefox/os/
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #9)
> There are a few things that would help here:
> 1) Web developers should be using CSS transitions and animations instead of
> jQuery's scripted animations, wherever possible. In this case, it would be
> easy, and the animation of 'opacity' would be shunted off to the OMTC
> compositor and we'll be golden.
Can we "fix" jQuery's animations to use CSS animations under the hood?
> 2) jQuery should use requestAnimationFrame. In this case, that would mean
> the first repaint after the initial DOM changes would always also see an
> opacity change, so we'd get onto the "opacity changing" path quicker and
> coalesce away one frame of repainting.
Same thing, but it sounds like an even more strait forward fix
> 1) Web developers should be using CSS transitions and animations instead of
> jQuery's scripted animations, wherever possible. 

Agreed. However, jQuery's animations can animate many more properties than can CSS, so there is a bit of a mismatch here. So if people want to use CSS animations they should use them in their stylesheets and just use jQuery to change the classes to control the animations. There are also plugins that attempt to untangle the CSS-able animations vs the non-CSS-able ones, you could use those. Several are cataloged here: http://addyosmani.com/blog/css3transitions-jquery/

> 2) jQuery should use requestAnimationFrame.

We tried it and the outcome showed us that it broke a LOT of code. For example people would do a 2-second animation using jQuery and then repeat it every 10 seconds using a setTimeout. When the window is out of focus the animations don't happen but the timeout continues to stack them every 10 seconds. When the window comes back into focus they all happen at once. See http://bugs.jquery.com/ticket/9381 . If this sort of behavior is acceptable or you can ensure it doesn't happen in your codebase, there is a plugin to use requestAnimationFrame in jQuery animations: https://github.com/gnarf/jquery-requestAnimationFrame

As with browser feature changes, it's often better to let users who need something "opt in" to getting it rather than changing the old behavior out from under existing code.
(In reply to Brad Lassey [:blassey] (use needinfo?) from comment #11)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #9)
> > There are a few things that would help here:
> > 1) Web developers should be using CSS transitions and animations instead of
> > jQuery's scripted animations, wherever possible. In this case, it would be
> > easy, and the animation of 'opacity' would be shunted off to the OMTC
> > compositor and we'll be golden.
> Can we "fix" jQuery's animations to use CSS animations under the hood?

That seems very hard.
(In reply to Dave Methvin from comment #12)
> > 2) jQuery should use requestAnimationFrame.
> 
> We tried it and the outcome showed us that it broke a LOT of code. For
> example people would do a 2-second animation using jQuery and then repeat it
> every 10 seconds using a setTimeout. When the window is out of focus the
> animations don't happen but the timeout continues to stack them every 10
> seconds. When the window comes back into focus they all happen at once. See
> http://bugs.jquery.com/ticket/9381. If this sort of behavior is acceptable
> or you can ensure it doesn't happen in your codebase, there is a plugin to
> use requestAnimationFrame in jQuery animations:
> https://github.com/gnarf/jquery-requestAnimationFrame

Yes, I saw that ticket.

> As with browser feature changes, it's often better to let users who need
> something "opt in" to getting it rather than changing the old behavior out
> from under existing code.

Unfortunately this means the majority of developers are getting a suboptimal experience because a few developers are using the existing API in pathological ways. This isn't really the right place to discuss this though. Where can I contribute my thoughts to the jQuery community? :-)
Depends on: 911674
I filed bug 913438 to remove the mAnchor check in nsDisplayBackgroundImage::TryOptimizeToImageLayer. With that, and the patches in bug 911889, the background image changes described in comment #8 are handled more or less optimally now:
-- In the first frame we hoist the current image to the top of the z-order and inserts the next image underneath it. FrameLayerBuilder observes that in the layer containing these images, only the current image is now visible (before there was some other <a> element on top of it), and turns the whole thing into a single ImageLayer for the current image, throwing out the previous ThebesLayer. No ThebesLayer painting required.
-- In the second frame the current image has opacity < 1 set on it. A new ImageLayer is created for the current image, with a ContainerLayer parent carrying the opacity. A new ThebesLayer is created to hold the new image and the <a> element on top of it, and they're drawn into the ThebesLayer.
-- In subsequent frames the opacity of the current image decreases. No ThebesLayers need to be repainted, the opacity change is drawn by the compositor.
-- At the end, the opacity of the current image reaches 0. The ContainerLayer and ImageLayer for it are destroyed. No ThebesLayer updates occur.
So we reduce that stage of the page to just one sizeable repaint, which is about optimal.
Here's my analysis of the rest of the page:

-- Right to left scrolling of the screens in the phone: these are absolutely positioned with scripted animation of 'left'. Should be fixed by bug 876321. (background-position is present, but not animated; I think it doesn't matter here, it's just being used for spriting. Except I guess it will inhibit use of ImageLayers, but that's probably OK.)

-- In some places as we scroll down, a ThebesLayer gets scrolled out of view. This breaks our recycling logic in FrameLayerBuilder; since nothing uses this disappearing ThebesLayer, it gets recycled for the next set of display items, which basically all move from the ThebesLayer they were on to the previous ThebesLayer. This of course forces them all to be repainted. We should make ContainerState::CreateOrRecycleThebesLayer smarter to avoid that problem. I've filed bug 913443 for that.

-- Further down, various phone images and text sidebars move up and down and left and right. These are a mix of scripted animation of 'top'/'left' of abs-pos elements (bug 876321) and scripted animation of margin-left/margin-right. The latter could probably be fixed the same way we fix 876321. Filed bug 913444.

I think that's all.
Depends on: 876321, 913443, 913444
Apple's new iPhone design websites have a similar user experience to the /firefox/os/ website and the paint flashing is not too bad unless you get to the horizontal scrolling sections:

http://www.apple.com/iphone-5c/design/

It is also choppy on Firefox compared to Chrome.
I have patches for everything mentioned above.

There are a few more invalidations as we scroll the page that I'm trying to track down.
There's some unnecessary invalidation going on because we've merged multiple nsDisplayFixedPositions into a single layer and then we delete the frame for one of the nsDisplayFixedPositions, which somehow causes all the other ones to forget what their layer is.
I think we need to fix the fundamentals of nsDisplayFixedPosition. See bug 919144.
Depends on: 919144
No longer depends on: 906462
You need to log in before you can comment on or make changes to this bug.