Closed Bug 1540906 Opened 5 years ago Closed 5 months ago

Synchronizing transform animations with geometric animations on other elements is causing performance issues

Categories

(Core :: DOM: Animation, defect, P1)

defect

Tracking

()

RESOLVED FIXED
121 Branch
Tracking Status
firefox68 --- wontfix
firefox121 --- fixed

People

(Reporter: birtles, Assigned: boris)

References

(Depends on 1 open bug, Blocks 4 open bugs)

Details

(Whiteboard: [layout:backlog:79])

Attachments

(2 files)

In bug 1301305 we introduced an optimization to ensure that when the author is animating 'transform' and 'margin' etc. at the same time we run them both on the main thread. This improved the visual appearance of trello.com

Unfortunately, it has caused performance issues for other content (see the list of bugs this one blocks). In bug 1502026 we tweaked this heuristic further to distinguish between transitions and animations and that helped performance on the iPhone page but there are still many other performance bug reports that are at least in part due to this optimization.

We should investigate some other way to fix the trello.com case, or see if it even still requires this optimization.

As well as ensuring we don't regress bug 1301305 too much, we should also check bug 1240632 comment 3 which provides STR for content that was fixed by bug 1301305.

That said, we might also want to extend the heuristic from bug 1301305 to include 'perspective' in order to fix bug 1330627 (could we do it only when the perspective animation is on an ancestor of the transform animation, perhaps?).

Whiteboard: [layout:backlog:perf]
Whiteboard: [layout:backlog:perf] → [layout:backlog:perf][layout:backlog:2020]
No longer blocks: 1512768
See Also: → 1619465

Hi Brian, Just wonder do you have any initial thoughts on this bug? In order to not regress bug 1301305 too much, it seems we may have to introduce some other ways to sync geometric animations with transform animations (i.e. not just fallback to main thread transform animations for this case). Perhaps I have to check the patches of bug 1301305 first.

Flags: needinfo?(brian)

I'm afraid I don't have any good ideas.

I would probably start by revisiting bug 1301305 to see if it's still an issue. If it's not, we could try turning off the pref that Hiro added for this (dom.animations.mainthread-synchronization-with-geometric-animations) in Nightly and see what happens. We'll want to check a number of configurations however (esp. Windows WebRender, Windows non-WebRender).

If it is still an issue, it would be good to get a profile and see why the perf difference is so big between Firefox and Chrome when that pref is disabled and if there's anything we can do to fix the underlying issue.

If fixing the underlying issue proves too difficult, then we have to think a bit more creatively:

  • Does it make sense to only synchronize margin animations (what Trello uses) but not left/top/bottom/right? At least when the latter are abpos? Would that fix some of the cases negatively affected by this heuristic?
  • Are there some margin animations we can emulate on the compositor using transforms? (Seems unlikely and very very difficult)
  • Is there some other heuristic we can apply to determine whether the animations should be synchronized that would fix trello but not affect other animations where this produces worse performance? Distance in the tree? Distance on the screen?

Obviously long-term we want to get more and more animations on the compositor so that this becomes less of an issue, but getting margin animations on the compositor seems a long way off.

Flags: needinfo?(brian)

(In reply to Brian Birtles (:birtles) from comment #4)

  • Is there some other heuristic we can apply to determine whether the animations should be synchronized that would fix trello but not affect other animations where this produces worse performance? Distance in the tree? Distance on the screen?

Regarding tweaking the heuristics, note that there is also bug 1503741 which might fix some cases.

Perhaps another concrete step is to go through the linked bug reports and list up what properties are being animated and any other interesting properties of the animations (Is either animation infinitely repeating? Are the elements ancestor/descendant? Are they close in the DOM? Close on the page? Do the animations have the same duration? etc. etc.) That might give some hints as to useful tweaks to the heuristic.

Whiteboard: [layout:backlog:perf][layout:backlog:2020] → [layout:backlog:79]

(In reply to Brian Birtles (:birtles) from comment #4)

I'm afraid I don't have any good ideas.

I would probably start by revisiting bug 1301305 to see if it's still an issue. If it's not, we could try turning off the pref that Hiro added for this (dom.animations.mainthread-synchronization-with-geometric-animations) in Nightly and see what happens. We'll want to check a number of configurations however (esp. Windows WebRender, Windows non-WebRender).

I was trying to disable dom.animations.mainthread-synchronization-with-geometric-animations to see the case in https://trello.com/b/k1vqA7Kk/firefox-layout-project-tracking. Before disabling this pref, I didn't see this "Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was." However, after disabling this pref, yes, I notice the background is seen first. But, I compare this with Chrome (84.0.4124.1), and I also notice the background is seen first on Chrome, on my mac... Does this mean this is still an issue on Chrome? However, the delay in Chrome is shorter.

And of course, I feel the animation of closing the menu is slower after enabling the pref.

(In reply to Brian Birtles (:birtles) from comment #4)

If it is still an issue, it would be good to get a profile and see why the perf difference is so big between Firefox and Chrome when that pref is disabled and if there's anything we can do to fix the underlying issue.

Try to dump to Gecko profiler: https://perfht.ml/2VRPRIG after disabling the pref.

(In reply to Boris Chiou [:boris] from comment #7)

(In reply to Brian Birtles (:birtles) from comment #4)

If it is still an issue, it would be good to get a profile and see why the perf difference is so big between Firefox and Chrome when that pref is disabled and if there's anything we can do to fix the underlying issue.

Try to dump to Gecko profiler: https://perfht.ml/2VRPRIG after disabling the perf.

You should get the profile on release-optimized build. I was wondering why some styling takes 4 - 5ms, it looks super slow.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #8)

(In reply to Boris Chiou [:boris] from comment #7)

(In reply to Brian Birtles (:birtles) from comment #4)

If it is still an issue, it would be good to get a profile and see why the perf difference is so big between Firefox and Chrome when that pref is disabled and if there's anything we can do to fix the underlying issue.

Try to dump to Gecko profiler: https://perfht.ml/2VRPRIG after disabling the perf.

You should get the profile on release-optimized build. I was wondering why some styling takes 4 - 5ms, it looks super slow.

Oh. Right. I used the nightly debug version. I will try to re-profile it. Thanks.

The profile on release-nightly build: https://perfht.ml/2W52sJV (without webrender). Looks much better. On my mac, the issue of "The Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was" is not very obvious, compared to the same website on Chrome, I think. Perhaps we could just disable this pref on nightly (but keep enabling it on release and beta)?

With web-render: https://perfht.ml/2YAI8Sk
Without web-render: https://perfht.ml/2W52sJV

That's very encouraging! It would be great to try on Windows too if you have access to a Windows machine.

What's the spec of your mac? I can see some refresh driver ticks take 13ms - 16ms on both profiles, it's worrisome.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #12)

What's the spec of your mac? I can see some refresh driver ticks take 13ms - 16ms on both profiles, it's worrisome.

Perhaps I'm too optimistic. The spec is 15-inch 2017; 3.1 GHz Quad-Core Intel Core i7; 16 GB 2133 MHz LPDDR3; Intel HD Graphics 630 1536 MB.

Yes. I notice it takes 13ms~16ms, for PaintFrame and BuildDisplayList. :(

(In reply to Brian Birtles (:birtles) from comment #11)

That's very encouraging! It would be great to try on Windows too if you have access to a Windows machine.

OK. I will try on Windows later, on my old desktop.

Then we should look into the profile in detail.
this is the first tick took 14ms, I can see there are two hotspot in terms of animations, ComposeSortedEffects and HasAnimationsForCompositor. Can you please take another profile with more fine-grained sampling interval?

(In reply to Hiroyuki Ikezoe (:hiro) from comment #15)

Then we should look into the profile in detail.
this is the first tick took 14ms, I can see there are two hotspot in terms of animations, ComposeSortedEffects and HasAnimationsForCompositor. Can you please take another profile with more fine-grained sampling interval?

Mac, interval: 0.1ms
With web-render: https://perfht.ml/2yxsIn6
Without web-render: https://perfht.ml/2WrGu2w

Windows 10 64bits (CPU: intel i5-2400 @3.1Ghz, RAM: 24GB DDR3 1333): interval: 1ms
With web-render: https://perfht.ml/2SIMu60
Without web-render: https://perfht.ml/2YRXYZ5

I don't know why my nightly on windows doesn't allow me to change the interval to a smaller value. (It said it cannot access property "version", e.meta is undefined)

I think both window and mac have similar performance in this case; In other words: "The Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was" is not very obvious, compared to that on Chrome.

However, it's worth to check what happens in the functions Hiro mentioned.

(In reply to Boris Chiou [:boris] from comment #16)

I think both window and mac have similar performance in this case; In other words: "The Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was" is not very obvious, compared to that on Chrome.

Does that mean, even with that pref off, you see better performance in Firefox than Chrome? For both Windows and Mac? With and without Web render?

(If so, that sounds like a good case for running a trial with that pref off!)

(In reply to Brian Birtles (:birtles) from comment #17)

(In reply to Boris Chiou [:boris] from comment #16)

I think both window and mac have similar performance in this case; In other words: "The Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was" is not very obvious, compared to that on Chrome.

Does that mean, even with that pref off, you see better performance in Firefox than Chrome? For both Windows and Mac? With and without Web render?

At least not worse than Chrome Canary. Almost the same or a little bit better on the website, with and without web-render. (I didn't see much difference after enabling the web-render.)

(If so, that sounds like a good case for running a trial with that pref off!)

Yes, we could make this pref off on nightly for trial.

(In reply to Brian Birtles (:birtles) from comment #17)

(In reply to Boris Chiou [:boris] from comment #16)

I think both window and mac have similar performance in this case; In other words: "The Menu slides closed and viewport expansion follows with a delay, so that background is seen first and then lists appear where the menu was" is not very obvious, compared to that on Chrome.

Does that mean, even with that pref off, you see better performance in Firefox than Chrome? For both Windows and Mac? With and without Web render?

(If so, that sounds like a good case for running a trial with that pref off!)

\o/

One thing I noticed is that it looks like the site does listen mouseout and mouseover events, (the callback for mouseout was handled first, then the callback for mouseover was handled), and it seems like the script in the mouseout callback triggers another flush styles/reflow. If it's the same characteristic on Chrome, maybe the site should stop doing the cause of the trigger, maybe. I don't know why they do it though.

Also I saw some flushes for throttled animations in the profile, once after we stop it (bug 1632957), the performance will be better. Actually the flushes cause additional reflows but I believe those reflows for updating scrollbar's positions and as far as I know Chrome doesn't update scrollbars at all during transform animations run on the compositor.

Note that I can no longer see the two functions related to animations in the new profiles.

Indeed, on my super-fast Ryzen 3950x machine, Chromes takes 20ms - 60ms in each frame. It's hard to believe though. :/ I can actually see two Layout stuff in a single frame.

The original site issue seems not obvious on nightly now, so perhaps we
could give this a trial to disable this pref, for the better performance
in other cases.

Assignee: nobody → boris.chiou
Status: NEW → ASSIGNED
Keywords: leave-open
Blocks: 1637075
Pushed by bchiou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/614e528541e6
Disable the pref of synchronizing transform animations with geometric animations on nightly. r=hiro

Now that we triage by severity, setting priority to P1 to reflect backlog prioritization on this bug as either in-progress, or planned development in the near term. See https://wiki.mozilla.org/Platform/Layout#Backlog_Tracking_in_Bugzilla

Priority: P3 → P1

Now its Firefox 79. Perhaps we could enable this for beta and release channels from now? Any concerns, hiro?

Flags: needinfo?(hikezoe.birchill)

Yep, I have. As I mentioned in a review comment, we should have a talos test to be able to catch regressions that geometric and transform animations get jaggy due to future changes.

Flags: needinfo?(hikezoe.birchill)
No longer blocks: 1637075
Depends on: 1637075

The leave-open keyword is there and there is no activity for 6 months.
:boris, maybe it's time to close this bug?

Flags: needinfo?(boris.chiou)

Need to keep open this until we can ship this. I should remove leave-open.

Severity: normal → S3
Flags: needinfo?(boris.chiou)
Keywords: leave-open
Blocks: 1747345

Hi Boris, given we have this pref turned off in Nightly for over a year now, do you think we should ship this?

Flags: needinfo?(boris.chiou)

Yes, we should ship this. However, I have no idea to write a suitable talos test for this when working on this, so I left this open.

Perhaps we could write a test which uses transform and margin animations, and the reference file uses the transform animation only. To make sure they have similar performance.

Flags: needinfo?(boris.chiou)

Will try to write the pref test if possible.

See Also: → 1760795

This has been enabled for four years now on nightly, with no regression.
Since we know performance is better without it, I think we're better off
doing this.

Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2b0cdf804685
Let de-synchronization of transform and geometric animations ride the trains. r=firefox-animation-reviewers,boris
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: