Open Bug 1407536 Opened 4 years ago Updated 10 months ago

Performance issues (possibly CSS animation related) in Firefox Quantum beta on OS/X - Very high CPU usage on stripe.com - Perf.html attached

Categories

(Core :: Graphics, defect, P2)

57 Branch
x86
macOS
defect

Tracking

()

Tracking Status
firefox-esr60 --- affected
firefox57 --- fix-optional
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- wontfix
firefox63 --- wontfix
firefox64 --- affected
firefox65 --- affected

People

(Reporter: t.guichelaar, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: perf, Whiteboard: [gfx-noted])

Attachments

(3 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:57.0) Gecko/20100101 Firefox/57.0
Build ID: 20171002181526

Steps to reproduce:

This is my first bug report so bear with me :-)

I installed the Firefox Quantum beta and cleared my profile.

Firefox never was very fast on my 15 inch retina Macbook Pro, but it's gotten worse with Quantum.

I went to stripe.com as they are heavy on CSS animations / transitions. The CPU usage immediatly skyrocketed.


Actual results:

CPU usage was very high, fans started spinning. Scrolling was very janky.


Expected results:

A buttery smooth performance like I have in Safari.
See Also: → 1404042
Keywords: perf
Uploaded that profile as https://perfht.ml/2kHLOPE fwiw.

In the #2 content process there's a bunch of PresShell::Paint off refresh ticks (both from DidComposite and vsync).  

The compositor is showing lots of stuff, but mostly it's waiting on things like GLContextCGL::SwapBuffers to do the swap.

Tinus, this was a clean profile so didn't have a non-default zoom level selected for stripe or anything like that, right
Component: Untriaged → Graphics
Product: Firefox → Core
Whiteboard: [qf]
(In reply to Boris Zbarsky [:bz] (still digging out from vacation mail) from comment #1)
> Uploaded that profile as https://perfht.ml/2kHLOPE fwiw.
> 
> In the #2 content process there's a bunch of PresShell::Paint off refresh
> ticks (both from DidComposite and vsync).  
> 
> The compositor is showing lots of stuff, but mostly it's waiting on things
> like GLContextCGL::SwapBuffers to do the swap.
> 
> Tinus, this was a clean profile so didn't have a non-default zoom level
> selected for stripe or anything like that, right

Yes, I tried directly after cleaning the profile (was a tip from someone on HN).

The zoom level was normal. 

I tried again a couple of minutes ago with beta 5.
This isolated CSS animation has 70% CPU usage in Firefox vs. 30% in Safari and 55% in Chrome.

https://codepen.io/davide_ravasi/full/WbRKrY/

If I add a second ball, the CPU usage goes to 100%.
In Windows, Edge averages 31% CPU usage, while FF does around 20. Great job there!
Priority: -- → P2
Whiteboard: [qf] → [qf][gfx-noted]
On the stripe.com page, a prime suspect may be the "Developers First" heading with its animated gear icons and the many SVG images (as <img> tags) that are being animated behind it.

On a 2015 MacBook Pro 13", I’m getting this CPU usage for browsers’ relevant web content process, with this element in the middle of the viewport:
- Firefox Nightly: 110%
- Chrome: around 50%
- Safari: 45-50%
(In reply to Tinus from comment #0)
> Firefox never was very fast on my 15 inch retina Macbook Pro, but it's
> gotten worse with Quantum.

This suggests that one way to make progress here is that somebody should use mozregression or similar tools to figure out when it got worse.  (Though it might be worth testing with stylo disabled first to see if it was stylo that caused the change.)
How do I test this with Stylo disabled? The flag in about:config seems to be removed? I tried adding stripe.com in the layout.css.stylo-blocklist.blocked_domains but that did not result in better performance.
The pref name is layout.css.servo.enabled.
I attached a new perf.html export with servo disabled. Same results basically.
layout.css.servo.enabled is set to false in this perf.html.

Machine info:
  Modelnaam:	MacBook Pro
  Modelaanduiding:	MacBookPro11,2
  Processornaam:	Intel Core i7
  Processorsnelheid:	2,2 GHz
  Aantal processors:	1
  Totale aantal cores:	4
  L2-cache (per core):	256 KB
  L3-cache:	6 MB
  Geheugen:	16 GB
Whiteboard: [qf][gfx-noted] → [perf:investigate][gfx-noted]
(In reply to Tinus from comment #3)
> This isolated CSS animation has 70% CPU usage in Firefox vs. 30% in Safari
> and 55% in Chrome.
> 
> https://codepen.io/davide_ravasi/full/WbRKrY/

On Linux, this animation makes Chrome Dev Edition peg my processor at 200-250% CPU usage (and they drop to basically nothing if I go to example.org instead).  Whereas, it makes Firefox Nightly stick at 60-75% CPU.

Not sure if this means Chrome has regressed something, or if they have a Linux-specific bug, or what. But it sounds like to the extent that there's a discrepancy where we're losing here, it's specifically on Mac (not Windows, per comment 4, and not Linux per this comment).
It's been this way for years. Animation of various kinds on the Mac uses huge amounts of CPU, there are many bug reports about it. I don't think anything has changed recently performance-wise. I've been hoping Servo would improve things, but unfortunately it seems not, at least so far. It's particularly difficult because of bug 1237454, where hidden CSS animations like "loading" spinners still eat CPU.

With the page at stripe.com, I don't see any difference between FF52ESR, FF56.0.2, FF57.0b14 with servo enabled, or with it disabled. On my 2012 Macbook Air with macOS 10.12, I see about 98-108% CPU on each, according to iStat Menus CPU meter. With Safari it's about 35%. That's exactly what I would expect based on past experience.

I would probably close this as invalid, as it's not something new or specific to Quantum or Servo, or as a duplicate of one of the older bugs, though I'm not sure which.
Whiteboard: [perf:investigate][gfx-noted] → [qf:investigate][gfx-noted]
So this bug seems to have low priority, while I have yet to see a Mac perform well with Firefox. What is going on?
I had a quick look at stripe.com, and it looks like hovering over the 'create account' and 'contact sales' buttons causes a lot of repainting.

These buttons have a hover css pseudo style that adds a transition animation for 'transform' and 'box-shadow'. We're running the transform animation on the compositor, even though it's only 0.15s long.

I think what happens is that we decide we need a separate Layer for transformed content (the button), which then pushes all content which is logically on top of the button into another Layer. We do this because we assume the asynchronously transformed content could move anywhere, and we need to be sure any content that it might move behind is split out from the background.

Allocating new layers for the popped out content, and redrawing it sucks, especially since we do the reverse again when the transition finishes.

Some ideas:

* Analyze the transition/animation (for simple cases at least) and figure out a bounding box to clip to. Only pop out other content if it intersects this clip, rather than doing it for the entire displayport. Should be easy here, since we're just transitioning 'transform:translateY(-1px);', and I don't think anything else could ever intersect that.

* If we do end up popping content, try to remember this and don't immediately undo it. If we've redrawn the whole page into foreground/background Layers, then we might as well keep them around until we have a decent reason to get rid of them.

* Don't try so hard for a 0.15s transition on a small element. Separate Layers and async animations are nice for long running things, but we're missing the entire transition just doing prep work here.

Jamie, this might be interesting to you!
Flags: needinfo?(jnicol)
The "developers first" animations are what's burning CPU for me, while at rest anyway. They seem to be done on the content thread rather than on the compositor. We should investigate why this is the case.

Matt's analysis of hover effect on the buttons seems spot on too. This is actually a reasonably common scenario: that creating a small short-lived layer for an animation results in large knock-on layerisation changes and much more repainting.

> Analyze the transition/animation (for simple cases at least) and figure out a bounding box to clip to. Only pop out other content if it intersects this clip, rather than doing it for the entire displayport. Should be easy here, since we're just transitioning 'transform:translateY(-1px);', and I don't think anything else could ever intersect that.

This sounds ideal, and if not too difficult seems worthwhile implementing, for simple cases anyway. I've had this idea before but not known how to get the required information about the animations.

> Don't try so hard for a 0.15s transition on a small element. Separate Layers and async animations are nice for long running things, but we're missing the entire transition just doing prep work here.

Using a minimum size to ensure an active layer is worthwhile is something we do, but I've never thought about using the duration. It's a really great idea! and doesn't sound like it'd be too hard.
Flags: needinfo?(jnicol)
(In reply to Jamie Nicol [:jnicol] from comment #15)
> The "developers first" animations are what's burning CPU for me, while at
> rest anyway. They seem to be done on the content thread rather than on the
> compositor. We should investigate why this is the case.

Looks like all of that 'animation' (both the spinning cogs, and the flying symbols) are all generated using JS and requestAnimationFrame, not using a declarative CSS animation. That's a bit sad, since it's just opacity/transform, and we should be able to do it async if they let us.

> 
> This sounds ideal, and if not too difficult seems worthwhile implementing,
> for simple cases anyway. I've had this idea before but not known how to get
> the required information about the animations.

We have nsLayoutUtils::ComputeSuitableScaleForAnimation already, which does something similar-ish. I think we'd want to use the full set of animations (css transitions are a subclass of Animation), not just the compositor ones (Element::GetAnimations?).

I *think* doing that from IsClippedWithRespectToParentAnimatedGeometryRoot and computing a bounding rect should work.

> 
> Using a minimum size to ensure an active layer is worthwhile is something we
> do, but I've never thought about using the duration. It's a really great
> idea! and doesn't sound like it'd be too hard.

Indeed, should be easy to pull the duration from the animations when we decide if we're an active layer.
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> We have nsLayoutUtils::ComputeSuitableScaleForAnimation already, which does
> something similar-ish. I think we'd want to use the full set of animations
> (css transitions are a subclass of Animation), not just the compositor ones
> (Element::GetAnimations?).

Yeah, Element::GetAnimations is probably what you want, although it does some sorting which you might not need unless you want to skip overridden animations, so you might also be able to use Element::GetAnimationsUnsorted too, or even EffectSet::GetEffectSet if you want to go straight to the KeyframeEffect objects. That will give you all CSS animations / CSS transitions / Element.animate animations that affecting (pseudo-)element or are scheduled to do so.
FYI I'm also seeing exceptionally high CPU usage on my computer.

Here's the profile: https://perfht.ml/2G6glgj

I don't know if the profile captures the computer hardware too, so here's what I'm running:

- Late 2013 Retina MacBook Pro 13" (fully loaded: i7, 16GB RAM, etc.)
- Mac OS Sierra 10.12.6
- FF 60.0b5 (64-bit) -- seeing the same issue in Nightly

When I open Firefox, my fans immediately start spinning up. Not 100% or anything, but say 3500-4000 RPM which is way above normal. Fans stay around 1300 RPM with no browser open, and similarly are at 1300 when Chrome is on its new tab page (no other tabs open.)
(Happened to see this linked from https://news.ycombinator.com/item?id=17031959)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Adding the site, stripe.com into bug summery.
Summary: Performance issues (possibly CSS animation related) in Firefox Quantum beta on OS/X - Very high CPU usage - Perf.html attached → Performance issues (possibly CSS animation related) in Firefox Quantum beta on OS/X - Very high CPU usage on stripe.com - Perf.html attached
Duplicate of this bug: 1459381
OS: Unspecified → Mac OS X
Hardware: Unspecified → x86
Duplicate of this bug: 1406360
I can confirm that this is still happening with the latest (2018-05-10) nightly build of Firefox. My system is a MacBook Pro 2017 on 10.13.4 (High Sierra). It does not happen if the tab with the bouncing ball (from https://codepen.io/davide_ravasi/full/WbRKrY/) is not active.

Let me know if there's anything else I can do to reproduce.
This is a profile from a debug build with optimization disabled.
That CodePen also works for me: tab 1 (CodePen), tab 2 (Google.com). Click tab 1, CPU goes from 48C to 70C in ~20 seconds. Click tab 2, CPU slowly decreases back down to 48C.
For the bouncing ball, both the ball and shadow need continually repainted. If they were simply being transformed we could handle this on the compositor and it would be quick, but unfortunately some of the animated properties require us to rerasterize continually. This is in contrast with the stripe website, where I think the animation could be driven by the compositor were it implemented differently.

The box blur shows up in the profile because it is the most expensive part of that rerasterization. I don't know whether there is room for improvement in our implementation of it. As for why this is worse on mac than other platforms - it might be another case of slow texture upload (since we are continually uploading the newly rerasterized data).
In the light of https://www.reddit.com/r/webdev/comments/8hha0r/is_it_worth_supporting_firefox/ (quoted in full hereafter) this bug (and similar ones) should be treated as existential threats to Firefox.

Anything that allows innocuous code to DDOS Firefox allows hostile publishers to punish Firefox users, with plausible deniability, detracting them from using the browser.

---------- from Reddit: ---------

# Is it worth supporting Firefox? #

I work for a major publisher. Ad-blockers are starting to hurt our revenue big time.

Our manager said that we should optimize by the order of average revenue per user. The browser with highest revenue per user should be optimized the most. Firefox is almost at the end of the list. I tried telling him that the people who use adblockers are likely to share more content, which brings in non-adblock users. However, that is also falling apart as adblock users are bringing other adblock users.

Chrome and Safari are our highest priority because of this. Firefox is like at the bottom of the list. I still have the chance, and I dont want to leave Firefox :(

What would be an argument that you think would be valid at this point? Obviously a business requires money to keep running. We tried putting anti-adblock notices, but an adblocker used by Firefox users mostly bypasses our measures.

Ad reinsertion solutions exist for Chrome, but they also dont support Firefox.

We're doing a redesign now. Manager says that we should just discourage the users of Firefox (not optimizing for Firefox, and poor UI) from visiting our website as they bring a loss.

Edit: They are even going to approach google and ask them to introduce new API's that allow us to determine if browser really chrome or just chromium, so that we block chromium. And a DRM for website instead of only video DRM that exists, so that user cannot view source code, and no extension can block the ads. Other publishers are doing the same thing. Expect new DRM for HTML5 too.
If you're not experiencing this issue and you're on the Firefox team, find a way to reproduce it, and use Firefox in that state until you fix it. Guaranteed you'll either fix it in the first week or you'll switch to Chrome. It's unbearable. I really want to use Firefox but I absolutely refuse on the basis that it rips through my battery while eating my CPU for common tasks.
Thanks everyone for giving this issue traction. I really want to use Firefox again but right now it is absolutely impossible.
Another thread about how FF is ‘back’ with numerous people complaining about this very issue. I’m starting to lose hope this will ever be fixed: https://news.ycombinator.com/item?id=17361168&p=2
I see memmove takes lots of time in profiles in comment 18 and comment 24, I guess it will be solved by bug 1265824 (or bug 1191965).  Also the "developers first" animations Jamie mentioned in comment 15 have been improved by bug 1467619, no?
Also, as dbaron commented in comment 6, it would be very helpful that someone try to narrow down which changeset made this worse.
FYI, the following solution has produced some improvements for me (fan no longer runs consistently at max RPM, CPU is around 45C)

about:config -> set gfx.compositor.glcontext.opaque to true
> about:config -> set gfx.compositor.glcontext.opaque to true

Yes! Thanks for sharing, that makes Firefox usable again.

The animation on the Stripe home page (see comment 1) still brings the CPU to ~90% when that tab is visible, but the overall perf is sooo much better.

Searching around for that preference, I found https://bugzilla.mozilla.org/show_bug.cgi?id=1429522 where @pcwalton is working on an actual fix. I'm glad this is being tackled, thanks folks.
This bug has become a grab bag of too many issues. From what I can tell, we have three issues here:
 - Stripe.com has an animation that causes too much repainting when layerization is affected.
 - https://codepen.io/davide_ravasi/full/WbRKrY/ keeps repainting a box-shadow which causes us to compute blurs on the CPU.
 - Compositing has too much GPU load.

For the first two, the way forward is to wait for WebRender, which does not have problems due to layerization, and which does box-shadow blurring on the GPU. So I'll make this bug depend on bug 1464426.

For the last, the work to address it will be happening in bug 1429522.
Depends on: 1429522
Whiteboard: [qf:investigate][gfx-noted] → [gfx-noted]
You need to log in before you can comment on or make changes to this bug.