Closed Bug 1131937 Opened 9 years ago Closed 7 years ago

Dramatically high CPU usage on facebook group page

Categories

(Core :: Layout, defect)

38 Branch
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Performance Impact high
Tracking Status
platform-rel --- -

People

(Reporter: andrew.43, Assigned: bugs)

References

Details

(Keywords: perf, Whiteboard: [Power:P1][gfx-noted][platform-rel-Facebook])

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:35.0) Gecko/20100101 Firefox/35.0
Build ID: 20150122214805

Steps to reproduce:

I compared the CPU usage of Firefox when browsing News Feed @Facebook and group page @Facebook by using Activily Monitor.



Actual results:

In my case, the CPU usage on News Feed page was about 20%, but on group page, CPU usage was always more than 60%.

I also created a new firefox profile and disabled all plugins to do the same comparison. The above situation happened agian.


Expected results:

I think the CPU usage between browsing News Feed and group page @Facebook should not be so different.
I can confirm the same issue on my machine.
I'm running Firefox 36 on OS X 10.10.

When opening Facebook group page in Firefox, I can see big spike in CPU usage and can hear my fan spin up.
This doesn't happen on Chrome or Opera.

This happens while all my extensions are disabled.
Thanks for the reports and for testing using a new profile. Please follow these instructions https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem to see what takes up the CPU. It could be something on the page or in facebook's code, as well as a Firefox problem.
Flags: needinfo?(andrew.43)
Thanks for the help.
I followed the instructions here's a link to Cleopatra: http://people.mozilla.org/~bgirard/cleopatra/#report=c9be4b8b951d7e6d62a2f9198480429dc1c49c49

First 5 seconds I were in the main facebook page. After that I clicked on a link to a group page. I'm not sure how to read the results from Cleopatra, but if I understand it correctly, there much more frames being generated when I'm in Facebook group?

The test was performed on clean Firefox 38 install, nothing installed or configured.
> After that I clicked on a link to a group page.
Let me check I understand correctly: after it loaded you waited for ~7 seconds before finishing profiling, and the problem is the high CPU usage during that time (i.e. 8000-15000 on the timeline)?

This is a tough problem, since the profile only indicates that something causes the page to be redrawn, and the redraw is resource-hungry (almost all the samples are under nsPresShell::Paint)...

You could try taking another profile using a development https://nightly.mozilla.org/ build - it should be more detailed. You could also try https://msujaws.wordpress.com/2012/02/01/layout-paint-flashing-in-firefox/ to see what causes the page redraws. Also https://support.mozilla.org/en-US/questions/983066 seems to indicate that blocking certain items on the page might improve the situation.
Product: Firefox → Core
Version: 35 Branch → 38 Branch
Hi Nickolay,
first of all, thank you for your help with this issue.
Yes you understood correctly, this is exactly the problem (8k-15k timeline).

I'm going to do same test with latest nightly for OS X and update my findings.
Regarding the blocking, yes I saw this question already. It might be a good workaround but this is probably only that, a workaround. Since this problem doesn't happen on Chrome or Safari, it sounds like a bug worth filling in :-)
So I re-run the tests on latest nightly for OS X 10.10: http://people.mozilla.org/~bgirard/cleopatra/#report=20ba437ed38cea7928bbcf7ff38272129ae9a1b5

I must say it looks much better, as firefox takes now about 30% cpu instead of the 40%-50% in v38.
Chrome v43 takes about 20% in the same page while Safari barely do anything with less then 10%.

If these are the numbers for final release I guess it's not really a bug but just a optimization differences between the browsers under OS X?
Thanks for the updated report! Do note that the nightly version uses two processes (so called "e10s"), so make sure to measure both processes or disable multi-process.

Re blocking: figuring out what specifically causes the bad behavior, and recreating the problem in a page not as complex as facebook will make the bug much more approachable by the developers.

Re other browsers: it might be a browser problem or facebook behaving differently in different browsers, hard to tell at this stage. Thanks for filing the bug, your help so far is much appreciated! I'm not sure what else to suggest, hope that someone more knowledgeable will take it from here.
I've done some research into this pbm. (Linux, Firefox 41.0.0, noscript, AdBlock Plus) - This is correct, FB normally idles at  <10% when in my newsfeed or my own page, but in a "group" page, it spikes to 95% plus and STAYS there as long as I'm in the group page and that tab is visible (switching to another tab lowers it back down - until I switch back to that tab).  I tried blocking / unblocking things and found this:
1) Blocking all javascripts does NOT help.  2)  Blocking all .css STOPS the high CPU usage, but makes FB unreadable.  Thru experimentation I found that I could block just the .css that was driving the "CHAT" column (makes just that disappear when blocked) and that STOPPED the CPU cooking, but meant:  1) doing w/o chat, 2) The .css and .js urls (visible in Adblock Plus "Blockable Elements" window) contain random numbers that change with each login!  Anyway, it's something to do w/FB chat, though the chat column appears the same whether on a gruup page or any other page.  Hope that helps someone more knowledgable in tracking this $#^% down!!!
I didn't see any facebook bugs on the sw:Power list (apart from bug 890154, which is on mobile), so adding this for triage.

These bits from comment 8 are interesting: "Blocking all .css STOPS the high CPU usage" and "it's something to do w/FB chat"
Whiteboard: [Power]
Just checked it with latest nightly on latest OS X (10.11). Looks the same. 
If I compare this to Chrome 46 this is what I get:
- On Facebook main page I get less then 1% CPU usage and occasional spikes in the 1-10% zone.
- On Facebook group page I get about 10% CPU usage and occasional spikes in the 10-15% zone.

On latest Firefox nightly:
- On Facebook main page I get about 2.5% CPU usage and occasional spikes in the 1-20% zone.
- On Facebook group page I get about 50% CPU usage and occasional spikes with no spikes.

Obviously this is totally non scientific and done by looking at top for few seconds, however the general picture I get is that Facebook's group page works differently and that Firefox is not well optimized as Chrome to handle these type of events that cause this high CPU usage.
Whiteboard: [Power] → [Power:P1]
Tested this issue on latest Firefox release 42.0 on Mac OS X 10.10, and now the differences are not so big.
- On Facebook main page I get between 1%-9% CPU usage.
- On Facebook group page I get around 30% CPU usage.

On Chrome 46 I have two processes that are consuming CPU, the main process and "Chrome Helper Process" that seams to be opened for each independent tab and is the higher consumer when talking about Facebook pages. The results for the "Chrome Helper Process" are:
- On Facebook main page I get less then 1% CPU usage.
- On Facebook group page I get between 10%-17% CPU usage.

All the tests were performed while each page was left in idle mode. Everything changes when scrolling trough the page and new content is continuously downloaded. On both Firefox and Chrome browsers the CPU usage reached even 120% values for CPU usage. Firefox on "firefox.exe" process and Chrome on "Google Chrome Helper" process
Status: UNCONFIRMED → NEW
Component: Untriaged → Graphics
Ever confirmed: true
Whiteboard: [Power:P1] → [Power:P1][gfx-noted]
Whiteboard: [Power:P1][gfx-noted] → [Power:P1][gfx-noted][platform-rel-Facebook]
platform-rel: --- → ?
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161208153507

This issues is reproducible also on Windows, not just on Mac OS X (please see Bug 1328102). 
Tested on Windows 7 x64 bit using Firefox 50 release and the CPU on a group page gets to 7-15% (in idle mode).
OS: Mac OS X → All
Hardware: x86 → All
I see that the bug fired by me (1328102) is a duplicate of this bug. People..this bug is old , i see here at least 2015. Other unknown browsers do not have this problem. What are you waiting for solve this? Milions of people use Facebook.

Simona B, mulţumesc/thank you!
(In reply to Nickolay_Ponomarev from comment #9)
> I didn't see any facebook bugs on the sw:Power list (apart from bug 890154,
> which is on mobile), so adding this for triage.
> 
> These bits from comment 8 are interesting: "Blocking all .css STOPS the high
> CPU usage" and "it's something to do w/FB chat"

The Maxthon browser(webkit engine) handle best this issue (under 2.5% CPU for Facebook idle state, no matter is news feed or groups). In Firefox if you login simultaneous with 2 profiles then the issue become very annoying, each profile consume same amount of CPU and if you have and other running programs the PC become slowly.
Some "news": tested on older version (ESR 17.0.6) and no such a problem.
This issue started with version 18 and has another weird behaviour. If you minimize the Facebook active window(tab) to taskbar, the CPU usage decreses to normally-(about 0) in idle status. If then then the window is restored on the screen then CPU usage jump instantly (in idle status)to values of 20-30 times bigger than when minimized.
https://vid.me/zx14
Can you use mozregression to narrow down the regression window to a smaller revision range? (http://mozilla.github.io/mozregression/)

I'm not sure if it'll work with ESR 17, but worth trying.

I think you can do: |mozregression --good 17 --bad 18|, or with the dates instead of the version number.
Here are the results for mozregressions. It seems that the issue start in the Build 2012-09-29:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=895f66c4eada&tochange=85f561c755f6

Checked again on those 2 builds (good;2012-09-28 and the bad: 2012-09-29) and I think is confirmed.
https://sendvid.com/ada0kvk2

Other remarks:
   -sometime, for the "bad" earlier versions that I've tested, the issue become evident not from the start ( on entering in a Facebook group), just after few ninutes, but in this time if you start some programs ( other browser..etc), the CPU usage jump instantly and remain permanently to higher value for Firefox(in idle status for Facebook groups) without to touch anything in browser.It is a weird behavior.
   - for the last stable release,the CPU usage for this issue is about 2-3 time higher than in the earlier versions.
ni? myself so I don't forget looking into it when I'm over with exams, thanks for the regression range!
Flags: needinfo?(emilio+bugs)
Some news: seems that the issue isn't just for Facebook. A simple search with Google (1 tab) take CPU load to about 4-5% in idle status. If I made other search in other window the CPU load is the same for this new tab in idle status and totally for Firefox is doubled.This happens only if the tabs are diplayed on the monitor. If I minimize the tabs on the taskbar the CPU drop to 0 (FF 50.1.0)
https://sendvid.com/1i847qu3
platform-rel: ? → -
Given the regression range, bug 539356 sounds like a plausible cause to me of unexplainable, higher cpu usage. Of course, I have an untrained eye (read: I don't know the Firefox source code), and some profiling would be necessary to actually conclude this.
Bas, can you please find someone on the graphics team to look at this?  Perhaps Matt given the regression from DLBI?
Blocks: QF-Websites
Flags: needinfo?(bas)
This does at least sound like a layout, and not a graphics issue. Matt would be the right person to look at this.
Flags: needinfo?(bas) → needinfo?(matt.woodrow)
If it helps: On ESR_45 (45.8.0) for the same conditions CPU usage is still high but a half or even lower than on 50/51/52
It looks like facebook is constantly modifying a 'transform' style property (setting the translate3d property with different x values) and this triggers painting.

The transformed content is offscreen, so we don't actually end up painting anything new, but we don't know this until we've done all the paint setup and run DLBI.

We can easily know that the content wasn't painted previously, but it's hard to tell if it's going to be on-screen with the new transform without falling back to old-style invalidation.

This is sort of a fundamental issue with how DLBI works, and the best long term solution is probably doing retained display lists where we can make the process of updating the display list super fast for this.

Given that this is facebook, we might want to try a shorter term fix for this specific instance of the problem, I'll have a think about how to do that.
Flags: needinfo?(matt.woodrow)
Whiteboard: [Power:P1][gfx-noted][platform-rel-Facebook] → [qf:p1][Power:P1][gfx-noted][platform-rel-Facebook]
Adding Jet if this turns out to be retained display lists.
Component: Graphics → Layout
Hey guys, thanks for looking into this. We looked at it from our (Facebook.com) end too, Matt's analysis is correct, this is coming from the placeholder at the bottom of the activity feed on a groups page, it is being animated with a shimmer effect. You can verify this by opening the "Animations" pane in DevTools' inspector tool -- if you pause the animation, the CPU usage drops precipitously. 

Chrome & Firefox both run the animation despite it being offscreen, although it looks like Chrome is using less CPU. For the time being, we're thinking of stopping the animation when it's offscreen using Intersection Observer (not supported in Firefox currently unfortunately). 

This is a common problem, and it would be nice to find a more general solution to the problem of off-screen animations. CSS containment, for example, seems like it might be a potential win.

By the way, you can always reach us via browsers @ fb.com if you ever need help narrowing down a facebook.com bug.
Hrm. So the current solution, at least from Facebook's end, doesn't actually solve anything for Firefox.
I hope we can find another shorter term solution.
Firefox 17.0.x for 99% of  idle time(same conditions) use <1% CPU
Wouldn't it be possible to calculate the bounding box coordinates of the transformed object (just some math involved there which shouldn't cost anything), compare those with the visible portion of the page, and if outside of it, simply not call Paint()?
(In reply to Vladan Djeric (:vladan) from comment #28)
> Chrome & Firefox both run the animation despite it being offscreen, although
> it looks like Chrome is using less CPU. For the time being, we're thinking
> of stopping the animation when it's offscreen using Intersection Observer
> (not supported in Firefox currently unfortunately).

Chrome does retaining of display lists, so they manage to figure out that the transformed item is offscreen much quicker. We're going to work on doing something similar asap.

Intersection observer is being worked on, and will hopefully ship in 54. I realize that's not much help for now.

> 
> This is a common problem, and it would be nice to find a more general
> solution to the problem of off-screen animations. CSS containment, for
> example, seems like it might be a potential win.

CSS containment could indeed be nice, but it still requires vigilance from the web authors. Probably ok for facebook, but still a pain for smaller sites.

> 
> By the way, you can always reach us via browsers @ fb.com if you ever need
> help narrowing down a facebook.com bug.

Thanks, that's helpful! Will remember that for next time.


(In reply to Mark Straver from comment #31)
> Wouldn't it be possible to calculate the bounding box coordinates of the
> transformed object (just some math involved there which shouldn't cost
> anything), compare those with the visible portion of the page, and if
> outside of it, simply not call Paint()?

We used to do this, but adding work per-change (rather than per-vsync tick) can have really bad corner cases.

I'll have a go at batching this work to the start of the next paint cycle, with a bail-out to a full paint when the batch gets too big. We're going to need the same machinery for retained display lists, so it's worth starting.
I'm clearing the ni? because matt diagnosed it, thanks! :)
Flags: needinfo?(emilio+bugs)
Depends on: 1352499, 1321865
Clearing the ni? as there's no longer a need.
Flags: needinfo?(andrew.43)
(In reply to Matt Woodrow (:mattwoodrow) from comment #26)
> This is sort of a fundamental issue with how DLBI works, and the best long
> term solution is probably doing retained display lists where we can make the
> process of updating the display list super fast for this.

(This seems to be bug 1352499.)

> Given that this is facebook, we might want to try a shorter term fix for
> this specific instance of the problem, I'll have a think about how to do
> that.

Did you end up having any ideas for this hypothetical short term fix? Or should we just focus on bug 1352499 as the solution here?
Flags: needinfo?(matt.woodrow)
I'm mainly just focusing on bug 1352499.

It's possible that we might be able to fast-track a subset of it for this specific class of problem, but I'll look into that more once we have the main project closer to completion.
Flags: needinfo?(matt.woodrow)
Assignee: nobody → bugs
OK. It sounds like our best bet here is:
 1) On the Mozilla end:
   Longer term: get bug 1352499 fixed [and maybe evaluate smaller fixes as Matt notes there] -- I'm guessing in 55 or 56?
   Shorter term: get Intersection Observer shipped (in 54 or 55 I believe).

 2) On the Facebook end:
    Stop the animation when it's offscreen, using Intersection Observer (as suggested in comment 28).  You should be able to use Firefox Nightly ( https://nightly.mozilla.org/ ) and Chrome to test that.

Vladan: are you still up for this Intersection Observer based solution? (Maybe that's already in progress?)
Flags: needinfo?(vladan.bugzilla)
(In reply to Daniel Holbert [:dholbert] (reduced availability - travel & post-PTO backlog) from comment #37)
>  2) On the Facebook end:
>     Stop the animation when it's offscreen, using Intersection Observer (as
> suggested in comment 28).  You should be able to use Firefox Nightly (
> https://nightly.mozilla.org/ ) and Chrome to test that.
> 
> Vladan: are you still up for this Intersection Observer based solution?
> (Maybe that's already in progress?)

Yes, I still think IntersectionObserver is a solution here, and we're working on adopting IntersectionObserver in our codebase currently. Another option is to just disable the shimmer effect altogether on groups' pages. I'll post here again after we've deployed one of the approaches.
Flags: needinfo?(vladan.bugzilla)
Thanks! I'm going to mark this as [qf:p2] since there's nothing immediately actionable on our end at the moment beyond shipping IntersectionObserver. (and we'll look forward to seeing your updates!)
Whiteboard: [qf:p1][Power:P1][gfx-noted][platform-rel-Facebook] → [qf:p2][Power:P1][gfx-noted][platform-rel-Facebook]
I've shipped a change to Facebook to stop the animation when offscreen without IntersectionObserver. The issue is gone in my test case. Probably still a good idea to consider optimizations to stop offscreen CSS animation. As Vladan mentioned Facebook, and probably other sites, are open using CSS containment, or other options, on high level site elements like news feed item if it helps Firefox disable CSS animations and perform other optimizations that are otherwise hard to verify.
Keywords: perf
(In reply to Benoit Girard (:BenWa) from comment #41)
> I've shipped a change to Facebook to stop the animation when offscreen
> without IntersectionObserver. The issue is gone in my test case. Probably
> still a good idea to consider optimizations to stop offscreen CSS animation.

Now we skip offscreen transform animations's restyling (bug 1190721).
Whiteboard: [qf:p2][Power:P1][gfx-noted][platform-rel-Facebook] → [perf:p2][Power:P1][gfx-noted][platform-rel-Facebook]
Whiteboard: [perf:p2][Power:P1][gfx-noted][platform-rel-Facebook] → [qf:p2][Power:P1][gfx-noted][platform-rel-Facebook]
Flags: needinfo?(dholbert)
Whiteboard: [qf:p2][Power:P1][gfx-noted][platform-rel-Facebook] → [qf:p1][Power:P1][gfx-noted][platform-rel-Facebook]
I just retested this (with intersectionobserver disabled just for the heck of it, so I don't benefit from that), and I'm seeing that the animated element in question (the ::before element with "animation-name: placeHolderShimmer") is getting this style right away:
>   animation-play-state: paused;

That comes from these rules:
>#pagelet_group_pager :not(.async_saving) > div > ._2iwq::before {
>    animation-play-state: paused;
>}
>._2x3w._2iwq::before {
>    animation-play-state: paused;

And so the offscreen animation does not play (and I can verify that in devtools' animation inspector).  So I think this is fixed, from that CSS change on Facebook's end.

Separately from that (without that tweak), I think this would've still been fixed via:
 (A) bug 1190721 (comment 42) -- landed in Firefox 58, which should throttle offscreen transform animations.
 (B) IntersectionObserver support (should be enabled by default in 57 & earlier as "dom.IntersectionObserver.enabled" about:config pref), which should also throttle the animation based on Facebook's usage per comment 41
 (C) ...and maybe also some help from retained display list (enabled on Nightly only, via bug 1416055)


But in any case, this seems to be WORKSFORME (I see slightly higher baseline CPU usage on the group page vs. my normal newsfeed page, but not substantially higher -- 6% vs 10% or something like that, with neither page spending much time above 20% which was noted as the expected-behavior in comment 0 here).

Please reopen/comment if I'm off-base with any of the above, or if there are still substantial issues here that I'm not seeing.
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(dholbert)
Resolution: --- → WORKSFORME
Performance Impact: --- → P1
Whiteboard: [qf:p1][Power:P1][gfx-noted][platform-rel-Facebook] → [Power:P1][gfx-noted][platform-rel-Facebook]
You need to log in before you can comment on or make changes to this bug.