Closed Bug 1428704 Opened 3 years ago Closed 2 years ago
FF57+ slows mac
OS to a crawl: impact on framerate 10/10
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.71 Safari/537.36 Steps to reproduce: I'm using a MacBook Pro 13" (late 2013, macOS High Sierra with latest updates). Since installing Firefox 57 the browsers is appalingly slow. Browsing is very slow, the notebook fans are working overtime. When I check the about:performance page it says that my tabs are slowing my browser with "impact on framerate 10/10". I installed all Firefox 57 updates and this behaviour did not change. Then I moved to FF58 - the same. Removed all addons - the same. Created a new profile - the same. At the moment, I use only one addon - uBlock Origin. I did some tests with just the addons tab open and even it is shown as slowing Firefox (in about:performance) All other browsers work without a hitch. The notebook and OS are in good shape. Actual results: Firefox is very slow, fans are working hard to keep up, about:performance shows impact on framerate 10/10 (or in this region)
Hi Sveon, There might be one other version where you can test this, try with FF Nightly, you can download it from here: https://www.mozilla.org/en-US/firefox/channel/desktop/# I use FF 57 and FF Nightly on Mac OS X 10.12 and Mac Os X 10.13 and I don't have this issue.
Component: Untriaged → Add-ons Manager
Product: Firefox → Toolkit
On Firefox Nightly, one tab open - searching Google for "firefox". about:performance shows "impact on framerate 8/10".
In my case on Mac OS X 10.12 with FF Nightly 59.0a1(2018-01-11), I have "Impact on framerate: 6/10", please note that I don't have any add-ons installed on my profile.
This doesn't appear to be related to about:addons I would start by looking at the macos Activity Monitor to confirm that Firefox is actually the thing that is working so hard. If it is, gathering a profile would be the next step: https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler
Component: Add-ons Manager → Untriaged
Product: Toolkit → Firefox
No, it's not related to about:addons. Any website does this. It is working extremely slow, and if I open a few tabs the entire computer goes to a crawl and the fans go wild. I saved a performance profile... Opening Google.com, searching, going to firefox.com, browse a few pages, then facebook - newsfeed and profile page. Facebook is the worst. Visibly the experience is not smooth at all - scrolling, loading, etc.
Great, thanks. The next step is to share the profile here so a Firefox engineer can take a look at it.
This is the profile: https://perfht.ml/2CPy4XN One tab only: 1. Google 2. Search Google for "Firefox" 3. Browsing Firefox.com 4. Browsing Facebook After going to Facebook, the fans start. With one tab (apart from Facebook) it's working. If I open a few tabs, everything crawls.
This is another one, browsing a common news web site: https://perfht.ml/2CRyTj2 The performance is more or less ok, but the fans are working hard.
Hello Andrew, Do you know how might contact to look at this profile that Sveon shared? Thanks
Performance has improved on Firefox Nightly. However, the browser is visibly much much slower than other browsers on the same system. Particularly, using Facebook is almost unbearably slow. Full name: https://www.facebook.com/messages/t/ (as window 2147483649). Impact on framerate: 10/10 CPU usage: 30%. System usage: 2%. Blocking process calls: 0%. Measure start: 10 seconds ago. Processes: 27313 (child)
Hi Andrew, When you have time, please look at comment 11 and maybe we can find someone to look at Sveon's profile. Thanks
Nothing in the profile jumps out at me, marking this for the performance triage team to take a closer look.
Unfortunately, it looks like the profiles that were gathered were done on a non-Nightly build, so I'm only getting pseudostacks here. It's clear that some site, "dnevnik.bg" is doing a ton of JS work here for some reason, but because there are no native stacks, it's hard to see why. It also looks like something is animating, which might be contributing to the CPU usage. If the page appears to be static, but something is invisible, it's possible that older versions of Firefox are still doing some computation on them. Bug 1237454 recently landed in Nightly to try to address this. So, Sveon, are you still able to reproduce this on a more recent Nightly build? And if so, can we see a profile from that Nightly, since it'll have native stacks?
Keeping this in [fxperf] triage until we can get more data.
Here's a new profile with the latest nightly as of 20.02.2018: https://perfht.ml/2EJabWT Opened Facebook, then Facebook Messenger from the top icon, then Gmail. Facebook is really laggy and immediately fires up the notebook fans. Everything loads very slowly compared to, let's say, Chrome. Gmail is much better. If I open several tabs (5+) the whole computer becomes laggy and slow to respond. The computer runs macOS with all the latest updates. Other browsers with the same websites perform excellent.
Thanks Sveon, I've added this profile to our analysis queue. I hope to have an analysis to you within a few days.
This seems more like a platform-level issue based on a quick skim of the profile, so moving to the qf group for now.
Whiteboard: [fxperf] → [qf]
Component: Untriaged → General
Product: Firefox → Core
See Also: → 1430820
Hello Panos, Based on the QF triage team, we think this bug is along the same characteristic and could be something that the DevTools team can do in the longer term.
Xidorn, this is similar to bug 1363424. Is there anything here that you can spot?
Flags: needinfo?(past) → needinfo?(xidorn+moz)
So, for the range in this profile, which has ~58s work there, ~11.2s we are waiting for next event in the event loop. Outside that, I see ~1.1s spent in mach_msg_trap under reflow, probably for shaping using CoreText. This sounds a bit high to me, especially given that I can see continuous ranges where we keep calling into this. But I'm not sure what should I expect here, probably we should check with jfkthame and see what does he think about this. For the style system, restyle takes ~1.7s, parsing takes ~1.1s, updating stylist takes ~0.4s, and several other stuffs (e.g. loading binding, styling anonymous boxes, taking snapshots) take ~0.2s in total. This doesn't sound very unreasonable, especially given it spread across the range and apparently the script consistently triggers sync reflow via getting various layout-depending bits. Similarly, reflow (under ProcessReflowCommands) takes ~4.8s including the ~1.1s mentioned above. Painting (under ProcessPendingUpdatesPaint) takes ~2.0s in total, which might be a bit high given that we don't need to flush it for script, but I don't have enough experience to judge whether it's problematic. Apart from that, it seems the rest is mostly JS execution and probably compiling. Even if they aren't, I'm not quite sure what to look next. (It would be helpful if we can simply filter out all the bits we are just running JS, as I mentioned in bug 1363424 comment 23). In general, I don't really think the browser being busy running some JS should lag the whole OS, so more likely there is some interaction with the OS which causes that. Font shaping with CoreText is possibly one of that, and maybe there is some graphics bits as well. (e.g. maybe the reporter can try disabling hardware acceleration and see if that helps.)
Hello! I'm the reporter of bug 1430820 which has a similar issue with a very similar system (MacBook 12", Early 2015, Retina). I can confirm that both Firefox Quantum 59.0 and Firefox Nightly 61.0a1 (2018-03-14) have a little bit of a performance boost when hardware acceleration is disabled - but it does not solve the bug as a whole. I took your suggestion to disable hardware acceleration. Here are the results of my stopwatch. What I did was I closed the browser, removed all profiles and removed cache directories, and reopened the browser. This should simulate a fresh install - indeed, the fresh start came up each time. There's no extensions installed at all. The stopwatch was from the time I submitted my login on Facebook to when the homepage fully loaded. ------- Firefox Quantum 59.0: with hardware acceleration - 27.4s without hardware acceleration - 21.9s Firefox Nightly 61.0a1 (2018-03-14): with hardware acceleration - 27.3s without hardware acceleration - 21.7s Firefox ESR 52.7.1: with hardware acceleration - 7.8s without hardware acceleration - 7.9s Safari: 6.4s ------- To confirm your suspicion, Firefox does not generally slow down my Mac; it just takes a long time to load and, quite often, spikes in CPU. It just takes a long time to load. I concur with the reporter that I get "Impact on framerate: 10/10" on loading any page, even AOL and DuckDuckGo. Here's a profile that I captured with Nightly with hardware acceleration disabled while loading Facebook on fresh install with just the Gecko Profiler add-on installed (did this separately from the stopwatch times above): https://perfht.ml/2pf7d1X Hopefully this helps, even a little!
Would you mind also profiling with ESR so that there is a chance we can understand what the difference is there?
No problem! ESR with hardware acceleration: https://perfht.ml/2pfh5J0 ESR without hardware acceleration: https://perfht.ml/2Is7Y0r
Thanks! The first issue I can see is that reflow is much slower on later version than 52. So in the profile in comment 23 (https://perfht.ml/2pf7d1X), the total time taken under PresShell::DoReflow is just 144ms, while in the profile in comment 25 (https://perfht.ml/2Is7Y0r), it takes 3,861ms, in which we have 3,061ms working inside gfxMacFont. jfkthame, any thought about this?
Another observation is that, if you search > https://static.xx.fbcdn.net/rsrc.php/v3iUNm4/y7/l/en_US/SeBpw4zyG2a.js in the two profiles mentioned in comment 26, you can see that the "j" function in ESR takes only 266ms, while in Nightly it takes 741ms. The time in this function seems to be dominated by HTMLCanvasElement.toDataURL, which itself takes 500ms on Nightly. Since ESR only has pseudo-stack, it is unknown how long does the same thing takes there, but given that there are lots of "USC::" functions from AppleIntelIBDWGraphicsGLDriver there, it should also be calling into that function. So how does that regress? Jeff, do you have any thought about this?
The USC:: functions are from shader compilation happening when we try to read back the data from a canvas. While this isn't great it's also not surprising. It may be that we're using a different version of Skia between the two versions. Lee might have a better opinion of what's going on. That being said, I expect there's a larger problem going on here to cause such a big difference in load time between ESR and nightly. Perhaps it would be worth disabling the throbbers to see if that makes a difference.
Flags: needinfo?(jmuizelaar) → needinfo?(mconley)
Hi mdppatrick, The profiles from your ESR build show that multi-process Firefox is disabled. This might be because some add-ons are installed that disable it automatically. If you restart with your add-ons disabled, in about:support, does it say anything about Multi-process Windows being enabled? If so, does the performance behaviour change in that state on the ESR build? Another thing to try - on the Nightly that is slow to load Facebook, can you change the following preference in about:config layout.frame_rate and change that to 15? You'll then need to restart the browser. This will lower your frame rate from 60fps to 15fps. We want to see whether or not the high-fps throbber is related to the behaviour. Please don't forget to reset this setting back to its default (-1) once you've concluded the experiment. We also suspect that tnikkel's work in bug 1191965 will help here, at least for macOS users, as it'll hopefully reduce some of the memory overhead when drawing the throbbers (assuming the throbbers are related here). In the meantime, I'm going to see if I can land a hidden pref to disable the throbbers on Nightly for our users to test with.
Flags: needinfo?(mconley) → needinfo?(mdppatrick)
ni?ing myself for patch to disable throbbers.
Just tested with layout.frame_rate = 15 on the latest Nightly. Oh, wow! Firefox just started flying! No fans, cpu normal. Facebook works very fast.
Actually, quite the opposite: I opened ESR with add-ons *enabled* (by the way, the only add-on I have installed is the Gecko Profiler) and it says that multiprocess windows is enabled: "1/1 (Enabled by default)". I opened Facebook in ESR with add-ons disabled and there's no difference in the load times. I'm not sure how or why the profiles say that multi-process is disabled, but I can confirm that there's no negotiable difference - with or without it. Either way, it works well! Changing layout.frame_rate to 15 allowed for a fresh login on Facebook to completely load in 12.19s. After a series of reloads, loads average out around 9s. You may have hit it on the nail!
Whiteboard: [perf-tools] → [perf-tools][fxperf]
Depends on: 1447153
(In reply to Sveon from comment #17) > Here's a new profile with the latest nightly as of 20.02.2018: > > https://perfht.ml/2EJabWT When looking at this profile, the things that look unusual to me are: - we paint continuously. The following comments saying that throttling painting to 15fps helps significantly seem to confirm that it's the problem. - in the content process 1, we have a lot of time spent in CGSGetDisplaySystemState from the SkyLight library (https://perfht.ml/2u3OBaC). I wouldn't be surprised if these 2 things were related.
(In reply to Florian Quèze [:florian] from comment #33) > - in the content process 1, we have a lot of time spent in > CGSGetDisplaySystemState from the SkyLight library > (https://perfht.ml/2u3OBaC). > I wouldn't be surprised if these 2 things were related. Huh, interesting. I hadn't noticed the SkyLight stuff before. mstange, do you know why that (otherwise idle) content process should be talking to CoreGraphics so much?
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #34) > Huh, interesting. I hadn't noticed the SkyLight stuff before. mstange, do > you know why that (otherwise idle) content process should be talking to > CoreGraphics so much? I suspect this is just us blocking on the window server being busy.
I agree. SkyLight is involved in the communication with the WindowServer. The native event loop needs to talk to the window server for every event that it processes. So the only remaining question is why we're processing native events in an idle content process at all. I think this might be caused by the profiler's jank instrumentation, which sends Gecko events to all threads every 16ms. We should have a separate setting for that feature so that we can turn it off. I've filed bug 1447338 for that. We should also finish off bug 1384336 so that we can process Gecko events in the content process without being bogged down by the native event loop.
Hey mdppatrick, now that bug 1447153 has landed, can you try the experiment again, but this time using the default frame rate, but setting the following pref to true: browser.tabs.hideThrobber (you'll need to create this boolean pref, as it doesn't exist by default - and this will also require a restart) Do you find that the page load performance improves with that pref set to true, and is comparable with the 15fps test that we did?
I tested it and it looks OK. Latest Nightly with these settings: browser.tabs.hideThrobber = true layout.frame_rate = -1 (default) I opened 7-8 tabs, different sites including Facebook. CPU appears normal. Fans are silent. In the past, this would have ground to a halt the whole computer.
(In reply to Sveon from comment #38) Thanks, Sveon. I have one more experiment for you - can you please reset the browser.tabs.hideThrobber pref, and then set this pref: browser.schedulePressure.timeoutMs to 150ms and then restart and repeat the experiment?
Tested it. Nope! This way it's back to the slow and laggy state.
Depends on: 1448133
I concur with the reporter. Enabling "browser.tabs.hideThrobber" gave a first-time load of Facebook with about 9.3s, and loads after are about 6.7s. A huge improvement! "browser.schedulePressure.timeoutMs" does not improve anything, and even additional loads take about 19s to complete.
Okay, one more experiment - I've landed a patch this weekend that adds a Nightly-only pref that should drop the throbber framerate to 30fps from 60fps. Can somebody experiencing this try a recent Nightly with all of the previous prefs reset to their defaults, but: browser.tabs.30FpsThrobber set to true? (You'll need to add the perf). This will also require a restart of the browser. Does that improve page load performance at all?
I gave it a try with browser.tabs.30FpsThrobber = true. The performance seems acceptable and about the same as when browser.tabs.hideThrobber = true. With these settings the browser is usable (unlike before). However, I still find the CPU/GPU are getting hotter than when I use other browsers, but I guess this is related to different issues.
I did some further testing. It appears that 30FpsThrobber = true makes things significantly better than the default settings, but still not as good as when hideThrobber = true. Here are two new profiles from the latest Nightly https://perfht.ml/2pGiJ6A --> browser.tabs.30FpsThrobber = true https://perfht.ml/2I4s5R3 --> browser.tabs.hideThrobber = true Basically: 1. default settings --> awful 2. browser.tabs.30FpsThrobber = true --> better 3. browser.tabs.hideThrobber = true --> much better
(In reply to Sveon from comment #44) > https://perfht.ml/2pGiJ6A --> browser.tabs.30FpsThrobber = true > https://perfht.ml/2I4s5R3 --> browser.tabs.hideThrobber = true I isolated one refresh tick from both these profiles where the markers indicated that we were doing very similar work: https://perfht.ml/2ukFe6y --> browser.tabs.30FpsThrobber = true https://perfht.ml/2ujv9XN --> browser.tabs.hideThrobber = true You can see that in the first profile, there is compositing happening during this time, and in the second profile, there is not. And the reflows in the second profile finish about 2x faster. You can also see the samples being taken at a more consistent rate in the second profile; in the first profile there are gaps in the graphs at the top.
Note that both profiles show the same gfxMacFont problem as xidorn pointed out in comment 26, as well as bug 1430820 comment 15. Hide throbber: https://perfht.ml/2upAtIW (3616 samples) 30fps throbber: https://perfht.ml/2uqDoB5 (4700 samples) Most of that blocking seems to be us waiting on macOS to do some font shaping... I wonder if the frequent compositing is drawing resources away from the bits of the OS that respond to font shaping requests?
This may, at least in part, be the same issue as noted in bug 1430820 comment 17. I'm working on a patch there that could help, I think. (Depends on the site: in particular, content that uses Apple AAT fonts such as the macOS system font, and has non-zero letter-spacing applied, will be hitting a slow text shaping path.)
Whiteboard: [perf-tools][fxperf] → [perf-tools][fxperf:p2]
Guys, have you tried stopping the Firefox window from being transparent by setting gfx.compositor.glcontext.opaque TRUE and restarting Firefox? This only works on Nightly For me this gave by far the biggest performance win. Isn't this just bug 1422090 and the proposed fix bug 1429522? (and bug 1404042 etc etc etc) I guess there might be second order problems with fonts etc but as Jeff says in bug 1424042 comment 141, "The window transparency seems to be a big enough problem that it will be hard to see other problems". I think you're flogging a dead horse with these investigations until transparency goes away. IIUC, it goes like this. MacOS does its window composition on the GPU (G for Graphics). At scaled resolutions this involves interpolation which seems very hard work for the GPU. Firefox's lack of Core Animation means that even tiny animations cause full window refreshes @ [frame rate] which hit the GPU very hard. Then for historical reasons Firefox presents to MacOS as a transparent window, which hits the GPU even harder. Finally Intel seems to set up its CPU + iGPU package so that the GPU gets power priority:- when the GPU is working very hard the CPU gets power starved and its performance slows dramatically. So Firefox is a perfect storm. At highest scaled resolutions, a full screen Firefox window showing ANY animation at all asks the GPU to do a transparency & interpolation calculation on 1680x2x1050x2 = 7 Mpx @ [frame rate] which for any reasonable frame rate is way beyond the limit of my little 15 W iGPU and probably yours too. The GPU redlines, fries and drops frames. The CPU power starves and throttles, everything slows to red hot janky molasses. It's horrible. I see this on any page showing an animation, however small. If you're at 1680x1050 the fan spools up within seconds. That's the GPU transparent-ing and interpolat-ing like crazy. Setting layout.frame_rate to 15 fps dramatically reduces the load on the GPU and everything cools down. Disabling the tab throbber helps a bit during page loads. But even with the throbber disabled, Firefox paints quite often during page load, ~30-40 fps so killing the tab throbber has only limited benefit for me. In bug 1429522 comment 1 I saw that for me, stopping transparency and the tab throbber allows the CPU to run full speed until content starts to paint, but after that the GPU still redlines and the CPU throttles so page loads are not as fast as they could be. I can think of some fancy hacks like setting layout.frame_rate to a low value just during page load to throttle the whole browser, not just the throbber. Basically ensure that the GPU isn't too power greedy. But until there's a viable transparency fix, I'm not sure it's worth it.
Depends on: 1453701
Mike, I think I saw you discussing on IRC making the throbber 30fps by default. Does this mean we can close this bug?
Whiteboard: [perf-tools][fxperf:p2] → [perf-tools][fxperf:p3]
I suspect the ClientStorage work in bug 1265824 and the lowering of the framerate of the throbber in bug 1511095 will have helped here. Ultimately, the best solution is probably to use the OS compositor directly, rather then sending it things that we've already composited. That's what pcwalton is working on for planeshift, but we're unlikely to see that bear any kind of fruit until we're deeper into 2019. So yeah, I'm going to close this as INCOMPLETE unless anybody else has additional actionable information to help us solve this more quickly in a reasonable fashion.
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.