Closed Bug 1560090 Opened 5 years ago Closed 4 years ago

High refresh rate monitor (>200Hz) slows performance

Categories

(Core :: Networking, defect, P1)

68 Branch
defect

Tracking

()

RESOLVED DUPLICATE of bug 1587058

People

(Reporter: lpy750, Assigned: mayhemer)

References

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0

Steps to reproduce:

I recently noticed a huge decrease in performance when overclocking my monitor.

Tested on two PC's - older i7 CPU with Windows 10 (1809), and newer i9 CPU with Windows 10 (1903)

Steps to reproduce, with a fresh profile:

A demonstration video can be seen here - https://streamable.com/imxrv

this issue could be duplicate of the bug 1439683. Further, I will set a component to it. If Isn't the right one please fell free to change it.

Component: Untriaged → Graphics
Product: Firefox → Core

There are some problems with very high refresh rates, which is why they are not "officially" supported. One problem is that painting the browser window that often is computationally very expensive, and can cause problems with scheduling other events, such as JavaScript or animation timers.

Priority: -- → P3

This happens to me, as well. I'm using a single 240hz display, on Windows 10 64-bit. It causes all page loads that aren't from files on disk to appear to be slow.

Reproduction

Here is how I can reproduce it on my machine:

  1. Clean reset of Firefox. (Just tested on 71.0 x64, and Nightly 73.0 x64.)

  2. Verify display is running at 240hz.

  3. Install at least 1 extension. Ones that I have tested: uBlock Origin (large impact), uMatrix (large impact), Firefox Multi-Account Containers (medium impact.)

  4. Navigate around any web site by clicking on links.

Page loads appear to be much slower than expected. This includes "websites" which are running on my local network, and consist only of serving simple static "hello world" HTML pages with no JavaScript, CSS or images, so internet connection speeds are probably not an issue. This also occurs even if the extension aren't supposed to be doing any real work -- for example, if uMatrix is bypassed globally (but without the extension disabled in Firefox), or if the Firefox Multi-Account Containers addon is just using a single container.

And by "very slow" I mean the apparent timings of the page loads go from <10 milliseconds to >2 seconds (with uMatrix or uBlock Origin) or >1 second (with Firefox Multi-Account Containers.)

  1. Disable all extensions, and page loads becomes fast again. Alternatively, the extensions can be left enabled, and layout.frame_rate can be set to something other than -1, such as 90, and page loads will also become fast again.

Screen capture (GIF)

I have recorded a GIF screen capture of this in effect on one of my real websites, which is just some simple static files, without any JavaScript. I used a GIF recorder instead of full-featured capture software like OBS, because OBS can induce vsync of a lower frame rate on the desktop, which interferes with the results. Note that Firefox is not in a reset-clean state in this recording, but the effect is the same. The GIF is over 3 megabytes, so I'm hosting it externally on my own server, instead of uploading it here.

https://cancel.fm/stuff/share/ff_slow_1.gif

Gecko Profiler captures (Firefox Nightly)

Here are two captures Gecko Profiler captures using Firefox Nightly (73.0) with a clean profile, plus uBlock Origin and uMatrix installed. The action is simply refreshing the page at https://cancel.fm/ripcord/ (the same as in the screen recording above.) I noticed that the "Waiting for socket thread" state of network requests seems to be important, so I have looked up how to enable profiling of the socket process, and I have set MOZ_PROFILER_STARTUP_FILTERS=GeckoMain,Socket Thread in the environment.

In capture #1, uBlock Origin and uMatrix are enabled. However, the internal settings of the extensions have been set to bypass the domain being loaded, which means these extensions aren't supposed to be applying their blocking rules or doing any significant work. For capture #2, the extensions have been disabled in Firefox Nightly's about:addons. You can see that the page loads much more quickly in #2.

  1. uBlock Origin and uMatrix enabled: https://perfht.ml/2Zwe4pd
  2. uBlock Origin and uMatrix disabled: https://perfht.ml/2QrcUr5

Machine information

This is a 16-core machine (AMD Threadripper 1950x) with 64gb memory.

Application Basics
------------------

Name: Firefox
Version: 73.0a1
Build ID: 20191227215440
Update Channel: nightly
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:73.0) Gecko/20100101 Firefox/73.0
OS: Windows_NT 10.0
Launcher Process: Enabled
Multiprocess Windows: 1/1 Enabled by default
Remote Processes: 6
Enterprise Policies: Inactive
...

Graphics
--------

Features
Compositing: WebRender
...
Uses Tiling (Content): true
Off Main Thread Painting Enabled: true
Off Main Thread Painting Worker Count: 4
Target Frame Rate: 240
DirectWrite: true (10.0.14393.0)
GPU #1
Active: Yes
Description: NVIDIA GeForce GTX 1080
Vendor ID: 0x10de
Device ID: 0x1b80
Driver Version: 24.21.13.9836
Driver Date: 6-24-2018
...

If there is any other kind of data you would like me to collect, please let me know.

Thanks for the in-depth analysis, Andrew Richards, and confirming what's happening. It's been a while since I played around with layout.frame_rate, and interestingly Firefox now crashes the moment I try to change the value. Even when I tried with Windows 10 Sandbox. And I was also able to reproduce the crash on a tablet. I can however use user.js to adjust the value.

I found somewhat of a workaround which of course doesn't fix the issue, and I feel it's a matter of time before more people are using 240Hz monitors and experiencing this slowdown... start Firefox with a lower refresh rate such as 60/120/144Hz. Once Firefox has loaded, revert back to 240Hz. Firefox will continue to run as though the refresh rate is 60/120/144Hz and not slow when it's 240Hz.

Appending my last comment, changing the value of layout.frame_rate is crashing Firefox 72 Beta and Firefox Nightly 73, but NOT the latest stable release, Firefox 71.

That's interesting. Changing layout.frame_rate does not result in a crash for me. But the setting being changed after Firefox has already started seems to have a different effect than if it were already at that setting when Firefox started up.

Here is an example of what happens on my machine.

  1. Verify display is running at 240hz.
  2. Clean reset Firefox. Then, install uBlock Origin and uMatrix extensions.
  3. Restart Firefox.
  4. Visit any website and click around between pages. Page loads are very slow.
  5. Change layout.frame_rate from -1 (the default) to 60. Do not restart Firefox.
  6. Visit any website and click around between pages. Page loads are now significantly faster, but not as fast as if there were no extensions installed.
  7. Restart Firefox.
  8. Visit any website and click around between pages. Page load times appear to be faster, on average, than they were in step #6.
  9. Change layout.frame_rate from 60 to -1. Do not restart Firefox.
  10. Visit any website and click around between pages. Page loads are still pretty fast.
  11. Restart Firefox.
  12. Visit any website and click around between pages. Page loads are slow again.

Very strange.

See Also: → 1614212

Just a quick update, currently on 75.0b3 (64-bit) and still the same situation many months later. As I'm not usually running at 240Hz unless I'm gaming, it's not always annoying. I went through each add-on one-by-one and found out of the 22 I have installed, 3 of them 'trigger' the slowness.

Basically I have 19 add-ons that can all be running at the same time, and everything will be fine. However, out of the 3 remaining add-ons, running them even independently causes the whole browser to slow down. So maybe there's something in the code of these 3 that they have in common? There are the following:

Multithreaded Download Manager
uBlock Origin
User-Agent Switcher and Manager

Basically I have 19 add-ons that can all be running at the same time, and everything will be fine. However, out of the 3 remaining add-ons, running them even independently causes the whole browser to slow down.

Hmm. This isn't my experience. Every extension causes the browser to behave worse. But the degree of the badness varies. uBlock Origin and uMatrix both have strong effects. I talked about this above in my bug report, and you can see it in the profile captures.

To reiterate an example, the Firefox Multi-Account Containers extension has less of an effect, but the effect is still there. You may need to pay closer attention to the effect it's having -- you might just not be noticing it as easily. If you measured with the profiler, or paid closer attention to the behavior of the browser, you'd probably notice that stuff gets slower by at least 50 to 100 milliseconds in most cases. But that's just my experience, so I could be wrong. (I was able to reproduce this, identically, on 2 completely different computers, though.)

I haven't tried 75.0x versions yet, but I can confirm that the same thing is still happening in 74. The same behaviors are observed, except that changing layout.frame_rate now causes Firefox to crash most of the time.

I suspect this is happening to everyone on Windows (not sure about other OS) with higher refresh rate displays to varying degrees. I can see it happening, though to a lesser extent, at 160hz. I have talked to a friend with a 240hz display and they experience it as well. (Neither of us can use Firefox as a regular daily web browser, due to this problem.)

Actually, which extensions don't seem to cause the problem for you? I could give one of them a try and see what happens.

You're correct, some have more of an effect than others. uBlock Origin is the biggest offender. Multithreaded Download Manager & User-Agent Switcher and Manager do slow the browser, but not as much. The ones that don't have a visible affect on my end (like you said if I measured them I might see what they're doing, but visibly there appears to be no difference), are:

BetterTTV
Clickbait Remover for Youtube
Cookie AutoDelete
Decentraleyes
Disconnect for Facebook™
Enhancer for YouTube™
FrankerFaceZ
Google search link fix
Stylus
To Google Translate
View Image for Google Images
Old Twitter Layout

I can run any or all of them and Firefox runs fine. The moment I include one of the 3 I mentioned, everything slows right down.

Thanks! I'll give some of those a try tomorrow, if I have time, and see if there's any significant change in the profiler captures.

I ran two profiling sessions with uBO, at different frame rates -- the profiled scenario is forcing a reload of www.google.com.au five times:

There is a noticeable difference.

In the profiling sessions, the obvious difference is the time spent in the Web Content's poll, ~16 seconds with frame_rate at 300, while it is ~8 seconds with the default frame rate (60Hz). In fact, all the Parent, WebExtensions and Web Content processes display the same behavior, i.e. lot more time spent in the poll function when frame_rate is 300.

I can see no meaningful difference in JavaScript execution, in either the web page's own code or uBO's code.

Though I didn't run a profiling session without uBO, I can confirm that with frame_rate: 300 and without uBO the page appeared to refresh as fast as with frame_rate: -1 and with uBO. I didn't try with other content blockers or any other extensions.

Note that uBO makes use of requestAnimationFrame ("RAF") in its content script, though as said it does not spend more time into it's JS code, and as a matter of fact, the profiling for the frame_rate: 300 case, uBO spent 20 ms in its RAF callback, while it spent 29 ms with frame_rate: -1 case -- so the JavaScript code of either the page or uBO is not the issue here.

See Also: → 1622879

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression

Ok, I was ready to claim this isn't a regression, but I've downloaded several Firefox versions to pinpoint when things have gone south. There are two issues here.

In regards to layout.frame_rate:

Changing this preference (unless done in user.js) crashes the browser beginning Firefox 72. Therefore,

layout.frame_rate crash bug - exists from Firefox 72-75beta (at the time of this post).

The initial reported bug:

The general browser slowdown when specific add-ons are installed becomes easily noticeable when the monitor's refresh rate is above 248Hz and/or layout.frame_rate is set to 248 or above. In some cases this will occur below 248Hz, such as 240Hz, but I'm not exactly sure why. Possibly something to do with a "clean" vs "used" profile.

This bug can be reproduced beginning Firefox 58.

"High refresh rate bug" - exists from Firefox 58-75beta (at the time of this post).

To reproduce and compare this bug to a version where it didn't exist:

  • Download Firefox 57

  • Set layout.frame_rate to 300 (or use a monitor with 250Hz or higher refresh rate)

  • Install uBlock Origin add-on (and optionally 'User-Agent Switcher and Manager' add-on)

  • Load a website that generally loads quickly. Refreshing the page should be instant

  • For example, Website A takes 0.3 seconds to load. Website B takes 0.6 seconds to load

  • Upgrade to Firefox 58 with the same steps as above

  • Load a website that generally loads quickly. Refreshing the page should be instant but isn't

  • For example, Website A takes 1.5 seconds to load. Website B takes 3 seconds to load

In summary:

Browser loading webpages up to 5x slower with certain add-ons and high refresh rate monitors (250Hz+), since Firefox 58.
layout.frame_rate crashing browser when value is changed, since Firefox 72.

(In reply to rhill@raymondhill.net from comment #12)

I ran two profiling sessions with uBO, at different frame rates -- the profiled scenario is forcing a reload of www.google.com.au five times:

Those stacks seem incorrect. Last time I saw callstacks spuriously go through c++ STL frames, I think it was actually Rust stacks or just bad symbols.

Yeah those profiles have bad symbols. It would be good to try again with a Firefox Nightly build that was downloaded from mozilla.org (because the Mozilla symbol server will have symbols for those.)

I'm not exactly sure what is meant by bad symbols, so hopefully I did this right. This is Firefox Nightly with a clean profile plus uBlock Origin add-on.

No settings were changed in Firefox. Only the refresh rate in Nvidia Control Panel was changed:

144Hz Test #1 (abc.net.au): https://perfht.ml/392d8fs
144Hz Test #2 (abc.net.au): https://perfht.ml/38Y6Spb
250Hz Test #1 (abc.net.au): https://perfht.ml/39ZWnmz
250Hz Test #2 (abc.net.au): https://perfht.ml/2QnMpUn

144Hz Test #1 (example.com): https://perfht.ml/3b8yl9d
144Hz Test #2 (example.com): https://perfht.ml/2UfKhiv
250Hz Test #1 (example.com): https://perfht.ml/2wjL0qO
250Hz Test #2 (example.com): https://perfht.ml/3aXhhTs

(I did not expect quite so much code in these sites: https://www.abc.net.au/res/sites/news-projects/news-core/1.26.0/desktop.js:1:**367079**)

Those profiles (good symbols this time!) look mostly idle to me, though some of them don't have the "Content Process", which is the process where I expect the issue is.
Also what does Test #1 mean, vs Test #2?

Indeed, some websites are very heavy on the javascript, which is why I also included example.com which is obviouisly a very basic website.

The captures are basically, hit start, wait 1 second, reload the webpage, end capture after a few seconds. Test #1 and Test #2 labels are basically the same thing, just done twice to allow for any variation, if that makes sense. Although I suppose I could've done multiple page reloads within one capture, but this is how I did it instead.

Ok thanks. I'm super curious so I'll try to repro locally tomorrow.

No problem. And I agree, "Test" isn't the best word to use, I could've called it "Capture".

I only used the default Gecko Profiler settings, so if there's a setting you want me to change to capture more valuable information, I could try that.

Confirmed locally on my machine with uBock Origin, frame_rate =>300 causes google.com to take weirdly long to load.

Assignee: nobody → jgilbert
Status: UNCONFIRMED → NEW
Ever confirmed: true

Huge gaps between navigate start, load start, and load done:
https://perfht.ml/3a6Bcze

I feel like this is a network/resource loading stack problem? The load times bound the network request bands in the profile. The profiler definitely sees requests taking longer.

Attached image Ann long socket.png

Waiting a quarter of a second for the socket thread, then another 250ms to send a reponse. :(

Component: Graphics → Networking

I'm glad to see you can reproduce it on your end. Simple things such as favicon.ico that should only take a few ms might take up to 500ms at ~300 fps, which obviously indicates something is very wrong.

Also I only just realised Firefox already/now has its own profiler. I was using the Gecko Profiler add-on all this time.

The profilers are the same. I think the recording UI is just different, is all.

It seems that main thread is quite busy.
I think we can't do much for now until we fix bug 1528285.

See Also: → omt-networking
Whiteboard: [necko-triaged]

Can you add the Compositor and Renderer threads? Judging by the memory track in the GPU process, that's where the extra activity is, and that's probably what's starving the CPU for the rest of the system.
Is there any ongoing animation during this time? I.e. is there even a reason for the compositor to do anything? Maybe it's the tab throbber? On nightly, you can set the pref browser.tabs.hideThrobber to true. Can you check whether that improves performance?

(In reply to Markus Stange [:mstange] from comment #30)

Can you add the Compositor and Renderer threads? Judging by the memory track in the GPU process, that's where the extra activity is, and that's probably what's starving the CPU for the rest of the system.
Is there any ongoing animation during this time? I.e. is there even a reason for the compositor to do anything? Maybe it's the tab throbber? On nightly, you can set the pref browser.tabs.hideThrobber to true. Can you check whether that improves performance?

Just the browser chrome UI. I'll try without the throbber, and I'll get profiles with more threads.

Here's the where the "Waiting for socket thread" string is from:
https://github.com/firefox-devtools/profiler/blob/95bb5bf269a48575155c34bad1a3b2320f942f78/src/components/tooltip/NetworkMarker.js#L110

There are so many dots to connect here...

you can set the pref browser.tabs.hideThrobber to true. Can you check whether that improves performance?

It does indeed improve performance from what I can see. Not to where it should be, but somewhere in between lower refresh rates, and Tab Throbber enabled.

It doesn't seem to help with the long "Waiting for socket thread" times. I'm still trying to figure out "who has the ball" here, because everyone in the profiles seems to be waiting on someone else to do something.

frame_rate: 300, throbber: false:
https://perfht.ml/2UjG0dS

Compositor and renderer, along with pretty much everyone else, is just waiting on something for the duration of the google.com load. I don't know where that work is happening, because the socket threads and process seem to be waiting idle too.

Causing page loading regressions on (increasingly common) high-refresh rate monitors is a high priority issue.

If bug 1528285 fixes this, it should be re-prioritized accordingly.
If we can't fix this in a timely fashion, we should stop-gap fix this by capping max refresh rate.

Leaving this unfixed will hemorrhage us power users if they compare us to Chrome.

FYI jbonisteel (gfx) and nhi (necko)

Assignee: jgilbert → nobody
Severity: normal → critical
Flags: needinfo?(nhnguyen)
Flags: needinfo?(jbonisteel)
Priority: P3 → P1
See Also: → 1613496

I wonder if we have any telemetry to get a sense of how many people use high-refresh rate monitors to understand the depth of the impact?

Flags: needinfo?(jbonisteel)

It's not easy to fix the problem of main thread being too busy. There isn't an easy way to fix it, and we can't realistically get to this in the next couples of releases.

A few questions for Graphics team:

  1. Is this refresh rate supported? One of the earlier comments suggested that we don't support it.
  2. How many users could potentially be affected by this bug? (same question as comment #37)
Flags: needinfo?(nhnguyen) → needinfo?(jbonisteel)

I don't think we've proved that the main thread is busy yet, so I want to have that proved out before we assume it. In my profiles there was a lot of waiting, which didn't even seem to be blocking to me. I would like someone familiar with netwerk to look into this directly, and to certify that they think this is a problem elsewhere via experimentation on their own machine.

It's not just users individually affected by this bug, but also about the stories those affected users tell. Ten tweets of "I have a $5000 ultimate gaming machine and Firefox runs like crap" does a lot more damage than losing us ten users. Each blog post about "Firefox is slow on my new laptop" (and many new laptops are >60hz) serves to establish industry expectation that given how slow Firefox is on good machines, imagine how bad it is on slow machines.

Flags: needinfo?(nhnguyen)

The comment about 'official support' I think more reflects that there is work to be done here if it is expected that FF works well in these configurations.

It's not just users individually affected by this bug, but also about the stories those affected users tell.

True, of course, but I think the interesting thing to measure would be also - if we could prioritize this work - are we able to attract more users over time? A bit more of an open question though.

I don't know if there is telemetry that measures this currently, but I can take a look.

Can you make a http log (https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging) for me?

I will try to figure out what networking is doing. and what that "Waiting for socket thread" means exactly.

(logs contain cookie, use new profile and non-personal uri)

Flags: needinfo?(jgilbert)

As mentioned earlier, comment #30, hiding the throbber helps. I found that using userChrome.css to hide anything that animates: .tab-throbber, #stop-reload-button, .tab-loading-burst[bursting] (not a practical solution) restores about 99% of the speed. It's like there's another element somewhere that slows the browser down under some circumstances but I'm not sure what it is. This of course isn't a solution but I hope it can assist somewhat in tracking it down.

(In reply to Dragana Damjanovic [:dragana] from comment #41)

Can you make a http log (https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging) for me?

I will try to figure out what networking is doing. and what that "Waiting for socket thread" means exactly.

(logs contain cookie, use new profile and non-personal uri)

I can't. Please follow the STR to test it locally. A high-hz monitor is not required.

Flags: needinfo?(jgilbert)
Flags: needinfo?(nhnguyen) → needinfo?(honzab.moz)

(In reply to lpy750 from comment #42)

As mentioned earlier, comment #30, hiding the throbber helps. I found that using userChrome.css to hide anything that animates: .tab-throbber, #stop-reload-button, .tab-loading-burst[bursting] (not a practical solution) restores about 99% of the speed. It's like there's another element somewhere that slows the browser down under some circumstances but I'm not sure what it is. This of course isn't a solution but I hope it can assist somewhat in tracking it down.

Thanks for trying this, lpy750. The load delay is caused by suspending each channel from "http-on-modify-request" and "http-on-examine-response" by the extension. Why resuming is slowed down so much is still a question, but simply could be main thread flooding by too rapid vsync ticking.

Keeping ni? on me.

To follow up on earlier comments about telemetry and the question of how many users this could impact, it looks like around 3.74% of users have a monitor that would be considered high-refresh rate. Of course, that just answers the question about current users, not how many we could potentially attract if performance were better. I will chat with our PM to get product's take on this.

Flags: needinfo?(jbonisteel)

Going to work on this one now. There is now better logging of events and their delays (bug 1638925) which may give better clues what delays us.

Assignee: nobody → honzab.moz
Flags: needinfo?(honzab.moz)

I've added more logging of event chains lately. Even w/o it I can see that the channel loading the top level document is suspended for significantly longer time (250ms vs 5ms) during the opening by the extension. Thanks the logs I can see that there is ChromeUtils::IdleDispatch triggered by timer (!!) on the top level doc load critical path! This can likely be delayed by the higher number of refresh request events on the main thread (subject to confirm yet). That IdleDispatch is on the critical path of loads that are filtered by extensions has already been reported from examination of Backtrack logs long ago in a different bug.

I can confirm that after SetNewListener is called (by extension filters) on the nsHttpChannel of the top level document (Parent/Main thread) an idle dispatch is scheduled with timeout of 250ms which, when executed, sends a message to the extension process that immediate answers back to the parent process and after a series of DOM promises are fulfilled, the nsHttpChannel is finally resumed.

Because the main thread is flooded with VSync handling notifications, the idle dispatch can't fire before its own timeout. The higher the frame rate, the lower the chance idle dispatches will fire before its timeout.

See 1587058 - Redesign webRequest event coelescing to reduce head-of-line blocking, make it efficient with Fission for Backtrack logs and original report of this known issue.

I'm duplicating this bug to that one, because it's the main cause here..

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE

For what it's worth, just discovered that ui.prefersReducedMotion is a thing. And as expected, with that set to 1, there are no animations, and therefore no delays at these high refresh rates.

If https://bugzilla.mozilla.org/show_bug.cgi?id=1587058 is the cause, should it's priority and severity not at least match the values for this issue? Since it's causing it

See Also: → 1864271
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: