Closed Bug 1827052 Opened 1 year ago Closed 6 months ago

Firefox (and other Gecko-based browsers such as Waterfox) do not performance-scale as well as Chromium (blink-based) browsers (brave, ungoogled-chromium, thorium, etc.) as more and more tabs are created - (more than hundred tabs).

Categories

(Core :: Performance, defect)

Firefox 111
defect

Tracking

()

RESOLVED INACTIVE

People

(Reporter: br9fwmvi2zmag5ejggxpkxpcxip3pm, Unassigned, NeedInfo)

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/111.0

Steps to reproduce:

(1)
enable maximum logging using MOZ_LOG=all:5,timestamp,append, NSPR_LOG_MODULES=all:5,timestamp,append, and also having the Browser console window open in "multiprocess" mode with maximum logging options opened.

(2)
open every link in a new tab, and also every URL-typed into a new tab (never overwriting existing tabs).

(3)
save all tabs - more than 100 open tabs - using popular FLOSS browser extensions "Save Page WE" and "SingleFile" - also screenshoting some tabs using "Page Saver WE" extension.

Actual results:

Works but performance/responsiveness deteriorates rapidly as more are more tabs are opened (more than 100 tabs in total).

Expected results:

We exclusively use FLOSS browsers on relatively high-end workstation machines - machines with at least 128GB of RAM memory and at least 12 processor cores, at least 4GB of GPU memory, and all writing browser-data exclusively to NVMe GEN3 or GEN4 high-performance drives.

We use both Firefox (and other Gecko-based browsers such as Waterfox, torbrowser, etc.) as well as Chromium (blink-based) browsers (primarily brave but also, ungoogled-chromium, thorium, etc.), and really push these browsers to the limit in terms of CPU/memory/IO utilization:

(1)
by opening every link in a new tab, and also every URL-typed into a new tab (never overwriting existing tabs).

(2)
by enabling maximum logging using MOZ_LOG=all:5,timestamp,append, NSPR_LOG_MODULES=all:5,timestamp,append, and also having the Browser console window open in "multiprocess" mode with maximum logging options opened - which we periodically save before it fills up - the MOZ_LOG/NSPR_LOG files easily get over 150GB in size (combined) with more than 100 tabs opened.

In the case of blink-based Chromium browsers, we use the maximum logging options: "--enable-logging=stderr --v=9 --trace-startup --trace-startup-duration=10 --trace-startup-file=... --log-net-log=... --webview-verbose-logging --enable-extension-activity-logging --enable-gpu-client-logging --enable-gpu-service-logging --enable-gpu-command-logging --enable-gpu-driver-debug-logging --enable-sandbox-logging" along with other maximum log options, and the main log file gets even greater in size - easily 250GB or more than 100 of tabs opened.

(3)
by saving all tabs - more than 100 opened tabs - using popular FLOSS browser extensions "Save Page WE" and "SingleFile" - also screenshoting some tabs using "Page Saver WE" extension.

Since all of this heavy-duty usage really pushes these competing browser engines to the limit, we have noticed that the Firefox Gecko-based browsers do not scale as well as the Chromium blink-based browsers, especially with more than 100 tabs opened at the same time.

With more than 100 tabs opened, Firefox starts to become more and more unresponsive, while Chromium tends to maintain responsiveness and performs much better.

(A)
We believe that the main reason is that the way Firefox content processes are allowed to grow disproportionately in terms of memory usage - with the top content process easily growing to 25GB of private-bytes of memory or more, while the top Chromium-based browsers content process stays around 5GB in size or less - in other words, Firefox really concentrates too much of content into the top content processes which end up using too much commit-charge, which really slows down performance - see the attached screenshots which illustrates this behavior.

On the other hand, the Chromium-based browsers more evenly distribute the memory usage of content processes, background processes and GPU processes so that performance/responsiveness is maintained better as the scaling of tabs is increased drastically.

In fact, we are able to use Chromium-based browsers with many 100s of tabs opened - even more than 500 tabs opened - whereas, with Firefox performance really starts to degrade rapidly with more than 200 tabs opened.

We have tried changing some of Firefox's about:config settings to try to obtain better performance but it does not seem to help: for example, by increasing "dom.ipc.processCount.webIsolated" from 1 to 4, "dom.ipc.processCount" from 8 to 98, "dom.ipc.processPrelaunch.fission.number" from 3 to 16.

Note that this is not an issue of the computer having inadequate resources - as all of our workstation machines are relatively high-end with ample RAM memory, CPU cores, GPU memory and high-performance NVMe drives.

Note that this is not an issue of choice of Operating System - as we have noticed the same performance issue on both Windows and GNU/Linux.

So, this is strictly an issue of software-architecture and software-engineering - with software that does not performance-scale as well when the applications are pushed to the limit, in terms of CPU/memory/IO utilization.

(B)
Another way to improve performance of Firefox, is by creating and releasing specialized builds of Firefox that uses the most advanced features of modern CPUs such as AVX2.

For example, Thorium is a specialized build of Chromium for both Windows and GNU/Linux that builds with AVX2, Polly, and other advanced-CPU-features and/or multi-core-libraries, to really make best utilization of modern CPU/GPU etc.

So, have the regular release of Firefox which is good-enough for the average computer user, but also release a separate performance-enhanced version of Firefox for high-end workstations.

Alternatively, make the Firefox installer have the option of selecting/installing the version of Firefox best suited to the user and/or computer-hardware - thus allowing Firefox to be installed both for the average computer user, but also high-end power users.

The Bugbug bot thinks this bug should belong to the 'Core::Performance' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Performance
Product: Firefox → Core

https://github.com/lowleveldesign/process-governor/

Also, we just tried using 'procgov64' command-line-tool for Windows (see above) to limit the maximum memory of the firefox top-content-process - in this case to 32GB - using the following command-line, which produces the following output:

$ '/e/Applications/https```github.comlowleveldesignprocess-governor`releases`download`2.10-1`procgov.zip/procgov64.exe' --verbose --verbose --verbose --verbose --verbose --maxmem 32G --nogui --pid 19560 --nowait

Process Governor v2.10.22150.6 - sets limits on your processes
Copyright (C) 2022 Sebastian Solnica (lowleveldesign.org)

CPU affinity mask: (not set)
Max CPU rate: (not set)
Max bandwidth (B): (not set)
Maximum committed memory (MB): 32,768
Maximum job committed memory (MB): (not set)
Minimum WS memory (MB): (not set)
Maximum WS memory (MB): (not set)
Preferred NUMA node: (not set)
Process user-time execution limit (ms): (not set)
Job user-time execution limit (ms): (not set)
Clock-time execution limit (ms): (not set)

Apparently, that simply marks the process for killing if the process exceeds that limit.

This indeed did happen to us, as that top-content-process simply runs-away unchecked, uncontrolled and unlimited - due to this defect.

And after this top-content-process is killed, as you probably know - the entire Firefox application does not abort/terminate, but extensions do not seem to work after this point.

We had to disable and re-enable each extension on the 'about:addons' page, and that seems to fix the problem with extensions working again.

We do not know, as of yet, if the killing of this top-content-process has made the browser lose unsaved information (for example, in the data written into the currently-loaded profile folder), and/or if it leaves Firefox in a bad/unstable state.

(Just so that we know: If Firefox engineers can please let us know what are the actual consequences for killing the top-content-process with the remaining content-processes remaining untouched? - that would be greatly appreciated.)

However, upon continuing to use Firefox after this point, new tabs open faster, and also seem to be saving correctly (using "Save Page WE" and/or "SingleFile").

So, at least as of current, this behavior tends to re-confirm our theory that performance really degrades noticeably when the private-bytes of the top-content-process exceeds approximately 12GB, because as soon as this top-content-process is killed, the next largest content-process is only about 6GB in private-bytes, and as long as the top-content-process is 12GB or less, performance is reasonably fast.

By no means are we suggesting that this is the fix to this problem, because this is a rather risky hack and could leave Firefox in a bad/unstable state and with a possibly corrupted data written to the current-profile folder, and also, it would be still be unreasonable to expect the average Firefox user to follow the above instructions to make Firefox run faster again.

So, in summation, the defect still remains - that the top-content-process simply runs-away unchecked, uncontrolled and unlimited in terms of memory utilization - unless it is is manually killed or marked-for-killing (as above) - really causing Firefox to rapidly degrade in performance (especially as new tabs are opened), and there is really no explanation as to why this happens other than Firefox is incorrectly engineered into this defective behavior.

If someone could explain if there is a better, safe way to limit the maximum memory of content processes, so that new content processes can be created, and the memory of firefox's processes (and even background processes) more evenly distributed in terms of memory usage.

(For example if there is an about:config setting to achieve the above?)

We have attached another screenshot of the process-list sorted by memory usage (again compared to brave.exe) after this top-content-process is killed.

The content processes are highlighted in yellow, whereas the background processes are highlighted in teal - both for 'firefox.exe' and (chromium-based) 'brave.exe' processes.

Also, since this defect concerns run-away memory utilization, as a side-note to this defect:

When the commit-charge of all of the Firefox processes running on the computer causes the system to exceed the system-limit (which on Windows is : sizeof(physical-RAM-memory) + sizeof(all pagefile.sys files), on GNU/Linux is sizeof(physical-RAM-memory) + sizeof(all swap files/space)), then one of two things will happen:

(1)
If all the Firefox processes are not running as elevated-user/Administrator in Windows (or root user in GNU/Linux), then the kernel should simply kill the top-memory-using process(es).

(2)
However, if Firefox is running as elevated-user/Administrator in Windows (or root user in GNU/Linux) - which obviously is a very stupid idea that no browser users should be doing - then exceeding the system-limit will definitely crash the system (BSOD) on Windows, and could at least in theory, cause a kernel-panic on GNU/Linux.

We believe this is the worst-case scenario if this defect is not fixed, but again, not a likely scenario unless the user is inappropriately running Firefox as elevated-user/Administrator/root-user.

So, for example in one of our Windows workstations with 128GB RAM, we have had to set our pagefile.sys files to a combined total of 192GB to ensure that the system-limit (128GB + 192GB = 320GB) is high enough to avoid the above 2 scenarios (actually, just scenario #1).

After Firefox top-content-process that has been marked for killing by 'procgov64.exe' is killed (after exceeding 32GB in private-bytes) - see other comment - this is screenshot of the remaining processes, sorted by private-bytes.

The content processes are highlighted in yellow, whereas the background processes are highlighted in teal - both for 'firefox.exe' and (chromium-based) 'brave.exe' processes.

This bug was moved into the Performance component.

:br9fwmvi2zmag5ejggxpkxpcxip3pm, could you make sure the following information is on this bug?

  • For slowness or high CPU usage, capture a profile with http://profiler.firefox.com/, upload it and share the link here.
  • For memory usage issues, capture a memory dump from about:memory and attach it to this bug.
  • Troubleshooting information: Go to about:support, click "Copy raw data to clipboard", paste it into a file, save it, and attach the file here.

If the requested information is already in the bug, please confirm it is recent.

Thank you.

Flags: needinfo?(br9fwmvi2zmag5ejggxpkxpcxip3pm)

With no answer from the reporter, we don’t have enough data to reproduce and/or fix this issue. Please reopen or file a new bug with more information if you see it again. Details of our performance triage process can be found at https://wiki.mozilla.org/Performance/Triage.

Status: UNCONFIRMED → RESOLVED
Closed: 6 months ago
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: