Closed Bug 1658198 Opened 5 months ago Closed 5 months ago

Huge performance degradation on Firefox 79 when opening web pages

Categories

(Core :: Layout: Columns, defect)

79 Branch
x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
81 Branch
Tracking Status
firefox-esr68 --- unaffected
firefox-esr78 --- unaffected
firefox79 --- wontfix
firefox80 --- fixed
firefox81 --- fixed

People

(Reporter: zrzut01, Assigned: TYLin)

References

(Regression)

Details

(Keywords: perf, regression)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0

Steps to reproduce:

Try to open https://www.picuki.com/profile/justinchapple

Actual results:

It takes very long to open some web pages on Firefox 79. Rendering of the page stalls and one Web Content process uses 99% of CPU core. It takes ~3 min or longer to finally render the page. Same web page on same machine with latest Chromium or Firefox 75 takes ~2s. I tried to run Firefox 79 with fresh profile without any results. The machine is AMD E-350 with 16GB of RAM, Fedora 32.

Expected results:

The web page should be rendered no longer than on previous Firefox version.

It worked properly on last Firefox 78.x version as well.

Hi,

I am not able to replicate the issue. I've tried on Ubuntu 18.04.3 LTS and windows10 pro, may be fedora specific.

On the following versions:
release 79.0
beta, 80.0b4 (64-bit)
nightly 81.0a1 (2020-08-13) (64-bit)

Can you provide your about:support information? What fedora are you using?

Please test if the issue occurs to you in safe mode (add-ons disabled). Here is a link that can help you do that:
https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode

If the issue persists, also test it using a fresh profile, you can find the steps to do that below:
https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles?redirectlocale=en-US&redirectslug=Managing-profiles#w_starting-the-profile-manager

Also, try this on the latest version of nightly? You can download it from here: https://nightly.mozilla.org/

Thanks for the report.

Best regards, Clara.

Component: Untriaged → Widget: Gtk
Flags: needinfo?(zrzut01)
Product: Firefox → Core

What hardware did you use to perform the test?

Flags: needinfo?(zrzut01) → needinfo?(clara.guerrero)

I downloaded 78.1.0esr from mozilla.org and it works super fast and stable.

Fedora version 32, installed from 32 installer, not upgraded. About:support in the attachment.

  • removed ~/.mozilla
  • run Firefox 79
  • tested it in Private Mode
    results are same
Flags: needinfo?(clara.guerrero)
Attached file about_support79.txt

About:support contents

Thanks for the report! Could you try to find a regression range?
$ pip3 install --upgrade mozregression
$ MOZ_ENABLE_WAYLAND=1 ~/.local/bin/mozregression --good 78 --bad 79 -a https://www.picuki.com/profile/justinchapple

Keywords: perf, regression
OS: Unspecified → Linux
Hardware: Unspecified → x86_64

I tried twice but mozregression crashed at the end as shown in the two above attached logs from those two sessions. Anyway it narrowed builds much as you can see in the logs.

The issue had been reported in Red Hat Bugzilla – Bug 1867382 as well. I have created attachments there with those logs and didn't attach here to avoid duplication. If you wish me to do that please write such request in a comment.

Flags: needinfo?(jan)

Thanks! Can you check if this bug is still present in Nightly?
$ MOZ_ENABLE_WAYLAND=1 mozregression --launch 2020-08-13 --pref gfx.webrender.force-disabled:true -a https://www.picuki.com/profile/justinchapple
If it was fixed, please try to find a fix range:
$ MOZ_ENABLE_WAYLAND=1 mozregression --find-fix --bad 79 --good 2020-08-13 --pref gfx.webrender.force-disabled:true -a https://www.picuki.com/profile/justinchapple

https://bugzilla.redhat.com/show_bug.cgi?id=1867382#c5

54:53.34 INFO: Pushlog:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=e226df37045d7c9f4f8ef0014e760008c85f9f5e&tochange=c7610920fbf56608762c0652cf08e523a14d02a0

Could you try to launch them seperately to check which one is the first bad one?
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch e226df37045d7c9f4f8ef0014e760008c85f9f5e
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch 91c26c5b1e33c9417048f1628ba0f07827e0c86d
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch 398fc19f5f6adf0a58ddb43e2efbaac571db320b # <- this was backed out later and actually shipped with Firefox 80
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch 47e04e14c6b5728c8ef60c96045178db4ad21f2b
$ MOZ_ENABLE_WAYLAND=1 mozregression --repo autoland --launch 015c7a1af896e2e2b6f3aa78312dd5c3aa19fbdf

Nightly and release have different preferences enabled by default. That can make it slightly harder to find the regression range.
For example, Nightly uses OpenGL rendering by default, release still uses software rendering. This way you could have software rendering like release:
$ MOZ_ENABLE_WAYLAND=1 mozregression --good 78 --bad 79 --pref gfx.webrender.force-disabled:true layout.animation.prerender.partial:false -a https://www.picuki.com/profile/justinchapple

Flags: needinfo?(jan)

2020-08-13 bad

e226df37045d7c9f4f8ef0014e760008c85f9f5e good, no matter how I set 'gfx.webrender.force-disabled' and 'layout.animation.prerender.partial'
91c26c5b1e33c9417048f1628ba0f07827e0c86d bad, no matter how I set 'gfx.webrender.force-disabled' and 'layout.animation.prerender.partial'
398fc19f5f6adf0a58ddb43e2efbaac571db320b bad
47e04e14c6b5728c8ef60c96045178db4ad21f2b unable to check, causes crash of mozregression with below stacktrace
015c7a1af896e2e2b6f3aa78312dd5c3aa19fbdf unable to check, causes crash of mozregression with below stacktrace

Traceback (most recent call last):
File "/home/mac/temp/moz_venv/bin/mozregression", line 8, in <module>
sys.exit(main())
File "/home/mac/temp/moz_venv/lib64/python3.8/site-packages/mozregression/main.py", line 341, in main
sys.exit(method())
File "/home/mac/temp/moz_venv/lib64/python3.8/site-packages/mozregression/main.py", line 282, in launch_integration
self._launch(IntegrationInfoFetcher)
File "/home/mac/temp/moz_venv/lib64/python3.8/site-packages/mozregression/main.py", line 274, in _launch
build_info = fetcher.find_build_info(self.options.launch)
File "/home/mac/temp/moz_venv/lib64/python3.8/site-packages/mozregression/fetch_build_info.py", line 128, in find_build_info
task_id = tc_index.findTask(tk_route)["taskId"]
KeyError: 'taskId'

The instersting thing here is if I run mozregression with the -a option and URL provided then bad build of Firefox opens the URL immediately without any problems. But only in that way, if I try to open it by address bar manually (even in same session) it hangs as described earlier.

(In reply to mac from comment #10)

e226df37045d7c9f4f8ef0014e760008c85f9f5e good, no matter how I set 'gfx.webrender.force-disabled' and 'layout.animation.prerender.partial'
91c26c5b1e33c9417048f1628ba0f07827e0c86d bad, no matter how I set 'gfx.webrender.force-disabled' and 'layout.animation.prerender.partial'

That's bug 1647332 and it already has one known regression: bug 1657345
Can you check if this bug does no longer occur with the following build?
$ mozregression --repo try --launch 669bae0ecd7b27a27ac2acf25b09e69ddb2dd939 --pref gfx.webrender.force-disabled:true

The issue still occurs on 669bae0ecd7b27a27ac2acf25b09e69ddb2dd939.

Thanks for testing! Please also post a link to this upstream bug report into the RedHat bug to let them know.

Component: Widget: Gtk → Layout: Columns
Flags: needinfo?(aethanyc)
Regressed by: 1647332

I can repro after resizing the window horizontally a few times.

Status: UNCONFIRMED → NEW
Ever confirmed: true

Aand can no longer repro anymore :(

Could you provide cpu/gpu which you are working on?

Flags: needinfo?(emilio)

Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz / Intel Corporation UHD Graphics 630 (Mobile).

But this is not likely to matter because layout is just running on a single thread so... Two questions:

  • Can you take a profile in https://profiler.firefox.com on Nightly, and post the link here?
  • Can you tell me what your window.devicePixelRatio is? (Mine is 2.5 for example)
Flags: needinfo?(emilio) → needinfo?(zrzut01)

I load the test case in my local firefox debug build with this commend MOZ_LOG=ColumnSet:5 ./mach run --layoutdebug https://www.picuki.com/profile/justinchapple, and see log like the following:

[Child 31817: Main Thread]: D/ColumnSet FindBestBalanceBSize: Choosing next guess=444401, iteration=893
...
[Child 31817: Main Thread]: D/ColumnSet FindBestBalanceBSize: Choosing next guess=444402, iteration=894
...
[Child 31817: Main Thread]: D/ColumnSet FindBestBalanceBSize: Choosing next guess=444403, iteration=895
...

Apparently, the URL made firefox goes into the code path in nsColumnSetFrame::FindBestBalanceBSize where it uses a linear search for the column balancing size, which it shouldn't fall into the path.

I'll take a look.

Assignee: nobody → aethanyc
Status: NEW → ASSIGNED
Flags: needinfo?(aethanyc)

(In reply to Emilio Cobos Álvarez (:emilio) from comment #17)

Intel(R) Core(TM) i9-8950HK CPU @ 2.90GHz / Intel Corporation UHD Graphics 630 (Mobile).

But this is not likely to matter because layout is just running on a single thread so... Two questions:

  • Can you take a profile in https://profiler.firefox.com on Nightly, and post the link here?
  • Can you tell me what your window.devicePixelRatio is? (Mine is 2.5 for example)

Then according to cpu-monkey.com data your CPU is 10 times faster than mine which is significant difference. In such situation on such powerful hardware you can easily miss that performance issues.

Flags: needinfo?(zrzut01)

Sure, still a multi-minute hang divided by ten is not something I'd miss, specially when looking for it :)

Generally these issues are algorithmic and depend on stuff like the viewport size, the DPI and some other things that may vary across machines

Anyhow, thank you so much for identifying the commit that broke it :)

OK. I know why the page is slow.

The font-size of the column container used in the page is 0, so Bug 1647332 Part 4 makes extraBlockSize [1] have an initial value of 0. Later in the function, when we use the extraBlockSize to make a guess of the next feasible column block-size by doubling extraBlockSize in every iteration, we just double a zero value, which is still zero, and increase the guess by 1 app unit in the "sanitize" part of the logic [2]. As a result, we may take thousands of column balancing iteration (reflow) to find the first feasible column block-size.

[1] https://searchfox.org/mozilla-central/rev/50cb0892948fb4291b9a6b1b30122100ec7d4ef2/layout/generic/nsColumnSetFrame.cpp#1006
[2] https://searchfox.org/mozilla-central/rev/50cb0892948fb4291b9a6b1b30122100ec7d4ef2/layout/generic/nsColumnSetFrame.cpp#1102-1103

Severity: -- → S2

We should probably uplift this beta.

Pushed by aethanyc@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/b7ad02f8114d
Provide a minimum starting value for extraBlockSize in FindBestBalanceBSize. r=heycam
Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 81 Branch

Comment on attachment 9170117 [details]
Bug 1658198 - Provide a minimum starting value for extraBlockSize in FindBestBalanceBSize.

Beta/Release Uplift Approval Request

  • User impact if declined: Huge performance regression if font-size:0 or line-height:0 is used on multi-column container layout.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Simple one-line change to set a sensible default value of a variable in the column balancing algorithm.
  • String changes made/needed: None
Attachment #9170117 - Flags: approval-mozilla-beta?

Comment on attachment 9170117 [details]
Bug 1658198 - Provide a minimum starting value for extraBlockSize in FindBestBalanceBSize.

approved for 80 rc1

Attachment #9170117 - Flags: approval-mozilla-beta? → approval-mozilla-release+
See Also: → 1659166
See Also: 1659166
You need to log in before you can comment on or make changes to this bug.