Open Bug 1759017 Opened 4 years ago Updated 2 years ago

Stuttering with smooth scrolling after upgrading to FF-98 on linux

Categories

(Core :: Graphics, defect, P2)

Firefox 98
defect

Tracking

()

Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- wontfix
firefox104 --- wontfix
firefox105 --- wontfix
firefox106 --- wontfix

People

(Reporter: tsebrenko, Unassigned)

References

Details

Attachments

(7 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0

Steps to reproduce:

  1. Open https://www.yahoo.com or https://www.tiktok.com with FF-98.
  2. Open any site with a lot of pictures, javascripts.

Actual results:

Stuttering, lagged scroll up and down.

Example from another user:
https://imgur.com/a/uZwwjZa

Expected results:

Smooth and fast scroll without lags, like in FF-97, FF-91.7.0. I will say more - problems with scrolling (slightly noticeable) began back in FF-97. But they are not in the ESR branch.

The same problem from another user:
https://www.reddit.com/r/firefox/comments/taovn7/stuttering_with_smooth_scrolling_after_upgrading/

Here is the morzegression report from 2021-01-01 to today:
https://hg.mozilla.org/releases/mozilla-release/pushloghtml?fromchange=a4739fdb1d1cffb06d5971a6fb6901c1b3b41d26&tochange=f1dead7ff54438f9cd1fb0c9b84fe0bd0a660a7b

Just for my own info, I tested this on Debian 11 with FF-91.7 and a local build of m-c (which should be Fx100) running in a VM without 3D acceleration (i.e., software-only rendering). Went to yahoo, and selected a random "article". Scrolling was smooth, no jank observed, so I cannot confirm it.

When you get a chance, can you please attach your about:support output to this report?

Flags: needinfo?(jg.staffel)

Thanks for the report and using mozregression.

(In reply to jg.staffel from comment #0)

Here is the morzegression report from 2021-01-01 to today:
https://hg.mozilla.org/releases/mozilla-release/pushloghtml?fromchange=a4739fdb1d1cffb06d5971a6fb6901c1b3b41d26&tochange=f1dead7ff54438f9cd1fb0c9b84fe0bd0a660a7b

Hmm, that is an very large merge to the release branch. Usually I would expect mozregression to produce a pushlog on mozilla-central that was at most one day long. Maybe you can try mozregression again but use different settings? I'm not sure what setting you had that would cause that though. I think the key difference is that you should be getting links to mozilla-central instead of mozilla-release.

Attached file about:support —
(In reply to Bob Hood from comment #2) > When you get a chance, can you please attach your about:support output to this report? ```
Attached file about:support —
In English

Openbox without any compositor and XFCE with compositing - the same behavior.
On my old Thinkpad X220 laptop I don't see any problems.
But on a PC with GF1650S and Nvidia driver 470.103.01 and 510.54 - there are problems.

Flags: needinfo?(jg.staffel)

(In reply to Timothy Nikkel (:tnikkel) from comment #3)

Thanks for the report and using mozregression.

(In reply to jg.staffel from comment #0)

Here is the morzegression report from 2021-01-01 to today:
https://hg.mozilla.org/releases/mozilla-release/pushloghtml?fromchange=a4739fdb1d1cffb06d5971a6fb6901c1b3b41d26&tochange=f1dead7ff54438f9cd1fb0c9b84fe0bd0a660a7b

Hmm, that is an very large merge to the release branch. Usually I would expect mozregression to produce a pushlog on mozilla-central that was at most one day long. Maybe you can try mozregression again but use different settings? I'm not sure what setting you had that would cause that though. I think the key difference is that you should be getting links to mozilla-central instead of mozilla-release.

Here is the new morzegression report:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=365268736d5b982cbe82d7df1ff8cc8712a2dd4a&tochange=ab135b079c63c3db4bffebd24388c756e67d39f9

(In reply to jg.staffel from comment #7)

Here is the new morzegression report:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=365268736d5b982cbe82d7df1ff8cc8712a2dd4a&tochange=ab135b079c63c3db4bffebd24388c756e67d39f9

That one doesn't seem to make sense either because it is things landed on 2022-03-01, so those changes are present in Firefox 99 but not Firefox 98.

OK, I'm using mozregression-gui.
What settings need to be made?
Build type: shippable, opt, pgo, debug, asan or asan-debug?
Which repository?
What date range should I specify?

Your setting seems like they were okay this time, it's just that the builds you marked as good and bad got you to that result.
Try 2022-01-01 to 2022-02-10 should cover the time 98 was in the nightly cycle.

The other possibility is that the problem is not consistently reproducible, so that in builds that have the bug sometimes scroll smoothly.

If I need to rebuild FF with some patch, then I can easily do it - I'm on Gentoo.

(In reply to jg.staffel from comment #12)

The new report between 2021-11-01 and 2022-01-10:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c833216e4a1317b940f1878eac9afd12cc9cc9be&tochange=579afa70a0b1c00897227d9e784d600ab504deb4

Thanks for trying again. I don't think anything in there could cause this. It must be that you aren't seeing this problem consistently in the same (ie a build with the bug will sometimes work fine).

Let's try a different approach. Can you upload a profile using https://profiler.firefox.com/ ? Before starting a profile click the dropdown next to the profile button and edit settings and disable screenshots.

(In reply to Timothy Nikkel (:tnikkel) from comment #14)

Let's try a different approach. Can you upload a profile using https://profiler.firefox.com/ ? Before starting a profile click the dropdown next to the profile button and edit settings and disable screenshots.

But what if there is a profiler freezes when trying to send a captured profile? A new tab opens with the address https://profiler .firefox.com/from-browser and nothing loads. Other tabs don't load either. When you close the browser, it closes, but remains hanging in the background.
I tried to do it even on a new clean profile.

I tried both on the ESR branch and with FF-98.

I'm not sure. Maybe Markus knows whats happening with the profiler there?

Flags: needinfo?(mstange.moz)

On the latest ESR version, the profiler launched from Web developer tools - works as needed - allows you to capture and store a profile.

But in version 98 - running the profiler causes each page to freeze - the loading icon turns when switching tabs, no new tab is loaded. Stopping the profiler does not return FF to work - everything also hangs. You need to close the window and manually close the firefox process in the task manager.

That sounds really bad. jg.staffel, could you file a new bug about this in the Gecko Profiler component? This is the first I've heard of such a problem.

Flags: needinfo?(mstange.moz) → needinfo?(jg.staffel)
Flags: needinfo?(jg.staffel)

The profiler from "Web developer tools" works with ff-98 if I uncheck "enable new performance recorder" in the settings.
Will the attached report be enough?

Attached file profile.json.zip —

Web developer tools is for site authors to look at their own site. We are trying to look at the internals of Firefox with the profile to see what is going on, so unfortunately the attached profile doesn't give us much to go on.

Do you still see the same problem if you enter trouble shoot mode via about:support?

(In reply to Timothy Nikkel (:tnikkel) from comment #23)

Do you still see the same problem if you enter trouble shoot mode via about:support?

Yes

Are you able to do a screen recording to show what you see? Trying to think of different ways to understand/diagnose this.

(In reply to Timothy Nikkel (:tnikkel) from comment #25)

Are you able to do a screen recording to show what you see? Trying to think of different ways to understand/diagnose this.

My video with FF-98: https://imgur.com/a/iZKT0zb

Video from another user from reddit: https://imgur.com/a/uZwwjZa

My video with FF:ESR: https://imgur.com/a/9NT3nVC

Maybe there is a place to upload the original video?

Google drive?

My original video with FF-98:
https://vk.cc/cbOpPK

My original video with FF:ESR:
https://vk.cc/cbOq3d

When is the stuttering in the video that I'm looking for? Is it happening the whole time frequently? Or once every few seconds? Can you point me at a time stamp when a stutter happens, maybe the worst one so that I can see it?

(In reply to Timothy Nikkel (:tnikkel) from comment #30)

When is the stuttering in the video that I'm looking for? Is it happening the whole time frequently? Or once every few seconds? Can you point me at a time stamp when a stutter happens, maybe the worst one so that I can see it?

Look at new video: https://vk.cc/cbOvfJ
The scrolling is jerky, stuttering all the time: up and down.

While in the video from ESR - everything is very smooth, without jerks.

The video smooths out the jerks a little, but they are very annoying - there is a very big difference between 98 and ESR, the eyes get very tired.

Can you tell me the best settings for screen capture?
I use:
ffmpeg -video_size 1920x1080 -framerate 60 -f x11grab -i :0.0 output.mkv

(In reply to tsebrenko from comment #33)

Can you tell me the best settings for screen capture?
I use:
ffmpeg -video_size 1920x1080 -framerate 60 -f x11grab -i :0.0 output.mkv

I don't know.

Some people are more or less sensitive to the issues like this, I think I'm a person that is not as sensitive to stuttering issues like this so it's hard for me to see in a video. The issue usually becomes obvious in a profile (the ms mark when we do an update is marked, so you can see when one is skipped etc), which is why I asked for a profile. unfortunately the profiler is not working for you for some reason.

(In reply to tsebrenko from comment #34)

The results of mozregression --good 97 --bad 98
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e40a136dc8760f3fdd8a29fad47581d16d646c80&tochange=7e667b6a803df651804a554cb7e60e1f1bbce7c6

That also doesn't make sense. That change I would not expect to effect the bits that make it into the firefox executable, it is related to our automation infrastructure only. The reason this happens is if you mark a build as good because you didn't see the bug but the build actually had the bug and you just got lucky that the bug didn't show up the time you tested. What happens if you test each build a few times? You can do this by testing the build, exiting it, and then telling mozregression to retry that build. If you do this and you find sometimes the same build is good and sometimes it is bad then we know the problem only shows up intermittently. If you do this and you find the results be consistent when you retry then we can to think of something else.

Has the algorithm (graph) changed scrolling in version 97 or 98?
Something like a pulsation when scrolling?
Acceleration at the beginning and at the end of the scroll?

Maybe there are settings in about:config that can return the old behavior?

Same problem on windows. Most visible on https://www.twitch.tv/directory/all

My mozregression results:

Narrowed integration regression window from [ee9cb109, c55e3a40] (3 builds) to [395f90b6, c55e3a40] (2 builds) (~1 steps left)
Bug 1571758 - Wrench reftests for scroll offset generation. r=botond
Differential Revision: https://phabricator.services.mozilla.com/D133445

(In reply to tsebrenko from comment #36)

Has the algorithm (graph) changed scrolling in version 97 or 98?
Something like a pulsation when scrolling?
Acceleration at the beginning and at the end of the scroll?

Maybe there are settings in about:config that can return the old behavior?

Bug 1752862 landed, it added some prefs that you can play with apz.gtk.kinetic_scroll.delta_mode apz.gtk.kinetic_scroll.page_delta_mode_multiplier apz.gtk.kinetic_scroll.pixel_delta_mode_multiplier, see here for some explanation of the meaning of the values of them
https://hg.mozilla.org/integration/autoland/rev/e635157e4111#l1.12

(In reply to vladkosi from comment #37)

Same problem on windows. Most visible on https://www.twitch.tv/directory/all

My mozregression results:

Narrowed integration regression window from [ee9cb109, c55e3a40] (3 builds) to [395f90b6, c55e3a40] (2 builds) (~1 steps left)
Bug 1571758 - Wrench reftests for scroll offset generation. r=botond
Differential Revision: https://phabricator.services.mozilla.com/D133445

Thank you! This seems plausible. Using the changeset ids from that I get this pushlog

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=395f90b6&tochange=c55e3a40

which is bug 1571758. Hiro do you have any thoughts?

Flags: needinfo?(hikezoe.birchill)

(In reply to Timothy Nikkel (:tnikkel) from comment #38)

Bug 1752862 landed, it added some prefs that you can play with apz.gtk.kinetic_scroll.delta_mode apz.gtk.kinetic_scroll.page_delta_mode_multiplier apz.gtk.kinetic_scroll.pixel_delta_mode_multiplier, see here for some explanation of the meaning of the values of them
https://hg.mozilla.org/integration/autoland/rev/e635157e4111#l1.12

apz.gtk.kinetic_scroll.enabled=false did not help.

I think I found the "problematic" commit:
https://hg.mozilla.org/integration/autoland/rev/e947de18d580

I had the same issue ... now, after updating my NVIDIA driver, the stuttering is gone. I had a 470 version driver installed, but 470.57.02 - which is too old.

The way I found it was 6 hours of hg bisect, march build, march run and scrolling this list: https://hg.mozilla.org/mozilla-unified/tags

Finally got this output:

Testing changeset 676613:e947de18d580 "Bug 1742994 - Bump required nvidia driver version for HW-WR to 470.82 (i.e. always use EGL), r=aosmond" (2 changesets remaining, ~1 tests)


The first bad revision is:
changeset:   676613:e947de18d580
user:        REMOVED TO NOT FEED THE SPAMMERS
date:        Mon Jan 24 10:50:18 2022 +0000
summary:     Bug 1742994 - Bump required nvidia driver version for HW-WR to 470.82 (i.e. always use EGL), r=aosmond

For me problem is occured with nvidia-drivers-470.103.01 and 510.54. At first I thought there might be something wrong with the driver and installed 510.54, but no.

(In reply to fabian.becker from comment #41)

I think I found the "problematic" commit:
https://hg.mozilla.org/integration/autoland/rev/e947de18d580

For future reference a quicker way of pin-pointing that issue would be to look at the gfx section of about:support.

tsebrenko, I now notice that you've only uploaded the esr version of about support. It would be good to upload about:support from 98 as well to see if there are any changes in there from the esr one that might point to something.

Attached file about:support —
> tsebrenko, I now notice that you've only uploaded the esr version of about support. It would be good to upload about:support from 98 as well to see if there are any changes in there from the esr one that might point to something. Ok, about:support from 98:
Attached file about:support —
> tsebrenko, I now notice that you've only uploaded the esr version of about support. It would be good to upload about:support from 98 as well to see if there are any changes in there from the esr one that might point to something. Ok, about:support from 98:
Attached file about:support —
> tsebrenko, I now notice that you've only uploaded the esr version of about support. It would be good to upload about:support from 98 as well to see if there are any changes in there from the esr one that might point to something. Ok, about:support from 98:

The 98 about:support doesn't include the section with "HW_COMPOSITING:" etc. Would be good to compare those too.

I also noticed these modified prefs, are they modified on purpose? Do they affect the stuttering?
mousewheel.acceleration.factor: 6
mousewheel.acceleration.start: 5
mousewheel.min_line_scroll_amount: 26

Tested on second windows PC with the same results:

changeset 395f90b6e8803ec3663a7c2356ad4affd450cd6d -> good
changeset c55e3a40a4de20482f056a3de9e9823fb8c82768 -> bad

Attached file about:support-98-full —

(In reply to Timothy Nikkel (:tnikkel) from comment #47)

The 98 about:support doesn't include the section with "HW_COMPOSITING:" etc. Would be good to compare those too.

I also noticed these modified prefs, are they modified on purpose? Do they affect the stuttering?
mousewheel.acceleration.factor: 6
mousewheel.acceleration.start: 5
mousewheel.min_line_scroll_amount: 26

These scroll settings do not affect on stuttering - I took them from my work profile for a try. What's with them, what's without them is the same problem.

I have the same issue since updating Firefox to 98.0(firefox-nightly is also affected):
https://bugzilla.mozilla.org/show_bug.cgi?id=1759147

I use i3, but I've also tried Gnome and KDE - the same results. I use gtx770 with nvidia-470xx-dkms driver and cannot test newer driver versions, because Nvidia has stopped support for Kepler GPUs in latest drivers. But switching back to Firefox 97 solves the issue completely.

There are three possibilities I can think of; the two are bug 1753436 and bug 1758352.

  1. Bug 1571758 caused a performance regression (the former bug, bug 1753436) and the regression was fixed in 99 cycle, but I didn't uplift the fix into 98. So if the shutter is not observable in 99, that's the cause.

  2. Bug 1571758 caused another regression (the latter, bug 1758352), it was originally reported with regard to keyboard scrolling but in theory if the site in question calls scrollBy (or some such) in setTimeout/setInverval (maybe requestIdleCallback?) callbacks it's possible something does wrong there.

  3. In the profile in comment 21, I can see the site has at least one scroll event handler, which means we consider the document has scroll-linked effect, and I can see both cases where a nsRefreshDriver::Tick call is processed quite fast (around 4ms) and where the function call takes relatively long (over 10ms), so it's possible we use the same scroll offset on different two frames, that's causing the jitter scroll, see bug 1571758 comment 30 for the detail.

3 is highly suspicious for me. Introducing force responsive mode ignoring scroll linked effect would be the only one way I can think of to solve this jitter.

Hmm maybe I am wrong. It looks like all reporter cases are on Linux? There's something I am not aware of.

Flags: needinfo?(hikezoe.birchill)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #52)

Hmm maybe I am wrong. It looks like all reporter cases are on Linux? There's something I am not aware of.

I can see the stutters on two windows 10 PCs, one has nvidia GPU, the second one intel HD graphics.

changeset 395f90b6e8803ec3663a7c2356ad4affd450cd6d -> good
changeset c55e3a40a4de20482f056a3de9e9823fb8c82768 -> bad

(In reply to Hiroyuki Ikezoe (:hiro) from comment #52)

There are three possibilities I can think of; the two are bug 1753436 and bug 1758352.

  1. Bug 1571758 caused a performance regression (the former bug, bug 1753436) and the regression was fixed in 99 cycle, but I didn't uplift the fix into 98. So if the shutter is not observable in 99, that's the cause.

FF-100.0a1 (2022-03-13) have the same problem.

tsebrenko, would you mind trying this build to see whether the jitter scrolling still can be observable or not? You can download target.tar.bz2 in the "Artifacts and Debugging tools" pane in the right bottom box.

If the jitter doesn't happen on the build, can you please set a new pref "apz.force-responsive" to false to see whether the jitter does happen? If it doesn't 3) in comment 52 is the culprit of the jitter.

Flags: needinfo?(tsebrenko)
Severity: -- → S3
Priority: -- → P2

(In reply to Hiroyuki Ikezoe (:hiro) from comment #55)

would you mind trying this build

I apologize for replying to the question addressed to another person, but I still observe the issue with this build. Just for the info.

Thanks for the info. Then there's definitely another underlying problem I haven't noticed unfortunately. :/

(In reply to Hiroyuki Ikezoe (:hiro) from comment #55)

tsebrenko, would you mind trying this build to see whether the jitter scrolling still can be observable or not? You can download target.tar.bz2 in the "Artifacts and Debugging tools" pane in the right bottom box.

If the jitter doesn't happen on the build, can you please set a new pref "apz.force-responsive" to false to see whether the jitter does happen? If it doesn't 3) in comment 52 is the culprit of the jitter.

I can't run this version because it is 32-bit:
XPCOMGlueLoad error for the file /home/data/reserved/firefox/libmozgtk.so:
libgtk-3.so.0: I can't open a shared object file: There is no such file or directory
Failed to load XPCOM.

$ whereis libgtk-3.so.0
libgtk-3.so.0: /usr/lib64/libgtk-3.so.0

Or am I doing something wrong?

Flags: needinfo?(tsebrenko)

The first launch - everything is fine, there is no problem.
Restarting with apz.force-responsive=false - I don't see the difference.

Restarting with force-responsive=true and scrolling for 3 minutes - stuttering started. Multiple tabs are open at the same time - yahoo.com , youtube.com, https://www.twitch.tv/directory/all, pikabu.ru. After switching between tabs, it disappears, but then reappears.

I can reproduce this issue as well and also tried the build from https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/FJsgYRxiSf-ibx5EZXe5Qg/runs/1/artifacts/public/build/target.tar.bz2 and it seems to solve the issue for me. Strangely, however, it doesn't seem to act any differently based on how apz.force-responsive is set. I also can't reproduce the behavior tsebrenko saw where eventually the stuttering reappeared. It seems to stay gone for me.

Thanks for trying the build. I really appreciate it.

(In reply to Michael Marley from comment #61)

Strangely, however, it doesn't seem to act any differently based on how apz.force-responsive is set.

That's not strange at all. That's what I expected. The preference value shouldn't have any impact visually feeling scroll smoothness, I mean scrolling keeps same pace regardless of the preference value.

So, from what I can tell, there are at least three different problems users commented both in this bug and bug 1759147.

  1. Older NVIDIA drivers have fallen back to SW-WR since 98 release (comment 41)
  2. A (expected) side effect of bug 1571758 on documents having scroll-linked effects (comment 61)
  3. Unknown reasons but it definitely was introduced in bug 1571758 (cases of the original reporter of the both bugs)

I can reproduce this bug very easily on this page. I am using very specific scroll settings that seem to make the lag very obvious (general.smoothScroll.msdPhysics.enabled = true, apz.gtk.kinetic_scroll.enabled = false). The try build in comment 59 seems to work perfectly, the scrolling is much better. I am using Fedora Workstation 35 in the default wayland configuration.

(In reply to G from comment #63)

I can reproduce this bug very easily on this page. I am using very specific scroll settings that seem to make the lag very obvious (general.smoothScroll.msdPhysics.enabled = true, apz.gtk.kinetic_scroll.enabled = false). The try build in comment 59 seems to work perfectly, the scrolling is much better. I am using Fedora Workstation 35 in the default wayland configuration.

I can't figure out how to edit on bugzilla so I will also add that I have done all of this in the latest Nightly build. Google.com also quite recently started using a scroll-linked effect so I can reproduce the lag on google search results as well, quite concerning.

Is the patch from that build something that could be merged then?

(I also forgot to mention, the systems on which I am testing all have Intel graphics, so the bump in the required driver version doesn't affect me.)

Depends on: 1760222

Unfortunately no. It will regress bug 1571758. What I wanted to know by using the build is that the code introduced in bug 1571758 doesn't have any performance impact on scrolling.

Anyways, I've opened a new bug to mitigate the jitter fixed by the build, bug 1760222. I don't think it's perfect, to be honest I am a bit skeptical though, let's give it a try.

Thanks. I will be happy to try out any test builds.

Setting Regressed by field after analyzing regression range found by mozregression in comment #34.

Regressed by: 1751474

Set release status flags based on info from the regressing bug 1751474

:vringar, since you are the author of the regressor, bug 1751474, could you take a look?

For more information, please visit auto_nag documentation.

Bad bot. (That was an incorrect regression range.)

Flags: needinfo?(stefan)
No longer regressed by: 1751474

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Status: UNCONFIRMED → NEW
Ever confirmed: true

(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #72)

The bug has a release status flag that shows some version of Firefox is affected, thus it will be considered confirmed.

Yes, because you wrongly picked up a regression range.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: