Closed Bug 1732365 Opened 3 years ago Closed 3 years ago

Jank on GLX/fvwm X11/Intel when moving the second Firefox window to another page

Categories

(Core :: Graphics: WebRender, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr78 --- disabled
firefox-esr91 --- wontfix
firefox92 --- wontfix
firefox93 --- wontfix
firefox94 --- disabled

People

(Reporter: jan, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug, Regression)

Details

(Keywords: perf, regression)

Attachments

(2 files)

Found this on Twitter:
https://twitter.com/secalertsasia/status/1436096928236855297

Firefox on Linux is still not working well with WebRender for me (again) https://cstu.io/b06961

https://utcc.utoronto.ca/~cks/space/blog/web/FirefoxWebRenderFailureII

September 5, 2021

(Because my issue is so peculiar, as covered in my original entry, I haven't filed any sort of bug about it.

For historical reasons, this version has been running with $MOZX11_EGL_ set to "1", and did not exhibit the problem

When I took out this setting, my official Firefox 91 began having severe jank in the same situation as my original problem

Forcing gfx.webrender.software to "true" in my Firefox 91 profiles appears to fix the problem in light testing. (Time will create a more thorough test, if I keep not seeing the jank. It's very obvious jank, at least.)

Does this problem only occur when having multiple Firefox windows open?

Please reset the necessary prefs back to their default to reproduce this bug, restart Firefox, reproduce this bug, open about:support, click on "Copy text to clipboard" and paste it here. Thanks!

Flags: needinfo?(cks+mozilla)

I don't think I've seen it in the past with a single Firefox window, but I will keep an eye out. I usually have more than one Firefox window open. I've reset the relevant preferences (all gfx.* prefs at their default value) in my official Firefox 92.0 install, and will keep an eye out for when this reoccurs and do the about:support thing. (I couldn't reproduce it on demand right now, which is part of what makes this bug fun.)

Flags: needinfo?(cks+mozilla)

Chris, could you shortly post the output of xrandr --listproviders here? If it contains name:Intel (and not name:modesetting), that would explain the issue.

Flags: needinfo?(cks+mozilla)

You don't need to be able to reproduce it right now. Seeing about:support of the configuration that had this problem should be enough to be able to compare with other bugs.

Please also post the terminal output of $ xrandr --listproviders. Thanks!

Attached file about:support (text version) —
```
; xrandr --listproviders
Providers: number : 1
Provider 0: id: 0x46 cap: 0xf, Source Output, Sink Output, Source Offload, Sink Offload crtcs: 3 outputs: 4 associated providers: 0 name:modesetting
```

And the about:support from Firefox 92 as it currently stands, well, I've attached it in two versions, the text and the raw form.

It may be relevant that my display is a single HiDPI 3840x2160 Dell P2715Q, hence some of the scaling settings in my about:support.

Flags: needinfo?(cks+mozilla)

Version: 92.0
OS Theme: Adwaita / Adwaita
Multiprocess Windows: 4/4
Fission Windows: 0/4 Disabled by default
Window Protocol: x11
Desktop Environment: unknown
Target Frame Rate: 60
Description: Mesa Intel(R) UHD Graphics 630 (CFL GT2)
Display0: 3840x2160 default
GDK_DPI_SCALE: 0.5
GDK_SCALE: 2

When I took out this setting, my official Firefox 91 began having severe jank in the same situation as my original problem (in Firefox windows on fvwm virtual screens other than my first one).

https://packages.ubuntu.com/hirsute/fvwm

Are you able to reproduce the problem easily with 91?
$ pip3 install --upgrade mozregression
$ mozregression --launch 91 --pref gfx.webrender.all:true -a https://mozilla.org -a https://vsynctester.com

This machine is Fedora 34 with a custom-built and somewhat old fvwm (it works, I don't update it very often). Unfortunately mozregression doesn't seem to be launching Firefox 91; it reports:

 0:01.24 INFO: Using date 2021-07-12 for release 91
 0:02.70 INFO: Downloading build from: https://archive.mozilla.org/pub/firefox/nightly/2021/07/2021-07-12-21-56-04-mozilla-central/firefox-92.0a1.en-US.linux-x86_64.tar.bz2
[...]
 1:06.55 INFO: application_buildid: 20210712215604
 1:06.55 INFO: application_changeset: 3880d0d21aa306cfac44ea0d0fa188b59ae4233c
 1:06.55 INFO: application_name: Firefox
 1:06.55 INFO: application_repository: https://hg.mozilla.org/mozilla-central
 1:06.55 INFO: application_version: 92.0a1

and its About box reports 92.0a1. The vsynctester.com mostly had grey text in the box, with occasional red flashes and a few cyan.

mozregression is only using Nightly builds and 92.0a1 is the version of Nightly that corresponds to what became 91 release. All this is to say, it's expected that it reports 92.0a1 when ask for release 91

Drag the second tab to create a second window: The purpose of vsynctester is just that something is changing in the window. You could equally play back a YouTube video.
One question is, for example, if stuttering occurs in one window if the other is hidden underneath a different application or placed on a different workspace.

In Firefox 91 (as downloaded by mozregression), this appears to reproduce. When I create a second window and move either window to a different fvwm virtual page, the vsynctester page immediate becomes extremely janky. If the two windows are on the same fvwm virtual page (either my top left one or any other), it seems fine.

Can you reproduce the problem with all of these?
$ mozregression --launch 92 --pref gfx.webrender.all:true -a https://mozilla.org -a https://vsynctester.com
$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true -a https://mozilla.org -a https://vsynctester.com
$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-enabled:true -a https://mozilla.org -a https://vsynctester.com

I can now reproduce the problem on Firefox 92 as well. If I have a Firefox 92 window playing a Youtube video or displaying https://vsynctester.com/, interacting with other Firefox 92 windows on a different fvwm virtual page becomes very janky; scrolling is very slow and the vsynctester.com mouse input test at https://www.vsynctester.com/testing/mouse.html is almost unresponsive. So it seems that in Firefox 92 it takes a reasonably graphics intensive page to do this, and the page itself (when displayed) is fine but others are not fine when it's active.

$ mozregression --launch 92 --pref gfx.webrender.all:true -a https://mozilla.org -a https://vsynctester.com
$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true -a https://mozilla.org -a https://vsynctester.com
Both of these reproduce the full problem, where either window is janky when they're on different fvwm virtual pages.

$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-enabled:true -a https://mozilla.org -a https://vsynctester.com
This has no problems; each window is responsive when on different fvwm virtual pages.

Are these two fine or does the problem also occur?
$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true layout.frame_rate:60 -a https://mozilla.org -a https://vsynctester.com

$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true layout.frame_rate:0 -a https://mozilla.org -a https://vsynctester.com
(How much fps does vsynctester show above the diagram here?)

The first one has the full problem (both windows janky when separated). The second one appears to have no problem, and regardless of whether the two windows are on the same fvwm virtual page or on different ones, vsynctester reports 80 to 85 fps.

How much fps do you get with this one? (How much fps does vsynctester show above the diagram here?)
$ mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-enabled:true -a https://mozilla.org -a https://vsynctester.com

I seem to get between 55 to 60 fps depending on where I park the mouse (completely outside the window is best) and where the scrolling background is at (white is best). The fps seems to be the same regardless of whether the two windows are on the same fvwm virtual page or on different ones.

bug 1635186 comment 49, bug 1716049 and this bug are similar.

  • EGL (bug 1684194 comment 9: shared GL context for all windows) + GLX_SGI_video_sync/Mesa + SwapInterval(0): no problem, 60 Hz
  • GLX + layout.frame_rate:0 + ASAP mode / no GLX_SGI_video_sync (?) + SwapInterval(0): no problem, but ~85 Hz (unthrottled?)
  • GLX + layout.frame_rate:-1 + GLX_SGI_video_sync + SwapInterval(0): unknown
  • GLX + layout.frame_rate:-1 + GLX_SGI_video_sync + SwapInterval(1): problem occurs
  • GLX + layout.frame_rate:60 + 60 Hz software timer + SwapInterval(1): problem occurs

bug 1716049 comment 31 had a try build with the option to set SwapInterval to 0 without enabling ASAP mode,
but it seemed to have a minor logic bug and it can't be opened with mozregression anymore (too old?).
https://hg.mozilla.org/try/rev/5d4e698df126255d585fd69437550178fbb4af84
If useGlxVsync is true, then GLX_SGI_video_sync should be used. But it can only ever become true if StaticPrefs::gfx_software_vsync_AtStartup() is true. Therefore I think the if at the beginning should have been inverted. With this assumption, we should have set gfx.software-vsync to true to be able to use GLX_SGI_video_sync.

It would be great if someone could make a new try build that is usable with mozregression.

Has STR: --- → yes
Depends on: linux-egl
Regressed by: 1702301
Summary: Jank on Linux → Jank on HiDPI/non-Gnome/X11/Intel with GLX
Has Regression Range: --- → yes
See Also: → 1716049, 1635186

layout.frame_rate:0 has different behavior depending on when it's set:

  • If Firefox starts with layout.frame_rate:0 already set before the Firefox session is currently running, Firefox will run unthrottled.
  • If Firefox starts with layout.frame_rate:-1 and then it is set to layout.frame_rate:0 while the Firefox session is running, Firefox will run at the proper vsync rate.
    • This is the essence of the workaround I described in bug 1720634 which I still use, now in combination with gfx.x11-egl.force-disabled.

This might be why the behavior is unexpected when trying to test this with mozregression.

You mean layout.frame_rate:-1 forces non-ASAP on init (=throttling to correct display refresh rate?), but then we can still change SwapInterval to 0 by setting layout.frame_rate:0? I can confirm the difference in behavior:

94-102 fps and the GLX/Xwayland bug (bug 1635186 comment 49) can't be reproduced:
mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true layout.frame_rate:0 -a https://mozilla.org -a https://vsynctester.com

61 fps and the GLX/Xwayland bug (bug 1635186 comment 49) can't be reproduced:
Open about:config, set layout.frame_rate to 0, don't restart, test.
mozregression --launch 2021-09-22 --pref gfx.webrender.all:true gfx.x11-egl.force-disabled:true layout.frame_rate:-1 -a https://mozilla.org -a https://vsynctester.com

Nightly 94 is fixed by the switch to EGL, it will become Beta 94 in 1.5 weeks.
Stable 93 would unlikely get a dot release with an untested GLX SwapInterval(0).
ESR 91 will still be around for half a year and needs a fix for Nvidia (bug 1716049) and this bug. Release managers seem to prefer keeping horrible bugs instead of risking a possible fix in ESR that could cause regressions, therefore an uplift candidate needs active usage.
If we could set GLX SwapInterval to 0 (controllable by pref) in Nightly and merge bug 1732002 to have a greater GLX/X11/Nvidia test population, then it might be possible to get it backported to ESR at some point. Better than suddenly enforcing SW WR for the whole Linux ESR.

Nical or Robert, could you create a patch and a try build that does only the following aspect of https://hg.mozilla.org/try/rev/5d4e698df126255d585fd69437550178fbb4af84 (bug 1716049 comment 31)?

-    const bool isASAP = (StaticPrefs::layout_frame_rate() == 0);
-    mGLX->fSwapInterval(*mDisplay, mDrawable, isASAP ? 0 : 1);
+    const bool isASAP = StaticPrefs::layout_frame_rate() == 0;
+    int swapinterval = isASAP ? 0 : StaticPrefs::gfx_swapinterval();
+    mGLX->fSwapInterval(*mDisplay, mDrawable, swapinterval);
+- name: gfx.swapinterval
+  type: RelaxedAtomicInt32
+  value: 0 // <---------------- 0 instead of 1
+  mirror: always

IIUC, Linux desktops without compositor would get tearing as with software rendering, but composited desktops wouldn't.
The switch to RenderCompositorEGL (SwapInterval 0) was the fix for EGL/X11: bug 1635186 comment 49

Could do, OTOH this setup would get EGL by default in 94 if no further problems come up - and that would fix the issue here IIUC. Is there any good reason to invest more into GLX? I'd love to see it deprecated rather sooner than later, it's really old tech.

(In reply to Robert Mader [:rmader] from comment #26)
Fully agreed, but should ESR 91 really be left broken on Linux until ~June 2022? https://wiki.mozilla.org/Release_Management/Calendar
That's the default Firefox for Debian users. Or should HW WR be disabled there?

(In reply to Darkspirit from comment #27)

Fully agreed, but should ESR 91 really be left broken on Linux until ~June 2022? https://wiki.mozilla.org/Release_Management/Calendar
That's the default Firefox for Debian users. Or should HW WR be disabled there?

Hm, hard call. In case we wanted to backport it, that would mean:

  • it has to be the default - users willing to change configs can already do so by e.g. disabling hardware acceleration or enabling EGL
  • that means it should be the default in stable for at least one cycle, otherwise maintainers wouldn't take it I assume
  • IIUC it only affects a rather small user base (non-composited WM)
  • it might have regressions for some users (tearing), who could rightfully expect a ESR not to change in such a way

So I personally would go for leave it be - or disable HW acceleration for everybody. But rather leave it be.

(In reply to Robert Mader [:rmader] from comment #28)

  • IIUC it only affects a rather small user base (non-composited WM)

bug 1720634 (60 Hz / window count = Hz per window) also occurs with Gnome X11/Nvidia.
I closed it as duplicate of bug 1716049 because they seem to suffer from the same root problem.

(Chris Siebenmann from comment #14)

If I have a Firefox 92 window playing a Youtube video or displaying https://vsynctester.com/, interacting with other Firefox 92 windows on a different fvwm virtual page becomes very janky; scrolling is very slow and the vsynctester.com mouse input test at https://www.vsynctester.com/testing/mouse.html is almost unresponsive.

"Almost unresponsive" (GLX/fvwm X11 desktop/Intel) sounds like the behavior I see when manually enabling WR on GLX/Xwayland (bug 1635186 comment 49).

Options for Linux ESR 91:
a) Advising Nvidia and non-Gnome Mesa users who run into this multi window regression to manually enable gfx.x11-egl.force-enabled (bug 1689464, bug 1684194, bug 1713468) or gfx.webrender.software.

b) comment 21: Trying to switch GLX SwapInterval from 1 to 0 (like EGL is already doing), test it some time with Nvidia/X11 (bug 1732002) and Mesa =<20 (bug 1695933), then uplift it into ESR.

c) Disabling HW WR for Nvidia and non-Gnome Mesa in Linux ESR 91 - for the profit/usability of users with multiple windows, at the cost of users who use only one window.
ESR 91 is still in Debian experimental and not in other channels: Debian ESR users who'll get the upgrade from 78 (Basic by default) to 91 wouldn't notice the change. But Chromium might perform better for SW WR users.

@jrmuizel: What is your opinion?

Flags: needinfo?(jmuizelaar)

Is disabling HW-WR for non-composited WM an option?

(In reply to Jeff Muizelaar [:jrmuizel] from comment #30)

Is disabling HW-WR for non-composited WM an option?

No, and disabling HW WR for ESR 91 on non-Gnome Mesa seems to be overkill.
If you don't want to touch GLX swap interval:
a) wontfix
b) detect fvwm correctly in ESR 91 and block WR/GLX for it. 94 on Mesa 21 is fixed by bug 1695933.


bug 1716049, GLX SwapInterval on Nvidia (basically 60 Hz / window count = Hz per window):
Composited/uncomposited does not matter.
(Nicolas Silva [:nical] from bug 1716049 comment 13)

We could also accept tearing when multiple windows are presenting. It's kind of gross but perhaps better than losing hardware acceleration altogether for all linux users with proprietary nvidia drivers, which is what we'll have to do in the short term if we don't find a solution.

bug 1732002 disabled the fix (EGL/X11/Nvidia), the GLX bug is back.
It's bad and noticable, but usable. bug 1716049 could be wontfixed (or hw WR be disabled for ESR91, but Nvidia users tend to have large screens).

EGL can be reenabled for an upcoming Nvidia driver version that supports EGL_NV_robustness_video_memory_purge,
but robustness needs to be fixed for the RenderCompositorEGL codepath (bug 1731172 comment 21).

(Irrelevant: EGL/Nvidia could already be enabled on PopOS because it sets the NVreg_PreserveVideoMemoryAllocations=1 Nvidia driver flag: bug 1731172 comment 18)


GLX SwapInterval on Mesa (massively slowed down rendering until both windows are visible):

  • Can't reproduce with GLX / i3 X11 / Intel.

  • Can't reproduce with GLX / Mate X11 (compositing disabled) / Intel.

  • Can't reproduce with GLX / KDE X11 (compositing disabled) / Intel. Tried with multiple Activities and Workspaces.

  • Irrelevant: Reproducible with GLX/Xwayland (bug 1635186), but WR was disabled there. WR ships with EGL now. (bug 1695933 + bug 1730671)

  • Can only reproduce with fvwm at the moment. This bug can be wontfixed or at least fvwm be correctly detected by Firefox.
    fvwm has 4 desks (tabs of workspaces) with 4 pages (workspaces) each. The workspace switcher looks like a Windows logo with tabs above.
    If I have two Firefox windows, one on the top left page on desk 1 and the other on the top left page on desk 2, then there is no problem.
    The bug occurs if the second Firefox window is on a different page (for example, one is top left, the other is bottom right), desk number does not matter.

    about:support

    Desktop Enionment unknown

    $ printenv | grep fvwm

    DESKTOP_SESSION=fvwm

    XDG_SESSION_DESKTOP=fvwm

    GDMSESSION=fvwm

Flags: needinfo?(jmuizelaar)
Summary: Jank on HiDPI/non-Gnome/X11/Intel with GLX → Jank on GLX/fvwm X11/Intel when moving the second Firefox window to another page

Sorry for panicking without reason.
Mesa: I would wontfix this edge case fvwm issue for ESR91, it can be fixed by gfx.x11-egl.force-enabled=true.
Nvidia: They kindly backported the EGL fix to 470: bug 1731172 comment 24

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: