Closed Bug 1675768 Opened 3 years ago Closed 3 years ago

Ship WebRender on Linux to more configurations in early beta

Categories

(Core :: Graphics: WebRender, task, P3)

Desktop
Linux
task

Tracking

()

RESOLVED FIXED
84 Branch
Tracking Status
firefox84 --- affected

People

(Reporter: aosmond, Assigned: aosmond)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Need to review wr-linux first for exact scope. But candidates might be KDE (all desktop env?), Wayland, Intel large screen?

I'd like to see Wayland and EGL/X11 included because

  1. both are not enabled by default anyway (Wayland only on some distributions like Fedora, which then could also easily carry a patch changing the default again).
  2. they benefit more from WR than the GLX backend, which is why many(most?) users who enable them explicitly also enable Webrender already. That's because they support
  • dmabuf sharing for faster WebGL and VAAPI support
  • partial damage
  1. It's kinda odd if enabling Wayland or EGL/X11 backend disables WR. I think it would be more consistent for users if these options leave WR untouched, at least as long as they are not enabled by default.

Martin, what do you think about that?

Flags: needinfo?(stransky)

I don't have any issue with that. Testing basic/OpenGL with EGL/X11 and Wayland isn't that useful since we'll probably ship WebRender / WebRender SWGL first.

Concerns

  • GLX/Xwayland likely still has unacceptable delays (bug 1672414), MOZ_X11_EGL should be default on Xwayland right away. No GLX on Xwayland. I encountered these tab switch delays by myself until I switched to MOZ_X11_EGL/Xwayland.
  • WebRender/GLX/KDE could be fatal if Firefox doesn't correctly detect multiple GPUs: If an Intel card enabled WebRender on non-compositing KDE/Nvidia, it would break Firefox: bug 1663273. MOZ_X11_EGL would prevent this bug. But then, Nvidia/X11 might be using EGL as well which needs bug 1663152 to be merged.

Suggestions

  • Regardless of WebRender, Xwayland (=Mesa) should default to MOZ_X11_EGL right away (bug 788319) instead of GLX (bug 1672414). Let it ride to release. First it would only affect WebGL and users who have manually enabled WebRender.
    • widget.dmabuf-webgl.enabled would likely cause problems on multi GPU setups (e.g. Intel APU + AMD GPU). I would suggest to disable it on non-Nightly until bug 1588904 is fixed.
  • Nightly on X11/Mesa could default to MOZ_X11_EGL as well: Mesa 20.2/Ubuntu 20.10 supports EGL_KHR_swap_buffers_with_damage on X11.
    • "Intel large screen" (EGL_KHR_swap_buffers_with_damage) and Xwayland (slow tab switch with GLX) depend on MOZ_X11_EGL: Let WebRender/EGL/Gnome/Xwayland and WebRender/EGL/KDE/Xwayland&X11 ride to Early Beta possibly after bug 1640779 or bug 1669275 is fixed.
  • Enable SW-WR by default on Nightly/proprietary Nvidia.

There are still some regressions left with WebRender/X11 but I think we can proceed on Wayland with Webrender and also tentatively switch nightly users to X11/EGL:

  • switch users on X11 with Mesa drivers to EGL on Nightly
  • enable WebRender by default on Wayland/Mesa and single GPU systems
Flags: needinfo?(stransky)

We don't currently detecting multiple GPUs on Linux in the GfxInfo code, so that is something that would need to be added.

85 Nightly

  • Ship EGL to users with Mesa drivers; either X11 and XWayland.
  • Ship WebRender to users on Wayland and with Mesa drivers and only 1 GPU.
  • Ship Software WebRender to proprietary Nvidia (probably another bug).

85 Beta

  • Ship WebRender to all desktop environments on X11.

85 Release

  • Ship WebRender to all screen sizes with Intel (already the case on ATI/AMD).

One open question I have then is if Wayland is sufficiently stable (or we anticipate it will reach such in time for a train) that we won't care to test/ship the XWayland + WebRender configuration?

(In reply to Andrew Osmond [:aosmond] from comment #5)

We don't currently detecting multiple GPUs on Linux in the GfxInfo code, so that is something that would need to be added.

85 Nightly

  • Ship EGL to users with Mesa drivers; either X11 and XWayland.
  • Ship WebRender to users on Wayland and with Mesa drivers and only 1 GPU.
  • Ship Software WebRender to proprietary Nvidia (probably another bug).

85 Beta

  • Ship WebRender to all desktop environments on X11.

85 Release

  • Ship WebRender to all screen sizes with Intel (already the case on ATI/AMD).

I agree with this list :) Looks like I should get back to bug 1669275

One open question I have then is if Wayland is sufficiently stable (or we anticipate it will reach such in time for a train) that we won't care to test/ship the XWayland + WebRender configuration?

We will need Xwayland for quite some time. As of Gnome, we shouldn't ship WR Wayland for versions before 3.36 (also used in Ubuntu 20.04) and KWin is not ready at all yet, especially as soon as we reland bug 1629140, which I hope to do for 85. Xwayland can serve as a great compatibility layer in the mean time.

Correction, we shouldn't ship Wayland (not WR) for Gnome < 3.36 and current Kwin.

(In reply to Andrew Osmond [:aosmond] from comment #5)

One open question I have then is if Wayland is sufficiently stable (or we anticipate it will reach such in time for a train) that we won't care to test/ship the XWayland + WebRender configuration?

Wayland backend is disabled by default, that's covered by Bug 1543600, and we're not going to enable it anytime soon as Wayland does not work with flash plugin and we don't have a test stuite finished yet (Bug 1578640). Wayland is enabled by default on some specific distros like Fedora where the Wayland backend is developed.

So I think it's safe to enable WR on Wayland and test/develop that configuration togethter.

Note: Also Wayland is enabled on Fedora/Gnome only where it's well tested, not on KDE.

IIUC, flash support will get removed in 85 (bug 1675349) - so that shouldn't be a blocker any more.

While there are many improvements, there are a few alarming regressions with EGL:

https://treeherder.mozilla.org/perfherder/compare?originalProject=mozilla-central&newProject=try&newRevision=ed39b214afb83f76bcb45739bcedb81a86155644&framework=1&selectedTimeRange=604800

The SVG related tests are a bit of a mystery. Robert/Martin, offhand do you know what is happening with XRes? Does it have weird implications with EGL vs GLX?

Hm, not absolutely sure but I'd hope we don't (need to) use XRes with the EGL backend at all. It looks like something we'd mainly use in combination with xrender - which is not compatible with the EGL backend (at least I think so - I haven't ported pixmap support for example). For user who have gfx.xrender.enabled:true, we should probably stick to the GLX backend (or remove xrender support altogether at some point).

It definitely wouldn't have XRender enabled in those tests:

https://searchfox.org/mozilla-central/rev/5a1a34953a26117f3be1a00db20c8bbdc03273d6/gfx/config/gfxConfigManager.cpp#325

A ForceDisable is a runtime failure, and that overrides a UserForceEnable (done via the envvars in this case). So it would have been running without WR/acceleration and GLX/EGL would have made no difference.

It sounds like the XRes tests should be deprecated or continue to run with GLX then.

XRes seems to (inaccurately) report memory usage for X pixmaps:
https://searchfox.org/mozilla-central/rev/5a1a34953a26117f3be1a00db20c8bbdc03273d6/testing/talos/talos/cmanager_linux.py#124-141
https://www.freedesktop.org/wiki/Software/xrestop/

According to this, linux64 is running on Ubuntu 16.04: https://searchfox.org/mozilla-central/rev/5a1a34953a26117f3be1a00db20c8bbdc03273d6/taskcluster/ci/test/test-platforms.yml#120
The try build had two WebGL test failures, which made me think of bug 1663152. According to https://wiki.mozilla.org/TestEngineering/Performance/Platforms#HPE_Moonshot it's running on an Intel APU though, but is that information correct?

Do you see the same problem when running talos on linux1804-64-shippable-qr?
Which prefs are actually enabled?
We might not be comparing GLX with EGL, but with EGL+DMABUF: Do you get the same result with widget.dmabuf-webgl.enabled=false?
(Only Mesa supports DMABUF. Proprietary Nvidia falls back to readback.)

rasterflood_gradient sets layout.frame_rate to 0: https://searchfox.org/mozilla-central/rev/5a1a34953a26117f3be1a00db20c8bbdc03273d6/testing/talos/talos/test.py#1149

tp5o_webext also uses XRes: https://searchfox.org/mozilla-central/rev/5a1a34953a26117f3be1a00db20c8bbdc03273d6/testing/talos/talos/unittests/test_config.py#719

Since bug 1650583 comment 30, EGL no longer uses GLX_SGI_video_sync (IIUC vsync/framerate detection) because mixing them would break EGL on proprietary Nvidia. Bug 1669275 should reintroduce that only for Mesa, not for proprietary Nvidia. (bug 1640779 is about EGL-only vsync.)

Since bug 1663003 comment 17, GLX is used again to find X Visual for EGL window transparency. Are there leaks or something like that?

It looks like it was running basic because talos isn't configured correctly for us to use EGL.

I've split off the EGL work into bug 1677203.

Pushed by aosmond@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9bfbe000361f
Ship WebRender to early beta for all Linux desktops and Intel users with large screens. r=jrmuizel
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch

I'm slightly concerned about bug 1663273 - bug 1663273 comment 52 suggests we'll have issues on potentially all not-composited window managers. Maybe we should limit WR roll-out to composited window managers, something we can check with gdk_screen_is_composited() (we already do in other places).

Also reopening as we haven't landed all steps agreed upon.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

(In reply to Robert Mader [:rmader] from comment #19)

I'm slightly concerned about bug 1663273 - bug 1663273 comment 52 suggests we'll have issues on potentially all not-composited window managers. Maybe we should limit WR roll-out to composited window managers, something we can check with gdk_screen_is_composited() (we already do in other places).

Does that happen with non-NVIDIA proprietary drivers? We don't currently turn on WebRender for those users.

Also reopening as we haven't landed all steps agreed upon.

I put the EGL work into bug 1677203, blocked by our CI work that should be coming in the next month or so. They are in the middle of doing the work to migrate away from 16.04. There seem to be some historical weird issues around that configuration that I'd rather not cause us to break, when I can just wait a bit and have it go away :).

Technically we never blocked Wayland users from getting WR on nightly, but we needed to know the vendor/device ID to pass the allowlist -- once the PCI bus change lands that should be fixed (once the soft freeze is over). Similarly I'll enable Software WebRender for NVIDIA proprietary once the soft freeze is over.

See Also: → 1677293
See Also: → wr-linux-egl-nightly

(In reply to Andrew Osmond [:aosmond] from comment #20)

(In reply to Robert Mader [:rmader] from comment #19)

I'm slightly concerned about bug 1663273 - bug 1663273 comment 52 suggests we'll have issues on potentially all not-composited window managers. Maybe we should limit WR roll-out to composited window managers, something we can check with gdk_screen_is_composited() (we already do in other places).

Does that happen with non-NVIDIA proprietary drivers? We don't currently turn on WebRender for those users.

Oh right, I forgot that! Than it's probably fine.

Also reopening as we haven't landed all steps agreed upon.

I put the EGL work into bug 1677203, blocked by our CI work that should be coming in the next month or so. They are in the middle of doing the work to migrate away from 16.04. There seem to be some historical weird issues around that configuration that I'd rather not cause us to break, when I can just wait a bit and have it go away :).

Right.

Technically we never blocked Wayland users from getting WR on nightly, but we needed to know the vendor/device ID to pass the allowlist -- once the PCI bus change lands that should be fixed (once the soft freeze is over). Similarly I'll enable Software WebRender for NVIDIA proprietary once the soft freeze is over.

This already worked by falling back to X11/GLX in glxtest if available. With you PCI patch we can adapt glxtest to not do that any more, which is pretty cool for flatpak / Xwayland-less use-cases. However, there's still at least one other blocker which is display detection - which should be easy to implement for wayland. Will create a bug for it.

Technically we never blocked Wayland users from getting WR on nightly, but we needed to know the vendor/device ID to pass the allowlist -- once the PCI bus change lands that should be fixed (once the soft freeze is over). Similarly I'll enable Software WebRender for NVIDIA proprietary once the soft freeze is over.

This already worked by falling back to X11/GLX in glxtest if available. With you PCI patch we can adapt glxtest to not do that any more, which is pretty cool for flatpak / Xwayland-less use-cases. However, there's still at least one other blocker which is display detection - which should be easy to implement for wayland. Will create a bug for it.

Ahhhh, so the DISPLAY envvar check is what puts it down the glxtest path I guess?

https://searchfox.org/mozilla-central/rev/44e6dfd7e02edd95e5fd4d4c25c8b946131f92cd/toolkit/xre/glxtest.cpp#558

Which display detection case are we hitting? I've lifted the last screen size restriction on nightly/beta, and we don't check the refresh rate on nightly either.

Ah the SCREEN_INFO path is glxtest is desirable for about:support/diagnosing issues and telemetry, but I don't think we make any decisions based on it.

(I was just concernced about an Intel (qualified)+Nvidia setup because on Mac you had to handle switching between GPUs and I don't know how System76 Linux laptops do it. bug 1663273 comment 45 contains a fix/workaround: Remove the if (gdk_screen_is_composited(screen)) { check and it's fixed.)

(In reply to Andrew Osmond [:aosmond] from comment #23)

Ah the SCREEN_INFO path is glxtest is desirable for about:support/diagnosing issues and telemetry, but I don't think we make any decisions based on it.

Right, we don't necessarily need it but it's good to have :)

See Also: → 1669275

For the record, apparently there's a nasty bug on dual-gpu system with prop. nvidia driver where the system will consume more power with the EGL backend, even if the nv gpu is switched off / not used at all :(

See Also: → 1678897
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED

(In reply to Robert Mader [:rmader] from comment #26)

For the record, apparently there's a nasty bug on dual-gpu system with prop. nvidia driver where the system will consume more power with the EGL backend, even if the nv gpu is switched off / not used at all :(

Can you be more specific? Does that affect laptop with dual GPU (intel+nvidia) or workstations? I think I have both boxes available so I can test that.

Flags: needinfo?(robert.mader)

(In reply to Martin Stránský [:stransky] from comment #27)

Can you be more specific? Does that affect laptop with dual GPU (intel+nvidia) or workstations? I think I have both boxes available so I can test that.

I was just referring to bug 1678897, but that was apparently just a very odd setup. Thanks anyways :)

Flags: needinfo?(robert.mader)

Andrew, is there plan to enable WebRender for other X.org desktops with Intel/AMD? Like KDE which should be OK now with Bug 1479135 and Bug 1663273 fixed.
Thanks.

Flags: needinfo?(aosmond)
Flags: needinfo?(aosmond)
You need to log in before you can comment on or make changes to this bug.