Open Bug 1617498 (WR-linux-wayland-compositing) Opened 2 years ago Updated 11 hours ago

[meta] WR Wayland Compositing

Categories

(Core :: Graphics: WebRender, enhancement, P3)

Desktop
Linux
enhancement

Tracking

()

ASSIGNED

People

(Reporter: gw, Assigned: rmader)

References

(Depends on 10 open bugs, Blocks 2 open bugs)

Details

(Keywords: meta)

WebRender has a trait that can be implemented by Gecko which allows all rendering to occur in native compositor surfaces [1].

On Windows, we render directly into DirectComposition surfaces, while on Mac we render directly into CoreAnimation surfaces. It would be great if we could also do this on Linux, when supported by the underlying windowing system.

The advantage is that WebRender no longer composites the set of picture cache slices into a single buffer before handing to the OS. Instead, the OS compositor is able to composite the picture cache slices directly. This can result in significant performance and battery improvements. We're also able to support compositing video directly to a native compositor surface, which can provide further performance and power savings (this work is being tracked in [2]).

I don't believe this is feasible on X11, since there's no way that I'm aware of to draw into surface tiles with the GPU, and composite them with a single atomic transaction (if there is a way, please let me know!).

However, I believe that Wayland supports everything we need, so long as the wp_viewporter [3] or similar extension is supported. WebRender needs this in able to support clipping of the wayland subsurfaces that the picture cache tiles would be rasterized into. It appears that this extension is available in GNOME [4] and also KWin / Plasma [5].

[1] https://searchfox.org/mozilla-central/rev/a37fc61f172b432e7ae0b6b4c4a12cac2a787a0f/gfx/wr/webrender/src/composite.rs#451

[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1579235

[3] https://cgit.freedesktop.org/wayland/wayland-protocols/tree/stable/viewporter/viewporter.xml

[4] https://gitlab.gnome.org/GNOME/mutter/issues/132

[5] https://phabricator.kde.org/D26171

CCing a few people that might be interested in this work.

That can be done on Wayland by rendering to dmabuf as it's implemented for WebGL (Bug 1586696). Also cross-process fence synchronization is available (Bug 1614568).

It appears that this extension is available in GNOME [4] and also KWin / Plasma [5].

Weston also does support it well

Author of the Gnome Viewport implementation here. I wouldn't be surprised if you run into bugs in Mutter when using subsurfaces so advanced (we don't have any clients doing that yet). So great to see this and I'll be following this bug closely. Feel free to always ping me.

Great, thanks Robert! We shouldn't need any cross-process synchronization for this case, I think - all surface allocation and rasterization occurs inside the GPU process.

Do we have a ticket for the GPU process on Wayland?

I believe GPU process is enabled on Linux now by default on nightly? I'm not sure if that's different when using Wayland?

Even if not using a dedicated GPU process, WR still exists in a single process as far as all allocation and rasterization is involved.

(In reply to Glenn Watson [:gw] from comment #6)

I believe GPU process is enabled on Linux now by default on nightly? I'm not sure if that's different when using Wayland?

Wayland does not use GPU process. It's disabled because Wayland can't share plain surfaces/windows across processes. Wayland can only share the underlying GPU memory (by dmabuf) which can be mapped to EGLImage/framebuffer in different processes.

Priority: -- → P3
OS: Unspecified → Linux
Hardware: Unspecified → Desktop

Side note: the upcoming Sway version will have viewport support, too.

Sway 1.5 with viewporter support is out.

Using wl-viewports would apparently allow us to scale videos more efficiently. YUV conversion in the compositor is not mandatory in Wayland - the Mutter tracking bug for that is here: https://gitlab.gnome.org/GNOME/mutter/-/issues/1366 (hopefully available around 3.40 if everything works out).

See Also: → 1623530, 1653166

Yes - there are patches in progress for WR to make use of native OS compositor transforms where available to scale videos efficiently in the compositor / hardware (see https://phabricator.services.mozilla.com/D84328). We can make use of the viewport scaling functionality in wayland to achieve the same efficiency savings here as with DirectComposition and CoreAnimation.

Depends on: 1668805
Assignee: nobody → robert.mader
Status: NEW → ASSIGNED
Alias: WR-linux-wayland-compositing
Summary: Implement WebRender native compositor trait for Wayland → [meta] WR Windows Compositing
Summary: [meta] WR Windows Compositing → [meta] WR Wayland Compositing
Depends on: 1695500
Depends on: 1697673
Depends on: 1699754
Depends on: 1699985

Status update: the example compositor now works quite well and can be tested (see bug 1695500). So far Weston is the only compositor able to run it properly - compositor bugs are tracked in bug 1699754.

The main takeaway from implementing the example compositor Wayland backend for me is that:
1: Wayland seems to offer everything needed to map the features used on other platforms
2: We may want to use Wayland APIs directly instead of using the EGL-Wayland platform in order to have more control over buffers etc.

The second point is something for later when the basic functionality stands. However it may make sense to create a little library for that so it can be reused by other projects that want to do similar compositor integration.

Depends on: 1700151
Depends on: 1700684
Depends on: 1707202
See Also: → 1707209
Depends on: 1711214
Depends on: 1711224
Depends on: 1711244
Depends on: 1711461
See Also: 1623530
Depends on: 1712472
Depends on: 1713202
Depends on: 1714326
Depends on: 1714771
Depends on: 1716006

Little status update here: after the latest round of patches things seem to run quite stable for me. So I think this is now dogfoodable and if you run recent Gnome (40.1/3.38.5) or KDE (5.22), you're invited to give this a try. Simply switch on gfx.webrender.compositor.force-enabled on latest nightly (of course you also need to run with MOZ_ENABLE_WAYLAND=1).

Depends on: 1716044
Depends on: 1716108

I did some (not very scientific) performance profiling now on my Thinkpad T460p (skylake). What immediately jumps to attention is that that we have heavily reduced GPU utilization when e.g. scrolling a static page. I tested this with intel_gpu_top and both reported utilization as well and frequencies drop by about 30% while RC6 time increased by about 10%. This is on a FullHD screen - on 4K I'd expect even bigger differences. Reducing GPU overhead is the central idea behind this effort, so it's nice to see that it works out.

CPU wise we seem to also consume about the same in FF, however at least Gnome-Shell consumes about twice as much CPU time as normally (still way less than FF). It is somewhat expected that we trade GPU vs CPU time to some extend. However, I think there's quite a bit of optimization potential, both by how FF uses the Wayland protocol and by the implementation in Gnome-Shell.

Power consumption wise I didn't spot a significant difference on my mashine yet. Apparently the lower GPU frequency gets compensated by the extra CPU time or there are other things at play so that the package (I have an integrated Intel GPU) does not power down. This finding is a bit sad as saving energy is the eventual main goal of the whole effort.

Note that I only looked for very obvious and easy to spot differences - nothing below a save 10% change. Also, other hardware may be affected differently. Also, this was only for HW-WR, not SW-WR.

Robert I have a 4K display running off Intel UHD 620 graphics (Whiskey lake). Do you know of a good (scientific) profiling utility for GNOME/Fedora so I could do some testing? Perhaps there's a way of logging intel_gpu_top output to a file.

I see in this blog macOS has a tool to show the area being repainted. Are you aware of such a tool on Linux/Wayland?

Depends on: 1717902

Hi Vincent. Created bug 1717902 for discussions and findings around performance and profiling, lets continue there.

Depends on: 1718569
Depends on: 1718570
Depends on: 1720375
Depends on: 1718688

After bug 1718570 landed I now consider the compositor backend to be on feature parity with the default one. To my knowledge, there's no broken feature (I previously worried about e.g. screenshots, but they work) - and in many situations the compositor backend is already much faster. So while there is outstanding performance work and potentially some bugs will get discovered, we are getting closer to the point where we can enable compositor integration by default - at least for a subset of users using recent versions of their compositors.

@rmader sorry for asking in such a random place, but on my system (Arch Linux, GNOME Wayland, the 2021-07-11 Nightly, AMD GPU), with the compositor enabled I sometimes get rectangular parts of the window flickering with portions from another tab. I don't get along very well with the Bugzilla search, so if that's a known issue, can you please point me to it? Otherwise I'll try to update and file a bug.

(In reply to Laurențiu Nicola from comment #18)

@rmader sorry for asking in such a random place, but on my system (Arch Linux, GNOME Wayland, the 2021-07-11 Nightly, AMD GPU), with the compositor enabled I sometimes get rectangular parts of the window flickering with portions from another tab. I don't get along very well with the Bugzilla search, so if that's a known issue, can you please point me to it? Otherwise I'll try to update and file a bug.

No worries, this probably affected all users until bug 1718570 landed - so thanks for asking.
Despite its title about partial damage (thus better performance), its main achievement was actually to give much better guarantees about correctness. So if you update nightly to the latest version, my expectation would be that what you describe should not happen any more - buffer content should now always be correct (minus Webrender, system compositor or driver bugs of course). If you still see such issues please file a new bug blocking this one.

Depends on: 1720850
Depends on: 1720874
No longer depends on: 1720874
Depends on: 1721036
Depends on: 1721298
Depends on: 1723012
Depends on: 1723940

Hello Robert, what's status of this feature? Should it be enabled by default, do we need to test is somehow or so?
It may be possible to run testsuite on the compositor to compare result, for instance I use locally:

MOZ_ENABLE_WAYLAND=1 ./mach mochitest dom/base/test --setpref widget.wayland.test-workarounds.enabled=true --enable-webrender

or for long version

MOZ_ENABLE_WAYLAND=1 ./mach mochitest dom --setpref widget.wayland.test-workarounds.enabled=true --enable-webrender

you can use --setpref to enable the feature.

Flags: needinfo?(robert.mader)
Depends on: 1725371

(In reply to Martin Stránský [:stransky] (ni? me) from comment #20)

Hello Robert, what's status of this feature? Should it be enabled by default, do we need to test is somehow or so?

I think it's quite close to be ready from the FF side, but as it uncovered a lot of bugs in compositors (some of them listed in bug 1699754). It will still take some time until most/all of them are fixed and reached users - the good thing is that this will benefit other applications as well that try to do similar things. Opened bug 1725372 to track things.

Flags: needinfo?(robert.mader)
Depends on: 1726807
Depends on: 1726954
Depends on: 1725368
Depends on: 1727936
Depends on: 1729233
Depends on: 1729613
Depends on: 1731450
Depends on: 1732051
You need to log in before you can comment on or make changes to this bug.