Closed Bug 1725245 Opened 3 years ago Closed 3 months ago

Add Ubuntu 22.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1

Categories

(Testing :: CI Configuration, task)

Firefox 93
Desktop
Linux
task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: aosmond, Unassigned)

References

(Depends on 1 open bug, Blocks 7 open bugs)

Details

The latest Fedora and Ubuntu releases, and presumably any derivatives thereof, ship Firefox on Wayland by default. They don't use the compatibility layer XWayland, but rather actual Wayland using our pref MOZ_ENABLE_WAYLAND to force it on.

Right now we are unable to test this configuration in CI because our version of Ubuntu is too old to work well as Wayland development is quite active.

We don't need our existing tests on Linux to be fully ported, but it would be good to have the option to run some tests using Wayland on a newer Ubuntu, such as 21.04. This would allow us to start greening the tree for Wayland, and push it out officially from our side.

Blocks: wayland
See Also: → wayland-tests

The recent Wayland test suite (Bug 1578640) based on Ubuntu 18.04 is working relatively well.

The only issue there is missing focus support. The failures you see there are caused broken focus where we wait for focus (that produces timeouts) or we expect a window has focus (that produces reftest failures).

These failures are pretty random as it depends how the focus is handled by compositor - we use a workaround where we hide/show a window and hope it gets the focus after show.

Only reliable way how to fix that is XDG activation protocol implementation (https://wayland.app/protocols/xdg-activation-v1).
This is not finished yet (https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1845). Robert, do you know when this may be available in Mutter? For instance I don't see it in mutter-40.3-1.fc34.x86_64 package.

So our goal is to have compositor with XDG activation protocol available to test Wayland reliably.

Flags: needinfo?(robert.mader)

Copying my answer from the issue here:

Current status is that we'll ship it in 41 - we may backport it to 40, but Ubuntu 21.04 still uses 3.38. So I currently don't see to get it into the FF CI apart from

  • waiting for 21.10
  • using another distro
  • asking Ubuntu devs to backport it to Gnome 3.36 so we can use Ubuntu 20.04 (I don't think they'll backport to Ubuntu 21.04, but that would also be an option)

That being said, I think in the Mutter version in Ubuntu 18.04 are quite a few Wayland related bugs - not to mention all those fixes for compositor integration. So I guess we'd want at least Ubuntu 20.04 so we don't run into already fixed issues.

Flags: needinfo?(robert.mader)

I think we want 21.04 given that is the release users are getting Wayland today. Test what is already shipping :). They don't get it in 20.04 by default.

:stranksy - do you know of prior art/docs on getting wayland running on docker? Currently we run tests on docker with xvfb - my understanding is wayland on xvfb == xwayland; that is not wayland proper. Any advice you have on this would be helpful.

Flags: needinfo?(stransky)

For the record from matrix:

robert.mader
jmaher: for the record, "wayland on xvfb == xwayland" sounds wrong to me. Xwayland runs on top of Wayland, so Xwayland would imply "x11 on wayland on xvfb". So if a nested Wayland session runs on top of xvfb, that's entirely valid.
"Wayland on native backend" would certainly be better than "Wayland on X11 backend", but for docker tests the later should be fine for now.
jmaher
robert.mader: ok, then maybe this isn't such a hard problem to solve- I was reading up on it and kept running into more resources about pure Wayland without xvfb; I will keep looking into this a bit more
robert.mader
jmaher: Gnome-Shell/Mutter only supports native headless Wayland mode since very recently (since Gnome 40), see https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1698. Ubuntu 21.04 still ships 3.38 (because of big UI changes, also mind the versioning switch). So IIUC this will only work from Ubuntu 21.10 on :/
jmaher
ok, that seems very new- probably not what we want right now

for unrelated reasons (OOM, should be fixed with upgraded MESA) we would like to run the marionette tests on this new 21.04 image as well

See Also: → 1709584

(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #4)

:stranksy - do you know of prior art/docs on getting wayland running on docker? Currently we run tests on docker with xvfb - my understanding is wayland on xvfb == xwayland; that is not wayland proper. Any advice you have on this would be helpful.

The testing wayland scripts are already there - see Bug 1578640

The important part is here:
https://searchfox.org/mozilla-central/rev/28c8f793f9f8c80de24a179b4472098692a7e9a4/taskcluster/scripts/tester/test-linux.sh#210

This launch Wayland compositor (Mutter in this case) which provides Wayland display.

xvfb is used as underlying X11 server as AFAIK Mutter fails to run without X11. If any new version does not need X11 to run, you can just run mutter without it.

Generally when you set WAYLAND_DISPLAY env variable and MOZ_ENABLE_WAYLAND=1 env variable Firefox will run under Wayland and it does not matter if there's X11, Xwayland or any other X server running.

Flags: needinfo?(stransky)
Blocks: linux-egl
Summary: Add Ubuntu 21.04 as a test platform to run tests for Wayland → Add Ubuntu 21.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1 and MOZ_X11_EGL=1 (Test what is already shipping)
See Also: → 1741997
Blocks: 1744060
No longer blocks: linux-egl

Given that 21.04 is going EOL next week and 22.04 (LTS) is already in alpha, I'm taking the freedom to update the request.

Summary: Add Ubuntu 21.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1 and MOZ_X11_EGL=1 (Test what is already shipping) → Add Ubuntu 22.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1 and MOZ_X11_EGL=1 (Test what is already shipping)
See Also: → 1752113
See Also: 1752113
See Also: wayland-tests

Ubuntu 22.04 is live now and it's even running Firefox on Wayland by default. We need to do the testing ASAP.

we just did a manual QA pass on this, ASAP == DONE :)

But we still don't have CI for this right? That seems unfortunate :(

These comments are not productive, lets work to keep our bugzilla comments about issues and solving problems, not opinions.

I was just trying to confirm that CI for this is not set up yet, sorry if it came across the wrong way... If there's anything I can help with to get it set up let me know, I'd be happy to help.

yeah, CI is not setup, and probably won't be until the summer. We don't have headcount and our contractor is blocked on access (which goes to headcount in other teams). There is a long list of things, our priorities are migrating from AWS to GCP and replacing old performance hardware (old phones and laptops). We are using manual QA to give us sanity checks in the meantime.

The biggest problem we have is that our current images run in docker images and the last 3 times we have spent 3+ months just getting a desktop windowing environment to work properly from within docker. I have proposed switching to a VM where this is a task that can complete in a day or two for a simple experiment, but for automation requires some retooling and provisioning changes (then we are back migrating cloud providers).

This is a higher priority than windows 11. OSX upgrades, or android emulator upgrades.

Thanks for the update Joel (and for all the work you and your team do)

No longer blocks: 1752113

(In reply to Martin Stránský [:stransky] (ni? me) from comment #9)

Ubuntu 22.04 is live now and it's even running Firefox on Wayland by default. We need to do the testing ASAP.

A minor clarification: Ubuntu 22.04 hasn't been released yet, it will go live on 2022-04-21. Also note that users of Ubuntu 20.04 won't be offered to upgrade until 22.04.1 is released, on 2022-08-04. See the release schedule for details.

And to further mitigate potential issues with Firefox on Wayland not being ready for prime time, we can easily disable it (by setting DISABLE_WAYLAND: 1 in the launch environment) until it is deemed ready.

Blocks: snap

I've had the firefox snap crash 3 times today, while running it from the candidate channel (100.0-1) in a wayland session (Ubuntu 22.04). These crashes didn't trigger the crash reporter.
The last time (just now) I happened to be running it from a terminal, and I got this:

(firefox:80147): Gdk-WARNING **: 21:44:48.720: The program 'firefox' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue'.
  (Details: serial 980422 error_code 2 request_code 53 (core protocol) minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the GDK_SYNCHRONIZE environment
   variable to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

I wonder whether that might speak in favour of reverting to native wayland (which I've been running for months without seeing any such crash)?

(In reply to Olivier Tilloy from comment #20)

I've had the firefox snap crash 3 times today, while running it from the candidate channel (100.0-1) in a wayland session (Ubuntu 22.04). These crashes didn't trigger the crash reporter.
The last time (just now) I happened to be running it from a terminal, and I got this:

(firefox:80147): Gdk-WARNING **: 21:44:48.720: The program 'firefox' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadValue'.
  (Details: serial 980422 error_code 2 request_code 53 (core protocol) minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the GDK_SYNCHRONIZE environment
   variable to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

I wonder whether that might speak in favour of reverting to native wayland (which I've been running for months without seeing any such crash)?

Did you file a bug?

Flags: needinfo?(olivier)

(In reply to Pascal Chevrel:pascalc from comment #21)

Did you file a bug?

Now I have: bug 1766849

Flags: needinfo?(olivier)

Another data point: several users have reported that a cold startup of the firefox snap (just after booting the machine) on Ubuntu 22.04 in a Wayland session is significantly faster when using the beta channel than with the candidate channel. The only meaningful difference between the two is that the snap in the candidate channel forces the use of XWayland, whereas beta makes use of native Wayland support.

in lieu of CI infrastructure, can you accept as an alternative some kind of community feedback? Because Firefox under wayland has been great on Fedora for > 12 months, and the beta channel seems to work, and using xwayland is exposing people to https://gitlab.gnome.org/GNOME/gtk/-/issues/4136 (a gtk4 bug that is triggered by xwayland). While it's not a mozilla problem and I can't speak for the Ubuntu release team, it's a bit ironic that despite efforts to avoid gtk4 bugs in Ubuntu 22.04, the Firefox snap has shipped one anyway.

When we upgrade our CI from 18.04 to 22.04 it would be great if we could take care of properly setting up the font cache to not trigger a generation for the very first start of Firefox. For all of our CI jobs it adds another 15s and also some test jobs fail under such a condition because the startup of Firefox takes longer than usual. I'll add bug 1670290 as dependency. Thanks.

Depends on: 1670290
Blocks: 1779715
Blocks: snap-wayland
Blocks: wayland-stable
No longer blocks: 1779715

Where do we stand on this? I thought we were targeting October originally.

David is working on standing up a VM, but run into some issues lately.

Depends on: 1756660
Depends on: 1814042
No longer blocks: 1744060
Summary: Add Ubuntu 22.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1 and MOZ_X11_EGL=1 (Test what is already shipping) → Add Ubuntu 22.04 as a test platform to run tests for MOZ_ENABLE_WAYLAND=1
Blocks: 1709584
See Also: 1709584
Blocks: 1743143

Folks, how do we look here? Ubuntu 18.04 (recently used for testing) is EOL and latest nightly can't be build there due to old python 3.6. So all test fixes/tweaks has to be done on try directly which wastes Mozilla resources (right now on Bug 1787182 for instance).

Michelle, can you provide an update here or link the bugs where that work is happening?

Flags: needinfo?(mgoossens)

Maybe bug 1801347?

(In reply to Martin Stránský [:stransky] (ni? me) from comment #28)

Folks, how do we look here? Ubuntu 18.04 (recently used for testing) is EOL and latest nightly can't be build there due to old python 3.6. So all test fixes/tweaks has to be done on try directly which wastes Mozilla resources (right now on Bug 1787182 for instance).

FWIW, 18.04 has python 3.7: https://packages.ubuntu.com/bionic/python3.7

(In reply to Mike Hommey [:glandium] from comment #30)

Maybe bug 1801347?

Wayland is just a part of it.

AFAIK we need new testing environment to test what we already ship on X11 - EGL, dmabuf/WebGL, disabled system titlebar etc. All "new" features are off and we don't test them due to old setup in Mozilla automation.

Blocks: 1819259

https://mozilla-hub.atlassian.net/browse/RELOPS-376
https://mozilla-hub.atlassian.net/browse/RELOPS-132 (perhaps?)

Now to give a little context, I was assigned to this. Then someone left, and I do all their Mac tasks now so I haven't looked at it (I think this is also why the priority was dropped.)
Julia is on PTO for a few days, so I can't ask her right now what the plan is.
I'll ping-pong the NI over to the main Linux magician at our team.

Flags: needinfo?(mgoossens) → needinfo?(aerickson)

This is on our radar. We're discussing in Slack #wayland-support.

Flags: needinfo?(aerickson)
See Also: → wayland-ci

IIUC there are now several runners using Ubuntu 22.04, notably for bug 1813588, so I guess this can be closed and most dependent bugs can depend on that bug instead?

No longer blocks: 1795841

Comment 34 makes sense to me. Andrew, OK to close?

Flags: needinfo?(ahal)

<edited> wrong andrew :)

Wfm!

Status: NEW → RESOLVED
Closed: 3 months ago
Flags: needinfo?(ahal)
Resolution: --- → FIXED
Component: General → CI Configuration
You need to log in before you can comment on or make changes to this bug.