Need Mesa Lavapipe 22.1.2 or later to run WebGPU CTS on Linux
Categories
(Testing :: General, task, P2)
Tracking
(Not tracked)
People
(Reporter: jimb, Unassigned)
References
Details
We need to run the WebGPU conformance test suite in CI on Linux for Mozilla Central. This requires a conformant Vulkan implementation. Linux Mesa lavapipe 22.1.2 was certified as a conformant implementation in 2022-07, so that should be good enough. Lavapipe is a CPU-based renderer, so it does not require a machine with a GPU. It seems that Ubuntu 22.04 LTS only has Mesa 22.0.1, which is not good enough, but Jammy Updates has Mesa 22.2.5.
Reporter | ||
Updated•1 year ago
|
Comment 1•1 year ago
|
||
Currently we are building an ubuntu 22.04 VM running Wayland (not X11). Will WebGPU/Mesa work on that, or does it need X11?
From my little digging on Google searches, it seems that it doesn't matter if Ubuntu is running X11 or Wayland.
:jimb, can you confirm?
Comment 2•1 year ago
|
||
:jmaher: Answering for Jim here, we believe that this should work. The best way to figure this out is going to be with testing, though. Is there a way to access an interactive environment with the current WIP of the Ubuntu 22.04 environment?
Comment 3•1 year ago
|
||
I am looking into options with the current Ubuntu 22.04 Wayland- I just got back from a long PTO, so I am not sure who is on PTO, etc. - I assume later this week or first thing next week I can have an answer.
Comment 4•1 year ago
|
||
Mostly for :jimb: I found this upstream issue in wgpu
noting that lavapipe
seems to stall on certain hardware profiles: https://github.com/gfx-rs/wgpu/issues/1974. It's hard to tell if this might affect us until we know more about the hardware profile of CI runners, though.
Comment 5•1 year ago
|
||
The current Wayland 22.04 image has version 22.2.5-0ubuntu0.1~22.04.1
of mesa-vulkan-drivers. There is a minor update available (details in Bug 1841867).
Have we tried running the tests on the workers that use the image (https://firefox-ci-tc.services.mozilla.com/worker-manager/gecko-t%2Ft-linux-vm-2204-wayland)?
Updated•1 year ago
|
Comment 6•1 year ago
|
||
:aerickson: I can answer that. Short answer: nope! Do you have some instructions to help us test consuming this image? I'm unfamiliar with plumbing specific images to Taskcluster jobs, but I'd likely be the one to take it on. π
Updated•1 year ago
|
Comment 7•1 year ago
|
||
./mach try fuzzy -q 'test linux webgpu' --worker-override="t-linux-large-gcp=gecko-t/t-linux-vm-2204-wayland"
Comment 8•1 year ago
•
|
||
Just tried before and after Try builds. Both are broken. π Looks like the worker type is different (t-linux-large-gcp
vs t-linux-xlarge-gcp
) and the image didn't get picked up in these test runs, so I'll try pushing again with an additional --worker-override="t-linux-xlarge-gcp=gecko-t/t-linux-vm-2204-wayland"
.
EDIT: That didn't work either. π© Gonna reach out to Andrew and Joel directly, see if I can get some instruction on how to get this consumed properly for a test run.
Comment 9•1 year ago
|
||
the override isn't working, is there going to be a new push with the new --worker-override...
?
Comment 10•1 year ago
|
||
:jmaher: I'm not sure who you're addressing (does support for --worker-override=β¦
need to be added still?), but attempting to override xlarge
workers on Linux didn't work either.
Comment 11•1 year ago
|
||
I think you want just t-linux-large
and t-linux-xlarge
like:
--worker-override="t-linux-xlarge=gecko-t/t-linux-vm-2204-wayland"
See https://mozilla-hub.atlassian.net/wiki/spaces/ROPS/pages/318996714/Using+mach+try+fuzzy+--worker-override - it's confusing. I couldn't find an alias t-linux-xlarge-gcp
in taskcluster/ci/config.yml
.
Comment 12•1 year ago
•
|
||
:aerickson: Aha, that seems to change something, but now all non-build jobs are ending in exception
with no logs. π©
Comment 13•1 year ago
|
||
The original workers are docker-workers (that use a different payoload). The new workers are generic-workers and the worker-override option doesn't seem to change the payload to match what worker expects. I think some taskgraph hacking is required to get the jobs to run on the instances.
Comment 14•1 year ago
|
||
:aerickson and :jmaher have been extremely helpful in actually getting a working Try build that exercises this environment, and it seems that we are successfully running most WebGPU tests. I believe that the spirit of this ticket has been resolved, and that we just need to adjust test expectation metadata with the WIP patch stack I have against bug 1836805.
Thanks for your help, everyone! ππ»ππ»
Description
•