Webgpu CI jobs are not running in configs that can do anything but fail
Categories
(Core :: Graphics: WebGPU, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox116 | --- | fixed |
People
(Reporter: jgilbert, Assigned: ErichDonGubler)
References
(Blocks 3 open bugs)
Details
Attachments
(7 files, 4 obsolete files)
10.57 KB,
text/x-python
|
Details | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
Quis custodiet ipsos custodes?
This is the most recent try run from bug :
https://treeherder.mozilla.org/jobs?repo=try&revision=adefaafc0b5d68929d5b0a8d3369fd523a154b59
Despite the extensive changes in that bug, apparently all's well. This was too suspicious to me.
Here's that same run but when we ask for a gpu:
https://treeherder.mozilla.org/jobs?repo=try&revision=fd60d0fa06e61a00633f0fca75faba6a474e7d65
Linux is still not getting a gpu, and so is successfully continuing to expect to fail all tests.
Windows is getting a GPU now though, and we finally start to see daylight:
[task 2023-05-04T08:06:39.943Z] 08:06:39 INFO - TEST-START | /_mozilla/webgpu/chunked/1/cts.https.html?q=webgpu:api,operation,adapter,requestDevice:default:*
[task 2023-05-04T08:06:39.971Z] 08:06:39 INFO - Setting pref dom.webgpu.enabled to true
[task 2023-05-04T08:06:41.745Z] 08:06:41 INFO -
[task 2023-05-04T08:06:41.745Z] 08:06:41 INFO - TEST-UNEXPECTED-PASS | /_mozilla/webgpu/chunked/1/cts.https.html?q=webgpu:api,operation,adapter,requestDevice:default:* | : - expected FAIL
[task 2023-05-04T08:06:41.745Z] 08:06:41 INFO - TEST-INFO | expected FAIL
[task 2023-05-04T08:06:41.746Z] 08:06:41 INFO - TEST-OK | /_mozilla/webgpu/chunked/1/cts.https.html?q=webgpu:api,operation,adapter,requestDevice:default:* | took 1803ms
Also chunking probably doesn't work right now for this job, since I asked for 4 and I'm getting 2 still.
Similarly, mochitest-webgpu Linux jobs look like:
[task 2023-05-04T04:07:56.305Z] 04:07:56 INFO - TEST-START | dom/webgpu/mochitest/test_device_creation.html
[task 2023-05-04T04:07:56.490Z] 04:07:56 INFO - GECKO(1976) | Validation error without device target: No suitable adapter found
[task 2023-05-04T04:07:56.525Z] 04:07:56 INFO - GECKO(1976) | MEMORY STAT | vsize 2570MB | residentFast 146MB | heapAllocated 8MB
[task 2023-05-04T04:07:56.681Z] 04:07:56 INFO - TEST-OK | dom/webgpu/mochitest/test_device_creation.html | took 375ms
mochitest-webgpu does not appear to be running on Windows right now.
Reporter | ||
Comment 1•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Comment 2•1 years ago
|
||
Kelsey and Erich are working on re-evaluating what our test expectations should be, so that when we land it we won't get massive oranges for unexpected passes.
Reporter | ||
Comment 3•1 years ago
|
||
This is the script I wrote to automate re-marking tests, since wpt-update doesn't seem to be working, and Erich's previous experience was that it was slow anyway.
Feed it "wptreport.json"s one at a time, generally from the list of artifacts from a W(webgpuN) job, and it will re-mark tests based on the unexpected values in the report json.
It generally tries to preserve existing comments and line orders, with some narrow exceptions.
It has a framework to support amending expectations with e.g. if os == "linux": FAIL/PASS
later on, but consider that aspect a WIP.
If this pans out, this will land somewhere more permanent, possibly in-tree.
Reporter | ||
Updated•1 years ago
|
Assignee | ||
Updated•1 years ago
|
Assignee | ||
Updated•1 years ago
|
Reporter | ||
Comment 4•1 years ago
|
||
Erich will try to mark these failures in a one-off way, while finishing automating marking will likely land separately.
Assignee | ||
Updated•1 year ago
|
Reporter | ||
Comment 5•1 year ago
|
||
I think we should keep the old title.
"Webgpu CI jobs are not running in configs that can do anything but fail" is the defect/bug, "Narrow WebGPU CTS testing in CI to Windows (for now)" is the solution/patch.
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 6•1 year ago
|
||
Assignee | ||
Comment 7•1 year ago
|
||
Depends on D179812
Assignee | ||
Comment 8•1 year ago
|
||
Depends on D179813
Assignee | ||
Comment 9•1 year ago
|
||
Depends on D179814
Assignee | ||
Comment 10•1 year ago
|
||
Co-Authored-By: Kelsey Gilbert <jgilbert@mozilla.com>
Depends on D179815
Assignee | ||
Comment 11•1 year ago
|
||
Depends on D179816
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 12•1 year ago
•
|
||
Seeing if I can land the first patches only depending on #webgpu-reviewers
with this Try build. 🤞🏻 EDIT: It's green, ship it!
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 13•1 year ago
|
||
Assignee | ||
Comment 14•1 year ago
•
|
||
Since D179816 has approval, I'm going to try reordering it to be earlier in the patch stack, so it can land. Checking that everything is still green in CI with this Try push. EDIT: It's green! Landing.
Comment 15•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/09f78a71f86a
https://hg.mozilla.org/mozilla-central/rev/485b73c565ed
https://hg.mozilla.org/mozilla-central/rev/d64d0a03298f
Reporter | ||
Comment 16•1 year ago
|
||
This only partially landed.
Reporter | ||
Updated•1 year ago
|
Comment 17•1 year ago
|
||
Updated•1 year ago
|
Comment 18•1 year ago
|
||
Comment 19•1 year ago
|
||
Backed outfor causing webgpu failures at mozilla/webgpu/chunked/1/cts.https.
Backout link: https://hg.mozilla.org/integration/autoland/rev/9b04e12f5edd9f428f3c9bd0d6b4b9a99cdc646e
Push where failures started: https://treeherder.mozilla.org/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel&revision=3e7d709ab2a3a86ff58f4406faa05e738ca49ec9&selectedTaskRun=ZNkrNVr_Qz6q5LlNst4UoA.0
Failure log: https://treeherder.mozilla.org/logviewer?job_id=418319498&repo=autoland&lineNumber=1868
Comment 20•1 year ago
|
||
Backed out changeset 1ceebfa927ec for causing TypeError related webgpu failures
Comment 21•1 year ago
|
||
Assignee | ||
Comment 22•1 year ago
•
|
||
Investigating backouts. I'm greatly surprised that the last single revision got backed out, given the positive Try push I noted in comment 14.
Comment 23•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Comment 25•1 year ago
|
||
bugherder |
Reporter | ||
Comment 26•1 year ago
|
||
This still has parts that haven't landed.
Updated•1 year ago
|
Assignee | ||
Comment 27•1 year ago
|
||
Currently investigating why web-platform-tests
jobs generated by Taskcluster's decision task are including WebGPU tests. One can observe this with Try pushes like this one, which, like backout pushes above, fail because /_mozilla/webgpu
tests are still being included (despite the intent to not do so with bug 1829715, CC :jgilbert). This is the single biggest source of CI failures with current pushes.
Assignee | ||
Comment 28•1 year ago
•
|
||
Looks like we were never actually filtering out webgpu
tests from wpt
. This Try push's wpt5
on Linux 18.04 x64 WebRender debug
doesn't have any of the changes that haven't yet landed and been backed out, but apparently contains /_mozilla/webgpu
test runs. 😭
Sooo...we get to figure out how to actually do that filtering correctly now, since we're specifically accepting that the virtualization: virtual
environment specified in the web-platform-tests
is broken, and we need virtual-with-gpu
instead.
Asked :jgraham about this in Firefox CI
on Matrix.
Assignee | ||
Comment 29•1 year ago
|
||
Talked with :jgraham. Current plan is to expose an inverse filtering operation based on WPT tag, i.e., an --exclude-tag=…
option or something similar. I'm going to try writing the PR myself, but if that doesn't work as I expect, I can either get support or hand off to :jgraham in his next work day.
Assignee | ||
Comment 30•1 year ago
|
||
Assignee | ||
Comment 31•1 year ago
|
||
Depends on D180993
Assignee | ||
Updated•1 year ago
|
Comment 32•1 year ago
|
||
Comment on attachment 9339159 [details]
WIP: Bug 1831263: feat(wpt): add --exclude-tag
Revision D180993 was moved to bug 1838742. Setting attachment 9339159 [details] to obsolete.
Comment 33•1 year ago
|
||
Comment on attachment 9339160 [details]
WIP: Bug 1831263: test(wpt): use --exclude-tag=webgpu
instead of --exclude=…
for web-platform-tests
r?#webgpu-reviewers
Revision D180994 was moved to bug 1838742. Setting attachment 9339160 [details] to obsolete.
Comment 34•1 year ago
|
||
Comment 35•1 year ago
|
||
bugherder |
Assignee | ||
Comment 36•1 year ago
|
||
LOL, shoulda used leave-open
.
Updated•1 year ago
|
Assignee | ||
Comment 37•1 year ago
|
||
Depends on D179815
Updated•1 year ago
|
Comment 38•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 39•1 year ago
|
||
Backed out for webgpu unexpected passes.
Failure log:
- https://treeherder.mozilla.org/logviewer?job_id=420019674&repo=autoland
- https://treeherder.mozilla.org/logviewer?job_id=420019822&repo=autoland
Backout link: https://hg.mozilla.org/integration/autoland/rev/b3731aa59f4b75e213c7516d9d202053e59e7028
Comment 40•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Comment 41•1 year ago
|
||
Backed out for causing webgpu unexpected fails.
Backout link: https://hg.mozilla.org/integration/autoland/rev/37ce560e4c4242d065c31dc5d81368a994123371
Comment 42•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Comment 43•1 year ago
|
||
Assignee | ||
Updated•1 year ago
|
Comment 44•1 year ago
|
||
Comment 45•1 year ago
|
||
bugherder |
Description
•