Intermittent TCw mozilla/tests/webgpu/cts/webgpu/* | Failure while resetting counters OR Reached unreachable code
Categories
(Testing :: General, defect, P5)
Tracking
(firefox-esr115 unaffected, firefox121 unaffected, firefox122 unaffected, firefox123 wontfix, firefox124 wontfix)
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox121 | --- | unaffected |
firefox122 | --- | unaffected |
firefox123 | --- | wontfix |
firefox124 | --- | wontfix |
People
(Reporter: intermittent-bug-filer, Unassigned)
References
(Regression)
Details
(Keywords: intermittent-failure, regression)
Filed by: ncsoregi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=442396690&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/bgTk9adpQbuPAM-lwBf13g/runs/0/artifacts/public/logs/live_backing.log
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO -
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - TEST-UNEXPECTED-FAIL | /_mozilla/webgpu/cts/webgpu/shader/execution/expression/call/builtin/tanh/cts.https.html?q=webgpu:shader,execution,expression,call,builtin,tanh:f16:* | :inputSource="const";vectorize="_undef_" - assert_unreached:
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - - EXCEPTION: WebGPU device failed to initialize with Error "requestAdapter returned null"; not retrying
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - assert@https://web-platform.test:8443/_mozilla/webgpu/common/util/util.js:38:11
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - acquire@https://web-platform.test:8443/_mozilla/webgpu/webgpu/util/device_pool.js:42:11
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO -
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - Reached unreachable code
[task 2024-01-06T10:33:31.860Z] 10:33:31 INFO - wpt_fn@https://web-platform.test:8443/_mozilla/webgpu/common/runtime/wpt.js:75:25
[task 2024-01-06T10:33:31.861Z] 10:33:31 INFO -
Comment 1•6 months ago
|
||
There are test coverage wpt failures such as:
- TEST-UNEXPECTED-FAIL | /_mozilla/webgpu/cts/webgpu/api/operation/adapter/requestAdapter/cts.https.html?q=webgpu:api,operation,adapter,requestAdapter:requestAdapter_no_parameters:* | : - assert_unreached:
- TEST-UNEXPECTED-ERROR | /_mozilla/webgpu/cts/webgpu/api/validation/render_pipeline/overrides/cts.https.html?q=webgpu:api,validation,render_pipeline,overrides:identifier,vertex:* | Traceback (most recent call last):
- TEST-UNEXPECTED-FAIL | /_mozilla/webgpu/cts/webgpu/shader/execution/expression/call/builtin/tanh/cts.https.html?q=webgpu:shader,execution,expression,call,builtin,tanh:f16:* | :inputSource="const";vectorize="undef" - assert_unreached:
- TEST-UNEXPECTED-ERROR | /_mozilla/webgpu/cts/webgpu/api/operation/adapter/requestAdapter/cts.https.html?q=webgpu:api,operation,adapter,requestAdapter:requestAdapter_no_parameters:* | Traceback (most recent call last):
- TEST-UNEXPECTED-ERROR | /_mozilla/webgpu/cts/webgpu/api/validation/render_pipeline/primitive_state/cts.https.html?q=webgpu:api,validation,render_pipeline,primitive_state:strip_index_format:* | Traceback (most recent call last):
Erich, these have appeared on today's merge, after Bug 1822630 has reached mozilla-central.
I've created a general bug since they might have the same root cause, but let us know if separate bugs are needed.
Is there any chance you could take a look at this?
Thank you!
Comment 2•6 months ago
|
||
Set release status flags based on info from the regressing bug 1822630
Comment hidden (Intermittent Failures Robot) |
Comment 4•6 months ago
|
||
:nataliaCs: There seem to be at least two separate sets of symptoms from the 6 instances of failures currently represented by the Orange Factor index for the past week (which you mentioned in comment 1):
-
Infrastructure failures, perhaps caused by bugs in
wptrunner
(CC :jgraham).- https://treeherder.mozilla.org/logviewer?job_id=442396691&repo=mozilla-central&lineNumber=2313
- https://treeherder.mozilla.org/logviewer?job_id=442397875&repo=mozilla-central&lineNumber=1885
- https://treeherder.mozilla.org/logviewer?job_id=442397944&repo=mozilla-central&lineNumber=3754
- https://treeherder.mozilla.org/logviewer?job_id=442397967&repo=mozilla-central&lineNumber=1886
- WebGPU CTS is being run in a Linux 18.04 environment, which, as we have determined since bug 1837027, does not have the proper environment for running WebGPU tests.
https://treeherder.mozilla.org/logviewer?job_id=442396690&repo=mozilla-central&lineNumber=2329
https://treeherder.mozilla.org/logviewer?job_id=442396782&repo=mozilla-central&lineNumber=2335
It's salient to note that all of these failures are in test-coverage-wpt
jobs, which the WebGPU team does not have direct stewardship over. Furthermore, I don't see how either set of symptoms could be relevant to changes made in bug 1822630. (1) changed the layout of WPT test files (which should be mostly internal to WebGPU WPT tests) and (2) changed the way in which those tests were distributed among chunk in CI (which should also not be visible in other test configurations). Also, these offending tests are running at tier 2, while WebGPU CTS is currently marked as running on tier 3. The discrepancy makes this problem visible, and raise the question: Should we be running tier 3 tests in a tier 2 job at all?
NI'ing :jmaher, who is likely to understand more about both the test-coverage-wpt
test and its configuration. CC'ing :jgraham, in case he also has knowledge that gives us leverage to determine the right solution.
Comment 5•6 months ago
|
||
Thank you very much for taking the time to check this!
Comment 6•6 months ago
|
||
there are some tasks like test-verify
and test-coverage
that make some assumptions and do the best they can to run properly. They are not run per push and provide ancillary data to CI. In this case test-coverage-wpt
is setup to run on linux
, but it was never setup to run on linux2204
or handle specifics like webgpu
or canvas
or privatebrowsing
.
we have a reference to test-coverage-wpt
in the test-sets.yml:
https://searchfox.org/mozilla-central/source/taskcluster/ci/test/test-sets.yml#133
I would be surprised if the test-coverage tools will work on 22.04; that is a TODO item to see if that works. Eventually we will need to migrate all test-coverage tasks to be run on 22.04 instead of 18.04.
In addition to that, we need to see if we can create a different job for webgpu
, maybe test-coverage-wpt-webgpu
or something like that. We should look at test-verify as well.
So there are 2 TODO items:
- see if test-coverage / code-coverage tools work on existing linux2204-wayland image
- figure out how to split out
webgpu
and other tagged/subharnesses from wpt
Comment 7•6 months ago
|
||
are the tests still at _mozilla/webgpu
?
using this guide for a way to find "subsuites":
https://searchfox.org/mozilla-central/source/taskcluster/gecko_taskgraph/util/chunking.py#32
I ask because one of the failing jobs has this in the path: tests/web-platform/mozilla/tests/webgpu/cts/webgpu/api/validation/render_pipeline/overrides/cts.https.html
Comment 8•6 months ago
•
|
||
:jmaher: That's correct, yes! The path with Oh, derp, right, mozilla
instead of _mozilla
looks like an error of some kind. I'll investigate._mozilla
in the WPT URL means mozilla
in-tree. Nothing looks incorrect to me ATM, actually. 😅
Comment 9•6 months ago
|
||
:jmaher: For context, bug 1822630 changed the layout of _mozilla/webgpu
somewhat, but shouldn't have moved tests outside of it.
Comment 10•6 months ago
|
||
Set release status flags based on info from the regressing bug 1822630
Comment 11•5 months ago
|
||
https://wiki.mozilla.org/Bug_Triage#Intermittent_Test_Failure_Cleanup
For more information, please visit BugBot documentation.
Description
•