Closed Bug 1822630 Opened 1 year ago Closed 6 months ago

Distribute tests for WebGPU CTS better

Categories

(Core :: Graphics: WebGPU, defect, P1)

defect

Tracking

()

RESOLVED FIXED
123 Branch
Tracking Status
firefox123 --- fixed

People

(Reporter: ErichDonGubler, Assigned: ErichDonGubler)

References

(Blocks 1 open bug)

Details

Attachments

(8 files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Our current vendoring of gpuweb/cts is far from perfect WRT the test we generate for CI. There are two related pain points we'd like to resolve around them with this bug:

  1. Chunked test files contain a fixed number of tests, evenly divided from the set generated by CTS upstream scripting. Each individual chunk requires its own set of metadata for the tests it contains. Both of these facts create a particularly painful problem: when we vendor in updates to the CTS, additions and removals of tests causes:

    1. The chunk that new tests land in and all subsequent chunks have to have lines moved between them. This generates a high volume of diff noise, making it difficult for a human to visually identify tests that have, in fact, been added or removed (rather than being moved between chunks).
    2. Expectation metadata for tests must also be manually moved to match generated changes, because of the above. This is tedious, because a high percentage of tests still fail, ATOW.
  2. :ErichDonGubler's initial stab at WebGPU CTS left an action item for improving the distribution of task times per :jmaher's request, and we'd like to honor that here. To wit, the WPT tests we generate for WebGPU CTS (viz., wpt* jobs in Taskcluster runs like these) need to:

    1. Stay within 50 minutes of execution time for optimized builds.
    2. Stay within 30 minutes of execution time for debug builds.
Assignee: nobody → egubler
See Also: → webgpu-cts
Assignee: egubler → nobody
Severity: -- → S3
Flags: needinfo?(jimb)
Assignee: nobody → egubler
Status: NEW → ASSIGNED

Discussed the way we might resolve this a bit with :jgilbert, :teoxoy, and :jimb, since it's become relevant with :jgilbert's work on bug 1831263. Our tentative direction is to use a hybrid manual-automatic WPT test chunking approach based on the upstream tree structure, instead of the “linear” chunking we do. The things we're thinking of getting with it:

  • Hand-pick a set of test paths that are relatively fast smoke-level tests, much like WebGL's “core” tasks in CI are currently structured. This is intended allow folks to triage whether or not they've broken something fundamental quickly.
    • Maybe split the tests b/w API and shaders?
  • “auto”-chunk everything else into CTS test files based on, say, unique path components to a depth of 5, i.e., webgpu:api,operation,adapter,requestDevice,* and all other tests underneath it becomes their own WPT test file.
  • When a set of tests from the above are taking too long, we split it, and add task chunks as necessary. We feel that it should be much easier to understand and resolve longer running sets of tests when they're divided conceptually, so this seems tractable to plan on doing.

When a set of tests from the above are taking too long, we split it. We feel that it should be much easier to understand and resolve longer running sets of tests when they're divided conceptually, so this seems tractable to plan on doing.

This sounds fantastic to me.

Removing from current work queue, since I haven't been actively working on this recently.

Assignee: egubler → nobody
Status: ASSIGNED → NEW
Priority: -- → P3
Blocks: 1834558
See Also: → 1849016
Blocks: webgpu-phase-2
No longer blocks: webgpu-v1-cts-blockers
See Also: → 1850356
Assignee: nobody → egubler
Blocks: webgpu-v1
No longer blocks: webgpu-phase-2
Priority: P3 → P1

I am going to unassign myself from this bug, indicative of the fact that I'm no longer actively working on it. However, I wanted to dump some notes on my hot take for the larger divisions of labor:


Suggested chunks (and bikeshedding options), based on understanding derived from personal experience and upstream docs on the test tree layout:

  • webgpu-{core,smoke,essential}: Fast tests that cover a reasonable breadth of the WebGPU API in a small amount of time
    • webgpu:api,operation,*
    • webgpu:shader,execution,*
    • webgpu:idl,*
  • webgpu-{long_running,ugly,exhaustive}: Tests that have particularly large matrices of subtests (i.e., texture format, type of shader binding)
  • webgpu-{ext,other,full}: Anything else.
Assignee: egubler → nobody
Priority: P1 → P3

Did an experiment with this bug today where I put all current CTS tests under a new cts path segment in (viz., added the final segment in testing/web-platform/mozilla/tests/webgpu/cts): try:a0e6d9c553c1

Apparently, the additional path segment affects the decision task's chunking algorithm in a way that breaks how we want to run tests, ATM. So, we'll need to handle this first, before we can put tests behind even more path segments, it seems.

The division of test paths into arguments for chunked Taskcluster jobs appears to happen in the decision task, but I haven't dived into details of how (or even where in code) that's being done.

Depends on: 1861473
See Also: → 1861473
Assignee: nobody → egubler
Status: NEW → ASSIGNED
Priority: P3 → P1

I've figured out how to change decision task logic to more granularly pass /_mozilla/webgpu test paths to WPT test chunks as of try:5c6accf69184. I'm excited! This means this task is unblocked; assigned to myself, since I intend to do this ASAP.

No longer blocks: 1834558
Attachment #9369003 - Attachment description: Bug 1822630: test(webgpu): switch to single WPT file per CTS `spec.ts` path r=#webgpu-reviewers! → WIP: Bug 1822630: test(webgpu): switch to single WPT file per CTS `spec.ts` path r=#webgpu-reviewers!
  • Run moz-webgpu-cts process-reports --preset=reset-contradictory … on try:2234a8fd0c4acf8df6f0973c79faa9b4993a8798
  • Run moz-webgpu-cts process-reports --preset=merge … on try:cf792a6099e01c2058d60008926821ff3157b926

Run moz-webgpu-cts process-reports --preset=reset-contradictory … on try:2f71d85c7150ef03b83066f1a88efade262689e1.

Attachment #9370533 - Attachment description: Bug 1822630: refactor(webgpu): use `concat!(…)` to avoid multiline strings in CTS re-vendor bin. r=#webgpu-reviewers! → Bug 1822630, step 1, part 3: refactor(webgpu): use `concat!(…)` to avoid multiline strings in CTS re-vendor bin. r=#webgpu-reviewers!
Attachment #9369002 - Attachment description: WIP: Bug 1822630: test(webgpu): set effectively inf. depth for CTS test grouping r=#webgpu-reviewers! → Bug 1822630, step 2, part 1: test(webgpu): set effectively inf. depth for CTS test grouping r=#webgpu-reviewers!
Attachment #9369003 - Attachment description: WIP: Bug 1822630: test(webgpu): switch to single WPT file per CTS `spec.ts` path r=#webgpu-reviewers! → Bug 1822630, step 2, part 3: test(webgpu): switch to single WPT file per CTS `spec.ts` path r=#webgpu-reviewers!

This is not a perfect migration; this metadata has been relocated by
running moz-webgpu-cts process-reports …:

Depends on D196628

Keywords: leave-open
See Also: → 1871429
Pushed by egubler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/033186437ce2
step 1, part 1: test(webgpu): try to green up CI from before consuming `wgpu` e1baa5a56e3f73c586cd001623dea2872ac609fd plus fixes r=webgpu-reviewers,nical
https://hg.mozilla.org/integration/autoland/rev/e2e799de9ca5
step 1, part 2: test(webgpu): adjust WPT exps. for `wgpu` e1baa5a56e3f73c586cd001623dea2872ac609fd r=webgpu-reviewers,supply-chain-reviewers,nical
https://hg.mozilla.org/integration/autoland/rev/ae50a3b121cf
step 1, part 3: refactor(webgpu): use `concat!(…)` to avoid multiline strings in CTS re-vendor bin. r=webgpu-reviewers,nical
Attachment #9369002 - Attachment description: Bug 1822630, step 2, part 1: test(webgpu): set effectively inf. depth for CTS test grouping r=#webgpu-reviewers! → Bug 1822630, step 2, part 1: test(webgpu): set effectively inf. depth for CTS test grouping r=#webgpu-reviewers!,jmaher!
Pushed by egubler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/009ad7a2177f
step 2, part 1: test(webgpu): set effectively inf. depth for CTS test grouping r=webgpu-reviewers,jimb,jmaher
https://hg.mozilla.org/integration/autoland/rev/f538337928c1
step 2, part 2: test(webgpu): delete all WPT metadata r=webgpu-reviewers,jimb
https://hg.mozilla.org/integration/autoland/rev/c048a9a51bfe
step 2, part 3: test(webgpu): switch to single WPT file per CTS `spec.ts` path r=webgpu-reviewers,nical
https://hg.mozilla.org/integration/autoland/rev/21569fc8a100
step 2, part 4: test(webgpu): re-introduce migrated WPT metadata r=webgpu-reviewers,nical
https://hg.mozilla.org/integration/autoland/rev/2a8643887846
step 2, part 5: test(webgpu): order subtests in WPT files for CTS by name r=webgpu-reviewers,nical
Keywords: leave-open
Regressions: 1873336
Regressions: 1873696
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: