Open Bug 1868357 Opened 10 months ago Updated 1 month ago

Intermittent svg/as-image/canvas-drawImage-alpha-2.html == svg/as-image/canvas-drawImage-alpha-2-ref.html | single tracking bug

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, intermittent-testcase, Whiteboard: [stockwell infra])

Attachments

(1 obsolete file)

Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=438887501&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/LP_DabG5QmmIJXBFe9Mhzg/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/LP_DabG5QmmIJXBFe9Mhzg/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2023-12-05T12:36:20.530Z] 12:36:20     INFO - REFTEST TEST-PASS | layout/reftests/svg/as-image/canvas-drawImage-alpha-1.html == layout/reftests/svg/as-image/canvas-drawImage-alpha-1-ref.html | image comparison, max difference: 0, number of differing pixels: 0
[task 2023-12-05T12:36:20.531Z] 12:36:20     INFO - REFTEST TEST-END | layout/reftests/svg/as-image/canvas-drawImage-alpha-1.html == layout/reftests/svg/as-image/canvas-drawImage-alpha-1-ref.html
[task 2023-12-05T12:36:20.538Z] 12:36:20     INFO - REFTEST TEST-START | layout/reftests/svg/as-image/canvas-drawImage-alpha-2.html == layout/reftests/svg/as-image/canvas-drawImage-alpha-2-ref.html
[task 2023-12-05T12:36:20.538Z] 12:36:20     INFO - REFTEST TEST-LOAD | file:///Z:/task_170177374376981/build/tests/reftest/tests/layout/reftests/svg/as-image/canvas-drawImage-alpha-2.html | 21 / 138 (15%)
[task 2023-12-05T12:36:20.578Z] 12:36:20     INFO - REFTEST TEST-LOAD | file:///Z:/task_170177374376981/build/tests/reftest/tests/layout/reftests/svg/as-image/canvas-drawImage-alpha-2-ref.html | 21 / 138 (15%)
[task 2023-12-05T12:36:20.629Z] 12:36:20     INFO - REFTEST INFO | REFTEST fuzzy test (0, 0) <= (47, 675) <= (1, 39900)
[task 2023-12-05T12:36:20.929Z] 12:36:20     INFO - REFTEST TEST-UNEXPECTED-FAIL | layout/reftests/svg/as-image/canvas-drawImage-alpha-2.html == layout/reftests/svg/as-image/canvas-drawImage-alpha-2-ref.html | image comparison, max difference: 47, number of differing pixels: 675
[task 2023-12-05T12:36:20.930Z] 12:36:20     INFO - REFTEST TEST-END | layout/reftests/svg/as-image/canvas-drawImage-alpha-2.html == layout/reftests/svg/as-image/canvas-drawImage-alpha-2-ref.html
[task 2023-12-05T12:36:20.932Z] 12:36:20     INFO - REFTEST TEST-START | layout/reftests/svg/as-image/canvas-drawImage-slice-1a.html == layout/reftests/svg/as-image/lime100x100-ref.html
[task 2023-12-05T12:36:20.932Z] 12:36:20     INFO - REFTEST TEST-LOAD | file:///Z:/task_170177374376981/build/tests/reftest/tests/layout/reftests/svg/as-image/canvas-drawImage-slice-1a.html | 22 / 138 (15%)
[task 2023-12-05T12:36:20.948Z] 12:36:20     INFO - REFTEST TEST-PASS | layout/reftests/svg/as-image/canvas-drawImage-slice-1a.html == layout/reftests/svg/as-image/lime100x100-ref.html |

It looks like we had a handful of failures in the same directory, most of them looking ~fuzzy but with one that's decidedly not-fuzzy:

skip-if(!browserIsRemote||!d2d||gpuProcess) == data:text/plain,FAIL about:blank

The reftest screenshot shows us rendering the FAIL text, with the reference case being blank. The comment in reftest.list adds some context about what that test is actually trying to do:
https://searchfox.org/mozilla-central/rev/413b88689f3ca2a30b3c49465730c0e7d40f9188/layout/reftests/layers/reftest.list#33-40

Essentially the test is a "canary" which is never actually meant to be run (per its skip-if conditions); if conditions are such that it becomes enabled, it'll fail and be a signal for why other tests around it are failing, I think.

In this case it looks like gpuProcess at least is false as shown at the start of the log -- and there are also some messages about the gpu process failing to spin up:

[task 2023-12-05T12:30:05.204Z] 12:30:05     INFO - [GFX1-]: Killing GPU process due to IPC reply timeout
[task 2023-12-05T12:30:05.205Z] 12:30:05     INFO - [GFX1-]: Fallback WR to SW-WR + D3D11
[task 2023-12-05T12:30:05.205Z] 12:30:05     INFO - [GFX1-]: Failed to create remote compositor
[task 2023-12-05T12:30:05.208Z] 12:30:05     INFO - [Parent 9224, Main Thread] WARNING: base::KillProcess refusing to terminate process handle 0: file /builds/worker/checkouts/gecko/ipc/chromium/src/base/process_util_win.cc:349
[task 2023-12-05T12:30:05.209Z] 12:30:05     INFO - [GFX1-]: [D3D11] failed to get compositor device.
[task 2023-12-05T12:30:05.210Z] 12:30:05     INFO - [GFX1-]: Failed to initialize CompositorD3D11 for SWGL: FEATURE_FAILURE_D3D11_NO_DEVICE
[...]
[task 2023-12-05T12:31:42.298Z] 12:31:42     INFO - REFTEST INFO | Dumping representation of sandbox which can be used for expectation annotations
[...]
[task 2023-12-05T12:31:42.309Z] 12:31:42     INFO - REFTEST INFO |     gpuProcess: false

So it looks like this is a task that was meant to be a WebRender task (per qr in the name) but hardware webrender (and the gpu process) failed for some reason, and so there are some expected failures/fuzziness as a result.

Component: SVG → Graphics: WebRender

This doesn't seem to have been a one-off -- it's happened twice, on the same push -- both on https://hg.mozilla.org/integration/autoland/rev/10602eb918edd2ccf85a22fb60b3dc62c845c42a , per OrangeFactor, both with logs matching what I described in comment 1 -- Killing GPU process due to IPC reply timeout, followed by Fallback WR to SW-WR + D3D11

gw, do you know what might be going on here? I wonder if there was a recent regression that might be leading to the GPU process getting stalled or something?

(If this doesn't happen again, we can probably forget about it. But the two-instances-right-away suggests that it might happen more.)

Flags: needinfo?(gwatson)

Not sure what is going on here - it's possible Kelsey or Sotaro may have some ideas, but it seems like a fairly vague amount of detail from the logs.

Flags: needinfo?(gwatson)
See Also: → 1858436

Bug 1858436 comment 18 might be the explanation here.

[GFX1-]: Killing GPU process due to IPC reply timeout

in the logs for this new surge.

Attachment #9383664 - Attachment is obsolete: true

This is an infra issue -> comment 17

Whiteboard: [stockwell disable-recommended] → [stockwell infra]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: