Closed Bug 1718438 Opened 4 years ago Closed 3 years ago

Intermittent Windows 10 AArch64 qr opt [tier 2] bugs/76331-1.html == bugs/76331-1-ref.html | image comparison, max difference: 255, number of differing pixels: 7043 ("RenderThread detected a device reset in PostUpdate" "Device reset due to WR device")

Categories

(Core :: Graphics: WebRender, defect, P5)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Summary: Intermittent [tier 2] bugs/76331-1.html == bugs/76331-1-ref.html | image comparison, max difference: 255, number of differing pixels: 7043 → Intermittent Windows 10 AArch64 qr opt [tier 2] bugs/76331-1.html == bugs/76331-1-ref.html | image comparison, max difference: 255, number of differing pixels: 7043
See Also: → 1696069
See Also: → 1719114
See Also: → 1718427
Has Regression Range: --- → yes

Andrew, is there anything which can be done to reduce the failure rate of reftests on Windows AArch with WebRender enabled? The tasks ae failing permanently and contain several failures (not always the same).

Flags: needinfo?(aosmond)

It looks like maybe this has gone away? No intermittents have been tagged with this bug for over a month. Most recent one is this one from July 28: https://treeherder.mozilla.org/logviewer?job_id=346634932&repo=mozilla-beta

Also: I think this is really a WebRender bug (or a failure being surfaced via WebRender).

In the failure log from comment 0, it looks like we're capturing & showing the completely wrong testcase's rendering for the "Test" screenshot; and this graphics error logging is in the log, just before the test failure:

INFO - [GFX1-]: EGL image is not valid.
INFO - [GFX1-]: Internal D3D11 error: HRESULT: 0x887A0005: Error allocating Texture2D
INFO - [GFX1-]: Context has been lost.
INFO - [GFX1]: Device reset due to WR device: 0x887a0006
INFO - [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
INFO - [GFX1]: Device reset due to WR device: 0x887a0006
INFO - [GFX1]: Device reset due to WR device: 0x887a0006
INFO - [GFX1]: Device reset due to WR device: 0x887a0006
INFO - REFTEST TEST-LOAD | file:///C:/tasks/task_1624770307/build/tests/reftest/tests/layout/reftests/bugs/76331-1-ref.html | 86 / 2048 (4%)
INFO - [GFX1-]: Failed to make render context current during destroying.
INFO - [GFX1-]: Failed to make render context current during destroying.
INFO - [GFX1-]: Failed to make render context current during destroying.

And in the most recent failure, there's a similar (but more concise) bit of graphics logging just before the test failure:

INFO - [GFX1-]: Readback took too long: 1273 ms
INFO - [GFX1]: Device reset due to WR device: 0x887a0006
INFO - [GFX1-]: GFX: RenderThread detected a device reset in PostUpdate
INFO - [GFX1]: Device reset due to WR device: 0x887a0006

The GFX: RenderThread detected a device reset in PostUpdate and Device reset due to WR device lines are present for both failures, so those seem like they're hinting at a root cause here.

--> Reclassifying this as WebRender, and adding those to the bug summary. (Though this may be WORKSFORME if we continue to not-see-new-reports.)

Summary: Intermittent Windows 10 AArch64 qr opt [tier 2] bugs/76331-1.html == bugs/76331-1-ref.html | image comparison, max difference: 255, number of differing pixels: 7043 → Intermittent Windows 10 AArch64 qr opt [tier 2] bugs/76331-1.html == bugs/76331-1-ref.html | image comparison, max difference: 255, number of differing pixels: 7043 ("RenderThread detected a device reset in PostUpdate" "Device reset due to WR device")
Component: Layout → Graphics: WebRender
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(aosmond)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.