Closed Bug 1718329 Opened 3 months ago Closed 2 months ago

Crash in [@ mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map]

Categories

(Core :: Graphics: WebRender, defect)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
92 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 91+ fixed
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 + fixed
firefox92 + fixed

People

(Reporter: mccr8, Assigned: aosmond)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression, topcrash, Whiteboard: [not-a-fission-bug])

Crash Data

Attachments

(1 file)

Maybe Fission related. (DOMFissionEnabled=1)

Crash report: https://crash-stats.mozilla.org/report/index/529fd197-c4f9-40d6-9647-4ee140210625

Reason: EXCEPTION_ACCESS_VIOLATION_WRITE

Top 10 frames of crashing thread:

0 xul.dll mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map gfx/webrender_bindings/RenderCompositorD3D11SWGL.cpp:314
1 xul.dll mozilla::wr::RenderCompositorLayersSWGL::MapTile gfx/webrender_bindings/RenderCompositorLayersSWGL.cpp:194
2 xul.dll mozilla::wr::wr_compositor_map_tile gfx/webrender_bindings/RenderCompositor.cpp:139
3 xul.dll webrender_bindings::bindings::{{impl}}::map_tile gfx/webrender_bindings/src/bindings.rs:1421
4 xul.dll webrender::compositor::sw_compositor::{{impl}}::bind gfx/wr/webrender/src/compositor/sw_compositor.rs:1261
5 xul.dll webrender::renderer::Renderer::draw_frame gfx/wr/webrender/src/renderer/mod.rs:4509
6 xul.dll webrender::renderer::Renderer::render_impl gfx/wr/webrender/src/renderer/mod.rs:1955
7 xul.dll webrender::renderer::Renderer::render gfx/wr/webrender/src/renderer/mod.rs:1701
8 xul.dll webrender_bindings::bindings::wr_renderer_render gfx/webrender_bindings/src/bindings.rs:636
9 xul.dll mozilla::wr::RenderThread::UpdateAndRender gfx/webrender_bindings/RenderThread.cpp:485

Null derefs.

Looks like it first showed up in the 20210623095324 build. Looks like this line changed in bug 1717737, so maybe it is a regression from that, or just a signature change.

Severity: -- → S2

Some of this is going to be signature shift from bug 1717519 and some legitimate problems detected by bug 1717966.

Bug 1718334 will help us figure out what's going on here.

Depends on: 1718334

This doesn't look like a Fission crash even though the crash report in comment 0 has DOMFissionEnabled=1. Adding [not-a-fission-bug] whiteboard tag to hide this bug from Fission bug triage queries.

Whiteboard: [not-a-fission-bug]

The volume on both nightly and beta is worrying me, Jim can we get an assignee on this one? Thanks

Flags: needinfo?(jmathies)
Blocks: gfx-triage
Flags: needinfo?(jmathies)
Flags: needinfo?(jmuizelaar)
Depends on: 1721903

It looks like a lot of these crashes are happening after a device reset. I expect our handling of that situation could be better.

The two crashes that have come in so far give a device DEVICE_LOST error after Map()

I am setting it as a blocker for 91 and 92 given the volume of crashes. This bug needs an owner.

Severity: S2 → S1
Flags: needinfo?(jmathies)
Priority: -- → P1
Keywords: topcrash

Jeff, should we backout bug 1717737?

No, we're not crashing there. We're hitting the RELEASE_ASSERT. FWIW, the change in 91 (bug 1718334) won't have changed overall crash volume, it just unified all of the subsequent crashes under a single signature.

Flags: needinfo?(jmuizelaar)
Regressed by: 1718334

Ok, this is not blocking then

Crash Signature: [@ mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map] → [@ mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map] [@ OOM | unknown | mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map]
Flags: needinfo?(jmuizelaar)
Assignee: nobody → aosmond
Severity: S1 → S3
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(jmathies)
Flags: needinfo?(aosmond)
Priority: P3 → --

We discussed this in triage. The crashes hit a device removed error during compositing, so we need to better handle device resets on this code path. I've improved things on this front, so I'll take a look.

I think all we need to do here is just return false in MapTile (and other functions as the case may be) and let the existing device reset plumbing handle it when the render call returns:

https://searchfox.org/mozilla-central/rev/9c91451cc2392d942a42493fc895f5aeeddde45d/gfx/webrender_bindings/RenderThread.cpp#497

If we encounter a device reset in
RenderCompositorD3D11SWGL::TileD3D11::Map, we should fail the call, and
rely upon the device reset checks at the end of a render pass to
recreate our compositor sessions.

Flags: needinfo?(aosmond)
Pushed by aosmond@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/56bb3ec8839f
Gracefully handle device reset when mapping tiles with SW-WR + D3D11 compositing. r=jrmuizel
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch

Shall the remaining [@ OOM | unknown | mozilla::wr::RenderCompositorD3D11SWGL::TileD3D11::Map ] crashes get a new bug?

Flags: needinfo?(aosmond)

Yes, they appear to be a write failure after the map / original crash in question:

https://searchfox.org/mozilla-central/rev/064a1e6a2a6f6aa30be8bf4edea2f8408f779d4d/gfx/webrender_bindings/RenderCompositorD3D11SWGL.cpp#327

So I would say that is a different problem.

Flags: needinfo?(aosmond)
No longer blocks: gfx-triage

Comment on attachment 9233799 [details]
Bug 1718329 - Gracefully handle device reset when mapping tiles with SW-WR + D3D11 compositing.

Beta/Release Uplift Approval Request

  • User impact if declined: Users experiencing device resets with SW-WR + D3D11 may crash
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It's been verified in nightly and just properly handles a device reset with the existing plumbing.
  • String changes made/needed:
Attachment #9233799 - Flags: approval-mozilla-release?

Comment on attachment 9233799 [details]
Bug 1718329 - Gracefully handle device reset when mapping tiles with SW-WR + D3D11 compositing.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: High crash rate
  • User impact if declined: Users experiencing device resets with SW-WR + D3D11 may see crashes
  • Fix Landed on Version: 92
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): it's been verified in nightly and just properly handles a device reset with the existing plumbing.
  • String or UUID changes made by this patch:
Attachment #9233799 - Flags: approval-mozilla-esr91?

Comment on attachment 9233799 [details]
Bug 1718329 - Gracefully handle device reset when mapping tiles with SW-WR + D3D11 compositing.

approved for 91.0.1 (release + esr)

Attachment #9233799 - Flags: approval-mozilla-release?
Attachment #9233799 - Flags: approval-mozilla-release+
Attachment #9233799 - Flags: approval-mozilla-esr91?
Attachment #9233799 - Flags: approval-mozilla-esr91+
You need to log in before you can comment on or make changes to this bug.