Closed Bug 1829415 Opened 2 years ago Closed 2 years ago

Crash in [@ mozilla::webgpu::Queue::WriteBuffer]

Categories

(Core :: Graphics: WebGPU, defect)

Unspecified
Windows
defect

Tracking

()

RESOLVED DUPLICATE of bug 1829305
Tracking Status
firefox-esr102 --- unaffected
firefox112 --- unaffected
firefox113 --- unaffected
firefox114 + fixed

People

(Reporter: RyanVM, Assigned: ErichDonGubler)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/3c19e129-177e-42d3-bc65-a02c50230421

MOZ_CRASH Reason: MOZ_CRASH(IPC failure)

Top 4 frames of crashing thread:

0  xul.dll  mozilla::webgpu::Queue::WriteBuffer  dom/webgpu/Queue.cpp:162
1  xul.dll  mozilla::dom::GPUQueue_Binding::writeBuffer  dom/bindings/WebGPUBinding.cpp:21840
2  xul.dll  mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions>  dom/bindings/BindingUtils.cpp:3335
3  ?  @0x000001828384d43d  

Jim, could we set priority and severity? Thanks

Flags: needinfo?(jimb)

The bug is marked as tracked for firefox114 (nightly). We have limited time to fix this, the soft freeze is in 8 days. However, the bug still isn't assigned.

:bhood, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit auto_nag documentation.

Flags: needinfo?(bhood)

It seems like we're MOZ_CRASH-ing due to a mBridge->SendQueueWriteAction message failing to send. This could happen at any point due to the actor dying, so it may be worthwhile to not actually fail fatally in that situation.

The only reason this would fail is if !mBridge->CanSend() - so if you want to keep the assertion around, you could also check that to make sure that the bridge is still alive, and hasn't been shut down. It seems risky in this situation though, as the __delete__ message is marked as going the opposite direction from the SendQueueWriteAction: https://searchfox.org/mozilla-central/rev/8329a650e3b4f866176ae54016702eb35fb8b0d6/dom/webgpu/ipc/PWebGPU.ipdl#66,89.

Assignee: nobody → egubler
Status: NEW → ASSIGNED
Duplicate of this bug: 1830175

This URL consistently crashes on Win11 for me on several machines: https://webgpu.github.io/webgpu-samples/samples/twoCubes
Bisection points to: https://phabricator.services.mozilla.com/D175262 which is already marked as the regressor.

:nika: Your link returns Unable to find blame for revision for me, but I'm able to use this one to the same effect, I think: https://searchfox.org/mozilla-central/rev/0ffaecaa075887ab07bf4c607c61ea2faa81b172/dom/webgpu/ipc/PWebGPU.ipdl

:tcampbell (CC :jimb, :bhood): This bug is likely the same root cause as bug 1829305, which has a fix that has just barely started being consumed in Nightly. I no longer get crashes locally. Can you reproduce on latest Nightly still?

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(jimb)
Flags: needinfo?(bhood)

:ErichDonGubler I confirmed the latest Nightly won't crash for this.

:vitalyankh: Ah, okay. This particular bug must be the other half of the IPC boundary in bug 1829305. It's tempting to mark this bug as a duplicate. I think I'll do that, after I file a separate bug to ensure that we deal with the valid architectural concern that :nika brings up in comment 3.

Latest Nightly works for me as well. Bisecting for fix shows https://phabricator.services.mozilla.com/D176207 as you expected. Thanks! None of those samples crash for me anymore, though some still throw errors.

:tcampbell: That's expected, and we're working on it! We're tracking that with webgpu-v1-samples, in case you're curious! 🙂

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Duplicate of bug: 1829305
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.