Closed Bug 1876389 Opened 5 months ago Closed

Update `wgpu` to `trunk` upstream (week of 2024-01-22)

Categories

(Core :: Graphics: WebGPU, task, P1)

task

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox124 --- fixed

People

(Reporter: ErichDonGubler, Assigned: jimb)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 5 obsolete files)

No description provided.

I'll take care of this on Monday (1-29).

Those patches aren't sufficient, yet. Remaining errors:

error[E0405]: cannot find trait `IdentityHandlerFactory` in module `wgc::identity`
  --> gfx/wgpu_bindings/src/identity.rs:32:42
   |
32 | impl<I: wgc::id::TypedId> wgc::identity::IdentityHandlerFactory<I> for IdentityRecyclerFactory {
   |                                          ^^^^^^^^^^^^^^^^^^^^^^ not found in `wgc::identity`
error[E0405]: cannot find trait `TypedId` in module `wgc::id`
  --> gfx/wgpu_bindings/src/identity.rs:32:18
   |
32 | impl<I: wgc::id::TypedId> wgc::identity::IdentityHandlerFactory<I> for IdentityRecyclerFactory {
   |                  ^^^^^^^ not found in `wgc::id`
error[E0405]: cannot find trait `GlobalIdentityHandlerFactory` in module `wgc::identity`
  --> gfx/wgpu_bindings/src/identity.rs:42:21
   |
42 | impl wgc::identity::GlobalIdentityHandlerFactory for IdentityRecyclerFactory {}
   |                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ not found in `wgc::identity`
error[E0107]: struct takes 0 generic arguments but 1 generic argument was supplied
  --> gfx/wgpu_bindings/src/server.rs:78:26
   |
78 |     global: wgc::global::Global<IdentityRecyclerFactory>,
   |                          ^^^^^^------------------------- help: remove these generics
   |                          |
   |                          expected 0 generic arguments
   |
note: struct defined here, with 0 generic parameters
  --> /home/jimb/moz/central/third_party/rust/wgpu-core/src/global.rs:46:12
   |
46 | pub struct Global {
   |            ^^^^^^
Some errors have detailed explanations: E0107, E0405.
For more information about an error, try `rustc --explain E0107`.
error: could not compile `wgpu_bindings` (lib) due to 4 previous errors
Attachment #9377576 - Attachment is obsolete: true
Attachment #9377628 - Attachment is obsolete: true
Attachment #9377574 - Attachment is obsolete: true
Attachment #9377573 - Attachment is obsolete: true
Pushed by jblandy@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9dd0cf97b35d
Update `wgpu` to revision 87b6513df32e8a9c588962ba8509019c277438e2. r=webgpu-reviewers,supply-chain-reviewers,nical
Pushed by nfay@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6181ee1cdb28
Fix file-whitespace lint failure in gfx/wgpu_bindings/src/identity.rs r=fix CLOSED TREE
Pushed by jblandy@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/4fd09aad97af
Update `wgpu` to revision 87b6513df32e8a9c588962ba8509019c277438e2. r=webgpu-reviewers,supply-chain-reviewers,nical
Blocks: 1877046
Regressions: 1878390

Backed out for causing webgpu failures on test_command_buffer_creation.html.

[task 2024-02-02T20:22:23.669Z] 20:22:23     INFO - TEST-START | dom/webgpu/mochitest/test_command_buffer_creation.html
[task 2024-02-02T20:23:52.263Z] 20:23:52     INFO - GECKO(8372) | [Parent 10016, Jump List #2] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012 (NS_ERROR_FILE_NOT_FOUND): file /builds/worker/checkouts/gecko/widget/windows/WinUtils.cpp:950
[task 2024-02-02T20:23:52.265Z] 20:23:52     INFO - GECKO(8372) | [Parent 10016, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/widget/windows/JumpListBuilder.cpp:219
[task 2024-02-02T20:23:52.266Z] 20:23:52     INFO - GECKO(8372) | [Parent 10016, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/widget/windows/JumpListBuilder.cpp:219
[task 2024-02-02T20:23:52.267Z] 20:23:52     INFO - GECKO(8372) | [Parent 10016, Jump List #2] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012 (NS_ERROR_FILE_NOT_FOUND): file /builds/worker/checkouts/gecko/widget/windows/WinUtils.cpp:950
[task 2024-02-02T20:25:52.248Z] 20:25:52     INFO - GECKO(8372) | [Parent 10016, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/widget/windows/JumpListBuilder.cpp:219
[task 2024-02-02T20:25:52.249Z] 20:25:52     INFO - GECKO(8372) | [Parent 10016, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80040111 (NS_ERROR_NOT_AVAILABLE): file /builds/worker/checkouts/gecko/widget/windows/JumpListBuilder.cpp:219
[task 2024-02-02T20:25:52.250Z] 20:25:52     INFO - GECKO(8372) | [Parent 10016, Jump List #3] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80520012 (NS_ERROR_FILE_NOT_FOUND): file /builds/worker/checkouts/gecko/widget/windows/WinUtils.cpp:950
[task 2024-02-02T20:27:16.700Z] 20:27:16     INFO -  [Parent 8760, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005 (NS_ERROR_FAILURE): file /builds/worker/checkouts/gecko/toolkit/components/places/Database.cpp:537
[task 2024-02-02T20:27:16.728Z] 20:27:16     INFO -  console.error: (new TypeError("connection not specified or invalid.", "resource://gre/modules/Sqlite.sys.mjs", 1548))
[task 2024-02-02T20:27:16.729Z] 20:27:16     INFO -  console.error: (new TypeError("can't access property \"executeBeforeShutdown\", db is undefined", "resource://gre/modules/PlacesUtils.sys.mjs", 1466))
[task 2024-02-02T20:27:16.731Z] 20:27:16     INFO -  [Parent 8760, IPDL Background] WARNING: QM_TRY failure (ERROR): 'OkIf(gBasePath)', file dom/quota/ActorsParent.cpp:1703
[task 2024-02-02T20:27:16.732Z] 20:27:16     INFO -  [Parent 8760, IPDL Background] WARNING: Trying to create QuotaManager before profile-do-change! Forgot to call do_get_profile()?: file /builds/worker/checkouts/gecko/dom/quota/ActorsParent.cpp:1702
[task 2024-02-02T20:27:16.732Z] 20:27:16     INFO -  [Parent 8760, IPDL Background] WARNING: QM_TRY failure (ERROR): 'GetOrCreate().map([](const auto& res) { return Ok{}; }) failed with resultCode 0x80004005, resultName NS_ERROR_FAILURE', file dom/quota/ActorsParent.cpp:1726
[task 2024-02-02T20:27:16.732Z] 20:27:16     INFO -  [Parent 8760, IPDL Background] WARNING: QM_TRY failure (ERROR): 'QuotaManager::EnsureCreated() failed with resultCode 0x80004005, resultName NS_ERROR_FAILURE', file dom/quota/QuotaParent.cpp:715
[task 2024-02-02T20:27:24.949Z] 20:27:24     INFO - TEST-INFO | started process screenshot
[task 2024-02-02T20:27:25.211Z] 20:27:25     INFO - TEST-INFO | screenshot: exit 0
[task 2024-02-02T20:27:25.211Z] 20:27:25     INFO - Buffered messages logged at 20:22:00
[task 2024-02-02T20:27:25.212Z] 20:27:25     INFO - TEST-PASS | dom/webgpu/mochitest/test_command_buffer_creation.html | Pref should be enabled. 
[task 2024-02-02T20:27:25.212Z] 20:27:25     INFO - Buffered messages finished
[task 2024-02-02T20:27:25.213Z] 20:27:25     INFO - TEST-UNEXPECTED-FAIL | dom/webgpu/mochitest/test_command_buffer_creation.html | Test timed out. - 
[task 2024-02-02T20:27:25.963Z] 20:27:25     INFO - GECKO(8372) | MEMORY STAT | vsize 600MB | vsizeMaxContiguous 1806MB | residentFast 67MB | heapAllocated 5MB
[task 2024-02-02T20:27:26.051Z] 20:27:26     INFO - TEST-OK | dom/webgpu/mochitest/test_command_buffer_creation.html | took 325481ms
[task 2024-02-02T20:27:26.135Z] 20:27:26     INFO - TEST-START | dom/webgpu/mochitest/test_context_configure.html
Flags: needinfo?(jimb)
Flags: needinfo?(jimb)

I can reproduce this locally. Interesting logging output:

 0:13.79 TEST_START: dom/webgpu/mochitest/test_basic_canvas.worker.html
 0:15.28 GECKO(412650) MEMORY STAT vsizeMaxContiguous not supported in this build configuration.
 0:15.28 GECKO(412650) MEMORY STAT | vsize 2649MB | residentFast 150MB | heapAllocated 9MB
 0:15.34 GECKO(412650) [Parent 412650, CanvasRenderer] WARNING: IPC message 'PWebGPU::Msg_DeviceLost' discarded: actor cannot send: file /home/jimb/moz/central/ipc/glue/ProtocolUtils.cpp:545
 0:15.34 GECKO(412650) [Parent 412650, CanvasRenderer] ###!!! ASSERTION: SendDeviceLost failed: 'Error', file /home/jimb/moz/central/dom/webgpu/ipc/WebGPUParent.cpp:218
Initializing stack-fixing for the first stack frame, this may take a while...
 0:31.80 GECKO(412650) #01: NS_DebugBreak (/home/jimb/moz/central/xpcom/base/nsDebugImpl.cpp:0)
 0:31.81 GECKO(412650) #02: mozilla::webgpu::WebGPUParent::DeviceLostCallback(unsigned char*, unsigned char, char const*) (/home/jimb/moz/central/dom/webgpu/ipc/WebGPUParent.cpp:340)
 0:31.81 GECKO(412650) #03: wgpu_core::device::DeviceLostClosure::call (src/device/mod.rs:309)
 0:31.81 GECKO(412650) #04: wgpu_core::device::resource::Device<A>::prepare_to_die (src/device/resource.rs:0)
 0:31.81 GECKO(412650) #05: wgpu_core::hub::Hub<A>::clear (src/hub.rs:223)
 0:31.82 GECKO(412650) #06: <wgpu_core::global::Global as core::ops::drop::Drop>::drop (src/global.rs:0)
 0:31.82 GECKO(412650) #07: core::ptr::drop_in_place<wgpu_bindings::server::Global> (/rustc/82e1608dfa6e0b5569232559e3d385fea5a93112/library/core/src/ptr/mod.rs:498)
 0:31.82 GECKO(412650) #08: wgpu_server_delete (gfx/wgpu_bindings/src/server.rs:136)
 0:31.82 GECKO(412650) #09: mozilla::ipc::IProtocol::DestroySubtree(mozilla::ipc::IProtocol::ActorDestroyReason) (/home/jimb/moz/central/ipc/glue/ProtocolUtils.cpp:627)
 0:31.82 GECKO(412650) #10: mozilla::ipc::IProtocol::DestroySubtree(mozilla::ipc::IProtocol::ActorDestroyReason) (/home/jimb/moz/central/ipc/glue/ProtocolUtils.cpp:0)
 0:31.82 GECKO(412650) #11: mozilla::gfx::PCanvasManagerParent::OnChannelClose() (/home/jimb/moz/central/obj-release-debug/ipc/ipdl/PCanvasManagerParent.cpp:607)
 0:31.82 GECKO(412650) #12: mozilla::ipc::MessageChannel::OnNotifyMaybeChannelError() (/home/jimb/moz/central/ipc/glue/MessageChannel.cpp:2090)
 0:31.82 GECKO(412650) #13: mozilla::detail::RunnableMethodImpl<mozilla::ipc::MessageChannel*, void (mozilla::ipc::MessageChannel::*)(), false, (mozilla::RunnableKind)1, >::Run() (/home/jimb/moz/central/xpcom/threads/nsThreadUtils.h:1213)
 0:31.82 GECKO(412650) #14: nsThread::ProcessNextEvent(bool, bool*) (/home/jimb/moz/central/xpcom/threads/nsThread.cpp:1194)
 0:31.82 GECKO(412650) #15: NS_ProcessNextEvent(nsIThread*, bool) (/home/jimb/moz/central/xpcom/threads/nsThreadUtils.cpp:480)
 0:31.82 GECKO(412650) #16: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) (/home/jimb/moz/central/ipc/glue/MessagePump.cpp:0)
 0:31.82 GECKO(412650) #17: MessageLoop::Run() (/home/jimb/moz/central/ipc/chromium/src/base/message_loop.cc:346)
 0:31.82 GECKO(412650) #18: nsThread::ThreadFunc(void*) (/home/jimb/moz/central/xpcom/threads/nsThread.cpp:372)
 0:31.84 GECKO(412650) #19: _pt_root (/home/jimb/moz/central/nsprpub/pr/src/pthreads/ptthread.c:204)
 0:31.91 GECKO(412650) #20: set_alt_signal_stack_and_start(PthreadCreateParams*) (/home/jimb/moz/central/mozglue/interposers/pthread_create_interposer.cpp:81)
 0:31.91 GECKO(412650) #21: ??? (/lib64/libc.so.6 + 0x8e897)
 0:31.91 GECKO(412650) #22: ??? (/lib64/libc.so.6 + 0x1156fc)
 0:31.91 GECKO(412650) #23: ??? (???:???)
 0:15.35 TEST_END: Test OK. Subtests passed 3/3. Unexpected 0
Flags: needinfo?(jimb)

So, what I believe is going on here is:

  • The channel between the content process and the GPU process gets shut down for some reason. Maybe normal behavior, maybe not, I don't understand this part yet.
  • As shown in the stack trace,
    • This causes the PWebGPUParent for that content process to close.
    • That drops the wgpu_core::Global.
    • That frees the Global's Devices.
    • That causes the Devices' DeviceLostCallback to be called.
    • That causes the GPU process to try to send a DeviceLost message to the content process.
      But of course, the channel to the content process is gone - that's why we're closing the Device in the first place.

When the channel to the content process is dropped, then clearly we shouldn't try to send it any more messages. I think WebGPUParent probably needs a flag to say, we are in the midst of cleaning up for a closing channel, so don't bother sending any messages back. Maybe IProtocol or MessageChannel already have a flag saying this.

But it's also unclear to me why the content process is shutting down. That may be a separate bug.

Preventing the attempt to send the DeviceLost message on the closed channel does prevent the crash, but the hang still happens.

So we have more to debug here.

Blocks: 1843891
Attachment #9377740 - Attachment description: Bug 1876389 - Update `wgpu` to revision 87b6513df32e8a9c588962ba8509019c277438e2. r=#webgpu-reviewers,#supply-chain-reviewers → Bug 1876389: Update `wgpu` to revision 32e70bc1635905c508d408eb1cf22b2aa062ffe1. r=#webgpu-reviewers,#supply-chain-reviewers
Attachment #9378339 - Attachment is obsolete: true
Pushed by jblandy@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b7ae0fc09c77
Update `wgpu` to revision 32e70bc1635905c508d408eb1cf22b2aa062ffe1. r=webgpu-reviewers,supply-chain-reviewers,nical
Status: NEW → RESOLVED
Closed: 4 months ago
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → 124 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: