Closed Bug 1864967 Opened 1 year ago Closed 9 months ago

Crash [@ /lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so+0x1009a33]

Categories

(Core :: Graphics: WebGPU, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
129 Branch
Tracking Status
firefox-esr115 --- disabled
firefox120 --- disabled
firefox121 --- disabled
firefox122 --- disabled
firefox127 --- disabled
firefox128 --- disabled
firefox129 --- fixed

People

(Reporter: jkratzer, Assigned: jimb)

References

(Blocks 3 open bugs)

Details

(4 keywords, Whiteboard: [bugmon:bisected,confirmed])

Crash Data

Attachments

(1 file)

4.59 KB, application/octet-stream
Details

Testcase found while fuzzing mozilla-central rev d12a09b7c773 (built with: --enable-debug --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build d12a09b7c773 --debug --fuzzing  -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.zip
[@ /lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so+0x1009a33]

    ==202740==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000090 (pc 0x7fb5c708ba33 bp 0x7fb7533f6df0 sp 0x7fb7533f6ca0 T202869)
    ==202740==The signal is caused by a READ memory access.
    ==202740==Hint: address points to the zero page.
        #0 0x7fb5c708ba33  (/lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so+0x1009a33) (BuildId: 4d2fac772f9b637bd3fd789ab892acecd55393e0)
        #1 0x7fb5c6ee3a1d  (/lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so+0xe61a1d) (BuildId: 4d2fac772f9b637bd3fd789ab892acecd55393e0)
        #2 0x7fb7f6f5f061 in wgpu_core::command::compute::_$LT$impl$u20$wgpu_core..global..Global$LT$G$GT$$GT$::command_encoder_run_compute_pass_impl::hd928693a1ff9f0bb /third_party/rust/wgpu-core/src/command/compute.rs:587:25
        #3 0x7fb7f6f9c223 in wgpu_bindings::server::Global::command_encoder_action::h1a3a1fab9070d21c /gfx/wgpu_bindings/src/server.rs:792:35
        #4 0x7fb7f6fa7e82 in wgpu_server_command_encoder_action /gfx/wgpu_bindings/src/server.rs:911:5
        #5 0x7fb7f11d8719 in mozilla::webgpu::WebGPUParent::RecvCommandEncoderAction(unsigned long, unsigned long, mozilla::ipc::ByteBuf const&) /dom/webgpu/ipc/WebGPUParent.cpp:1289:3
        #6 0x7fb7f11e440a in mozilla::webgpu::PWebGPUParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PWebGPUParent.cpp:482:80
        #7 0x7fb7ef254760 in mozilla::gfx::PCanvasManagerParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PCanvasManagerParent.cpp:279:32
        #8 0x7fb7ee7c252f in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /ipc/glue/MessageChannel.cpp:1813:25
        #9 0x7fb7ee7bf282 in mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::UniquePtr<IPC::Message, mozilla::DefaultDelete<IPC::Message>>) /ipc/glue/MessageChannel.cpp:1732:9
        #10 0x7fb7ee7bff02 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) /ipc/glue/MessageChannel.cpp:1525:3
        #11 0x7fb7ee7c104f in mozilla::ipc::MessageChannel::MessageTask::Run() /ipc/glue/MessageChannel.cpp:1623:14
        #12 0x7fb7edb074cd in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1192:16
        #13 0x7fb7edb0e45d in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:480:10
        #14 0x7fb7ee7c96e5 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:330:5
        #15 0x7fb7ee6e23c1 in RunHandler /ipc/chromium/src/base/message_loop.cc:363:3
        #16 0x7fb7ee6e23c1 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:345:3
        #17 0x7fb7edb027b3 in nsThread::ThreadFunc(void*) /xpcom/threads/nsThread.cpp:370:10
        #18 0x7fb8016a8d0f in _pt_root /nsprpub/pr/src/pthreads/ptthread.c:201:5
        #19 0x7fb801f49ac2 in start_thread nptl/pthread_create.c:442:8
        #20 0x7fb801fdba3f  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    
    UndefinedBehaviorSanitizer can not provide additional info.
    SUMMARY: UndefinedBehaviorSanitizer: SEGV (/lib/x86_64-linux-gnu/libVkLayer_khronos_validation.so+0x1009a33) (BuildId: 4d2fac772f9b637bd3fd789ab892acecd55393e0) 
    ==202740==ABORTING
Attached file Testcase
Group: core-security → gfx-core-security

Are we shipping this validation library?

This looks like a crash -in- the validation library (null ptr deref), not the validation library reporting a wgpu error. But of course maybe it crashed trying to tell us about a security error?

Verified bug as reproducible on mozilla-central 20231115214519-6f3be95d6511.
The bug appears to have been introduced in the following build range:

Start: 07aa1efdf65fea586bf4d2a0c6a5a8620fd8cc96 (20230412184105)
End: 26012bc0b7fe3d60462f200f280b610b88a8f375 (20230412205820)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=07aa1efdf65fea586bf4d2a0c6a5a8620fd8cc96&tochange=26012bc0b7fe3d60462f200f280b610b88a8f375

Keywords: regression
Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]

This bug has been marked as a regression. Setting status flag for Nightly to affected.

Setting affected versions based on the Fx114 pushlog from comment 3.

I believe this is S3.

We do not ship this validation layer directly, rather it is installed by users/devs.

I feel like this might be a jimb thing to look into? Or instead delegate :)

Severity: -- → S3
Flags: needinfo?(jimb)

stderr prior to crash:

[ERROR wgpu_hal::vulkan::instance] VALIDATION [VUID-vkCmdBindPipeline-commonparent (0xa6c0eb9e)]
    	Validation Error: [ VUID-vkCmdBindPipeline-commonparent ] Object 0: handle = 0x7f434c009470, type = VK_OBJECT_TYPE_INSTANCE; | MessageID = 0xa6c0eb9e | Object 0x7e000000007e of type VkPipeline was not created, allocated or retrieved from the correct device. The Vulkan spec states: Both of commandBuffer, and pipeline must have been created, allocated, or retrieved from the same VkDevice (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-vkCmdBindPipeline-commonparent)
[ERROR wgpu_hal::vulkan::instance] 	objects: (type: INSTANCE, hndl: 0x7f434c009470, name: ?)

Testcase crashes using the initial build (mozilla-central 20231115095248-d12a09b7c773) but not with tip (mozilla-central 20240615092745-71d0793fa963.)

The bug appears to have been fixed in the following build range:

Start: 60ccda7205ffdc385f41dbafaca8a8eb9dc13aff (20240614034530)
End: 1941da777755f8a365b0b6fc6735e4f1b40e3db7 (20240614065412)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=60ccda7205ffdc385f41dbafaca8a8eb9dc13aff&tochange=1941da777755f8a365b0b6fc6735e4f1b40e3db7

jkratzer, can you confirm that the above bisection range is responsible for fixing this issue?
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Flags: needinfo?(jimb) → needinfo?(jkratzer)
Keywords: bugmon

Appears to have been fixed via bug 1901628.

Status: NEW → RESOLVED
Closed: 9 months ago
Flags: needinfo?(jkratzer)
Resolution: --- → FIXED
Assignee: nobody → jimb
Depends on: 1901628
Target Milestone: --- → 129 Branch
Group: gfx-core-security → core-security-release
QA Whiteboard: [post-critsmash-triage]
Flags: qe-verify-
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: