Closed Bug 1816731 Opened 1 year ago Closed 5 months ago

Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Invalid width / buffer stride!), at /dom/webgpu/ipc/WebGPUParent.cpp:693

Categories

(Core :: Graphics: WebGPU, defect)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
121 Branch
Tracking Status
firefox-esr115 --- disabled
firefox112 --- disabled
firefox120 --- disabled
firefox121 --- fixed

People

(Reporter: jkratzer, Assigned: bradwerth)

References

(Blocks 3 open bugs)

Details

(Keywords: regression, testcase, Whiteboard: [bugmon:bisected,confirmed,origRev=721a7c52f1ab5ca451ec9bb0635752915e231893])

Attachments

(3 files, 2 obsolete files)

Testcase found while fuzzing mozilla-central rev 36b67e826e2d (built with: --enable-debug --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build 36b67e826e2d --debug --fuzzing -n firefox
$ python -m grizzly.replay ./firefox/firefox testcase.html
Assertion failure: false (MOZ_ASSERT_UNREACHABLE: Invalid width / buffer stride!), at /dom/webgpu/ipc/WebGPUParent.cpp:693

    ==45331==ERROR: UndefinedBehaviorSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7f99656ba97f bp 0x7f99022a74a0 sp 0x7f99022a7440 T45397)
    ==45331==The signal is caused by a WRITE memory access.
    ==45331==Hint: address points to the zero page.
        #0 0x7f99656ba97f in mozilla::webgpu::WebGPUParent::RecvDeviceCreateSwapChain(unsigned long, unsigned long, mozilla::layers::RGBDescriptor const&, nsTArray<unsigned long> const&, mozilla::layers::RemoteTextureOwnerId const&) /dom/webgpu/ipc/WebGPUParent.cpp:693:5
        #1 0x7f99656d5645 in mozilla::webgpu::PWebGPUParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PWebGPUParent.cpp:2088:80
        #2 0x7f99636211e0 in mozilla::gfx::PCanvasManagerParent::OnMessageReceived(IPC::Message const&) /builds/worker/workspace/obj-build/ipc/ipdl/PCanvasManagerParent.cpp:214:32
        #3 0x7f9962bf779a in mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecycleProxy*, IPC::Message const&) /ipc/glue/MessageChannel.cpp:1800:25
        #4 0x7f9962bf4417 in mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::UniquePtr<IPC::Message, mozilla::DefaultDelete<IPC::Message>>) /ipc/glue/MessageChannel.cpp:1725:9
        #5 0x7f9962bf4f45 in mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, mozilla::ipc::MessageChannel::MessageTask&) /ipc/glue/MessageChannel.cpp:1525:3
        #6 0x7f9962bf627f in mozilla::ipc::MessageChannel::MessageTask::Run() /ipc/glue/MessageChannel.cpp:1623:14
        #7 0x7f9961fba232 in nsThread::ProcessNextEvent(bool, bool*) /xpcom/threads/nsThread.cpp:1219:16
        #8 0x7f9961fc058d in NS_ProcessNextEvent(nsIThread*, bool) /xpcom/threads/nsThreadUtils.cpp:477:10
        #9 0x7f9962bfe913 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /ipc/glue/MessagePump.cpp:330:5
        #10 0x7f9962b1f618 in MessageLoop::RunInternal() /ipc/chromium/src/base/message_loop.cc:381:10
        #11 0x7f9962b1f521 in RunHandler /ipc/chromium/src/base/message_loop.cc:374:3
        #12 0x7f9962b1f521 in MessageLoop::Run() /ipc/chromium/src/base/message_loop.cc:356:3
        #13 0x7f9961fb5627 in nsThread::ThreadFunc(void*) /xpcom/threads/nsThread.cpp:384:10
        #14 0x7f997510cc86 in _pt_root /nsprpub/pr/src/pthreads/ptthread.c:201:5
        #15 0x7f99759adb42 in start_thread nptl/pthread_create.c:442:8
        #16 0x7f9975a3f9ff  misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
    
    UndefinedBehaviorSanitizer can not provide additional info.
    SUMMARY: UndefinedBehaviorSanitizer: SEGV /dom/webgpu/ipc/WebGPUParent.cpp:693:5 in mozilla::webgpu::WebGPUParent::RecvDeviceCreateSwapChain(unsigned long, unsigned long, mozilla::layers::RGBDescriptor const&, nsTArray<unsigned long> const&, mozilla::layers::RemoteTextureOwnerId const&)
    ==45331==ABORTING
Attached file Testcase (obsolete) —

Verified bug as reproducible on mozilla-central 20230214161440-e027953e2470.
The bug appears to have been introduced in the following build range:

Start: 8027f6771a74487960e588a23e3eae848d5eff5c (20220517153704)
End: ecba9892a284f7d1c79c37b8fcc9342c4955e2eb (20220517141829)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=8027f6771a74487960e588a23e3eae848d5eff5c&tochange=ecba9892a284f7d1c79c37b8fcc9342c4955e2eb

Keywords: regression
Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]

This bug has been marked as a regression. Setting status flag for Nightly to affected.

:aosmond bug 1768337 seems to be the only webgpu bug in the regression range, could you take a look?

Flags: needinfo?(aosmond)

Reviewing the code, we only perform bounds checking once we reach the compositor process. That makes sense because we cannot/should not trust the content process. We don't check at all in the content process. The assertion is safe, in that a non-debug build would just drop the request after checking the given values from JS would fail our bounds limits.

A quick glance at:
https://www.w3.org/TR/webgpu/#dom-gpuadapter-requestdevice

suggests we should be evaluating these bounds checks sooner and throwing an OperationError? These checks would be in addition the the compositor process checks.

Flags: needinfo?(aosmond) → needinfo?(jimb)

Testcase crashes using the initial build (mozilla-central 20230213170842-36b67e826e2d) but not with tip (mozilla-central 20230407213355-c3356b6d41ca.)

The bug appears to have been fixed in the following build range:

Start: cdea2170a020d1529306ca468d3210133365c477 (20230405213026)
End: 6f3869e6e810960b6a869bfcbd0c1ce23fa9dd4e (20230405223044)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=cdea2170a020d1529306ca468d3210133365c477&tochange=6f3869e6e810960b6a869bfcbd0c1ce23fa9dd4e

jkratzer, can you confirm that the above bisection range is responsible for fixing this issue?
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Flags: needinfo?(jimb) → needinfo?(jkratzer)
Keywords: bugmon
Attached file testcase.html (obsolete) —
Attachment #9317703 - Attachment is obsolete: true
Flags: needinfo?(jkratzer)

Bug 1814091 changed CanvasContext.getPreferredFormat to GPU.getPreferredCanvasFormat which broke the testcase.

Keywords: bugmon
Whiteboard: [bugmon:bisected,confirmed] → [bugmon:bisected,confirmed,origRev=721a7c52f1ab5ca451ec9bb0635752915e231893]

Unable to reproduce bug 1816731 using build mozilla-central 20230406092115-b1b50f07c34b. Without a baseline, bugmon is unable to analyze this bug.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon

It doesn't seem like we have the validation implemented for GPUCanvasContext.configure(configuration).

Severity: -- → S3

I can reproduce the crash with an updated testcase. I will add it to the Bug.

Assignee: nobody → bwerth
Attachment #9327944 - Attachment is obsolete: true
Attached file testcase_1816731.html

(In reply to Teodor Tanasoaia [:teoxoy] from comment #11)

It doesn't seem like we have the validation implemented for GPUCanvasContext.configure(configuration).

I agree. Those steps are implemented by CanvasContext::Configure. I'll try to improve them.

Blocks: 1864904
Pushed by bwerth@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9f0b87cd1aeb
Part 1: Prevent configuration of a WebGPU context for a too-big canvas. r=webgpu-reviewers,ErichDonGubler
https://hg.mozilla.org/integration/autoland/rev/c59cb31eb037
Part 2: Add a test of context.configure. r=webgpu-reviewers,ErichDonGubler
Regressions: 1865409
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: