Closed Bug 1873047 Opened 11 months ago Closed 6 months ago

AddressSanitizer: breakpoint on unknown address 0x7ffd1a1620b2 [@ AnnotateMozCrashReason]

Categories

(Core :: Graphics: WebGPU, defect, P1)

x86_64
Windows
defect

Tracking

()

RESOLVED FIXED
124 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox121 --- disabled
firefox122 --- disabled
firefox123 --- disabled
firefox124 --- fixed

People

(Reporter: jkratzer, Assigned: nical)

References

(Blocks 1 open bug)

Details

(Keywords: regression, testcase, Whiteboard: [bugmon:bisected,confirmed])

Crash Data

Attachments

(14 files)

6.97 KB, text/plain
Details
1.19 KB, text/plain
Details
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
12.48 KB, text/plain
Details

Testcase found while fuzzing mozilla-central rev 9ea90dc23395 (built with: --enable-address-sanitizer --enable-fuzzing).

Testcase can be reproduced using the following commands:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch --build 9ea90dc23395 --asan --fuzzing  -n firefox
$ python -m grizzly.replay.bugzilla ./firefox/firefox <bugid>
AddressSanitizer: breakpoint on unknown address 0x7ffd1a1620b2 [@ AnnotateMozCrashReason]

    =================================================================
    ==3564==ERROR: AddressSanitizer: breakpoint on unknown address 0x7ffd1a1620b2 (pc 0x7ffd1a1620b2 bp 0x00446f1fa670 sp 0x00446f1fa450 T0)
    ==3564==*** WARNING: Failed to initialize DbgHelp!              ***
    ==3564==*** Most likely this means that the app is already      ***
    ==3564==*** using DbgHelp, possibly with incompatible flags.    ***
    ==3564==*** Due to technical reasons, symbolization might crash ***
    ==3564==*** or produce wrong results.                           ***
        #0 0x7ffd1a1620b1 in AnnotateMozCrashReason /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:43
        #1 0x7ffd1a1620b1 in mozilla::webgpu::Sampler::Sampler(class mozilla::webgpu::Device *const, unsigned __int64) /dom/webgpu/Sampler.cpp:19
        #2 0x7ffd1a130445 in mozilla::webgpu::Device::CreateSampler(struct mozilla::dom::GPUSamplerDescriptor const &) /dom/webgpu/Device.cpp:200
        #3 0x7ffd18b1fa77 in mozilla::dom::GPUDevice_Binding::createSampler /builds/worker/workspace/obj-build/dom/bindings/./WebGPUBinding.cpp:19125
        #4 0x7ffd197d5f89 in mozilla::dom::binding_detail::GenericMethod<struct mozilla::dom::binding_detail::NormalThisPolicy, struct mozilla::dom::binding_detail::ThrowExceptions>(struct JSContext *, unsigned int, class JS::Value *) /dom/bindings/BindingUtils.cpp:3258
        #5 0x7ffd2421d8c8 in CallJSNative /js/src/vm/Interpreter.cpp:479
        #6 0x7ffd2421d8c8 in js::InternalCallOrConstruct(struct JSContext *, class JS::CallArgs const &, enum js::MaybeConstruct, enum js::CallReason) /js/src/vm/Interpreter.cpp:573
        #7 0x7ffd2423b3b0 in InternalCall /js/src/vm/Interpreter.cpp:640
        #8 0x7ffd2423b3b0 in js::CallFromStack /js/src/vm/Interpreter.cpp:645
        #9 0x7ffd2423b3b0 in js::Interpret(struct JSContext *, class js::RunState &) /js/src/vm/Interpreter.cpp:3060
        #10 0x7ffd2421c50a in MaybeEnterInterpreterTrampoline /js/src/vm/Interpreter.cpp:393
        #11 0x7ffd2421c50a in js::RunScript(struct JSContext *, class js::RunState &) /js/src/vm/Interpreter.cpp:451
        #12 0x7ffd2421da08 in js::InternalCallOrConstruct(struct JSContext *, class JS::CallArgs const &, enum js::MaybeConstruct, enum js::CallReason) /js/src/vm/Interpreter.cpp:605
        #13 0x7ffd2421f80b in InternalCall /js/src/vm/Interpreter.cpp:640
        #14 0x7ffd2421f80b in js::Call(struct JSContext *, class JS::Handle<class JS::Value>, class JS::Handle<class JS::Value>, class js::AnyInvokeArgs const &, class JS::MutableHandle<class JS::Value>, enum js::CallReason) /js/src/vm/Interpreter.cpp:672
        #15 0x7ffd22a0714f in js::CallSelfHostedFunction(struct JSContext *, class JS::Handle<class js::PropertyName *>, class JS::Handle<class JS::Value>, class js::AnyInvokeArgs const &, class JS::MutableHandle<class JS::Value>) /js/src/vm/SelfHosting.cpp:1520
        #16 0x7ffd22e07bf2 in AsyncFunctionResume /js/src/vm/AsyncFunction.cpp:149
        #17 0x7ffd22afb2e6 in AsyncFunctionPromiseReactionJob /js/src/builtin/Promise.cpp:2120
        #18 0x7ffd22afb2e6 in PromiseReactionJob /js/src/builtin/Promise.cpp:2178
        #19 0x7ffd2421d8c8 in CallJSNative /js/src/vm/Interpreter.cpp:479
        #20 0x7ffd2421d8c8 in js::InternalCallOrConstruct(struct JSContext *, class JS::CallArgs const &, enum js::MaybeConstruct, enum js::CallReason) /js/src/vm/Interpreter.cpp:573
        #21 0x7ffd2421f80b in InternalCall /js/src/vm/Interpreter.cpp:640
        #22 0x7ffd2421f80b in js::Call(struct JSContext *, class JS::Handle<class JS::Value>, class JS::Handle<class JS::Value>, class js::AnyInvokeArgs const &, class JS::MutableHandle<class JS::Value>, enum js::CallReason) /js/src/vm/Interpreter.cpp:672
        #23 0x7ffd22ddeb25 in JS::Call(struct JSContext *, class JS::Handle<class JS::Value>, class JS::Handle<class JS::Value>, class JS::HandleValueArray const &, class JS::MutableHandle<class JS::Value>) /js/src/vm/CallAndConstruct.cpp:119
        #24 0x7ffd17f2a106 in mozilla::dom::PromiseJobCallback::Call(class mozilla::dom::BindingCallContext &, class JS::Handle<class JS::Value>, class mozilla::ErrorResult &) /builds/worker/workspace/obj-build/dom/bindings/./PromiseBinding.cpp:83
        #25 0x7ffd13a86855 in mozilla::dom::PromiseJobCallback::Call /builds/worker/workspace/obj-build/dist/include/mozilla/dom/PromiseBinding.h:198
        #26 0x7ffd13a86855 in mozilla::dom::PromiseJobCallback::Call /builds/worker/workspace/obj-build/dist/include/mozilla/dom/PromiseBinding.h:211
        #27 0x7ffd13a86855 in mozilla::PromiseJobRunnable::Run(class mozilla::AutoSlowOperation &) /xpcom/base/CycleCollectedJSContext.cpp:210
        #28 0x7ffd13a5f77e in mozilla::CycleCollectedJSContext::PerformMicroTaskCheckPoint(bool) /xpcom/base/CycleCollectedJSContext.cpp:712
        #29 0x7ffd13a60db6 in mozilla::CycleCollectedJSContext::AfterProcessTask(unsigned int) /xpcom/base/CycleCollectedJSContext.cpp:499
        #30 0x7ffd15753b16 in XPCJSContext::AfterProcessTask(unsigned int) /js/xpconnect/src/XPCJSContext.cpp:1499
        #31 0x7ffd13d04672 in nsThread::ProcessNextEvent(bool, bool *) /xpcom/threads/nsThread.cpp:1237
        #32 0x7ffd13d147aa in NS_ProcessNextEvent(class nsIThread *, bool) /xpcom/threads/nsThreadUtils.cpp:480
        #33 0x7ffd15470b27 in mozilla::ipc::MessagePump::Run(class base::MessagePump::Delegate *) /ipc/glue/MessagePump.cpp:85
        #34 0x7ffd1538b803 in MessageLoop::RunInternal /ipc/chromium/src/base/message_loop.cc:370
        #35 0x7ffd1538b803 in MessageLoop::RunHandler(void) /ipc/chromium/src/base/message_loop.cc:363
        #36 0x7ffd1538b5ca in MessageLoop::Run(void) /ipc/chromium/src/base/message_loop.cc:345
        #37 0x7ffd1dcd8e7c in nsBaseAppShell::Run(void) /widget/nsBaseAppShell.cpp:148
        #38 0x7ffd1df48f07 in nsAppShell::Run(void) /widget/windows/nsAppShell.cpp:822
        #39 0x7ffd22135fbe in XRE_RunAppShell(void) /toolkit/xre/nsEmbedFunctions.cpp:721
        #40 0x7ffd1538b803 in MessageLoop::RunInternal /ipc/chromium/src/base/message_loop.cc:370
        #41 0x7ffd1538b803 in MessageLoop::RunHandler(void) /ipc/chromium/src/base/message_loop.cc:363
        #42 0x7ffd1538b5ca in MessageLoop::Run(void) /ipc/chromium/src/base/message_loop.cc:345
        #43 0x7ffd221355a4 in XRE_InitChildProcess(int, char **const, struct XREChildData const *) /toolkit/xre/nsEmbedFunctions.cpp:656
        #44 0x7ff756952713 in content_process_main /browser/app/../../ipc/contentproc/plugin-container.cpp:57
        #45 0x7ff756952713 in NS_internal_main(int, char **, char **) /browser/app/nsBrowserApp.cpp:375
        #46 0x7ff7569514d8 in wmain /toolkit/xre/nsWindowsWMain.cpp:151
        #47 0x7ff756a34207 in invoke_main D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:90
        #48 0x7ff756a34207 in __scrt_common_main_seh D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288
        #49 0x7ffd7898257c  (C:\Windows\System32\KERNEL32.DLL+0x18001257c)
        #50 0x7ffd79c6aa57  (C:\Windows\SYSTEM32\ntdll.dll+0x18005aa57)
    
    AddressSanitizer can not provide additional info.
    SUMMARY: AddressSanitizer: breakpoint /builds/worker/workspace/obj-build/dist/include/mozilla/Assertions.h:43 in AnnotateMozCrashReason
    ==3564==ABORTING
Attached file Testcase
Crash Signature: [@ core::result::unwrap_failed | wgpu_core::command::CommandEncoder<T>::open<T> ]

Regressed by : bug 1853140

Keywords: regression
Regressed by: 1853140

Set release status flags based on info from the regressing bug 1853140

:nical, since you are the author of the regressor, bug 1853140, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(nical.bugzilla)
Severity: -- → S3
Priority: -- → P1

Verified bug as reproducible on mozilla-central 20240104213501-45533d2448ef.
The bug appears to have been introduced in the following build range:

Start: ba00fe639072b671be556332e4628092d64d31df (20230615205119)
End: 08fd41d3ba9b86259882ef3f3a02a49faa61a5c5 (20230616001721)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ba00fe639072b671be556332e4628092d64d31df&tochange=08fd41d3ba9b86259882ef3f3a02a49faa61a5c5

Whiteboard: [bugmon:confirm] → [bugmon:bisected,confirmed]
Assignee: nobody → nical.bugzilla
Flags: needinfo?(nical.bugzilla)
No longer regressed by: 1853140

There are things poorly at multiple levels here:

    1. The test case manages to cause a device loss, at least in Mayank's case. It didn't reproduce for me but that depends on the driver.
    1. wgpu detects the error but fails to propagate it all the way to the API entry points and crashes instead
    1. The wgpu process crashes but some of our WebGPU DOM glue crashes when the IPC connection is not available.

I have prepared a fix for 2) upstream at https://github.com/gfx-rs/wgpu/pull/4999
I'll put together a fix for 3) here.

It would be nice to better understand 1) as well but I haven't gotten to that point yet, and fixing the other two (in fact just fixing 2)) should be enough to gracefully recover from the device loss and unblock fuzzing.

These JS proxies are always safe to create. The ids are validated on the other side so if the GPU process comes back up and we try to use the object (on a new device), it will generate an error as expected.

These JS proxies are always safe to create. The ids are validated on the other side so if the GPU process comes back up and we try to use the object (on a new device), it will generate an error as expected.

Depends on D197799

gfx-rs/wgpu#4999 has merged, resolving (ii) from :nical's comment 8.

Status: NEW → ASSIGNED
Pushed by egubler@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/8f0c15267535 Generate a valid Sampler object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/7142405fee57 Generate a valid Texture object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/bba7b8f2946f Generate a valid TextureView object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/055f1bcf0675 Generate a valid BindGroupLayout object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/6a97aed289ab Generate a valid BindGroup object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/aca3db8439db Generate a valid ShaderModule object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/30b515db994b Generate a valid CommandEncoder object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/2a1661674e34 Generate a valid PipelineLayout object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/bc67106c19d8 Generate a valid ComputePipeline object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/d0475e27ae48 Generate a valid RenderPipeline object on the JS side even if IPC is down. r=webgpu-reviewers,ErichDonGubler https://hg.mozilla.org/integration/autoland/rev/79282fd571c5 Move CommandEncoder and RenderBundleEncoder ::Finish to their respective file. r=webgpu-reviewers,ErichDonGubler

Bug marked as FIXED but still reproduces on mozilla-central 20240120093931-5471899cc9d0. If you believe this to be incorrect, please remove the bugmon keyword to prevent further analysis.

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

When I run the attached testcase locally, I get the following crash. Do you want me to open a new bug for this issue?

I think that this is still crashing for the same reason as bug 1874478 for which fixes have landed upstream. So let's wait until the next wgpu update and see how things are going then.

:nical: The fix should already be consumed, with the tracked WGPU update landing yesterday. Can you validate that it is fixed as expected?

Flags: needinfo?(nical.bugzilla)

Testcase crashes using the initial build (mozilla-central 20240103160634-9ea90dc23395) but not with tip (mozilla-central 20240209214145-9c7562b79131.)

The bug appears to have been fixed in the following build range:

Start: f3efca74da0f43269bd8ac07e2a5d27e89c4d7c3 (20240123145016)
End: 936300bf2ee78e086143fb3718e2c0af755385b0 (20240123160955)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=f3efca74da0f43269bd8ac07e2a5d27e89c4d7c3&tochange=936300bf2ee78e086143fb3718e2c0af755385b0

Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Keywords: bugmon
Flags: needinfo?(nical.bugzilla)

I believe this is closed now, so closing.

No longer blocks: webgpu-triage
Status: REOPENED → RESOLVED
Closed: 10 months ago6 months ago
Resolution: --- → FIXED
Depends on: 1875543
Target Milestone: 123 Branch → 124 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: