Closed Bug 1874080 Opened 10 months ago Closed 10 months ago

Hit MOZ_CRASH(Found dangling CheckedUnsafePtr) at /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:419

Categories

(Core :: DOM: Workers, defect, P3)

defect

Tracking

()

VERIFIED FIXED
123 Branch
Tracking Status
firefox-esr115 123+ fixed
firefox121 --- wontfix
firefox122 --- wontfix
firefox123 + verified

People

(Reporter: tsmith, Assigned: edenchuang)

References

(Blocks 1 open bug, Regression)

Details

(5 keywords, Whiteboard: [bugmon:bisected,confirmed][adv-main123+r][adv-esr115.8+r])

Attachments

(2 files)

Attached file testcase.html

Found while fuzzing m-c 20231112-43bee97ffb8c (--enable-address-sanitizer --enable-fuzzing)

To reproduce via Grizzly Replay:

$ pip install fuzzfetch grizzly-framework
$ python -m fuzzfetch -a --fuzzing -n firefox
$ python -m grizzly.replay.bugzilla ./firefox/firefox <bugid>

Hit MOZ_CRASH(Found dangling CheckedUnsafePtr) at /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:419

#0 0x7f51155482e8 in NotifyCheckFailure /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:419:31
#1 0x7f51155482e8 in NotifyCheckFailure<mozilla::CrashOnDanglingCheckedUnsafePtr> /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:387:13
#2 0x7f51155482e8 in mozilla::CheckCheckedUnsafePtrs<mozilla::CrashOnDanglingCheckedUnsafePtr>::Check(nsTArray<mozilla::detail::CheckedUnsafePtrBaseCheckingEnabled*>&) /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:411:7
#3 0x7f51154d7b10 in mozilla::detail::SupportCheckedUnsafePtrImpl<mozilla::CrashOnDanglingCheckedUnsafePtr, (mozilla::CheckingSupport)1>::~SupportCheckedUnsafePtrImpl() /builds/worker/workspace/obj-build/dist/include/mozilla/dom/quota/CheckedUnsafePtr.h:459:13
#4 0x7f511843c6d6 in Release /builds/worker/checkouts/gecko/dom/workers/WorkerPrivate.h:222:3
#5 0x7f511843c6d6 in Release /builds/worker/workspace/obj-build/dist/include/mozilla/RefPtr.h:49:40
#6 0x7f511843c6d6 in Release /builds/worker/workspace/obj-build/dist/include/mozilla/RefPtr.h:409:36
#7 0x7f511843c6d6 in assign_assuming_AddRef /builds/worker/workspace/obj-build/dist/include/mozilla/RefPtr.h:68:7
#8 0x7f511843c6d6 in operator= /builds/worker/workspace/obj-build/dist/include/mozilla/RefPtr.h:180:5
#9 0x7f511843c6d6 in Terminate /builds/worker/checkouts/gecko/dom/workers/Worker.cpp:196:20
#10 0x7f511843c6d6 in mozilla::dom::Worker::cycleCollection::Unlink(void*) /builds/worker/checkouts/gecko/dom/workers/Worker.cpp:209:8
#11 0x7f510da92adc in nsCycleCollector::CollectWhite() /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3133:26
#12 0x7f510da96ef8 in nsCycleCollector::Collect(mozilla::CCReason, ccIsManual, js::SliceBudget&, nsICycleCollectorListener*, bool) /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3499:26
#13 0x7f510da9641a in nsCycleCollector::ShutdownCollect() /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3410:20
#14 0x7f510da995f6 in nsCycleCollector::Shutdown(bool) /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:3709:5
#15 0x7f510da9bcf0 in nsCycleCollector_shutdown(bool) /builds/worker/checkouts/gecko/xpcom/base/nsCycleCollector.cpp:4033:18
#16 0x7f510dd51fc7 in mozilla::ShutdownXPCOM(nsIServiceManager*) /builds/worker/checkouts/gecko/xpcom/build/XPCOMInit.cpp:702:3
#17 0x7f511e1bf972 in XRE_InitChildProcess(int, char**, XREChildData const*) /builds/worker/checkouts/gecko/toolkit/xre/nsEmbedFunctions.cpp:660:16
#18 0x55c170f1dd9c in content_process_main /builds/worker/checkouts/gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:57:28
#19 0x55c170f1dd9c in main /builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp:375:18
#20 0x7f5135829d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16
#21 0x7f5135829e3f in __libc_start_main csu/../csu/libc-start.c:392:3
#22 0x55c170e420a8 in _start (/home/user/workspace/browsers/m-c-20240109162901-fuzzing-asan-opt/firefox+0xdc0a8) (BuildId: 8680b303547d6f1d0e0b5e1ccb4e2e4b700049d7)
Flags: in-testsuite?

Verified bug as reproducible on mozilla-central 20240110213539-1c750a173258.
The bug appears to have been introduced in the following build range:

Start: 53b4b785ae2a7f70257069e77b138fe36b53698a (20230612211509)
End: eb926a42fef72714535ed99bfa3586454449f898 (20230612192238)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=53b4b785ae2a7f70257069e77b138fe36b53698a&tochange=eb926a42fef72714535ed99bfa3586454449f898

Keywords: regression
Whiteboard: [bugmon:bisected,confirmed]

Interesting shutdown case.

Component: Storage: Quota Manager → DOM: Workers

When nsCycleCollector_shutdown is called on the main thread, other threads should be gone already.
I didn't check who has still a raw pointer to WorkerPrivate at that point. That is surprising.

Severity: -- → S2
Priority: -- → P3

Eden probably knows best how to find the offender.

Flags: needinfo?(echuang)

(In reply to Bugmon [:jkratzer for issues] from comment #1)

Verified bug as reproducible on mozilla-central 20240110213539-1c750a173258.
The bug appears to have been introduced in the following build range:

Start: 53b4b785ae2a7f70257069e77b138fe36b53698a (20230612211509)
End: eb926a42fef72714535ed99bfa3586454449f898 (20230612192238)
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=53b4b785ae2a7f70257069e77b138fe36b53698a&tochange=eb926a42fef72714535ed99bfa3586454449f898

I think bug 1777921 is in the range only because it increased the number of GC/CC runs, but the underlying problem is probably older.

Yeah, I was thinking the same.

Keywords: pernosco-wanted

Successfully recorded a pernosco session. A link to the pernosco session will be added here shortly.

A pernosco session for this bug can be found here.

We don't run shutdown CCs in content processes so this stack isn't concerning by itself, but it could be a sign of an issue that could happen before the shutdown so let's leave it hidden until somebody figures out what is going wrong.

(In reply to Andrew McCreight [:mccr8] from comment #9)

We don't run shutdown CCs in content processes so this stack isn't concerning by itself, but it could be a sign of an issue that could happen before the shutdown so let's leave it hidden until somebody figures out what is going wrong.

Interestingly we are not yet in content process shutdown in the pernosco session, the CC run there in question is triggered by IdleTaskRunner::Run. It unlinks the Worker object on the main thread which removes apparently the last strong reference to WorkerPrivate, which as such is not concerning. The concern here is that some other object still holds a raw* to the same WorkerPrivate as detected by CheckedUnsafePtr that might or might not be used later (we have no evidence it is a real problem).

In the stack in comment 0 the same is happening, but just triggered by the last possible CC run on shutdown which I assume just happens to be the first to run after the worker shutdown.

Log from CheckedUnsafePtr

CheckedUnsafePtr [0x1082406c8]
Location of creation: WebTaskSchedulerWorker, WebTaskSchedulerWorker.cpp:22
Stack of creation:
#01: mozilla::dom::WebTaskSchedulerWorker::WebTaskSchedulerWorker(mozilla::dom::WorkerPrivate*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x310be6c]
#02: mozilla::dom::WebTaskScheduler::CreateForWorker(mozilla::dom::WorkerPrivate*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x310af54]
#03: mozilla::dom::WorkerGlobalScope::Scheduler()[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2e04a70]
#04: mozilla::dom::WorkerGlobalScope_Binding::get_scheduler(JSContext*, JS::Handle<JSObject*>, void*, JSJitGetterCallArgs)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x1a10420]
#05: bool mozilla::dom::binding_detail::GenericGetter<mozilla::dom::binding_detail::MaybeGlobalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions>(JSContext*, unsigned int, JS::Value*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x1d35a8c]
#06: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b8f00]
#07: js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b96a4]
#08: js::CallGetter(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::MutableHandle<JS::Value>)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b9d8c]
#09: js::NativeGetProperty(JSContext*, JS::Handle<js::NativeObject*>, JS::Handle<JS::Value>, JS::Handle<JS::PropertyKey>, JS::MutableHandle<JS::Value>)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x44d83d4]
#10: js::GetProperty(JSContext*, JS::Handle<JS::Value>, JS::Handle<js::PropertyName*>, JS::MutableHandle<JS::Value>)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43c8d38]
#11: js::Interpret(JSContext*, js::RunState&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43bfad8]
#12: js::RunScript(JSContext*, js::RunState&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b8b44]
#13: js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct, js::CallReason)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b9160]
#14: js::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, js::AnyInvokeArgs const&, JS::MutableHandle<JS::Value>, js::CallReason)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x43b96a4]
#15: JS::Call(JSContext*, JS::Handle<JS::Value>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x4441b68]
#16: mozilla::dom::EventHandlerNonNull::Call(mozilla::dom::BindingCallContext&, JS::Handle<JS::Value>, mozilla::dom::Event&, JS::MutableHandle<JS::Value>, mozilla::ErrorResult&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x1b6000c]
#17: mozilla::JSEventHandler::HandleEvent(mozilla::dom::Event*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20f1b00]
#18: mozilla::EventListenerManager::HandleEventSingleListener(mozilla::EventListenerManager::Listener*, nsAtom*, mozilla::WidgetEvent*, mozilla::dom::Event*, mozilla::dom::EventTarget*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20d72a8]
#19: mozilla::EventListenerManager::HandleEventWithListenerArray(mozilla::EventListenerManager::ListenerArray*, nsAtom*, mozilla::EventMessage, nsPresContext*, mozilla::WidgetEvent*, mozilla::dom::Event**, mozilla::dom::EventTarget*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20d7f70]
#20: mozilla::EventListenerManager::HandleEventInternal(nsPresContext*, mozilla::WidgetEvent*, mozilla::dom::Event**, mozilla::dom::EventTarget*, nsEventStatus*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20d78b8]
#21: mozilla::EventTargetChainItem::HandleEventTargetChain(nsTArray<mozilla::EventTargetChainItem>&, mozilla::EventChainPostVisitor&, mozilla::EventDispatchingCallback*, mozilla::ELMCreationDetector&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20cf6e4]
#22: mozilla::EventDispatcher::Dispatch(mozilla::dom::EventTarget*, nsPresContext*, mozilla::WidgetEvent*, mozilla::dom::Event*, nsEventStatus*, mozilla::EventDispatchingCallback*, nsTArray<mozilla::dom::EventTarget*>*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20d166c]
#23: mozilla::DOMEventTargetHelper::DispatchEvent(mozilla::dom::Event&, mozilla::dom::CallerType, mozilla::ErrorResult&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20b1340]
#24: mozilla::dom::EventTarget::DispatchEvent(mozilla::dom::Event&)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x20ddd10]
#25: mozilla::dom::MessageEventRunnable::DispatchDOMEvent(JSContext*, mozilla::dom::WorkerPrivate*, mozilla::DOMEventTargetHelper*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2dd6c34]
#26: mozilla::dom::WorkerRunnable::Run()[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2e00ce8]
#27: nsThread::ProcessNextEvent(bool, bool*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x29e130]
#28: NS_ProcessNextEvent(nsIThread*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2a2104]
#29: mozilla::dom::WorkerPrivate::DoRunLoop(JSContext*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2df6b3c]
#30: mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run()[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2de7308]
#31: nsThread::ProcessNextEvent(bool, bool*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x29e130]
#32: NS_ProcessNextEvent(nsIThread*, bool)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x2a2104]
#33: mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x99b724]
#34: MessageLoop::Run()[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x945058]
#35: nsThread::ThreadFunc(void*)[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/toolkit/library/build/XUL +0x29ba40]
#36: _pt_root[/Users/echuang/Firefox/mozilla-central/obj-aarch64-apple-darwin/dist/Nightly.app/Contents/MacOS/libnss3.dylib +0x14b7d0]
#37: _pthread_start[/usr/lib/system/libsystem_pthread.dylib +0x7034]

Stack of last assignment

The case shows a race condition between WorkerPrivate releasing and Cycle-Colloect a WorkerGlobalScope.
The CheckedUnsafePtr holder is WebTaskSchedulerWorker which is strong referenced by WorkerGlobalScope. And we only release the strong reference while WorkerGlobalScope is released.
So if WorkerGlobalScope is cycle-collected after WorkerPrivate released, we hit the CheckedUnsafePtr complaining.

I guess we can unlink/nullify WorkerGlobalScope::WebTaskSchedulerWorker earlier at where we are closing/canceling the worker.

Olli, how do you think about it ?

Flags: needinfo?(echuang) → needinfo?(smaug)

Yeah, that sounds good. Also, WebTaskScheduler is only enabled on Nightly, so the patch doesn't need to be uplifted.

Flags: needinfo?(smaug)

Eden, are you going to take this?

Flags: needinfo?(echuang)
Assignee: nobody → echuang
Status: NEW → ASSIGNED

Yes, I take this.

Flags: needinfo?(echuang)
Keywords: sec-moderate

Keeping ni on me for landing a test case.

Flags: needinfo?(echuang)
Pushed by echuang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/3f58e5011a65 Nullify WebTaskSchedulerWorker::mWorkerPrivate while Disconnect(). r=smaug
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → 123 Branch

Verified bug as fixed on rev mozilla-central 20240115203916-85dba6fcf3e0.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon

Based on comment #1, this bug contains a bisection range found by bugmon. However, the Regressed by field is still not filled.

:edenchuang, if possible, could you fill the Regressed by field?

For more information, please visit BugBot documentation.

Flags: needinfo?(echuang)

See comment 5, I'd say this was regressed ever since it came into existence in bug 1734997.

Regressed by: 1734997

Set release status flags based on info from the regressing bug 1734997

Group: dom-core-security → core-security-release
Flags: needinfo?(echuang)
Flags: needinfo?(echuang)

Comment on attachment 9372841 [details]
Bug 1874080 - Nullify WebTaskSchedulerWorker::mWorkerPrivate while Disconnect(). r=smaug

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: WebTaskScheduler keeps a raw pointer to WorkerPrivate after WorkerPrivate is freed.
  • User impact if declined: Nightly user could meet UAF when using WebTaskSchduler in a Worker.
  • Fix Landed on Version: 123
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch nullifies the raw pointer when releasing the WorkerPrivate.
Flags: needinfo?(echuang)
Attachment #9372841 - Flags: approval-mozilla-esr115?

Comment on attachment 9372841 [details]
Bug 1874080 - Nullify WebTaskSchedulerWorker::mWorkerPrivate while Disconnect(). r=smaug

Approved for 115.8esr.

Attachment #9372841 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
Whiteboard: [bugmon:bisected,confirmed] → [bugmon:bisected,confirmed][adv-main123+r]
Whiteboard: [bugmon:bisected,confirmed][adv-main123+r] → [bugmon:bisected,confirmed][adv-main123+r][adv-esr115.8+r]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: