Closed Bug 1655632 Opened 4 years ago Closed 4 years ago

AddressSanitizer: SEGV /gecko/js/src/vm/JSContext.h:344:37 in realm

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
84 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox81 --- wontfix
firefox82 --- wontfix
firefox83 --- wontfix
firefox84 --- fixed

People

(Reporter: jkratzer, Assigned: tt)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-nullptr, testcase, Whiteboard: [bugmon:confirm])

Attachments

(2 files)

Found while fuzzing mozilla-central rev f4703bddd567. I have a testcase, however it's not fully reduced. I will attach a pernosco trace of this testcase shortly.

==27981==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000098 (pc 0x7fe990db026b bp 0x7fe8f4bf4a70 sp 0x7fe8f4bf4a70 T23)
==27981==The signal is caused by a READ memory access.
==27981==Hint: address points to the zero page.
    #0 0x7fe990db026a in realm /gecko/js/src/vm/JSContext.h:344:37
    #1 0x7fe990db026a in JS::CurrentGlobalOrNull(JSContext*) /gecko/js/src/jsapi.cpp:1235:12
    #2 0x7fe98c685adf in mozilla::dom::WorkerJSContext::DispatchToMicroTask(already_AddRefed<mozilla::MicroTaskRunnable>) /gecko/dom/workers/RuntimeService.cpp:913:38
    #3 0x7fe9845f37c4 in mozilla::CycleCollectedJSContext::enqueuePromiseJob(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<JSObject*>, JS::Handle<JSObject*>) /gecko/xpcom/base/CycleCollectedJSContext.cpp:254:3
    #4 0x7fe991230b44 in JSRuntime::enqueuePromiseJob(JSContext*, JS::Handle<JSFunction*>, JS::Handle<JSObject*>, JS::Handle<js::GlobalObject*>) /gecko/js/src/vm/Runtime.cpp:610:24
    #5 0x7fe990ff851e in EnqueuePromiseReactionJob(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::PromiseState) /gecko/js/src/builtin/Promise.cpp:1257:25
    #6 0x7fe990fbf406 in operator() /gecko/js/src/builtin/Promise.cpp:1619:12
    #7 0x7fe990fbf406 in ForEachReaction<(lambda at /builds/worker/checkouts/gecko/js/src/builtin/Promise.cpp:1618:44)> /gecko/js/src/builtin/Promise.cpp:1591:12
    #8 0x7fe990fbf406 in TriggerPromiseReactions /gecko/js/src/builtin/Promise.cpp:1618:10
    #9 0x7fe990fbf406 in ResolvePromise(JSContext*, JS::Handle<js::PromiseObject*>, JS::Handle<JS::Value>, JS::PromiseState, JS::Handle<js::SavedFrame*>) /gecko/js/src/builtin/Promise.cpp:1310:10
    #10 0x7fe990ff0cd7 in FulfillMaybeWrappedPromise(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>) /gecko/js/src/builtin/Promise.cpp:1345:10
    #11 0x7fe990fae843 in ResolvePromiseInternal(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>) /gecko/js/src/builtin/Promise.cpp:991:12
    #12 0x7fe990fbe9e3 in js::PromiseObject::resolve(JSContext*, JS::Handle<js::PromiseObject*>, JS::Handle<JS::Value>) /gecko/js/src/builtin/Promise.cpp:5614:12
    #13 0x7fe990dcd913 in ResolveOrRejectPromise(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, bool) /gecko/js/src/jsapi.cpp:3927:19
    #14 0x7fe98c75bbd3 in mozilla::dom::Promise::MaybeResolve(JSContext*, JS::Handle<JS::Value>) /gecko/dom/promise/Promise.cpp:294:8
    #15 0x7fe98c75d5b4 in void mozilla::dom::Promise::MaybeSomething<JS::Handle<JS::Value> const&>(JS::Handle<JS::Value> const&, void (mozilla::dom::Promise::*)(JSContext*, JS::Handle<JS::Value>)) /builds/worker/workspace/obj-build/dist/include/mozilla/dom/Promise.h:327:5
    #16 0x7fe98c6f407d in operator() /gecko/dom/workers/WorkerScope.cpp:929:27
    #17 0x7fe98c6f407d in InvokeMethod<(lambda at /builds/worker/checkouts/gecko/dom/workers/WorkerScope.cpp:927:16), void ((lambda at /builds/worker/checkouts/gecko/dom/workers/WorkerScope.cpp:927:16)::*)(const mozilla::MozPromise<bool, nsresult, true>::ResolveOrRejectValue &) const, mozilla::MozPromise<bool, nsresult, true>::ResolveOrRejectValue> /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:553:12
    #18 0x7fe98c6f407d in InvokeCallbackMethod<false, (lambda at /builds/worker/checkouts/gecko/dom/workers/WorkerScope.cpp:927:16), void ((lambda at /builds/worker/checkouts/gecko/dom/workers/WorkerScope.cpp:927:16)::*)(const mozilla::MozPromise<bool, nsresult, true>::ResolveOrRejectValue &) const, mozilla::MozPromise<bool, nsresult, true>::ResolveOrRejectValue, RefPtr<mozilla::MozPromise<bool, nsresult, true>::Private> > /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:584:5
    #19 0x7fe98c6f407d in mozilla::MozPromise<bool, nsresult, true>::ThenValue<mozilla::dom::ServiceWorkerGlobalScope::SkipWaiting(mozilla::ErrorResult&)::$_2>::DoResolveOrRejectInternal(mozilla::MozPromise<bool, nsresult, true>::ResolveOrRejectValue&) /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:837:7
    #20 0x7fe985b540a2 in mozilla::MozPromise<bool, nsresult, true>::ThenValueBase::ResolveOrRejectRunnable::Run() /builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h:410:21
    #21 0x7fe98c6ea457 in mozilla::dom::(anonymous namespace)::ExternalRunnableWrapper::Cancel() /gecko/dom/workers/WorkerPrivate.cpp:198:22
    #22 0x7fe98c6d8ea4 in mozilla::dom::WorkerRunnable::Run() /gecko/dom/workers/WorkerRunnable.cpp:240:5
    #23 0x7fe9847ff8ac in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1234:14
    #24 0x7fe9847f96ae in NS_ProcessPendingEvents(nsIThread*, unsigned int) /gecko/xpcom/threads/nsThreadUtils.cpp:461:19
    #25 0x7fe98c6c39bb in mozilla::dom::WorkerPrivate::ClearMainEventQueue(mozilla::dom::WorkerPrivate::WorkerRanOrNot) /gecko/dom/workers/WorkerPrivate.cpp:3573:5
    #26 0x7fe98c684a0f in mozilla::dom::workerinternals::(anonymous namespace)::WorkerThreadPrimaryRunnable::Run() /gecko/dom/workers/RuntimeService.cpp:2239:21
    #27 0x7fe9847ff8ac in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1234:14
    #28 0x7fe98480a79c in NS_ProcessNextEvent(nsIThread*, bool) /gecko/xpcom/threads/nsThreadUtils.cpp:513:10
    #29 0x7fe985bc2ed4 in mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) /gecko/ipc/glue/MessagePump.cpp:332:5
    #30 0x7fe985aa1e97 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:334:10
    #31 0x7fe985aa1e97 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:327:3
    #32 0x7fe985aa1e97 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:309:3
    #33 0x7fe9847f8257 in nsThread::ThreadFunc(void*) /gecko/xpcom/threads/nsThread.cpp:447:10
    #34 0x7fe9a9c85d3e in _pt_root /gecko/nsprpub/pr/src/pthreads/ptthread.c:201:5
    #35 0x7fe9a98c76da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)
    #36 0x7fe9a88a5a3e in clone /build/glibc-2ORdQG/glibc-2.27/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:95

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /gecko/js/src/vm/JSContext.h:344:37 in realm
Thread T23 (DOM Worker) created by T0 (Web Content) here:
    #0 0x55bde7acca1a in pthread_create /builds/worker/fetches/llvm-project/llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:209:3
    #1 0x7fe9a9c761e5 in _PR_CreateThread /gecko/nsprpub/pr/src/pthreads/ptthread.c:458:14
    #2 0x7fe9a9c6715e in PR_CreateThread /gecko/nsprpub/pr/src/pthreads/ptthread.c:533:12
    #3 0x7fe9847faf37 in nsThread::Init(nsTSubstring<char> const&) /gecko/xpcom/threads/nsThread.cpp:659:8
    #4 0x7fe98c6e75c7 in mozilla::dom::WorkerThread::Create(mozilla::dom::WorkerThreadFriendKey const&) /gecko/dom/workers/WorkerThread.cpp:94:7
    #5 0x7fe98c65ff7a in mozilla::dom::workerinternals::RuntimeService::ScheduleWorker(mozilla::dom::WorkerPrivate&) /gecko/dom/workers/RuntimeService.cpp:1351:14
    #6 0x7fe98c65ecb4 in mozilla::dom::workerinternals::RuntimeService::RegisterWorker(mozilla::dom::WorkerPrivate&) /gecko/dom/workers/RuntimeService.cpp:1218:19
    #7 0x7fe98c6b913e in mozilla::dom::WorkerPrivate::Constructor(JSContext*, nsTSubstring<char16_t> const&, bool, mozilla::dom::WorkerType, nsTSubstring<char16_t> const&, nsTSubstring<char> const&, mozilla::dom::WorkerLoadInfo*, mozilla::ErrorResult&, nsTString<char16_t>) /gecko/dom/workers/WorkerPrivate.cpp:2420:24
    #8 0x7fe98c6f7482 in mozilla::dom::RemoteWorkerChild::ExecWorkerOnMainThread(mozilla::dom::RemoteWorkerData&&) /gecko/dom/workers/remoteworkers/RemoteWorkerChild.cpp:437:41
    #9 0x7fe98c715869 in operator() /gecko/dom/workers/remoteworkers/RemoteWorkerChild.cpp:298:29
    #10 0x7fe98c715869 in mozilla::detail::RunnableFunction<mozilla::dom::RemoteWorkerChild::ExecWorker(mozilla::dom::RemoteWorkerData const&)::$_2>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:577:5
    #11 0x7fe9847c427d in mozilla::SchedulerGroup::Runnable::Run() /gecko/xpcom/threads/SchedulerGroup.cpp:146:20
    #12 0x7fe9847ce9e9 in mozilla::RunnableTask::Run() /gecko/xpcom/threads/TaskController.cpp:242:16
    #13 0x7fe9847caed5 in mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:512:26
    #14 0x7fe9847c8d92 in mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /gecko/xpcom/threads/TaskController.cpp:371:15
    #15 0x7fe9847c91cf in mozilla::TaskController::ProcessPendingMTTask(bool) /gecko/xpcom/threads/TaskController.cpp:168:36
    #16 0x7fe9847da824 in operator() /gecko/xpcom/threads/TaskController.cpp:86:37
    #17 0x7fe9847da824 in mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_5>::Run() /builds/worker/workspace/obj-build/dist/include/nsThreadUtils.h:577:5
    #18 0x7fe9847ff8ac in nsThread::ProcessNextEvent(bool, bool*) /gecko/xpcom/threads/nsThread.cpp:1234:14
    #19 0x7fe98480a79c in NS_ProcessNextEvent(nsIThread*, bool) /gecko/xpcom/threads/nsThreadUtils.cpp:513:10
    #20 0x7fe985bc1144 in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /gecko/ipc/glue/MessagePump.cpp:109:5
    #21 0x7fe985aa1e97 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:334:10
    #22 0x7fe985aa1e97 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:327:3
    #23 0x7fe985aa1e97 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:309:3
    #24 0x7fe98cde1ab8 in nsBaseAppShell::Run() /gecko/widget/nsBaseAppShell.cpp:137:27
    #25 0x7fe9909aa6a6 in XRE_RunAppShell() /gecko/toolkit/xre/nsEmbedFunctions.cpp:913:20
    #26 0x7fe985aa1e97 in RunInternal /gecko/ipc/chromium/src/base/message_loop.cc:334:10
    #27 0x7fe985aa1e97 in RunHandler /gecko/ipc/chromium/src/base/message_loop.cc:327:3
    #28 0x7fe985aa1e97 in MessageLoop::Run() /gecko/ipc/chromium/src/base/message_loop.cc:309:3
    #29 0x7fe9909a9c8f in XRE_InitChildProcess(int, char**, XREChildData const*) /gecko/toolkit/xre/nsEmbedFunctions.cpp:744:34
    #30 0x55bde7b14f53 in content_process_main /gecko/browser/app/../../ipc/contentproc/plugin-container.cpp:56:28
    #31 0x55bde7b14f53 in main /gecko/browser/app/nsBrowserApp.cpp:303:18
    #32 0x7fe9a87a5b96 in __libc_start_main /build/glibc-2ORdQG/glibc-2.27/csu/../csu/libc-start.c:310

==27981==ABORTING

A pernosco session is available at the following URL:
https://pernos.co/debug/gdNHeQjOjU4yhS-6LA8bOQ/index.html

Group: core-security → dom-core-security

So we seem to have a JSContext nullptr which leads to a zero page read here.

I see plenty of possibilities for WorkerPrivate* GetCurrentThreadWorkerPrivate() to return nullptr, which in turn makes JSContext* GetCurrentWorkerThreadJSContext() return nullptr. But interestingly, looking at the pernosco session, this is not our case here (though I am a bit afraid, that this could bite us some day, too).

In fact, I see in the pernosco session, that WorkerPrivate* GetCurrentThreadWorkerPrivate() does NOT return nullptr. But the mJSContext of the existing WorkerPrivate is nullptr.

The mJSContext of WorkerPrivate is directly set in very few points:

I assume, that there is some event in the queue after the mJSContext has been set to nullptr. We just assert on cx, so in release builds this nullptr slips through. I'd propose to propagate an error, instead.

Jason, are there chances to have that fully reduced, reproducible test case?
Eden, can you take a look?

Flags: needinfo?(jkratzer)
Flags: needinfo?(echuang)
Keywords: bugmon
Bugmon Analysis:
Failed to identify testcase.  Please ensure that the testcase meets the requirements identified here: https://github.com/MozillaSecurity/bugmon#testcase-identification
Removing bugmon keyword as no further action possible.
Please review the bug and re-add the keyword for further analysis.
Assignee: nobody → echuang
Flags: needinfo?(echuang)
Group: dom-core-security

It seems, we have no fully reduced testcase here. But we have pernosco.
Tom, do you mind to take a look if you see something obvious here?

Assignee: echuang → nobody
Flags: needinfo?(jkratzer) → needinfo?(ttung)

I saw "The debugging database for this trace has expired (typically 7 days after the trace was collected) and is being rebuilt. This may take up to a couple hours. " in the pernosco session and I will take a closer look once the rebuild is done.

I couldn't find the testcase in the attachments. Jason, would you mind attaching that even if it's not reduced or pointing me out where it is?

(In reply to Jens Stutte [:jstutte] (REO for FF 81) from comment #2)
Reading through Jens's analysis, I think the only possibility is that DoRunLoop() was executed right before WorkerJSContext::DispatchToMicroTask (where it consequently access the null'ed cx).

Combined with the information in comment#0, it seems the sequences are:

  • WorkerThreadPrimaryRunnable::Run (frame 26 in comment #0)
    • DoRunLoop and set mJSContext to be a nullptr
    • ClearMainEventQueue (frame 25 in comment #0)
      ...
      • CycleCollectedJSContext
        • DispatchToMicroTask (where it accessed the cx from worker and thus crashed)

I haven't figured out what can cause the NS_ProcessPendingEvents (frame 24)

So, if my analysis is correct, then, so far, I reckon the WorkerJSContext seems to be still alive base on this.

The question here is that if it's reasonable to change a way to access the WorkerJSContext in DispatchToMicroTask, or just defer the time for null the mJSContext in worker, or, as Jens suggested in comment #2, we expect the context can be a nullptr in some edge cases so that we propagate the error in the release build for cases like that.

I will need to read through the worker shutdown stuff to have a more clear picture of this.

Assignee: nobody → ttung
Flags: needinfo?(ttung) → needinfo?(jkratzer)

(In reply to Tom Tung [:tt, :ttung] from comment #5)

The question here is that if it's reasonable to change a way to access the WorkerJSContext in DispatchToMicroTask, or just defer the time for null the mJSContext in worker, or, as Jens suggested in comment #2, we expect the context can be a nullptr in some edge cases so that we propagate the error in the release build for cases like that.

To elaborate more on "change the way to access the WorkerJSContext in DispatchToMicroTask", I meant maybe we can use the aCx here rather than get the JSContext again from (GetCurrentWorkerThreadJSContext)[https://searchfox.org/mozilla-central/rev/23dd7c485a6525bb3a974abeaafaf34bfb43d76b/dom/workers/RuntimeService.cpp#914] since it's possible that the cx has already been set to null by Worker

Attached file testcase.zip

Apologies for the delay. The attached testcase does not trigger 100% of the time and sees a reduction in reliability the further reduced it is. However, it should be stable enough to test any patches you may have.

The easiest way to replay this testcase is with grizzly.replay:

unzip testcase.zip
pip install grizzly-framework
python3 -m grizzly.replay --xvfb --repeat 10 ~/builds/mc-asan/firefox ./
Flags: needinfo?(jkratzer)
Attachment #9170974 - Attachment description: Bug 1655632 - Use the member JSContext rather than a common utility function; → Bug 1655632 - Use the member JSContext getter rather than a global JSContext getter;

(In reply to Jason Kratzer [:jkratzer] from comment #9)

Created attachment 9171041 [details]
testcase.zip

Apologies for the delay. The attached testcase does not trigger 100% of the time and sees a reduction in reliability the further reduced it is. However, it should be stable enough to test any patches you may have.

The easiest way to replay this testcase is with grizzly.replay:

unzip testcase.zip
pip install grizzly-framework
python3 -m grizzly.replay --xvfb --repeat 10 ~/builds/mc-asan/firefox ./

Thanks! I can reporduce the issue without the patch in the first ten runs and cannot repoduce the issue with the patch for more than thirty runs.

Status: NEW → ASSIGNED
Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/autoland/rev/6fb0d10298d6
Use the member JSContext getter rather than a global JSContext getter; r=dom-workers-and-storage-reviewers,asuth
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: