Closed Bug 1729434 Opened 3 years ago Closed 3 years ago

Deadlock with the nativeallocation feature enabled

Categories

(Core :: Gecko Profiler, defect, P2)

defect

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox-esr91 --- unaffected
firefox92 --- unaffected
firefox93 --- wontfix
firefox94 --- fixed

People

(Reporter: florian, Assigned: mozbugz)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files)

Stack from lldb:

(lldb) bt
error: need to add support for DW_TAG_base_type 'auto' encoded with DW_ATE = 0x0, bit_size = 0
* thread #1, name = 'GeckoMain', queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007fff7202e062 libsystem_kernel.dylib`__psynch_mutexwait + 10
    frame #1: 0x00007fff720ec917 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_wait + 83
    frame #2: 0x00007fff720ea937 libsystem_pthread.dylib`_pthread_mutex_firstfit_lock_slow + 222
    frame #3: 0x000000011591d97b libmozglue.dylib`mozilla::detail::MutexImpl::lock() at Mutex_posix.cpp:96:3 [opt]
    frame #4: 0x000000011591d974 libmozglue.dylib`mozilla::detail::MutexImpl::lock(this=<unavailable>) at Mutex_posix.cpp:118 [opt]
    frame #5: 0x000000010b4147fe XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::baseprofiler::detail::BaseProfilerMutex::Lock(this=<unavailable>) at BaseProfilerDetail.h:54:35 [opt]
    frame #6: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::baseprofiler::detail::BaseProfilerAutoLock::BaseProfilerAutoLock(this=<unavailable>, aMutex=<unavailable>) at BaseProfilerDetail.h:101 [opt]
    frame #7: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::baseprofiler::detail::BaseProfilerAutoLock::BaseProfilerAutoLock(this=<unavailable>, aMutex=<unavailable>) at BaseProfilerDetail.h:100 [opt]
    frame #8: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::profiler::ThreadRegistration::OnThreadRef::ConstRWOnThreadWithLock::ConstRWOnThreadWithLock(this=<unavailable>, aLockedRWOnThread=<unavailable>, aDataMutex=<unavailable>) at ProfilerThreadRegistration.h:174 [opt]
    frame #9: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::profiler::ThreadRegistration::OnThreadRef::ConstRWOnThreadWithLock::ConstRWOnThreadWithLock(this=<unavailable>, aLockedRWOnThread=<unavailable>, aDataMutex=<unavailable>) at ProfilerThreadRegistration.h:174 [opt]
    frame #10: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] mozilla::profiler::ThreadRegistration::OnThreadRef::ConstLockedRWOnThread(this=<unavailable>) const at ProfilerThreadRegistration.h:181 [opt]
    frame #11: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] auto mozilla::profiler::ThreadRegistration::OnThreadRef::WithConstLockedRWOnThread<DoSyncSample(unsigned int, mozilla::profiler::ThreadRegistrationUnlockedReaderAndAtomicRWOnThread const&, mozilla::TimeStamp const&, Registers const&, ProfileBuffer&, mozilla::StackCaptureOptions)::$_15::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread const&)>(this=<unavailable>, aF=<unavailable>)::$_15::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread const&)&&) const at ProfilerThreadRegistration.h:187 [opt]
    frame #12: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] DoSyncSample(this=<unavailable>, aOnThreadRef=<unavailable>)::$_15::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const at platform.cpp:2352 [opt]
    frame #13: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) at ProfilerThreadRegistration.h:285 [opt]
    frame #14: 0x000000010b41479b XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] DoSyncSample(aFeatures=<unavailable>, aThreadData=0x0000000115b37120, aNow=<unavailable>, aRegs=0x00007ffee87e3b10, aBuffer=0x00007ffee87e3b50, aCaptureOptions=Full) at platform.cpp:2350 [opt]
    frame #15: 0x000000010b414638 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] profiler_capture_backtrace_into(this=<unavailable>, aOnThreadRef=<unavailable>)::$_24::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const at platform.cpp:5559 [opt]
    frame #16: 0x000000010b414493 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) at ProfilerThreadRegistration.h:299 [opt]
    frame #17: 0x000000010b41445e XUL`profiler_capture_backtrace_into(aChunkedBuffer=<unavailable>, aCaptureOptions=Full) at platform.cpp:5542 [opt]
    frame #18: 0x000000010b401d3e XUL`mozilla::profiler::profiler_add_native_allocation_marker(long long, unsigned long) at BaseProfilerMarkersDetail.h:285:9 [opt]
    frame #19: 0x000000010b401c02 XUL`mozilla::profiler::profiler_add_native_allocation_marker(long long, unsigned long) [inlined] mozilla::ProfileBufferBlockIndex AddMarkerToBuffer<mozilla::profiler::profiler_add_native_allocation_marker(long long, unsigned long)::NativeAllocationMarker, long long, unsigned long, mozilla::baseprofiler::BaseProfilerThreadId>(aBuffer=0x0000000115b5e008, aName=0x00007ffee87e8058, aCategory=<unavailable>, aOptions=0x00007ffee87e8018, aPayloadArguments=0x00007ffee87e8000, aPayloadArguments=0x00007ffee87e7ff8, aPayloadArguments=0x00007ffee87e7ff0)::NativeAllocationMarker, long long const&, unsigned long const&, mozilla::baseprofiler::BaseProfilerThreadId const&) at ProfilerMarkers.h:107 [opt]
    frame #20: 0x000000010b401be5 XUL`mozilla::profiler::profiler_add_native_allocation_marker(long long, unsigned long) [inlined] mozilla::ProfileBufferBlockIndex profiler_add_marker<mozilla::profiler::profiler_add_native_allocation_marker(long long, unsigned long)::NativeAllocationMarker, long long, unsigned long, mozilla::baseprofiler::BaseProfilerThreadId>(aName=0x00007ffee87e8058, aCategory=<unavailable>, aOptions=0x00007ffee87e8018, aPayloadArguments=0x00007ffee87e8000, aPayloadArguments=0x00007ffee87e7ff8, aPayloadArguments=0x00007ffee87e7ff0)::NativeAllocationMarker, long long const&, unsigned long const&, mozilla::baseprofiler::BaseProfilerThreadId const&) at ProfilerMarkers.h:142 [opt]
    frame #21: 0x000000010b401bbe XUL`mozilla::profiler::profiler_add_native_allocation_marker(aSize=256, aMemoryAddress=4662464768) at memory_hooks.cpp:111 [opt]
    frame #22: 0x000000010b401952 XUL`mozilla::profiler::AllocCallback(aPtr=0x0000000115e79100, aReqSize=<no summary available>) at memory_hooks.cpp:429:7 [opt]
    frame #23: 0x000000010b4015b7 XUL`replace_moz_arena_realloc(aArena=<no summary available>, aPtr=<unavailable>, aSize=<no summary available>) at memory_hooks.cpp:549:3 [opt]
    frame #24: 0x00000001082014b1 XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] js_arena_realloc(arena=<no summary available>, p=<unavailable>, bytes=<no summary available>) at Utility.h:400:10 [opt]
    frame #25: 0x00000001082014a9 XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] char* js_pod_arena_realloc<char>(arena=<no summary available>, prior=<no value available>, oldSize=<no summary available>, newSize=<no summary available>) at Utility.h:605 [opt]
    frame #26: 0x00000001082014a9 XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] char* js::AllocPolicyBase::maybe_pod_arena_realloc<char>(this=<no summary available>, arenaId=<no summary available>, p=<no value available>, oldSize=<no summary available>, newSize=<no summary available>) at AllocPolicy.h:40 [opt]
    frame #27: 0x00000001082014a9 XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] char* js::AllocPolicyBase::pod_arena_realloc<char>(this=<no summary available>, arenaId=<no summary available>, p=<no value available>, oldSize=<no summary available>, newSize=<no summary available>) at AllocPolicy.h:53 [opt]
    frame #28: 0x00000001082014a9 XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] char* js::AllocPolicyBase::pod_realloc<char>(this=<no summary available>, p=<no value available>, oldSize=<no summary available>, newSize=<no summary available>) at AllocPolicy.h:78 [opt]
    frame #29: 0x000000010820149f XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(unsigned long) [inlined] mozilla::detail::VectorImpl<char, 0ul, js::SystemAllocPolicy, true>::growTo(aV=<no summary available>, aNewCap=<no summary available>) at Vector.h:209 [opt]
    frame #30: 0x000000010820149f XUL`mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::growStorageBy(this=<no summary available>, aIncr=<no summary available>) at Vector.h:1021 [opt]
    frame #31: 0x000000010c0d12d1 XUL`js::wasm::Code::ensureProfilingLabels(bool) const [inlined] mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::internalEnsureCapacity(this=<no summary available>, aNeeded=<no summary available>) at Vector.h:1363:9 [opt]
    frame #32: 0x000000010c0d12b8 XUL`js::wasm::Code::ensureProfilingLabels(bool) const [inlined] bool mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::append<char>(this=<no summary available>, aInsBegin=<no value available>, aInsEnd=<no value available>) at Vector.h:1383 [opt]
    frame #33: 0x000000010c0d12b8 XUL`js::wasm::Code::ensureProfilingLabels(bool) const [inlined] bool mozilla::Vector<char, 0ul, js::SystemAllocPolicy>::append<char>(this=<no summary available>, aInsBegin=<no value available>, aInsLength=<no summary available>) at Vector.h:1469 [opt]
    frame #34: 0x000000010c0d12b8 XUL`js::wasm::Code::ensureProfilingLabels(this=<no summary available>, profilingEnabled=<no summary available>) const at WasmCode.cpp:1431 [opt]
    frame #35: 0x000000010c149b4c XUL`js::wasm::Realm::ensureProfilingLabels(this=<no summary available>, profilingEnabled=<no summary available>) at WasmRealm.cpp:121:15 [opt]
    frame #36: 0x000000010b925538 XUL`js::GeckoProfilerRuntime::enable(this=<no summary available>, enabled=<no summary available>) at GeckoProfiler.cpp:149:13 [opt]
    frame #37: 0x000000010b92648f XUL`js::EnableContextProfilingStack(cx=<no summary available>, enabled=<no summary available>) at GeckoProfiler.cpp:499:34 [opt] [artificial]
    frame #38: 0x000000010b416044 XUL`mozilla::profiler::ThreadRegistrationLockedRWOnThread::PollJSSampling(this=<no summary available>) at ProfilerThreadRegistrationData.cpp:185:7 [opt]
    frame #39: 0x000000010b43d1a4 XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() [inlined] PollJSSamplingForCurrentThread(this=<no summary available>, aThreadData=<no summary available>)::$_21::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread&)::operator()(mozilla::profiler::ThreadRegistrationLockedRWOnThread&) const at platform.cpp:4727:27 [opt]
    frame #40: 0x000000010b43d19c XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() [inlined] auto mozilla::profiler::ThreadRegistration::OnThreadRef::WithLockedRWOnThread<PollJSSamplingForCurrentThread()::$_21::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread&)>(this=<no summary available>, aF=<no summary available>)::$_21::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread&)&&) at ProfilerThreadRegistration.h:225 [opt]
    frame #41: 0x000000010b43d17e XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() [inlined] PollJSSamplingForCurrentThread(this=<no summary available>, aOnThreadRef=OnThreadRef @ scalar)::$_21::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const at platform.cpp:4725 [opt]
    frame #42: 0x000000010b43d17e XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() at ProfilerThreadRegistration.h:285 [opt]
    frame #43: 0x000000010b43d154 XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() [inlined] PollJSSamplingForCurrentThread() at platform.cpp:4723 [opt]
    frame #44: 0x000000010b43d154 XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run() [inlined] TriggerPollJSSamplingOnMainThread(this=<no summary available>)::$_31::operator()() const at platform.cpp:4749 [opt]
    frame #45: 0x000000010b43d154 XUL`mozilla::detail::RunnableFunction<TriggerPollJSSamplingOnMainThread()::$_31>::Run(this=<unavailable>) at nsThreadUtils.h:531 [opt]
    frame #46: 0x0000000107856fc8 XUL`mozilla::SchedulerGroup::Runnable::Run(this=<unavailable>) at SchedulerGroup.cpp:144:20 [opt]
    frame #47: 0x00000001078723a5 XUL`mozilla::RunnableTask::Run(this=<unavailable>) at TaskController.cpp:502:16 [opt]
    frame #48: 0x000000010785bf0d XUL`mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(this=<no summary available>, aProofOfLock=<no summary available>) at TaskController.cpp:805:26 [opt]
    frame #49: 0x000000010785af7a XUL`mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(this=<no summary available>, aProofOfLock=<no summary available>) at TaskController.cpp:641:15 [opt]
    frame #50: 0x000000010785b1c9 XUL`mozilla::TaskController::ProcessPendingMTTask(this=<no summary available>, aMayWait=<no summary available>) at TaskController.cpp:425:36 [opt]
    frame #51: 0x0000000107875b55 XUL`mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_1>::Run() [inlined] mozilla::TaskController::InitializeInternal(this=<no summary available>)::$_1::operator()() const at TaskController.cpp:138:37 [opt]
    frame #52: 0x0000000107875b44 XUL`mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_1>::Run(this=<unavailable>) at nsThreadUtils.h:531 [opt]
    frame #53: 0x00000001078672f8 XUL`nsThread::ProcessNextEvent(this=<unavailable>, aMayWait=<no summary available>, aResult=<no summary available>) at nsThread.cpp:1148:16 [opt]
    frame #54: 0x000000010786b8c9 XUL`NS_ProcessNextEvent(aThread=<unavailable>, aMayWait=<no summary available>) at nsThreadUtils.cpp:466:10 [opt]
    frame #55: 0x0000000107e77aa5 XUL`mozilla::ipc::MessagePump::Run(this=<unavailable>, aDelegate=<unavailable>) at MessagePump.cpp:107:5 [opt]
    frame #56: 0x0000000107e22287 XUL`MessageLoop::Run() [inlined] MessageLoop::RunInternal(this=<unavailable>) at message_loop.cc:331:10 [opt]
    frame #57: 0x0000000107e2227b XUL`MessageLoop::Run() [inlined] MessageLoop::RunHandler(this=<unavailable>) at message_loop.cc:324 [opt]
    frame #58: 0x0000000107e2227b XUL`MessageLoop::Run(this=<unavailable>) at message_loop.cc:306 [opt]
    frame #59: 0x000000010a1b8429 XUL`nsBaseAppShell::Run(this=<unavailable>) at nsBaseAppShell.cpp:137:27 [opt]
    frame #60: 0x000000010a22b111 XUL`nsAppShell::Run(this=<unavailable>) at nsAppShell.mm:753:26 [opt]
    frame #61: 0x000000010b6fdfa4 XUL`XRE_RunAppShell() at nsEmbedFunctions.cpp:917:20 [opt]
    frame #62: 0x0000000107e22287 XUL`MessageLoop::Run() [inlined] MessageLoop::RunInternal(this=<unavailable>) at message_loop.cc:331:10 [opt]
    frame #63: 0x0000000107e2227b XUL`MessageLoop::Run() [inlined] MessageLoop::RunHandler(this=<unavailable>) at message_loop.cc:324 [opt]
    frame #64: 0x0000000107e2227b XUL`MessageLoop::Run(this=<unavailable>) at message_loop.cc:306 [opt]
    frame #65: 0x000000010b6fdd64 XUL`XRE_InitChildProcess(aArgc=<no summary available>, aArgv=<no summary available>, aChildData=<no summary available>) at nsEmbedFunctions.cpp:749:34 [opt]
    frame #66: 0x0000000107419f3f plugin-container`main [inlined] content_process_main(bootstrap=<unavailable>, argc=<no summary available>, argv=<no summary available>) at plugin-container.cpp:57:28 [opt]
    frame #67: 0x0000000107419f16 plugin-container`main(argc=<no summary available>, argv=<no summary available>) at MozillaRuntimeMain.cpp:72 [opt]
    frame #68: 0x00007fff71eeacc9 libdyld.dylib`start + 1

Could this be a regression from bug 1721939?

Thanks for this. Looking at these two frames:

    frame #11: 0x000000010b4147e0 XUL`profiler_capture_backtrace_into(mozilla::ProfileChunkedBuffer&, mozilla::StackCaptureOptions) [inlined] auto mozilla::profiler::ThreadRegistration::OnThreadRef::WithConstLockedRWOnThread<DoSyncSample(unsigned int, mozilla::profiler::ThreadRegistrationUnlockedReaderAndAtomicRWOnThread const&, mozilla::TimeStamp const&, Registers const&, ProfileBuffer&, mozilla::StackCaptureOptions)::$_15::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread const&)>(this=<unavailable>, aF=<unavailable>)::$_15::operator()(mozilla::profiler::ThreadRegistration::OnThreadRef) const::'lambda'(mozilla::profiler::ThreadRegistrationLockedRWOnThread const&)&&) const at ProfilerThreadRegistration.h:187 [opt]
    frame #38: 0x000000010b416044 XUL`mozilla::profiler::ThreadRegistrationLockedRWOnThread::PollJSSampling(this=<no summary available>) at ProfilerThreadRegistrationData.cpp:185:7 [opt]

#38 locks the per-thread mutex, and then inside that scope #11 tries to lock it again.

Could this be a regression from bug 1721939?

Close! It's really its follow-up bug 1722261.
It started using the new mutexes, but "native allocations" don't check them (yet) like they do with the main profiler mutex.

I think we'll need to add these mutexes to profiler_is_locked_on_current_thread.

Assignee: nobody → gsquelart
Severity: -- → S4
Keywords: regression
Priority: -- → P2
Regressed by: 1722261
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1722261

In the following patch, profiler_is_locked_on_current_thread could allocate memory on some platforms (e.g.: While accessing the TLS on Linux, because the very first access on a thread allocates some memory for it.)
So we need to prevent the interception of memory allocations before calling profiler_is_locked_on_current_thread.
This is done by making the ThreadIntercept class RAII, so that as soon as it's created it already blocks nested interceptions, before calling profiler_is_locked_on_current_thread if necessary to potentially block even more interceptions.

This has the advantage of making ThreadIntercept safer overall, there is now only one way to use it: Create on the stack, and check IsBlocked().
Other functions like Block() and Unblock() previously made it possible to do incorrect things.
We don't need the extra space of Maybe from the old MaybeGet() function anymore.
And ThreadIntercept itself is smaller than AutoBlockIntercepts and its reference to the old ThreadIntercept inside the Maybe.

Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3870a49f8476
Rework ThreadIntercept to block while calling profiler_is_locked_on_current_thread - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/24f07c6c463a
Prevent more reentrant deadlocks in nativeallocations - r=canaltinova
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: