Deadlock when profiling and recording power metrics
Categories
(Core :: Gecko Profiler, defect, P2)
Tracking
()
People
(Reporter: padenot, Unassigned)
Details
There are four mutex at play here:
M0 is the thread registry lock from the Firefox Profiler
M1 is the LabeledMirrorLock from Glean
M2 is the lock to record scalars from Telemetry
M3 is the lock for a particular nsTimer
With the notation x -> y
with locks x and y meaning x
was held and y
is now acquired, we have:
M0 -> M1 -> M2 -> M3 -> M0
The stacks are the following:
Mutex M0 acquired by main thread here:
#0 pthread_rwlock_rdlock ??:?
#1 mozilla::detail::RWLockImpl::readLock() ??:?
#2 profiler_record_wakeup_count(nsTSubstring<char> const&) /objdir-ff-tsan/dist/include/mozilla/BaseProfilerDetail.h:224
#3 BaseProfilerAutoLockShared /objdir-ff-tsan/dist/include/mozilla/BaseProfilerDetail.h:266
#4 LockedRegistry /objdir-ff-tsan/dist/include/mozilla/ProfilerThreadRegistry.h:217
#5 profiler_record_wakeup_count(nsTSubstring<char> const&) /tools/profiler/core/platform.cpp:6646
#6 mozilla::glean::RecordPowerMetrics() /toolkit/components/glean/ipc/FOGIPC.cpp:357
#7 mozilla::dom::ContentChild::RecvNotifyProcessPriorityChanged(mozilla::hal::ProcessPriority const&) /dom/ipc/ContentChild.cpp:2802
Mutex M1 acquired here while holding mutex M0 in main thread:
#0 pthread_mutex_lock ??:?
#1 mozilla::detail::MutexImpl::lock() ??:?
#2 Lock /objdir-ff-tsan/dist/include/mozilla/Mutex.h:65
#3 Lock /objdir-ff-tsan/dist/include/mozilla/StaticMutex.h:39
#4 AutoLockBase /objdir-ff-tsan/dist/include/mozilla/DataMutex.h:82
#5 Lock /objdir-ff-tsan/dist/include/mozilla/DataMutex.h:96
#6 GetLabeledMirrorLock /objdir-ff-tsan/dist/include/mozilla/glean/bindings/ScalarGIFFTMap.h:33
#7 mozilla::glean::impl::CounterMetric::Add(int) const /toolkit/components/glean/bindings/private/Counter.cpp:26
#8 mozilla::glean::RecordThreadCpuUse(nsTSubstring<char> const&, unsigned long, unsigned long) /toolkit/components/glean/ipc/FOGIPC.cpp:165
#9 mozilla::profiler::ThreadRegistrationUnlockedConstReaderAndAtomicRW::RecordWakeCount() const /tools/profiler/core/ProfilerThreadRegistrationData.cpp:290
#10 profiler_record_wakeup_count(nsTSubstring<char> const&) /tools/profiler/core/platform.cpp:6650
#11 mozilla::glean::RecordPowerMetrics() /toolkit/components/glean/ipc/FOGIPC.cpp:357
#12 mozilla::dom::ContentChild::RecvNotifyProcessPriorityChanged(mozilla::hal::ProcessPriority const&) /dom/ipc/ContentChild.cpp:2802
Mutex M2 acquired here while holding mutex M1 in main thread:
#0 pthread_mutex_lock ??:?
#1 mozilla::detail::MutexImpl::lock() ??:?
#2 Lock /objdir-ff-tsan/dist/include/mozilla/Mutex.h:65
#3 Lock /objdir-ff-tsan/dist/include/mozilla/StaticMutex.h:39
#4 BaseAutoLock /objdir-ff-tsan/dist/include/mozilla/Mutex.h:236
#5 Add /toolkit/components/telemetry/core/TelemetryScalar.cpp:2632
#6 mozilla::Telemetry::ScalarAdd(mozilla::Telemetry::ScalarID, nsTSubstring<char16_t> const&, unsigned int) /toolkit/components/telemetry/core/Telemetry.cpp:2040
#7 operator()<mozilla::DataMutexBase<mozilla::UniquePtr<nsBaseHashtable<nsIntegralHashKey<unsigned int, 0>, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, nsDefaultConverter<mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> > > >, mozilla::DefaultDelete<nsBaseHashtable<nsIntegralHashKey<unsigned int, 0>, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, nsDefaultConverter<mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> > > > > >, mozilla::StaticMutexNameless>::AutoLockBase<mozilla::UniquePtr<nsBaseHashtable<nsIntegralHashKey<unsigned int, 0>, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, nsDefaultConverter<mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> > > >, mozilla::DefaultDelete<nsBaseHashtable<nsIntegralHashKey<unsigned int, 0>, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, nsDefaultConverter<mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> >, mozilla::Tuple<mozilla::Telemetry::ScalarID, nsTString<char16_t> > > > > > > > /toolkit/components/glean/bindings/private/Counter.cpp:29
#8 apply<(lambda at /toolkit/components/glean/bindings/private/Counter.cpp:26:36)> /objdir-ff-tsan/dist/include/mozilla/Maybe.h:602
#9 mozilla::glean::impl::CounterMetric::Add(int) const /toolkit/components/glean/bindings/private/Counter.cpp:26
#10 mozilla::glean::RecordPowerMetrics() /toolkit/components/glean/ipc/FOGIPC.cpp:332
#11 mozilla::dom::ContentChild::RecvNotifyProcessPriorityChanged(mozilla::hal::ProcessPriority const&) /dom/ipc/ContentChild.cpp:2802
Mutex M3 acquired here while holding mutex M2 in main thread:
#0 pthread_mutex_lock ??:?
#1 mozilla::detail::MutexImpl::lock() ??:?
#2 Lock /objdir-ff-tsan/dist/include/mozilla/Mutex.h:65
#3 Lock /objdir-ff-tsan/dist/include/mozilla/StaticMutex.h:39
#4 BaseAutoLock /objdir-ff-tsan/dist/include/mozilla/Mutex.h:236
#5 RemoveTimer /xpcom/threads/nsTimerImpl.cpp:107
#6 InitCommon /xpcom/threads/nsTimerImpl.cpp:395
#7 InitHighResolutionWithNamedFuncCallback /xpcom/threads/nsTimerImpl.cpp:427
#8 InitWithNamedFuncCallback /xpcom/threads/nsTimerImpl.cpp:413
#9 InitWithNamedFuncCallback /xpcom/threads/nsTimerImpl.h:200
#10 DoArmIPCTimerMainThread /toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:88
#11 ArmIPCTimer /toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:103
#12 mozilla::TelemetryIPCAccumulator::RecordChildScalarAction(unsigned int, bool, mozilla::Telemetry::ScalarActionType, mozilla::Variant<unsigned int, bool, nsTString<char16_t> > const&) /toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:186
#13 TelemetryScalar::Add(mozilla::Telemetry::ScalarID, unsigned int) /toolkit/components/telemetry/core/TelemetryScalar.cpp:2595
#14 mozilla::Telemetry::ScalarAdd(mozilla::Telemetry::ScalarID, unsigned int) /toolkit/components/telemetry/core/Telemetry.cpp:2019
#15 mozilla::glean::impl::CounterMetric::Add(int) const /toolkit/components/glean/bindings/private/Counter.cpp:24
#16 mozilla::glean::RecordPowerMetrics() /toolkit/components/glean/ipc/FOGIPC.cpp:331
#17 mozilla::dom::ContentChild::RecvNotifyProcessPriorityChanged(mozilla::hal::ProcessPriority const&) /dom/ipc/ContentChild.cpp:2802
Mutex M0 acquired here while holding mutex M3 in main thread:
#0 pthread_rwlock_rdlock ??:?
#1 mozilla::detail::RWLockImpl::readLock() ??:?
#2 LockShared /objdir-ff-tsan/dist/include/mozilla/BaseProfilerDetail.h:224
#3 BaseProfilerAutoLockShared /objdir-ff-tsan/dist/include/mozilla/BaseProfilerDetail.h:266
#4 LockedRegistry /objdir-ff-tsan/dist/include/mozilla/ProfilerThreadRegistry.h:217
#5 WithOffThreadRefOr<(lambda at /objdir-ff-tsan/dist/include/mozilla/ProfilerThreadState.h:69:7), bool> /objdir-ff-tsan/dist/include/mozilla/ProfilerThreadRegistry.h:270
#6 profiler_thread_is_being_profiled /objdir-ff-tsan/dist/include/mozilla/ProfilerThreadState.h:67
#7 profiler_thread_is_being_profiled_for_markers /objdir-ff-tsan/dist/include/mozilla/ProfilerMarkers.h:129
#8 AddTimer /xpcom/threads/TimerThread.cpp:635
#9 TimerThreadWrapper::AddTimer(nsTimerImpl*, mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&) /xpcom/threads/nsTimerImpl.cpp:100
#10 InitCommon /xpcom/threads/nsTimerImpl.cpp:406
#11 InitHighResolutionWithNamedFuncCallback /xpcom/threads/nsTimerImpl.cpp:427
#12 InitWithNamedFuncCallback /xpcom/threads/nsTimerImpl.cpp:413
#13 InitWithNamedFuncCallback /xpcom/threads/nsTimerImpl.h:200
#14 SetTimerInternal /xpcom/threads/IdleTaskRunner.cpp:274
#15 Schedule /xpcom/threads/IdleTaskRunner.cpp:213
#16 mozilla::ScheduleTimedOut(nsITimer*, void*) /xpcom/threads/IdleTaskRunner.cpp:172
#17 operator() /xpcom/threads/nsTimerImpl.cpp:661
#18 matchN<mozilla::Variant<nsTimerImpl::UnknownCallback, nsCOMPtr<nsITimerCallback>, nsCOMPtr<nsIObserver>, nsTimerImpl::FuncCallback, nsTimerImpl::ClosureCallback> &, (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:309
#19 matchN<mozilla::Variant<nsTimerImpl::UnknownCallback, nsCOMPtr<nsITimerCallback>, nsCOMPtr<nsIObserver>, nsTimerImpl::FuncCallback, nsTimerImpl::ClosureCallback> &, (lambda at /xpcom/threads/nsTimerImpl.cpp:658:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:318
#20 matchN<mozilla::Variant<nsTimerImpl::UnknownCallback, nsCOMPtr<nsITimerCallback>, nsCOMPtr<nsIObserver>, nsTimerImpl::FuncCallback, nsTimerImpl::ClosureCallback> &, (lambda at /xpcom/threads/nsTimerImpl.cpp:657:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:658:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:318
#21 matchN<mozilla::Variant<nsTimerImpl::UnknownCallback, nsCOMPtr<nsITimerCallback>, nsCOMPtr<nsIObserver>, nsTimerImpl::FuncCallback, nsTimerImpl::ClosureCallback> &, (lambda at /xpcom/threads/nsTimerImpl.cpp:656:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:657:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:658:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:318
#22 matchN<mozilla::Variant<nsTimerImpl::UnknownCallback, nsCOMPtr<nsITimerCallback>, nsCOMPtr<nsIObserver>, nsTimerImpl::FuncCallback, nsTimerImpl::ClosureCallback> &, (lambda at /xpcom/threads/nsTimerImpl.cpp:656:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:657:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:658:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:902
#23 match<(lambda at /xpcom/threads/nsTimerImpl.cpp:656:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:657:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:658:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:661:7), (lambda at /xpcom/threads/nsTimerImpl.cpp:662:7)> /objdir-ff-tsan/dist/include/mozilla/Variant.h:857
#24 Fire /xpcom/threads/nsTimerImpl.cpp:655
#25 nsTimerEvent::Run() /xpcom/threads/TimerThread.cpp:365
Comment 1•1 year ago
|
||
The severity field is not set for this bug.
:florian, could you have a look please?
For more information, please visit auto_nag documentation.
Updated•1 year ago
|
Updated•1 year ago
|
Interesting, I will take a closer look at it soon but how often do you encounter this?
If I'm not mistaken, it looks like glean and profiler try to access the profiler thread registery mutex at the same time. We could maybe return early on glean if the profiler is running?
Description
•