Closed Bug 1342839 Opened 7 years ago Closed 3 years ago

Massive warning messages for the order of releasing resource of TaskTracer

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox54 --- affected

People

(Reporter: sinker, Assigned: sinker)

References

Details

(Whiteboard: [ps-radar])

Attachments

(1 file, 1 obsolete file)

Following messages are given off continuously when TaskTracer is enabled.

[Child 13337] WARNING: Resource acquired is being released in non-LIFO order; why?
: file /home/thinker/progm/mozilla-central/xpcom/threads/BlockingResourceBase.cpp, line 353
--- Mutex : StaticMutex (currently acquired)
 calling context
  [stack trace unavailable]
Hi Eric,
I have a code that give off massive warning messages caused by |BlockingResourceBase::Release()|.

  https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/BlockingResourceBase.cpp#353

For performance reason, TaskTracer hands over a lock from |sMutex| to |mLogsMutex| as the code here.

  https://dxr.mozilla.org/mozilla-central/source/tools/profiler/tasktracer/GeckoTaskTracer.cpp#259

It would acquire |sMutex| before acquiring |mLogsMutex|, but release |sMutex| and keep |mLogsMutex| locked for the access a part of resources.  It reduces the chance of blocking on single mutex sharing among all thread.

Do you have an suggestion?
Flags: needinfo?(erahm)
Assignee: nobody → tlee
Markus, would you review this patch?
Flags: needinfo?(erahm) → needinfo?(mstange)
Attachment #8845869 - Attachment is obsolete: true
Sorry I didn't get around to looking at this, I need to take a look at your code to figure out what's going on. Deliberately reordering in BlockingResourceBase is a huge red flag (but might be the only option, I just need to take a look at the code). Adding ni? for myself again, I should have time to take a look on Monday.
Flags: needinfo?(erahm)
(In reply to Thinker Li [:sinker] from comment #1)
> Hi Eric,
> I have a code that give off massive warning messages caused by
> |BlockingResourceBase::Release()|.
> 
>  
> https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/
> BlockingResourceBase.cpp#353
> 
> For performance reason, TaskTracer hands over a lock from |sMutex| to
> |mLogsMutex| as the code here.
> 
>  
> https://dxr.mozilla.org/mozilla-central/source/tools/profiler/tasktracer/
> GeckoTaskTracer.cpp#259
> 
> It would acquire |sMutex| before acquiring |mLogsMutex|, but release
> |sMutex| and keep |mLogsMutex| locked for the access a part of resources. 
> It reduces the chance of blocking on single mutex sharing among all thread.
> 
> Do you have an suggestion?

Can't you just reduce the scope of locking |sMutex|? ie:

> TraceInfoHolder
> GetOrCreateTraceInfo()
> {
>   TraceInfo* info = sTraceInfoTLS.get();
>   // sStarted is atomic, shouldn't need the lock
>   ENSURE_TRUE(IsStartLogging(), TraceInfoHolder{});
> 
>   if (info && info->mObsolete) {
>     // TraceInfo is obsolete: remove it.
>     StaticMutexAutoLock lock(sMutex);
>     FreeTraceInfo(info);
>     info = nullptr;
>   }
> 
>   if (!info) {
>     {
>       StaticMutexAutoLock lock(sMutex);
>       info = AllocTraceInfo(Thread::GetCurrentId());
>     }
>     sTraceInfoTLS.set(info); // this is TLS you shouldn't need the lock
>   }
>   // TraceInfoHolder doesn't mess with static stuff, shouldn't need the lock
>   return TraceInfoHolder{info}; // |mLogsMutex| will be held, then
>                                 // ||sMutex| will be released for
>                                 // efficiency reason.
> }
Flags: needinfo?(erahm)
(In reply to Eric Rahm [:erahm] from comment #6)
> Can't you just reduce the scope of locking |sMutex|? ie:

The gap between two mutices gives a chance to the main thread to free |TraceInfo| after acquired |sMutex|.
It is potentially to crash the thread.
(In reply to Thinker Li [:sinker] from comment #7)
> The gap between two mutices gives a chance to the main thread to free
> |TraceInfo| after acquired |sMutex|.
> It is potentially to crash the thread.

I'll take a more thorough look at the code. It's not clear to me how the main thread would free a TraceInfo stored in another thread's TLS.
Okay so AFAICT the only time this would be an issue is if |ShutdownTaskTracer| runs (presumably on the main thread) at the same time |FreeTraceInfo| is called on another thread. |FreeTraceInfo| is called when an IPC thread or a nsThread shuts down or if the info has been marked as obsolete and |GetOrCreateInfo| is called.

So it comes down to: do you expect nsThreads and IPC threads to still be running when |ShutdownTaskTracer| is invoked?
(In reply to Eric Rahm [:erahm] from comment #9)
> So it comes down to: do you expect nsThreads and IPC threads to still be
> running when |ShutdownTaskTracer| is invoked?

I don't think so, but I can not control others to add more threads
that would stay alive when leak sanitizer does checking.
Al what to do here is to ease leak sanitizer.
Thinker: in bug 1342306 I made a big change to the profiler. I added gPS, which is a variable holding all the global state, and gPSMutex, which is a mutex protecting it. A lot (but not all) of the places where TaskTracer locks sMutex overlap with the places where gPSMutex is locked. So it might be worth considering removing sMutex and just using gPSMutex instead?

I'm not sure; I don't have a good sense of how TaskTracer works, how often its functions are called, etc.
(In reply to Thinker Li [:sinker] from comment #10)
> (In reply to Eric Rahm [:erahm] from comment #9)
> > So it comes down to: do you expect nsThreads and IPC threads to still be
> > running when |ShutdownTaskTracer| is invoked?
> 
> I don't think so, but I can not control others to add more threads
> that would stay alive when leak sanitizer does checking.
> Al what to do here is to ease leak sanitizer.

Only nsThread and ipc threads are opting in, I think it's okay to require any other threads be shutdown by the time you call |ShutdownTaskTracer|. I'm assuming you mean TSAN rather than LSAN. If you're still seeing warnings I'm okay with just converting the warning in the deadlock detector to an opt-in logging message.
Nicolas, it is doable, but what is the benefit?

In theory, separated mutices can reduce contention between threads.
Separated mutices also avoid potential problems of recursive lock.
For example, GeckoProfiler run in async with TaskTracer, it seems likely
that GeckoProfiler are triggered for sampling and try to acquired the mutex
while TaskTracer is holding it.

With a different point of view, we are planning to merge buffers used
by TaskTracer and GeckoProfiler.  Then, we need to syncrhonize clients
accessing buffers.  It is a good reason of a common mutex to synchronize
users of buffers.
Whiteboard: [ps-radar]
It was just an idea. No problem if you don't think it will help.
Flags: needinfo?(mstange)
Comment on attachment 8845872 [details] [diff] [review]
Fix warning messages for releasing order, v2

Review of attachment 8845872 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/BlockingResourceBase.cpp
@@ +373,5 @@
>  }
>  
> +void
> +BlockingResourceBase::SwapTop(BlockingResourceBase& aNext)
> +{

Eric, how do you think to swap the orders here to suppress warning messages?
Attachment #8845872 - Flags: feedback?(erahm)
Attachment #8845872 - Flags: feedback?(erahm)

Task Tracer was removed in bug 1715257.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: