Massive warning messages for the order of releasing resource of TaskTracer

NEW
Assigned to

Status

()

2 years ago
a year ago

People

(Reporter: sinker, Assigned: sinker)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(firefox54 affected)

Details

(Whiteboard: [ps-radar])

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

2 years ago
Following messages are given off continuously when TaskTracer is enabled.

[Child 13337] WARNING: Resource acquired is being released in non-LIFO order; why?
: file /home/thinker/progm/mozilla-central/xpcom/threads/BlockingResourceBase.cpp, line 353
--- Mutex : StaticMutex (currently acquired)
 calling context
  [stack trace unavailable]
(Assignee)

Comment 1

2 years ago
Hi Eric,
I have a code that give off massive warning messages caused by |BlockingResourceBase::Release()|.

  https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/BlockingResourceBase.cpp#353

For performance reason, TaskTracer hands over a lock from |sMutex| to |mLogsMutex| as the code here.

  https://dxr.mozilla.org/mozilla-central/source/tools/profiler/tasktracer/GeckoTaskTracer.cpp#259

It would acquire |sMutex| before acquiring |mLogsMutex|, but release |sMutex| and keep |mLogsMutex| locked for the access a part of resources.  It reduces the chance of blocking on single mutex sharing among all thread.

Do you have an suggestion?
Flags: needinfo?(erahm)
Blocks: 1323076
Assignee: nobody → tlee
(Assignee)

Comment 2

2 years ago
Created attachment 8845869 [details] [diff] [review]
Fix warning messages for releasing order
(Assignee)

Comment 3

2 years ago
Markus, would you review this patch?
Flags: needinfo?(erahm) → needinfo?(mstange)
(Assignee)

Comment 4

2 years ago
Created attachment 8845872 [details] [diff] [review]
Fix warning messages for releasing order, v2
Attachment #8845869 - Attachment is obsolete: true

Comment 5

2 years ago
Sorry I didn't get around to looking at this, I need to take a look at your code to figure out what's going on. Deliberately reordering in BlockingResourceBase is a huge red flag (but might be the only option, I just need to take a look at the code). Adding ni? for myself again, I should have time to take a look on Monday.
Flags: needinfo?(erahm)

Comment 6

2 years ago
(In reply to Thinker Li [:sinker] from comment #1)
> Hi Eric,
> I have a code that give off massive warning messages caused by
> |BlockingResourceBase::Release()|.
> 
>  
> https://dxr.mozilla.org/mozilla-central/source/xpcom/threads/
> BlockingResourceBase.cpp#353
> 
> For performance reason, TaskTracer hands over a lock from |sMutex| to
> |mLogsMutex| as the code here.
> 
>  
> https://dxr.mozilla.org/mozilla-central/source/tools/profiler/tasktracer/
> GeckoTaskTracer.cpp#259
> 
> It would acquire |sMutex| before acquiring |mLogsMutex|, but release
> |sMutex| and keep |mLogsMutex| locked for the access a part of resources. 
> It reduces the chance of blocking on single mutex sharing among all thread.
> 
> Do you have an suggestion?

Can't you just reduce the scope of locking |sMutex|? ie:

> TraceInfoHolder
> GetOrCreateTraceInfo()
> {
>   TraceInfo* info = sTraceInfoTLS.get();
>   // sStarted is atomic, shouldn't need the lock
>   ENSURE_TRUE(IsStartLogging(), TraceInfoHolder{});
> 
>   if (info && info->mObsolete) {
>     // TraceInfo is obsolete: remove it.
>     StaticMutexAutoLock lock(sMutex);
>     FreeTraceInfo(info);
>     info = nullptr;
>   }
> 
>   if (!info) {
>     {
>       StaticMutexAutoLock lock(sMutex);
>       info = AllocTraceInfo(Thread::GetCurrentId());
>     }
>     sTraceInfoTLS.set(info); // this is TLS you shouldn't need the lock
>   }
>   // TraceInfoHolder doesn't mess with static stuff, shouldn't need the lock
>   return TraceInfoHolder{info}; // |mLogsMutex| will be held, then
>                                 // ||sMutex| will be released for
>                                 // efficiency reason.
> }
Flags: needinfo?(erahm)
(Assignee)

Comment 7

2 years ago
(In reply to Eric Rahm [:erahm] from comment #6)
> Can't you just reduce the scope of locking |sMutex|? ie:

The gap between two mutices gives a chance to the main thread to free |TraceInfo| after acquired |sMutex|.
It is potentially to crash the thread.

Comment 8

2 years ago
(In reply to Thinker Li [:sinker] from comment #7)
> The gap between two mutices gives a chance to the main thread to free
> |TraceInfo| after acquired |sMutex|.
> It is potentially to crash the thread.

I'll take a more thorough look at the code. It's not clear to me how the main thread would free a TraceInfo stored in another thread's TLS.

Comment 9

2 years ago
Okay so AFAICT the only time this would be an issue is if |ShutdownTaskTracer| runs (presumably on the main thread) at the same time |FreeTraceInfo| is called on another thread. |FreeTraceInfo| is called when an IPC thread or a nsThread shuts down or if the info has been marked as obsolete and |GetOrCreateInfo| is called.

So it comes down to: do you expect nsThreads and IPC threads to still be running when |ShutdownTaskTracer| is invoked?
(Assignee)

Comment 10

2 years ago
(In reply to Eric Rahm [:erahm] from comment #9)
> So it comes down to: do you expect nsThreads and IPC threads to still be
> running when |ShutdownTaskTracer| is invoked?

I don't think so, but I can not control others to add more threads
that would stay alive when leak sanitizer does checking.
Al what to do here is to ease leak sanitizer.
Thinker: in bug 1342306 I made a big change to the profiler. I added gPS, which is a variable holding all the global state, and gPSMutex, which is a mutex protecting it. A lot (but not all) of the places where TaskTracer locks sMutex overlap with the places where gPSMutex is locked. So it might be worth considering removing sMutex and just using gPSMutex instead?

I'm not sure; I don't have a good sense of how TaskTracer works, how often its functions are called, etc.

Comment 12

2 years ago
(In reply to Thinker Li [:sinker] from comment #10)
> (In reply to Eric Rahm [:erahm] from comment #9)
> > So it comes down to: do you expect nsThreads and IPC threads to still be
> > running when |ShutdownTaskTracer| is invoked?
> 
> I don't think so, but I can not control others to add more threads
> that would stay alive when leak sanitizer does checking.
> Al what to do here is to ease leak sanitizer.

Only nsThread and ipc threads are opting in, I think it's okay to require any other threads be shutdown by the time you call |ShutdownTaskTracer|. I'm assuming you mean TSAN rather than LSAN. If you're still seeing warnings I'm okay with just converting the warning in the deadlock detector to an opt-in logging message.
(Assignee)

Comment 13

2 years ago
Nicolas, it is doable, but what is the benefit?

In theory, separated mutices can reduce contention between threads.
Separated mutices also avoid potential problems of recursive lock.
For example, GeckoProfiler run in async with TaskTracer, it seems likely
that GeckoProfiler are triggered for sampling and try to acquired the mutex
while TaskTracer is holding it.

With a different point of view, we are planning to merge buffers used
by TaskTracer and GeckoProfiler.  Then, we need to syncrhonize clients
accessing buffers.  It is a good reason of a common mutex to synchronize
users of buffers.
Whiteboard: [ps-radar]
It was just an idea. No problem if you don't think it will help.
Flags: needinfo?(mstange)
(Assignee)

Comment 15

2 years ago
Comment on attachment 8845872 [details] [diff] [review]
Fix warning messages for releasing order, v2

Review of attachment 8845872 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/threads/BlockingResourceBase.cpp
@@ +373,5 @@
>  }
>  
> +void
> +BlockingResourceBase::SwapTop(BlockingResourceBase& aNext)
> +{

Eric, how do you think to swap the orders here to suppress warning messages?
Attachment #8845872 - Flags: feedback?(erahm)

Updated

a year ago
Attachment #8845872 - Flags: feedback?(erahm)
You need to log in before you can comment on or make changes to this bug.