Closed Bug 1049068 Opened 10 years ago Closed 10 years ago

Remove Callstack from DeadlockDetector

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: erahm, Assigned: erahm)

References

Details

Attachments

(6 files)

The callstack reference [1] is only useful (although quite costly memory / performance wise) in tracemalloc builds, which are going away (bug 1014341). This should also save use 2*sizeof(void*) bytes per blocking resource.

Removing the callstack also has the arguably more important benefit of being able to bypass grabbing the DeadlockDetector mutex for instances where this is the only lock the thread is acquiring [2].

[1] http://hg.mozilla.org/mozilla-central/annotate/71497ed2e0db/xpcom/glue/DeadlockDetector.h#l27
[2] http://hg.mozilla.org/mozilla-central/annotate/71497ed2e0db/xpcom/glue/DeadlockDetector.h#l414
This removes printing of callstacks and preserves the current "stack trace unavailable" messaging.
Attachment #8468852 - Flags: review?(nfroyd)
Assignee: nobody → erahm
Status: NEW → ASSIGNED
This switches over from using a callstack to indicate if a a resource is locked or not to just using a bool.
Attachment #8468853 - Flags: review?(nfroyd)
This removes the unused mCallContext from ResourceAcquisition.
Attachment #8468854 - Flags: review?(nfroyd)
This removes the unused mFirstSeen from OrderingEntry. This allows us to move the early exit from CheckAcquisition to before the lock, which in theory could improve performance.
Attachment #8468855 - Flags: review?(nfroyd)
This removes unused CallStack params from various functions.
Attachment #8468858 - Flags: review?(nfroyd)
This removes CallStack now that it is no longer referenced.
Attachment #8468859 - Flags: review?(nfroyd)
(In reply to Eric Rahm [:erahm] from comment #0)
> The callstack reference [1] is only useful (although quite costly memory /
> performance wise) in tracemalloc builds, which are going away (bug 1014341).
> This should also save use 2*sizeof(void*) bytes per blocking resource.

Just to make sure I understand this number, that comes from not needing to store the call stack in the OrderingEntry (sizeof(void*)) and in BlockingResourceBase (sizeof(void*) to bool, subject to alignment concerns), is that correct?

> Removing the callstack also has the arguably more important benefit of being
> able to bypass grabbing the DeadlockDetector mutex for instances where this
> is the only lock the thread is acquiring [2].

Less mutex contention is a good thing...assuming that this mutex is a bottleneck.  Do you see measurable performance/scalability wins from this?

My concern here is that being able to see callstacks for this sort of stuff is extremely useful.  Removing that completely sounds like it's going to make the deadlock detector significantly less useful.  (Admittedly, the ability to see the callstacks was limited to an off-the-beaten-path build, so I don't know how useful it was in practice.)  It seems like the better path would be to use whatever DMD provides to replace trace-malloc's functionality in this context and/or to provide some bits that would let use save space and/or perform the code-hoisting in part 4 without needing to entirely gut the stack trace functionality.
Flags: needinfo?(erahm)
(In reply to Nathan Froyd (:froydnj) from comment #8)
> (In reply to Eric Rahm [:erahm] from comment #0)
> > The callstack reference [1] is only useful (although quite costly memory /
> > performance wise) in tracemalloc builds, which are going away (bug 1014341).
> > This should also save use 2*sizeof(void*) bytes per blocking resource.
> 
> Just to make sure I understand this number, that comes from not needing to
> store the call stack in the OrderingEntry (sizeof(void*)) and in
> BlockingResourceBase (sizeof(void*) to bool, subject to alignment concerns),
> is that correct?
> 

Yes, this was before I realized I need the bool to preserve the assertions.

> > Removing the callstack also has the arguably more important benefit of being
> > able to bypass grabbing the DeadlockDetector mutex for instances where this
> > is the only lock the thread is acquiring [2].
> 
> Less mutex contention is a good thing...assuming that this mutex is a
> bottleneck.  Do you see measurable performance/scalability wins from this?
> 

I don't have any measurements, I'm going on the hunch that having to grab the one-lock-to-rule-them-all every time you want to uses a mutex is going to impact performance. I think I can add some instrumentation to see how often there's contention. I can also add a multi-threaded scalability test which should give us some numbers to play with.

> My concern here is that being able to see callstacks for this sort of stuff
> is extremely useful.  Removing that completely sounds like it's going to
> make the deadlock detector significantly less useful.  (Admittedly, the
> ability to see the callstacks was limited to an off-the-beaten-path build,
> so I don't know how useful it was in practice.)  It seems like the better
> path would be to use whatever DMD provides to replace trace-malloc's
> functionality in this context and/or to provide some bits that would let use
> save space and/or perform the code-hoisting in part 4 without needing to
> entirely gut the stack trace functionality.

I agree, my goal is to add callstacks back in a more usable fashion after the detemplatization work in bug 1049051. I'd probably do so under a #define guard and store the references in BlockingResourceBase itself. I'll go ahead and file that follow up.
Flags: needinfo?(erahm)
Blocks: 1050445
Attachment #8468852 - Flags: review?(nfroyd) → review+
Attachment #8468853 - Flags: review?(nfroyd) → review+
Attachment #8468854 - Flags: review?(nfroyd) → review+
Attachment #8468855 - Flags: review?(nfroyd) → review+
Attachment #8468858 - Flags: review?(nfroyd) → review+
Comment on attachment 8468859 [details] [diff] [review]
Part 6: Remove CallStack

Review of attachment 8468859 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/glue/DeadlockDetector.cpp
@@ -6,5 @@
>  
>  #include "DeadlockDetector.h"
> -
> -namespace mozilla {
> -const CallStack CallStack::kNone((CallStack::callstack_id)-1);

Might as well hg rm this file now that you've deleted this.
Attachment #8468859 - Flags: review?(nfroyd) → review+
The great thing about removal patches is that there's very little for the reviewer to argue with. :)

I would still like to see some numbers.  I can imagine that for stress tests, not having to acquire the detector's mutex is a win, but I can equally imagine that for typical browser things, that mutex is uncontended most of the time.  And acquiring uncontended mutexes, while overhead, is generally pretty fast.
(In reply to Nathan Froyd (:froydnj) from comment #10)
> Comment on attachment 8468859 [details] [diff] [review]
> Part 6: Remove CallStack
> 
> Review of attachment 8468859 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: xpcom/glue/DeadlockDetector.cpp
> @@ -6,5 @@
> >  
> >  #include "DeadlockDetector.h"
> > -
> > -namespace mozilla {
> > -const CallStack CallStack::kNone((CallStack::callstack_id)-1);
> 
> Might as well hg rm this file now that you've deleted this.

Good call.

> I would still like to see some numbers.  I can imagine that for stress
> tests, not having to acquire the detector's mutex is a win, but I can
> equally imagine that for typical browser things, that mutex is uncontended
> most of the time.  And acquiring uncontended mutexes, while overhead, is
> generally pretty fast.

I added instrumentation to CheckAcquisition and did a basic start up Fx, open gmail, quit.

m-c w/o patches:
  number of contentions = 30,017
  number of times we bailed early = 807,616

m-c w/ patches:
  number of contentions = 5,831
  number of times we bailed early = 652,627

I'm claiming victory on this one.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: