Closed Bug 1126576 Opened 9 years ago Closed 7 years ago

Simplify Profiler + Pseudostack lifetime management

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox54 --- fixed

People

(Reporter: mstange, Assigned: n.nethercote)

References

Details

Attachments

(2 files)

Rough notes after a discussion about profiler lifetime management:

 - We'd like to get rid of the PseudoStack's refcount, the mStackCounter == 0 checks in push/pop, and the delete-on-last-pop behavior.
 - In order to do that, we may need to make sure that profiler initialization + shutdown happens in a way that's balanced on the stack.
 - The profiler shutdown call in XPCOMShutdown may be the main cause of our problems; we should find out whether we can just use the destructor of the profiler RAII object in XRE_main for shutting down the profiler.
I have taken a stab at this.

>  - We'd like to get rid of the PseudoStack's refcount, the mStackCounter ==
> 0 checks in push/pop, and the delete-on-last-pop behavior.

I've done this.

>  - In order to do that, we may need to make sure that profiler
> initialization + shutdown happens in a way that's balanced on the stack.
>  - The profiler shutdown call in XPCOMShutdown may be the main cause of our
> problems; we should find out whether we can just use the destructor of the
> profiler RAII object in XRE_main for shutting down the profiler.

I don't understand these points. PseudoStack lifetimes match thread lifetimes. Profiler init/shutdown doesn't seem relevant...
I'm not totally sure about this, as per comment 1. Please review carefully!
Attachment #8832786 - Flags: review?(mstange)
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
(In reply to Nicholas Nethercote [:njn] from comment #1)
> >  - In order to do that, we may need to make sure that profiler
> > initialization + shutdown happens in a way that's balanced on the stack.
> >  - The profiler shutdown call in XPCOMShutdown may be the main cause of our
> > problems; we should find out whether we can just use the destructor of the
> > profiler RAII object in XRE_main for shutting down the profiler.
> 
> I don't understand these points. PseudoStack lifetimes match thread
> lifetimes. Profiler init/shutdown doesn't seem relevant...

For the main thread's PseudoStack, your patch still does "PseudoStack* stack = new PseudoStack();" in profiler_init and "delete tlsPseudoStack.get();" inside profiler_shutdown. How is profiler init/shutdown not relevant?
Flags: needinfo?(n.nethercote)
It's possible that your patch does the right thing, but your comment confuses me.
(In reply to Markus Stange [:mstange] from comment #5)
> 
> For the main thread's PseudoStack, your patch still does "PseudoStack* stack
> = new PseudoStack();" in profiler_init and "delete tlsPseudoStack.get();"
> inside profiler_shutdown. How is profiler init/shutdown not relevant?

Oh, I see. That code controls the main thread's PseudoStack. So you're just worried about that PseudoStack possibly not being freed? As you said in comment 0, XRE_main contain a GeckoProfilerInitRAII, and that does call profiler_shutdown() when it's destroyed. So I think this all works out.
Flags: needinfo?(n.nethercote) → needinfo?(mstange)
(In reply to Nicholas Nethercote [:njn] from comment #7)
> Oh, I see. That code controls the main thread's PseudoStack. So you're just
> worried about that PseudoStack possibly not being freed?

Or that it's accessed after it's freed. GeckoProfilerInitRAII has one call to profiler_shutdown(), but XPCOMShutdown has another earlier call. And any SamplerStackFrameRAII objects on the stack have a pointer to the PseudoStack in their mHandle field.
I think that's what comment 0 is referring to when it says "in a way that's balanced on the stack" - if PseudoStack creation / destruction is balanced, then every SamplerStackFrameRAII object is either created before the PseudoStack is (and thus has a null mHandle), or it's created after the PseudoStack has been created and its mHandle pointer is guaranteed to stay valid at least until destruction of the SamplerStackFrameRAII object.
Flags: needinfo?(mstange)
Right, so there is an init/shutdown pair in XRE_main via the GeckoProfilerInitRAII, and then there is another one in NS_InitXPCOM2() and ShutdownXPCOM(). The former encompasses the latter, so it seems like the latter should be removed?
But it's still not clear to me if my patches are ok. I don't think anything you've said indicates they are wrong?
Flags: needinfo?(mstange)
(In reply to Nicholas Nethercote [:njn] from comment #9)
> Right, so there is an init/shutdown pair in XRE_main via the
> GeckoProfilerInitRAII, and then there is another one in NS_InitXPCOM2() and
> ShutdownXPCOM(). The former encompasses the latter, so it seems like the
> latter should be removed?

Yes, probably. Unless there are unforeseen problems with that. For example, when we're dumping the profile to a file on shutdown (when using the MOZ_PROFILER_SHUTDOWN env var), we need to make sure that that's not using parts of XPCOM that have already shut down. That's the only potential problem that I can think of at the moment; it looks like bug 662444 is still a ways off.

(In reply to Nicholas Nethercote [:njn] from comment #10)
> But it's still not clear to me if my patches are ok. I don't think anything
> you've said indicates they are wrong?

It's not clear to me either. I just wanted to address your comments on this bug first. I'll take another look at the patches tomorrow.
Flags: needinfo?(mstange)
(In reply to Nicholas Nethercote [:njn] from comment #10)
> But it's still not clear to me if my patches are ok. I don't think anything
> you've said indicates they are wrong?

I think I have a definite answer to this now. Your patches are good.

The use-after-free that I was worried about is completely orthogonal to your patches. For some reason I was thinking that SamplerStackFrameRAII called addref/deref on its mHandle pseudostack, but that's not the case. So your patches don't change anything there.
The fact that we have a GeckoProfilerInitRAII instance which encompasses all SamplerStackFrameRAIIs is what saves us from the use-after-free.
Comment on attachment 8832786 [details] [diff] [review]
(part 1) - Remove refcounting from PseudoStack

Review of attachment 8832786 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/public/GeckoProfiler.h
@@ +311,5 @@
>  class nsISupports;
>  class ProfilerMarkerPayload;
>  
> +// Each thread gets its own PseudoStack on thread creation, which is then
> +// destroyed on thread destruction. tlsPseudoStack is the owning reference.

May want to mention that the mechanism to achieve this is different on the main thread vs non-main threads: main thread has GeckoProfilerInitRAII, other threads have AutoProfilerRegister.
Attachment #8832786 - Flags: review?(mstange) → review+
Comment on attachment 8832787 [details] [diff] [review]
(part 2) - Remove refcounting from Pseudostack

Review of attachment 8832787 [details] [diff] [review]:
-----------------------------------------------------------------

This patch could use a better description.

This piece of code actually makes me a little worried - Application Thread sounds Android-y to me. I hope we're not calling profiler_init() on a non-main thread on Android?
Attachment #8832787 - Flags: review?(mstange) → review+
> This piece of code actually makes me a little worried - Application Thread
> sounds Android-y to me. I hope we're not calling profiler_init() on a
> non-main thread on Android?

The change is a no-op, though I understand you are concerned that the existing code is buggy. With that in mind, profiler_init() is called by the following:

- profiler_start()
- GeckoProfilerInitRAII
- NS_InitXPCOM2
- NS_InitMinimalXPCOM

The only one I can see that might be a concern is the call to profiler_start() in ANRReporter::RequestNativeStack(). I have no idea what that is for.
https://hg.mozilla.org/mozilla-central/rev/5aa8ef17a2cc
https://hg.mozilla.org/mozilla-central/rev/1b4fbaa8d228
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: