Simplify Profiler + Pseudostack lifetime management

RESOLVED FIXED in Firefox 54

Status

()

Core
Gecko Profiler
RESOLVED FIXED
2 years ago
3 months ago

People

(Reporter: mstange, Assigned: njn)

Tracking

(Blocks: 1 bug)

Trunk
mozilla54
Points:
---

Firefox Tracking Flags

(firefox54 fixed)

Details

Attachments

(2 attachments)

(Reporter)

Description

2 years ago
Rough notes after a discussion about profiler lifetime management:

 - We'd like to get rid of the PseudoStack's refcount, the mStackCounter == 0 checks in push/pop, and the delete-on-last-pop behavior.
 - In order to do that, we may need to make sure that profiler initialization + shutdown happens in a way that's balanced on the stack.
 - The profiler shutdown call in XPCOMShutdown may be the main cause of our problems; we should find out whether we can just use the destructor of the profiler RAII object in XRE_main for shutting down the profiler.
(Reporter)

Updated

4 months ago
Blocks: 1328363
(Assignee)

Comment 1

3 months ago
I have taken a stab at this.

>  - We'd like to get rid of the PseudoStack's refcount, the mStackCounter ==
> 0 checks in push/pop, and the delete-on-last-pop behavior.

I've done this.

>  - In order to do that, we may need to make sure that profiler
> initialization + shutdown happens in a way that's balanced on the stack.
>  - The profiler shutdown call in XPCOMShutdown may be the main cause of our
> problems; we should find out whether we can just use the destructor of the
> profiler RAII object in XRE_main for shutting down the profiler.

I don't understand these points. PseudoStack lifetimes match thread lifetimes. Profiler init/shutdown doesn't seem relevant...
(Assignee)

Comment 2

3 months ago
Created attachment 8832786 [details] [diff] [review]
(part 1) - Remove refcounting from PseudoStack

I'm not totally sure about this, as per comment 1. Please review carefully!
Attachment #8832786 - Flags: review?(mstange)
(Assignee)

Updated

3 months ago
Assignee: nobody → n.nethercote
Status: NEW → ASSIGNED
(Assignee)

Comment 3

3 months ago
Created attachment 8832787 [details] [diff] [review]
(part 2) - Remove refcounting from Pseudostack
Attachment #8832787 - Flags: review?(mstange)
(Assignee)

Comment 4

3 months ago
Try looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=52043c9f13c5fd23537af2b5420f2a7ec01f443d&selectedJob=73803504
(Reporter)

Comment 5

3 months ago
(In reply to Nicholas Nethercote [:njn] from comment #1)
> >  - In order to do that, we may need to make sure that profiler
> > initialization + shutdown happens in a way that's balanced on the stack.
> >  - The profiler shutdown call in XPCOMShutdown may be the main cause of our
> > problems; we should find out whether we can just use the destructor of the
> > profiler RAII object in XRE_main for shutting down the profiler.
> 
> I don't understand these points. PseudoStack lifetimes match thread
> lifetimes. Profiler init/shutdown doesn't seem relevant...

For the main thread's PseudoStack, your patch still does "PseudoStack* stack = new PseudoStack();" in profiler_init and "delete tlsPseudoStack.get();" inside profiler_shutdown. How is profiler init/shutdown not relevant?
Flags: needinfo?(n.nethercote)
(Reporter)

Comment 6

3 months ago
It's possible that your patch does the right thing, but your comment confuses me.
(Assignee)

Comment 7

3 months ago
(In reply to Markus Stange [:mstange] from comment #5)
> 
> For the main thread's PseudoStack, your patch still does "PseudoStack* stack
> = new PseudoStack();" in profiler_init and "delete tlsPseudoStack.get();"
> inside profiler_shutdown. How is profiler init/shutdown not relevant?

Oh, I see. That code controls the main thread's PseudoStack. So you're just worried about that PseudoStack possibly not being freed? As you said in comment 0, XRE_main contain a GeckoProfilerInitRAII, and that does call profiler_shutdown() when it's destroyed. So I think this all works out.
Flags: needinfo?(n.nethercote) → needinfo?(mstange)
(Reporter)

Comment 8

3 months ago
(In reply to Nicholas Nethercote [:njn] from comment #7)
> Oh, I see. That code controls the main thread's PseudoStack. So you're just
> worried about that PseudoStack possibly not being freed?

Or that it's accessed after it's freed. GeckoProfilerInitRAII has one call to profiler_shutdown(), but XPCOMShutdown has another earlier call. And any SamplerStackFrameRAII objects on the stack have a pointer to the PseudoStack in their mHandle field.
I think that's what comment 0 is referring to when it says "in a way that's balanced on the stack" - if PseudoStack creation / destruction is balanced, then every SamplerStackFrameRAII object is either created before the PseudoStack is (and thus has a null mHandle), or it's created after the PseudoStack has been created and its mHandle pointer is guaranteed to stay valid at least until destruction of the SamplerStackFrameRAII object.
Flags: needinfo?(mstange)
(Assignee)

Comment 9

3 months ago
Right, so there is an init/shutdown pair in XRE_main via the GeckoProfilerInitRAII, and then there is another one in NS_InitXPCOM2() and ShutdownXPCOM(). The former encompasses the latter, so it seems like the latter should be removed?
(Assignee)

Comment 10

3 months ago
But it's still not clear to me if my patches are ok. I don't think anything you've said indicates they are wrong?
Flags: needinfo?(mstange)
(Reporter)

Comment 11

3 months ago
(In reply to Nicholas Nethercote [:njn] from comment #9)
> Right, so there is an init/shutdown pair in XRE_main via the
> GeckoProfilerInitRAII, and then there is another one in NS_InitXPCOM2() and
> ShutdownXPCOM(). The former encompasses the latter, so it seems like the
> latter should be removed?

Yes, probably. Unless there are unforeseen problems with that. For example, when we're dumping the profile to a file on shutdown (when using the MOZ_PROFILER_SHUTDOWN env var), we need to make sure that that's not using parts of XPCOM that have already shut down. That's the only potential problem that I can think of at the moment; it looks like bug 662444 is still a ways off.

(In reply to Nicholas Nethercote [:njn] from comment #10)
> But it's still not clear to me if my patches are ok. I don't think anything
> you've said indicates they are wrong?

It's not clear to me either. I just wanted to address your comments on this bug first. I'll take another look at the patches tomorrow.
Flags: needinfo?(mstange)
(Reporter)

Comment 12

3 months ago
(In reply to Nicholas Nethercote [:njn] from comment #10)
> But it's still not clear to me if my patches are ok. I don't think anything
> you've said indicates they are wrong?

I think I have a definite answer to this now. Your patches are good.

The use-after-free that I was worried about is completely orthogonal to your patches. For some reason I was thinking that SamplerStackFrameRAII called addref/deref on its mHandle pseudostack, but that's not the case. So your patches don't change anything there.
The fact that we have a GeckoProfilerInitRAII instance which encompasses all SamplerStackFrameRAIIs is what saves us from the use-after-free.
(Reporter)

Comment 13

3 months ago
Comment on attachment 8832786 [details] [diff] [review]
(part 1) - Remove refcounting from PseudoStack

Review of attachment 8832786 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/public/GeckoProfiler.h
@@ +311,5 @@
>  class nsISupports;
>  class ProfilerMarkerPayload;
>  
> +// Each thread gets its own PseudoStack on thread creation, which is then
> +// destroyed on thread destruction. tlsPseudoStack is the owning reference.

May want to mention that the mechanism to achieve this is different on the main thread vs non-main threads: main thread has GeckoProfilerInitRAII, other threads have AutoProfilerRegister.
Attachment #8832786 - Flags: review?(mstange) → review+
(Reporter)

Comment 14

3 months ago
Comment on attachment 8832787 [details] [diff] [review]
(part 2) - Remove refcounting from Pseudostack

Review of attachment 8832787 [details] [diff] [review]:
-----------------------------------------------------------------

This patch could use a better description.

This piece of code actually makes me a little worried - Application Thread sounds Android-y to me. I hope we're not calling profiler_init() on a non-main thread on Android?
Attachment #8832787 - Flags: review?(mstange) → review+
(Assignee)

Comment 15

3 months ago
> This piece of code actually makes me a little worried - Application Thread
> sounds Android-y to me. I hope we're not calling profiler_init() on a
> non-main thread on Android?

The change is a no-op, though I understand you are concerned that the existing code is buggy. With that in mind, profiler_init() is called by the following:

- profiler_start()
- GeckoProfilerInitRAII
- NS_InitXPCOM2
- NS_InitMinimalXPCOM

The only one I can see that might be a concern is the call to profiler_start() in ANRReporter::RequestNativeStack(). I have no idea what that is for.
(Assignee)

Comment 16

3 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/5aa8ef17a2ccf798a43a721adfc613121baeffaf
Bug 1126576 (part 1) - Remove refcounting from PseudoStack. r=mstange.

https://hg.mozilla.org/integration/mozilla-inbound/rev/1b4fbaa8d228379eb879871aaeee3e91f6dc9c27
Bug 1126576 (part 2) - Remove redundant check from profiler_init(). r=mstange.

Comment 17

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/5aa8ef17a2cc
https://hg.mozilla.org/mozilla-central/rev/1b4fbaa8d228
Status: ASSIGNED → RESOLVED
Last Resolved: 3 months ago
status-firefox54: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
You need to log in before you can comment on or make changes to this bug.