Last Comment Bug 867757 - Provide a way to insert arbitrary data in a profile
: Provide a way to insert arbitrary data in a profile
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Gecko Profiler (show other bugs)
: Trunk
: x86 All
: -- normal (vote)
: mozilla27
Assigned To: Aaron Klotz [:aklotz]
:
Mentors:
Depends on: 921435 925990
Blocks: 887976 899764 867762 899782 918825 921330
  Show dependency treegraph
 
Reported: 2013-05-01 13:49 PDT by Benoit Girard (:BenWa)
Modified: 2014-02-19 13:27 PST (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch Rev 1 (35.72 KB, patch)
2013-06-10 16:14 PDT, Aaron Klotz [:aklotz]
bgirard: review-
Details | Diff | Splinter Review
Patch Rev 2 (44.31 KB, patch)
2013-06-19 15:05 PDT, Aaron Klotz [:aklotz]
bgirard: review-
Details | Diff | Splinter Review
Patch Rev 3 (53.78 KB, patch)
2013-06-26 15:31 PDT, Aaron Klotz [:aklotz]
bgirard: review-
Details | Diff | Splinter Review
Allow Markers to carry payload - WIP (14.24 KB, patch)
2013-07-07 21:12 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Splinter Review
Allow Markers to carry payload - WIP (18.00 KB, patch)
2013-07-07 22:53 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Splinter Review
Allow Markers to carry payload (24.84 KB, patch)
2013-07-09 20:45 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Splinter Review
Allow Markers to carry payload (30.61 KB, patch)
2013-07-10 09:29 PDT, Benoit Girard (:BenWa)
aklotz: review-
Details | Diff | Splinter Review
Allow Markers to carry payload (31.59 KB, patch)
2013-07-10 21:29 PDT, Benoit Girard (:BenWa)
aklotz: review+
Details | Diff | Splinter Review
Patch Rev 4 (82.40 KB, patch)
2013-07-12 13:26 PDT, Aaron Klotz [:aklotz]
bgirard: review-
Details | Diff | Splinter Review
Patch Rev 5 (101.16 KB, patch)
2013-07-30 15:02 PDT, Aaron Klotz [:aklotz]
no flags Details | Diff | Splinter Review
Patch rev 6 (116.89 KB, patch)
2013-09-03 09:59 PDT, Aaron Klotz [:aklotz]
no flags Details | Diff | Splinter Review
Part 1: Allow markers to carry payload (32.03 KB, patch)
2013-09-19 13:28 PDT, Aaron Klotz [:aklotz]
aklotz: review+
Details | Diff | Splinter Review
Part 2: Immediate sample current thread's stack (rev 7) (111.13 KB, patch)
2013-09-24 10:25 PDT, Aaron Klotz [:aklotz]
bgirard: review+
Details | Diff | Splinter Review
Part 2: Immediate sample current thread's stack (111.23 KB, patch)
2013-09-25 08:13 PDT, Aaron Klotz [:aklotz]
aklotz: review+
Details | Diff | Splinter Review

Description Benoit Girard (:BenWa) 2013-05-01 13:49:54 PDT
In addition to storing samples, you can insert markers into the profile. We need a way to insert arbitrary data into the profile to track interesting things in the profile such as IO, memory usage and perhaps other timing information.
Comment 1 Aaron Klotz [:aklotz] 2013-06-10 16:14:50 PDT
Created attachment 760627 [details] [diff] [review]
Patch Rev 1

This patch adds the capability to immediately request SPS to sample the calling thread's stack and annotate it.

* The PROFILER_SAMPLE_IMMEDIATELY macros are the entry points for doing this.
* From the perspective of the profile, the resulting stack is a regular stack that includes additional annotations via the 'M' (immediateName), 'D' (immediateDuration), and 'P' (immediatePayload) tags. My intention here is that we can eventually detect those annotations in the client and handle the affected stacks appropriately.
* I've added a bit of a hack to TableTicker::doNativeBacktrace to allow us to pull the current thread's callstack on Windows without needing to spin up and synchronize with a worker thread. The downside is that this implementation doesn't give us stack pointers to work with, so I intend to file a follow-up bug to make that work better.
* For Linux and Android I wanted to use the same code for both, calling getcontext() to retrieve the current thread's context. There's an implementation of getcontext() for Android included with breakpad, so this patch removes android-signal-defs.h and uses the breakpad stuff instead. that code looked OK to me, but please let me know if this is going to be an issue.
Comment 2 Benoit Girard (:BenWa) 2013-06-12 15:24:15 PDT
Comment on attachment 760627 [details] [diff] [review]
Patch Rev 1

Review of attachment 760627 [details] [diff] [review]:
-----------------------------------------------------------------

Great first attempt. You really dove deep in the profiler internals which are poorly documented without asking many questions. It's impressive but don't be shy to ask me and Julian questions.

Let's have another look with this stuff fixed :).

::: tools/profiler/GeckoProfiler.h
@@ +74,5 @@
>  
> +// Walk the current thread's callstack immediately, annotate it, and insert it
> +// into the profile. This is useful for gathering stacks for noteworthy events
> +// such as main thread I/O.
> +#define PROFILER_SAMPLE_IMMEDIATELY(name) do {} while(0)

I don't think we should expose those. For now we should make it so a marker always includes its callsite.

In the future I want to be able to add a feature to charge a single callsite to the execution of something else which would achieve something similar to your duration parameter. It will likely be a useful feature for watching IO because we can log the callsites of various async IO function call and charge them to the code doing the final flush.

We can extend markers to carry a JSON payload however. A standard 'duration' field wouldn't hurt either so feel free to port that code to markers. Once this information is carried in the profile then we can visualize them however we want.

::: tools/profiler/Makefile.in
@@ +20,5 @@
>    $(NULL)
>  
> +ifeq ($(OS_TARGET),Android)
> +LOCAL_INCLUDES += \
> +  -I$(topsrcdir)/toolkit/crashreporter/google-breakpad/src/common/android/include \

What do you need from here? It's not clear why this change requires this.

::: tools/profiler/TableTicker.cpp
@@ +395,5 @@
>  
>  void TableTicker::doNativeBacktrace(ThreadProfile &aProfile, TickSample* aSample)
>  {
> +#ifdef XP_WIN
> +  // Temporary solution, doesn't use pseudostack

This is temporary. Do you plan on fixing this before landing? If not what about using NS_Stackwalk and getting the pseudostack?

@@ +396,5 @@
>  void TableTicker::doNativeBacktrace(ThreadProfile &aProfile, TickSample* aSample)
>  {
> +#ifdef XP_WIN
> +  // Temporary solution, doesn't use pseudostack
> +  if (aProfile.ThreadId() == GetCurrentThreadId()) {

GetCurrentThreadId shouldn't be assumed to be signal/thread-pause safe. If you have information you need to fetch it should be called before pausing the threads and stored in aSample.

@@ +525,5 @@
> +  ThreadProfile* pProfile = tlsThreadProfile.get();
> +  if (!pProfile) {
> +    int tid;
> +#if defined(XP_WIN)
> +    tid = GetCurrentThreadId();

This get_tid stuff is getting worse. Would be nice if you can move it into the class OS abstraction and clean up the usage a little.

@@ +535,5 @@
> +    for (uint32_t i = 0; i < sRegisteredThreads->size(); i++) {
> +      ThreadInfo* curInfo = sRegisteredThreads->at(i);
> +      if (curInfo->ThreadId() == tid) {
> +        pProfile = curInfo->Profile();
> +        tlsThreadProfile.set(pProfile);

You use this code to set tlsThreadProfile for a thread if it isn't set but you don't guarantee that this tlsThreadProfile.set has a matched set(nullptr) when ~TableTicker is ran.

@@ +546,5 @@
> +    return;
> +  }
> +
> +  TickSample sample;
> +  sample.threadProfile = pProfile;

I think to provide a dead stack all we need to do is provide a temporary allocated ThreadProfile that we read out before returning. If we stack allocate ThreadProfile it shouldn't have much overhead and we're reusing the existing code. Except it's more complicated with UnwinderTick since it will be filled in async =\.

@@ +559,5 @@
> +  sample.AnnotateImmediateData(aName, aDuration, aPayload);
> +
> +  sample.timestamp = mozilla::TimeStamp::Now();
> +
> +  if (HasUnwinderThread()) {

Call ::Tick for this code.

::: tools/profiler/TableTicker.h
@@ +93,5 @@
>  
>    // Called within a signal. This function must be reentrant
>    virtual void Tick(TickSample* sample);
>  
> +  virtual void SampleImmediately(const char* aName,

This can use a short comment to explain what it does.

::: tools/profiler/UnwinderThread2.cpp
@@ -826,5 @@
>      struct __darwin_i386_thread_state* ss = &mc->__ss;
>      buff->regs.eip = ss->__eip;
>      buff->regs.esp = ss->__esp;
>      buff->regs.ebp = ss->__ebp;
> -#   elif defined(SPS_PLAT_x86_android)

cool

::: tools/profiler/android-signal-defs.h
@@ -1,1 @@
> -/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*- */

nice

::: tools/profiler/platform-macos.cc
@@ +403,5 @@
> +
> +void TickSample::PopulateContext(void* aContext)
> +{
> +#if defined(SPS_PLAT_amd64_darwin)
> +  asm (

I'm not familiar with asm so we will need to find another reviewer for this.

::: tools/profiler/platform.h
@@ +261,5 @@
>          lr(NULL),
>  #endif
>          function(NULL),
>          context(NULL),
> +        frames_count(0),

I think function and frames_count are useless. If you feel inclined you can remove them :).
Comment 3 Aaron Klotz [:aklotz] 2013-06-17 11:15:49 PDT
(In reply to Benoit Girard (:BenWa) from comment #2)
> Comment on attachment 760627 [details] [diff] [review]
> Patch Rev 1
> 
> Review of attachment 760627 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: tools/profiler/GeckoProfiler.h
> @@ +74,5 @@
> >  
> > +// Walk the current thread's callstack immediately, annotate it, and insert it
> > +// into the profile. This is useful for gathering stacks for noteworthy events
> > +// such as main thread I/O.
> > +#define PROFILER_SAMPLE_IMMEDIATELY(name) do {} while(0)
> 
> I don't think we should expose those. For now we should make it so a marker
> always includes its callsite.
> 
> In the future I want to be able to add a feature to charge a single callsite
> to the execution of something else which would achieve something similar to
> your duration parameter. It will likely be a useful feature for watching IO
> because we can log the callsites of various async IO function call and
> charge them to the code doing the final flush.
> 
> We can extend markers to carry a JSON payload however. A standard 'duration'
> field wouldn't hurt either so feel free to port that code to markers. Once
> this information is carried in the profile then we can visualize them
> however we want.

Will do.

> 
> ::: tools/profiler/Makefile.in
> @@ +20,5 @@
> >    $(NULL)
> >  
> > +ifeq ($(OS_TARGET),Android)
> > +LOCAL_INCLUDES += \
> > +  -I$(topsrcdir)/toolkit/crashreporter/google-breakpad/src/common/android/include \
> 
> What do you need from here? It's not clear why this change requires this.

This is necessary for the changes to use Breakpad's implementation of getcontext() on Android. The function is actually implemented in assembly as breakpad_getcontext(), so among other things the ucontext.h in this include path #defines getcontext() to breakpad_getcontext().

> 
> ::: tools/profiler/TableTicker.cpp
> @@ +395,5 @@
> >  
> >  void TableTicker::doNativeBacktrace(ThreadProfile &aProfile, TickSample* aSample)
> >  {
> > +#ifdef XP_WIN
> > +  // Temporary solution, doesn't use pseudostack
> 
> This is temporary. Do you plan on fixing this before landing? If not what
> about using NS_Stackwalk and getting the pseudostack?

Well, the whole issue is that, in the context of SPS, it is currently too expensive for a thread to call NS_Stackwalk() on itself on Windows. For each call to NS_StackWalk, a new thread is spun up in order to suspend the calling thread, obtain its context, and walk its stack. I don't think we should be doing that every time some main thread I/O occurs, so I'd like to land these changes first and investigate improvements to NS_StackWalk later in another bug. Improving that is going to take a bit of work.

> 
> @@ +396,5 @@
> >  void TableTicker::doNativeBacktrace(ThreadProfile &aProfile, TickSample* aSample)
> >  {
> > +#ifdef XP_WIN
> > +  // Temporary solution, doesn't use pseudostack
> > +  if (aProfile.ThreadId() == GetCurrentThreadId()) {
> 
> GetCurrentThreadId shouldn't be assumed to be signal/thread-pause safe. If
> you have information you need to fetch it should be called before pausing
> the threads and stored in aSample.

OK. I'll add necessary info to the sample so we don't need to do this.

> 
> @@ +525,5 @@
> > +  ThreadProfile* pProfile = tlsThreadProfile.get();
> > +  if (!pProfile) {
> > +    int tid;
> > +#if defined(XP_WIN)
> > +    tid = GetCurrentThreadId();
> 
> This get_tid stuff is getting worse. Would be nice if you can move it into
> the class OS abstraction and clean up the usage a little.

Sure.

> 
> @@ +535,5 @@
> > +    for (uint32_t i = 0; i < sRegisteredThreads->size(); i++) {
> > +      ThreadInfo* curInfo = sRegisteredThreads->at(i);
> > +      if (curInfo->ThreadId() == tid) {
> > +        pProfile = curInfo->Profile();
> > +        tlsThreadProfile.set(pProfile);
> 
> You use this code to set tlsThreadProfile for a thread if it isn't set but
> you don't guarantee that this tlsThreadProfile.set has a matched
> set(nullptr) when ~TableTicker is ran.

I take care of that in Sampler::UnregisterCurrentThread(). I actually wanted to populate that TLS slot during Sampler::RegisterCurrentThread(), but the ThreadProfile object isn't necessarily created at that time, hence the lazy version that I added here.

We can't use ~TableTicker to clear this because that will only clear the TLS slot for the thread that is executing ~TableTicker, not any of the others.

Suggestions welcome.

> 
> @@ +559,5 @@
> > +  sample.AnnotateImmediateData(aName, aDuration, aPayload);
> > +
> > +  sample.timestamp = mozilla::TimeStamp::Now();
> > +
> > +  if (HasUnwinderThread()) {
> 
> Call ::Tick for this code.

Sure.

> 
> ::: tools/profiler/TableTicker.h
> @@ +93,5 @@
> >  
> >    // Called within a signal. This function must be reentrant
> >    virtual void Tick(TickSample* sample);
> >  
> > +  virtual void SampleImmediately(const char* aName,
> 
> This can use a short comment to explain what it does.

OK.

> ::: tools/profiler/platform.h
> @@ +261,5 @@
> >          lr(NULL),
> >  #endif
> >          function(NULL),
> >          context(NULL),
> > +        frames_count(0),
> 
> I think function and frames_count are useless. If you feel inclined you can
> remove them :).

Sure :)
Comment 4 Benoit Girard (:BenWa) 2013-06-17 12:20:25 PDT
(In reply to Aaron Klotz [:aklotz] from comment #3)
> > @@ +535,5 @@
> > > +    for (uint32_t i = 0; i < sRegisteredThreads->size(); i++) {
> > > +      ThreadInfo* curInfo = sRegisteredThreads->at(i);
> > > +      if (curInfo->ThreadId() == tid) {
> > > +        pProfile = curInfo->Profile();
> > > +        tlsThreadProfile.set(pProfile);
> > 
> > You use this code to set tlsThreadProfile for a thread if it isn't set but
> > you don't guarantee that this tlsThreadProfile.set has a matched
> > set(nullptr) when ~TableTicker is ran.
> 
> I take care of that in Sampler::UnregisterCurrentThread(). I actually wanted
> to populate that TLS slot during Sampler::RegisterCurrentThread(), but the
> ThreadProfile object isn't necessarily created at that time, hence the lazy
> version that I added here.
> 
> We can't use ~TableTicker to clear this because that will only clear the TLS
> slot for the thread that is executing ~TableTicker, not any of the others.
> 
> Suggestions welcome.
> 

Taking care of it in UnregisterCurrentThread will work just fine for clearing it when a thread is terminated before we stopped profiling but this code isn't safe if we stop profiling while there are registered secondary threads. Maybe we need something like a weak pointer to check if the object is still alive? Maybe there's something simpler that can be used. The lifetimes are very annoying I have to admit with the locking restrictions.
Comment 5 Aaron Klotz [:aklotz] 2013-06-19 15:05:17 PDT
Created attachment 765027 [details] [diff] [review]
Patch Rev 2

Changes made as discussed in comment 3. For comment 4 I removed the TLS.
Comment 6 Benoit Girard (:BenWa) 2013-06-21 08:42:16 PDT
Comment on attachment 765027 [details] [diff] [review]
Patch Rev 2

Review of attachment 765027 [details] [diff] [review]:
-----------------------------------------------------------------

As we discussed on IRC I think this patch isn't safe because we can interrupt the main thread while it's writing to the buffer to write to the buffer ourselves.

::: tools/profiler/BreakpadSampler.cpp
@@ -182,3 @@
>    PseudoStack* stack = currThreadProfile.GetPseudoStack();
> -  for (int i = 0; stack->getMarker(i) != NULL; i++) {
> -    utb__addEntry( utb, ProfileEntry('m', stack->getMarker(i)) );

Why are you removing this?

::: tools/profiler/GeckoProfiler.h
@@ +71,5 @@
>  // only recorded if a sample is collected while it is active, marker will always
>  // be collected.
>  #define PROFILER_MARKER(info) do {} while (0)
> +#define PROFILER_MARKER_WITH_DURATION(info, duration) do {} while(0)
> +#define PROFILER_MARKER_WITH_PAYLOAD(info, payload) do {} while(0)

Please document payload and duration here and include format/units. Looks like the format is any string to be parsed by the front end. Perhaps state 'Any string beginning with '{' should be JSON encoded.

::: tools/profiler/TableTicker.cpp
@@ -497,3 @@
>    PseudoStack* stack = currThreadProfile.GetPseudoStack();
> -  for (int i = 0; stack->getMarker(i) != NULL; i++) {
> -    addDynamicTag(currThreadProfile, 'm', stack->getMarker(i));

Why are you removing this?
Comment 7 Aaron Klotz [:aklotz] 2013-06-26 15:31:20 PDT
Created attachment 768040 [details] [diff] [review]
Patch Rev 3

This revision of the patch does the following:

1. SampleImmediately allocates a ThreadProfile and captures the current thread's stack into it, including marker annotations.

2. Markers are now represented as pointers to ThreadProfile objects. That ThreadProfile sampled in (1) is addMarker()ed to the thread's PseudoStack.

3. When the next "real" sample occurs, the contents of the marker ThreadProfile is inserted into the thread's real ThreadProfile object.

4. To protect against (3) merging in a ThreadProfile whose stack is still being populated by the unwinder thread, a flag is set in the ThreadProfile. Any marker ThreadProfiles will not be merged or destroyed unless that flag is clear. The flag is set during (1) and cleared by the unwinder thread.

I think that the way that PseudoStack::clearMarkers() handles this is safe because that function is only called by the threads themselves when adding a new marker.
Comment 8 Benoit Girard (:BenWa) 2013-06-28 12:46:45 PDT
Comment on attachment 768040 [details] [diff] [review]
Patch Rev 3

Review of attachment 768040 [details] [diff] [review]:
-----------------------------------------------------------------

I suggested to call uwt synchronously at the end of my review so a far comments here are no longer relevant. Getting closer :)

::: tools/profiler/BreakpadSampler.cpp
@@ +176,5 @@
>      sample->threadProfile = GetPrimaryThreadProfile();
>    }
>  
>    ThreadProfile& currThreadProfile = *sample->threadProfile;
> +  currThreadProfile.SetUnwinding(true);

We talked on IRC how this should just call the uwt syncrounsly. When sampling we can't because we've got threads paused and this will risk deadlocking but this is fine if we're calling from the main thread. We just need to make sure we don't share any locks with the unwinder threads.

@@ +181,4 @@
>  
>    /* Get hold of an empty inter-thread buffer into which to park
>       the ProfileEntries for this sample. */
>    UnwinderThreadBuffer* utb = uwt__acquire_empty_buffer();

We will want to 'new' our own buffer, we don't need to limit ourselves to a preallocated buffer here.

@@ +185,5 @@
>  
>    /* This could fail, if no buffers are currently available, in which
>       case we must give up right away.  We cannot wait for a buffer to
>       become available, as that risks deadlock. */
>    if (!utb)

otherwise this would of just silently dropped a marker.

::: tools/profiler/GeckoProfiler.h
@@ +72,5 @@
>  // be collected.
>  #define PROFILER_MARKER(info) do {} while (0)
> +#define PROFILER_MARKER_WITH_DURATION(info, duration) do {} while(0)
> +#define PROFILER_MARKER_WITH_PAYLOAD(info, payload) do {} while(0)
> +#define PROFILER_MARKER_WITH_PAYLOAD_DURATION(info, payload, duration) do {} while(0)

We should support markers without backtraces since taking a breaktrace is a huge performance overhead.

::: tools/profiler/ProfileEntry.cpp
@@ +456,5 @@
>  }
>  
> +void ThreadProfile::SetUnwinding(bool aInProgress)
> +{
> +  mIsUnwindInProgress = aInProgress;

This needs to be an atomic store and possibly volatile? .. But actually we might kill this together.

::: tools/profiler/ProfileEntry.h
@@ +75,5 @@
>    mozilla::Mutex* GetMutex();
>    void BuildJSObject(JSAObjectBuilder& b, JSCustomObject* profile);
>  
> +  void SetUnwinding(bool aInProgress);
> +  bool IsUnwinding() const { return mIsUnwindInProgress; }

IsUnwindPending might be a better name.

Let's add some comments here. Here's mine maybe you can think of something more descriptive:
// mEntries is still being populated by the unwinder thread.

::: tools/profiler/PseudoStack.h
@@ +227,5 @@
>  
>    // Keep a list of active checkpoints
>    StackEntry volatile mStack[1024];
>    // Keep a list of active markers to be applied to the next sample taken
> +  ThreadProfile* mMarkers[1024];

ThreadProfile is not a good structure to store markers. See my comments below to contain it using a more appropriate struct/class.

::: tools/profiler/TableTicker.cpp
@@ +395,5 @@
>  
>  void TableTicker::doNativeBacktrace(ThreadProfile &aProfile, TickSample* aSample)
>  {
> +#ifdef XP_WIN
> +  // Temporary solution, doesn't use pseudostack

You gave a better explanation in bugzilla about what is going on here. Can you migrate some of this to this comments? I hate seeing 'temporary' comments that are vague and sometimes stay in our tree for years.

@@ +517,5 @@
>      InplaceTick(sample);
>    }
>  }
>  
> +void TableTicker::SampleImmediately(const char* aName,

This needs a better name. InsertMarker?

@@ +538,5 @@
> +  sample.PopulateContext(nullptr);
> +#endif
> +
> +  sample.AnnotateImmediateData(aName, aPayload, aDuration);
> +  sample.timestamp = mozilla::TimeStamp::Now();

I don't think these variables belong in TickSample. I think we want to create a Marker struct that has a name, an optional filled ThreadProfile pointer (so that we can chose if we want the backtrace for individual markers, or just a marker with no backtrace) and other members like this timestamp, duration or payloads...

@@ +540,5 @@
> +
> +  sample.AnnotateImmediateData(aName, aPayload, aDuration);
> +  sample.timestamp = mozilla::TimeStamp::Now();
> +
> +  Tick(&sample);

... I don't think this Tick sample should receive any data about the marker. It should only be passed to addMarker.

@@ +542,5 @@
> +  sample.timestamp = mozilla::TimeStamp::Now();
> +
> +  Tick(&sample);
> +
> +  if (!HasUnwinderThread()) {

I don't see why this is conditional on not having an unwinder thread. Can you maybe add a comment for this?

::: tools/profiler/platform.h
@@ +275,5 @@
> +                                    double* aDuration)
> +  {
> +    immediateName = aName;
> +    immediatePayload = aPayload;
> +    immediateDuration = aDuration;

These should be markerXXX and not immediateXXX but like I said elsewhere I think it's better to move this information elsewhere.

@@ +278,5 @@
> +    immediatePayload = aPayload;
> +    immediateDuration = aDuration;
> +  }
> +
> +  inline bool IsImmediate()

Maybe IsSamplingCurrentThread is a better name?

@@ +316,5 @@
>    // program counter.
>    virtual void Tick(TickSample* sample) = 0;
>  
> +  // Immediately generate an annotated backtrace for the current thread
> +  virtual void SampleImmediately(const char* aName,

Maybe SampleCurrentThread is a better name?
Comment 9 Benoit Girard (:BenWa) 2013-07-07 21:12:52 PDT
Created attachment 771912 [details] [diff] [review]
Allow Markers to carry payload - WIP

I've been working on something to allow markers to provide arbitrary data. This approach is more flexible as it will allow markers to carry things such as images for example and it uses up one entry from the circular buffer.

Aaron lets find some time to discuss how to merge our two approaches.
Comment 10 Benoit Girard (:BenWa) 2013-07-07 22:53:06 PDT
Created attachment 771932 [details] [diff] [review]
Allow Markers to carry payload - WIP
Comment 11 Benoit Girard (:BenWa) 2013-07-09 20:45:14 PDT
Created attachment 773049 [details] [diff] [review]
Allow Markers to carry payload

This should no longer leak and I updated cleopatra to support the new format.

I left a 'location' property commented out but this is where I expected the callsite for IO to be stored.
Comment 12 Benoit Girard (:BenWa) 2013-07-10 09:29:24 PDT
Created attachment 773348 [details] [diff] [review]
Allow Markers to carry payload

https://tbpl.mozilla.org/?tree=Try&rev=a60211f1b3bc
Comment 13 Aaron Klotz [:aklotz] 2013-07-10 11:47:50 PDT
Comment on attachment 773348 [details] [diff] [review]
Allow Markers to carry payload

Review of attachment 773348 [details] [diff] [review]:
-----------------------------------------------------------------

Good patch, however I have a few concerns.

::: tools/profiler/BreakpadSampler.cpp
@@ +182,5 @@
>    PseudoStack* stack = currThreadProfile.GetPseudoStack();
> +  ProfilerMarkerLinkedList* pendingMarkersList = stack->getPendingMarkers();
> +  while (pendingMarkersList && pendingMarkersList->peek()) {
> +    ProfilerMarker* marker = pendingMarkersList->popHead();
> +    currThreadProfile.addTag(ProfileEntry('m', marker));

Shouldn't this be calling utb__addEntry() instead of directly adding the marker tag to the profile?

@@ +189,1 @@
>  

This could theoretically race with updates to the generation ID that would happen when the unwinder thread flushes.

::: tools/profiler/moz.build
@@ +36,5 @@
>          'IOInterposer.cpp',
>          'NSPRInterposer.cpp',
>          'ProfilerIOInterposeObserver.cpp',
>          'SQLiteInterposer.cpp',
> +        'ProfilerMarkers.cpp',

I don't see ProfilerMarkers.cpp in this patch.

::: tools/profiler/platform.cpp
@@ +136,5 @@
> +ProfilerMarker*
> +ProfilerMarkerLinkedList::popHead() {
> +  ProfilerMarker* head = mHead;
> +
> +  mHead = head->mNext;

What if mHead is null? I know that you're using peek() to check before calling popHead(), so this shouldn't be an issue in practice. Can you put something defensive here so that it's clear in the code that you considered that case?
Comment 14 Benoit Girard (:BenWa) 2013-07-10 21:29:04 PDT
Created attachment 773765 [details] [diff] [review]
Allow Markers to carry payload

https://tbpl.mozilla.org/?tree=Try&rev=46809a3b7dc0
Comment 15 Aaron Klotz [:aklotz] 2013-07-12 13:26:27 PDT
Created attachment 774868 [details] [diff] [review]
Patch Rev 4

This version leverages BenWa's payload patch. Here is a summary of the latest changes:

* New functions, utb__acquire_sync_buffer() and utb__release_sync_buffer() manage a UnwinderThreadBuffer to be used for synchronous backtrace.
* Code that populates the UnwinderThreadBuffer has been moved into a new populateBuffer() function so that the same function can be used by both the signal handler case as well as the synchronous backtrace case.
* A new TLS slot is used to store the top of the stack so that this value is available during synchronous backtrace without needing to lock and access any profiler data structures.
* The unwinder thread's buffer handling code has been moved into process_buffer() to separate it from the infinite loop and buffer management code that is not used by the synchronous backtrace stuff.
* The backtrace is returned as a ThreadProfile from profiler_get_backtrace() and passed to IOMarkerPayload which is a ProfilerMarkerPayload and is thus included with the I/O marker.
Comment 16 Benoit Girard (:BenWa) 2013-07-13 09:21:08 PDT
Comment on attachment 774868 [details] [diff] [review]
Patch Rev 4

Review of attachment 774868 [details] [diff] [review]:
-----------------------------------------------------------------

Only minor stuff. Good work! I'd like to take another look but it's almost ready.

Julian want to double check the BreakpadSampler.cpp/Unwinder* changes? They looks fine with me but another look wouldn't hurt.

::: tools/profiler/BreakpadSampler.cpp
@@ +160,1 @@
>    ThreadProfile& currThreadProfile = *sample->threadProfile;

IMO currThreadProfile is becoming ambigious now that you're added isSamplingCurrentThread. In the case when !isSamplingCurrentThread then currThreadProfile is poorly named. Maybe rename to sampledThreadProfile?

::: tools/profiler/GeckoProfiler.h
@@ +103,5 @@
>  // to retrieve the profile.
>  static inline void profiler_stop() {}
>  
> +// Immediately capture the current thread's call stack and return it
> +static inline ThreadProfile* profiler_get_backtrace() { return nullptr; }

We shouldn't return a ThreadProfile* since I consider it to be an implementation detail and I want to keep as much flexibility to change it as possible. We should return either an opaque type or a pure virtual ProfilerBacktrace object.

::: tools/profiler/ProfilerMarkers.cpp
@@ +44,5 @@
> +  b.DefineProperty(data, "type", "io");
> +  b.DefineProperty(data, "source", mSource);
> +  b.DefineProperty(data, "duration", mDuration);
> +
> +  if (mStack) {

I think the location is going to be common enough that it's should be in the parent MarkerPayload as an optional field. Same with the duration I think. IMO the duration should be captured as two TimeStamp for the start/end so that it's more flexible and less ambiguous (for example is the duration starting from the marker forward, or ending at the marker).

::: tools/profiler/TableTicker.cpp
@@ +402,5 @@
> +     walk its stack. We shouldn't do that every time we need a thread to obtain
> +     its own backtrace in SPS.*/
> +  if (aSample->isSamplingCurrentThread) {
> +    void* ptrs[100];
> +    USHORT count = RtlCaptureStackBackTrace(0, 100, ptrs, nullptr);

To bad we don't get the stack address so we could merge the JS stack =\. This is fine for now much I imagine later we will want to include the Pseudoframes (JS) info.

::: tools/profiler/UnwinderThread2.cpp
@@ +1025,5 @@
> +/* Copy ProfileEntries presented to us by the sampling thread.
> +   Most of them are copied verbatim into |buff->aProfile|,
> +   except for 'hint' tags, which direct us to do something
> +   different. */
> +static void process_buffer(UnwinderThreadBuffer* buff, int oldest_ix)

This entire block is only a move right?

::: tools/profiler/platform.cpp
@@ +28,5 @@
>  #endif
>  
>  mozilla::ThreadLocal<PseudoStack *> tlsPseudoStack;
>  mozilla::ThreadLocal<TableTicker *> tlsTicker;
> +mozilla::ThreadLocal<void *> tlsStackTop;

Any reason why you introduce a TLS rather then looking it up for the regsistered thread array?

@@ +720,5 @@
> +{
> +  if (!stack_key_initialized)
> +    return nullptr;
> +
> +  // Don't capture a stack if we're not profiling

It would be nice to not have this requirement but this should happen in a new bug.

@@ +725,5 @@
> +  if (!profiler_is_active()) {
> +    return nullptr;
> +  }
> +
> +  // Don't capture a stack if we don't want to include personal information

We can still collect the c++ stack, just not dynamic Pseudoframes+Markers that can include URLs but this is also fine for now.
Comment 17 Benoit Girard (:BenWa) 2013-07-15 10:46:13 PDT
CC'ing anton as a heads up. When this lands the marker format will change from a string to an object.
Comment 18 Aaron Klotz [:aklotz] 2013-07-30 15:02:21 PDT
Created attachment 783375 [details] [diff] [review]
Patch Rev 5

Also requesting r? Julian for BreakpadSampler.cpp/Unwinder* changes.

(In reply to Benoit Girard (:BenWa) from comment #16)
> Comment on attachment 774868 [details] [diff] [review]
> Patch Rev 4
> 
> Review of attachment 774868 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: tools/profiler/BreakpadSampler.cpp
> @@ +160,1 @@
> >    ThreadProfile& currThreadProfile = *sample->threadProfile;
> 
> IMO currThreadProfile is becoming ambigious now that you're added
> isSamplingCurrentThread. In the case when !isSamplingCurrentThread then
> currThreadProfile is poorly named. Maybe rename to sampledThreadProfile?
> 

Done.

> ::: tools/profiler/GeckoProfiler.h
> @@ +103,5 @@
> >  // to retrieve the profile.
> >  static inline void profiler_stop() {}
> >  
> > +// Immediately capture the current thread's call stack and return it
> > +static inline ThreadProfile* profiler_get_backtrace() { return nullptr; }
> 
> We shouldn't return a ThreadProfile* since I consider it to be an
> implementation detail and I want to keep as much flexibility to change it as
> possible. We should return either an opaque type or a pure virtual
> ProfilerBacktrace object.

I created an abstract ProfilerBacktrace class that ThreadProfile now inherits from. Now profiler_get_backtrace returns a ProfilerBacktrace pointer.

> 
> ::: tools/profiler/ProfilerMarkers.cpp
> @@ +44,5 @@
> > +  b.DefineProperty(data, "type", "io");
> > +  b.DefineProperty(data, "source", mSource);
> > +  b.DefineProperty(data, "duration", mDuration);
> > +
> > +  if (mStack) {
> 
> I think the location is going to be common enough that it's should be in the
> parent MarkerPayload as an optional field. Same with the duration I think.
> IMO the duration should be captured as two TimeStamp for the start/end so
> that it's more flexible and less ambiguous (for example is the duration
> starting from the marker forward, or ending at the marker).

Done. I've modified the parent ProfilerMarkerPayload class to hold timestamps and stack, and added a helper function to add common properties to the JS object. I've also added an overload of profiler_time that accepts a TimeStamp and converts that to profiler time to assist with serializing those TimeStamp values. The IOInterposer stuff has been updated to work with the two TimeStamp objects instead of the duration.

> 
> ::: tools/profiler/TableTicker.cpp
> @@ +402,5 @@
> > +     walk its stack. We shouldn't do that every time we need a thread to obtain
> > +     its own backtrace in SPS.*/
> > +  if (aSample->isSamplingCurrentThread) {
> > +    void* ptrs[100];
> > +    USHORT count = RtlCaptureStackBackTrace(0, 100, ptrs, nullptr);
> 
> To bad we don't get the stack address so we could merge the JS stack =\.
> This is fine for now much I imagine later we will want to include the
> Pseudoframes (JS) info.

Yeah, I plan to take a closer look at this very soon. I created bug 899782 to track this.

> 
> ::: tools/profiler/UnwinderThread2.cpp
> @@ +1025,5 @@
> > +/* Copy ProfileEntries presented to us by the sampling thread.
> > +   Most of them are copied verbatim into |buff->aProfile|,
> > +   except for 'hint' tags, which direct us to do something
> > +   different. */
> > +static void process_buffer(UnwinderThreadBuffer* buff, int oldest_ix)
> 
> This entire block is only a move right?

Correct.

> 
> ::: tools/profiler/platform.cpp
> @@ +28,5 @@
> >  #endif
> >  
> >  mozilla::ThreadLocal<PseudoStack *> tlsPseudoStack;
> >  mozilla::ThreadLocal<TableTicker *> tlsTicker;
> > +mozilla::ThreadLocal<void *> tlsStackTop;
> 
> Any reason why you introduce a TLS rather then looking it up for the
> regsistered thread array?

Accessing that array requires the spinlock.

> 
> @@ +720,5 @@
> > +{
> > +  if (!stack_key_initialized)
> > +    return nullptr;
> > +
> > +  // Don't capture a stack if we're not profiling
> 
> It would be nice to not have this requirement but this should happen in a
> new bug.
> 
> @@ +725,5 @@
> > +  if (!profiler_is_active()) {
> > +    return nullptr;
> > +  }
> > +
> > +  // Don't capture a stack if we don't want to include personal information
> 
> We can still collect the c++ stack, just not dynamic Pseudoframes+Markers
> that can include URLs but this is also fine for now.

I've created bug 899764 to cover these last two items.
Comment 19 Benoit Girard (:BenWa) 2013-08-01 13:28:00 PDT
Comment on attachment 783375 [details] [diff] [review]
Patch Rev 5

Review of attachment 783375 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/BreakpadSampler.cpp
@@ +329,5 @@
> +     become available, as that risks deadlock. */
> +  if (!utb)
> +    return;
> +
> +  populateBuffer(utb, sample, &uwt__release_full_buffer, mJankOnly);

I'm sorry but I though this was the right way to do it but Julian pointed that breakpad isn't thread safe. Julian can you provide some input on the best way of processing a uwt buffer outside of the sampling thread and blocking on the results.
Comment 20 Julian Seward [:jseward] 2013-08-05 03:50:56 PDT
(In reply to Benoit Girard (:BenWa) from comment #19)
> Comment on attachment 783375 [details] [diff] [review]
> I'm sorry but I though this was the right way to do it but Julian pointed
> that breakpad isn't thread safe. Julian can you provide some input on the
> best way of processing a uwt buffer outside of the sampling thread and
> blocking on the results.

I was concerned with two things about this:

* breakpad isn't thread safe (I'm pretty sure) and we'd wind up
  running it in both the unwinder thread and the sampling thread.
  Making it thread-safe would be a pretty big exercise.

* Calling the unwinder from arbitrary (?) points inside the
  sampling thread puts us back in the "malloc-lock induced
  deadlock" area.

The safe way to do this is basically to add whatever extra data we
want to struct _UnwinderThreadBuffer, and have the unwinder thread
process that extra data.  It already does this using ::entsFixed set
of ProfileEntries, so perhaps you can piggyback on that mechanism.

From the sampling side, what currently happens is:

* the sampled thread calls uwt__acquire_empty_buffer to get hold of a
  buffer to fill.  That may fail, in which case too bad -- the sample
  is lost.

* if it succeeds, it adds ProfileEntries to the buffer using,
  utb__addEntry, that tell the unwinder thread what to do, and specify
  extra entries to copy into the ThreadProfile.

* it releases the buffer to the profiling thread, using
  uwt__release_full_buffer.  If a native unwind is required, then it
  also supplies the thread's initial register state at that point.

uwt__acquire_empty_buffer and uwt__release_full_buffer run on the
sampling thread and so have to be very careful not to do malloc.
They also take care of the locking/unlocking to coordinate passing
the buffers back and forth to the unwinder thread.

The central function is TableTicker::UnwinderTick in BreakpadSampler.cpp.

So .. I think you can extend this mechanism, to

(1) put whatever stuff you want in, via utb__addEntry

(2) IIRC from IRC chat with BenWa, you want a way for the
    sampling thread to stall until it knows that a buffer has
    been processed.  The buffers contain a 64-bit unique number
    (struct _UnwinderThreadBuffer::seqNo) which could be used
    for this purpose.

Re (2) I'll need to think about how to do this, but basically
it's going to be the same style of behaviour as happens in 
unwind_thr_fn in UnwinderThread2.cpp, except in the sampling
thread: polling the unwinder thread until it has gone past the
sample in question, using an exponential backoff in sleeps so as
not to waste too much CPU.
Comment 21 Aaron Klotz [:aklotz] 2013-09-03 09:59:56 PDT
Created attachment 798936 [details] [diff] [review]
Patch rev 6

This revision is quite similar to the previous one, with a few changes:

- When we create a backtrace, we add it to a linked list on the PseudoStack, similar to (but distinct from) markers. In the signal handler, we add those pointers to the unwinder thread buffer using a 'B' profile entry.

- I left in the refactoring of unwinder thread code where I moved the meat of the unwinder thread into process_buffer(). When the next profiler sample occurs and the unwinder thread encounters a 'B' profile entry, it unwinds that entry into the backtrace's profile object by calling process_buffer() recursively (Note that this recursion depth has an upper bound of 2 process_buffer() calls).

- The profile object used for this purpose must also own its associated UnwinderThreadBuffer. The lifetime of the profile object is managed using reference counting, since it may be manipulated by both the thread that created it and the unwinder thread.

- If the unwinder thread hasn't processed the entry by the time it is needed for serialization, the stack is omitted from the profile (similar to what happens if a regular sample hasn't been unwound yet). I know that for get_backtrace to be most useful in general, we will need to know when the unwinder thread has finished processing the stack. I think that such improvements would be another suitable objective to follow up with in bug 899764.
Comment 22 Benoit Girard (:BenWa) 2013-09-04 15:43:01 PDT
Comment on attachment 798936 [details] [diff] [review]
Patch rev 6

Review of attachment 798936 [details] [diff] [review]:
-----------------------------------------------------------------

Here's some feedback but so far it's looking good. Still checking UnwinderThread*.

::: storage/src/TelemetryVFS.cpp
@@ +86,5 @@
>      uint32_t mainThread = NS_IsMainThread() ? 1 : 0;
>      Telemetry::AccumulateTimeDelta(static_cast<Telemetry::ID>(id + mainThread),
>                                     start, end);
>      if (evtFn && mainThread) {
> +      evtFn(start, end);

Did you intent to include this here?

::: tools/profiler/BreakpadSampler.cpp
@@ +311,5 @@
> +  SyncBuf* syncBuf = utb__acquire_sync_buffer(tlsStackTop.get());
> +  if (!syncBuf) {
> +    return;
> +  }
> +  SyncProfile* syncProfile = static_cast<SyncProfile*>(sample->threadProfile);

Can you implement this as sample->threadProfile->AsSyncProfile using a virtual function. This is a manual RTTI an avoids a static_cast.

::: tools/profiler/GeckoProfiler.h
@@ +112,5 @@
>  
> +class ProfilerBacktrace;
> +
> +// Immediately capture the current thread's call stack and return it
> +static inline ProfilerBacktrace* profiler_get_backtrace() { return nullptr; }

If we're exposing this publicly we need a way to delete it.

::: tools/profiler/ProfileEntry.cpp
@@ +92,5 @@
>    // and union variant fields, so the following was derived
>    // by looking through all the use points of TableTicker.cpp.
> +  //   mTagMarker (ProfilerMarker*) m
> +  //   mTagData   (const char*)  c,s
> +  //   mTagPtr    (void*)        d,l,L,B, S(start-of-stack)

Can you add name for the tags you're introducing like (start-of-stack). I know this is terrible and this needs to go away but until then let's keep it documented.

::: tools/profiler/ProfileEntry.h
@@ +60,5 @@
>  };
>  
>  typedef void (*IterateTagsCallback)(const ProfileEntry& entry, const char* tagStringData);
>  
> +class ThreadProfile : public ProfilerBacktrace

I don't know how I feel about making ThreadProfile a ProfilerBacktrace. It's an interesting idea and avoids having an intermediate ProfilerBacktrace representation. It might be a good idea as long as we only expose it as an opaque object externally without exposing the details of ThreadProfile.

::: tools/profiler/PseudoStack.h
@@ +102,5 @@
>  
>  class ProfilerMarkerPayload;
> +template<typename T>
> +class ProfilerLinkedList;
> +// class ProfilerMarkerLinkedList;

remove commented code

@@ +134,5 @@
>    int mGenID;
>  };
>  
> +typedef struct _UnwinderThreadBuffer UnwinderThreadBuffer;
> +struct SyncBuf

Can we get a better name and/or a class comment for this. It's not obvious what this is.

@@ +256,5 @@
> +class PendingSyncBufs
> +{
> +public:
> +  PendingSyncBufs()
> +    :mSignalLock(false)

style nit, space after :

@@ +266,5 @@
> +    MOZ_ASSERT(aBuff);
> +    mSignalLock = true;
> +    STORE_SEQUENCER();
> +    mSyncBufs.insert(aBuff);
> +    mSignalLock = false;

Umm, I think it's this possible for mSignalLock = false to be visible before insert()? I hope we don't have this problem elsewhere.

@@ +270,5 @@
> +    mSignalLock = false;
> +    STORE_SEQUENCER();
> +  }
> +
> +  SyncBufLinkedList* getSyncBufs()

// Called from a signal

::: tools/profiler/SyncProfile.h
@@ +13,5 @@
> +
> +class SyncProfile : public ThreadProfile
> +{
> +public:
> +  NS_INLINE_DECL_THREADSAFE_REFCOUNTING(SyncProfile)

Why does this need reference counting? I'd REALLY like to avoid this. So far all the lifetimes are clearly defined and haven't needed ref counting. This is a bad precedence to lose control over lifetimes. If its to share this object to the unwinder thread I think we can manage the lifetime without reference counting.

::: tools/profiler/TableTicker.cpp
@@ +618,5 @@
>      MOZ_ASSERT(false);
>      return;
>    }
>  
> +  ThreadProfile* threadProfile = new ThreadProfile("Temp", PROFILE_DEFAULT_ENTRY,

Should this not use the SyncProfile like GetBacktrace()? They should be able to share code via a helper GetBacktrace()

::: tools/profiler/UnwinderThread2.cpp
@@ +729,5 @@
> +  for (size_t i = 0; i < N_PROF_ENT_PAGES; i++)
> +    buff->entsPages[i] = ProfEntsPage_INVALID;
> +}
> +
> +struct SyncUnwinderThreadBuffer : public SyncBuf, public UnwinderThreadBuffer

Multiple inheritance? Can we avoid it please :)
Comment 23 Benoit Girard (:BenWa) 2013-09-17 15:16:38 PDT
Comment on attachment 798936 [details] [diff] [review]
Patch rev 6

Review of attachment 798936 [details] [diff] [review]:
-----------------------------------------------------------------

The UnwinderThread2 changes look good. Just want to look at the final revision once more but I'll look at it ASAP.
Comment 24 Aaron Klotz [:aklotz] 2013-09-19 13:28:58 PDT
Created attachment 807402 [details] [diff] [review]
Part 1: Allow markers to carry payload

Rebased atop Terrence's patch. Carrying forward r+
Comment 25 Benoit Girard (:BenWa) 2013-09-19 17:10:03 PDT
Comment on attachment 807402 [details] [diff] [review]
Part 1: Allow markers to carry payload

https://tbpl.mozilla.org/?tree=Try&rev=4ca6cdf96bb8
Comment 26 Benoit Girard (:BenWa) 2013-09-19 20:34:38 PDT
Waiting for inbound to re-open.
Comment 28 Wes Kocher (:KWierso) 2013-09-20 19:45:07 PDT
https://hg.mozilla.org/mozilla-central/rev/a10c5bd96395
Comment 29 Aaron Klotz [:aklotz] 2013-09-24 10:25:58 PDT
Created attachment 809298 [details] [diff] [review]
Part 2: Immediate sample current thread's stack (rev 7)

(In reply to Benoit Girard (:BenWa) from comment #22)
> ::: storage/src/TelemetryVFS.cpp
> @@ +86,5 @@
> >      uint32_t mainThread = NS_IsMainThread() ? 1 : 0;
> >      Telemetry::AccumulateTimeDelta(static_cast<Telemetry::ID>(id + mainThread),
> >                                     start, end);
> >      if (evtFn && mainThread) {
> > +      evtFn(start, end);
> 
> Did you intent to include this here?
Removed as those changes were made obsolete by bug 902587.

> 
> ::: tools/profiler/BreakpadSampler.cpp
> @@ +311,5 @@
> > +  SyncBuf* syncBuf = utb__acquire_sync_buffer(tlsStackTop.get());
> > +  if (!syncBuf) {
> > +    return;
> > +  }
> > +  SyncProfile* syncProfile = static_cast<SyncProfile*>(sample->threadProfile);
> 
> Can you implement this as sample->threadProfile->AsSyncProfile using a
> virtual function. This is a manual RTTI an avoids a static_cast.
Done.

> ::: tools/profiler/GeckoProfiler.h
> @@ +112,5 @@
> >  
> > +class ProfilerBacktrace;
> > +
> > +// Immediately capture the current thread's call stack and return it
> > +static inline ProfilerBacktrace* profiler_get_backtrace() { return nullptr; }
> 
> If we're exposing this publicly we need a way to delete it.
Done via profiler_free_backtrace()

> 
> ::: tools/profiler/ProfileEntry.cpp
> @@ +92,5 @@
> >    // and union variant fields, so the following was derived
> >    // by looking through all the use points of TableTicker.cpp.
> > +  //   mTagMarker (ProfilerMarker*) m
> > +  //   mTagData   (const char*)  c,s
> > +  //   mTagPtr    (void*)        d,l,L,B, S(start-of-stack)
> 
> Can you add name for the tags you're introducing like (start-of-stack). I
> know this is terrible and this needs to go away but until then let's keep it
> documented.
Done.

> ::: tools/profiler/ProfileEntry.h
> @@ +60,5 @@
> >  };
> >  
> >  typedef void (*IterateTagsCallback)(const ProfileEntry& entry, const char* tagStringData);
> >  
> > +class ThreadProfile : public ProfilerBacktrace
> 
> I don't know how I feel about making ThreadProfile a ProfilerBacktrace. It's
> an interesting idea and avoids having an intermediate ProfilerBacktrace
> representation. It might be a good idea as long as we only expose it as an
> opaque object externally without exposing the details of ThreadProfile.
I've reworked this a little bit such that now the ThreadProfile is a member of ProfilerBacktrace.

> 
> ::: tools/profiler/PseudoStack.h
> @@ +102,5 @@
> >  
> >  class ProfilerMarkerPayload;
> > +template<typename T>
> > +class ProfilerLinkedList;
> > +// class ProfilerMarkerLinkedList;
> 
> remove commented code
Done.

> 
> @@ +134,5 @@
> >    int mGenID;
> >  };
> >  
> > +typedef struct _UnwinderThreadBuffer UnwinderThreadBuffer;
> > +struct SyncBuf
> 
> Can we get a better name and/or a class comment for this. It's not obvious
> what this is.
I did s/SyncBuf/LinkedUWTBuffer/ and added a comment.

> 
> @@ +256,5 @@
> > +class PendingSyncBufs
> > +{
> > +public:
> > +  PendingSyncBufs()
> > +    :mSignalLock(false)
> 
> style nit, space after :
Done.

> @@ +266,5 @@
> > +    MOZ_ASSERT(aBuff);
> > +    mSignalLock = true;
> > +    STORE_SEQUENCER();
> > +    mSyncBufs.insert(aBuff);
> > +    mSignalLock = false;
> 
> Umm, I think it's this possible for mSignalLock = false to be visible before
> insert()? I hope we don't have this problem elsewhere.
Fixed.

> @@ +270,5 @@
> > +    mSignalLock = false;
> > +    STORE_SEQUENCER();
> > +  }
> > +
> > +  SyncBufLinkedList* getSyncBufs()
> 
> // Called from a signal
Done.

> ::: tools/profiler/SyncProfile.h
> @@ +13,5 @@
> > +
> > +class SyncProfile : public ThreadProfile
> > +{
> > +public:
> > +  NS_INLINE_DECL_THREADSAFE_REFCOUNTING(SyncProfile)
> 
> Why does this need reference counting? I'd REALLY like to avoid this. So far
> all the lifetimes are clearly defined and haven't needed ref counting. This
> is a bad precedence to lose control over lifetimes. If its to share this
> object to the unwinder thread I think we can manage the lifetime without
> reference counting.
I removed the reference counting. Now the SyncProfile knows whether it is
owned by a ProfilerBacktrace or not and can manage itself accordingly. The 
SyncProfile::mOwnerState variable tracks this state.

> ::: tools/profiler/TableTicker.cpp
> @@ +618,5 @@
> >      MOZ_ASSERT(false);
> >      return;
> >    }
> >  
> > +  ThreadProfile* threadProfile = new ThreadProfile("Temp", PROFILE_DEFAULT_ENTRY,
> 
> Should this not use the SyncProfile like GetBacktrace()? They should be able
> to share code via a helper GetBacktrace()
Done.
 
> ::: tools/profiler/UnwinderThread2.cpp
> @@ +729,5 @@
> > +  for (size_t i = 0; i < N_PROF_ENT_PAGES; i++)
> > +    buff->entsPages[i] = ProfEntsPage_INVALID;
> > +}
> > +
> > +struct SyncUnwinderThreadBuffer : public SyncBuf, public UnwinderThreadBuffer
> 
> Multiple inheritance? Can we avoid it please :)
Fixed.
Comment 30 Benoit Girard (:BenWa) 2013-09-24 12:09:03 PDT
Comment on attachment 809298 [details] [diff] [review]
Part 2: Immediate sample current thread's stack (rev 7)

Review of attachment 809298 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great! Thanks for doing this.

::: tools/profiler/Makefile.in
@@ +11,5 @@
>    -I$(topsrcdir)/ipc/chromium/src \
>    -I$(topsrcdir)/toolkit/crashreporter/google-breakpad/src \
>    $(NULL)
>  
> +ifeq ($(OS_TARGET),Android)

Can you add a comment for that requires this?
Comment 31 Aaron Klotz [:aklotz] 2013-09-25 08:13:03 PDT
Created attachment 809922 [details] [diff] [review]
Part 2: Immediate sample current thread's stack

Added comment. Carrying forward r+.
Comment 33 :Ehsan Akhgari 2013-09-25 14:59:31 PDT
Comment on attachment 807402 [details] [diff] [review]
Part 1: Allow markers to carry payload

Review of attachment 807402 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/ProfileEntry.cpp
@@ +158,5 @@
>    , mPlatformData(aPlatform)
>    , mStackTop(aStackTop)
>  {
>    mEntries = new ProfileEntry[mEntrySize];
> +  mGeneration = 0;

You're leaving mPendingGenerationFlush uninitialized.

::: tools/profiler/ProfileEntry.h
@@ +20,5 @@
>  
>    // aTagData must not need release (i.e. be a string from the text segment)
>    ProfileEntry(char aTagName, const char *aTagData);
>    ProfileEntry(char aTagName, void *aTagPtr);
> +  ProfileEntry(char aTagName, ProfilerMarker *aTagMarker);

Nit: const.

@@ +32,5 @@
>    bool is_ent_hint();
>    bool is_ent(char tagName);
>    void* get_tagPtr();
>    void log();
> +  const ProfilerMarker* getMarker() {

Nit: const.

@@ +45,5 @@
>    union {
>      const char* mTagData;
>      char        mTagChars[sizeof(void*)];
>      void*       mTagPtr;
> +    ProfilerMarker* mTagMarker;

Nit: const.

@@ +83,5 @@
>    int ThreadId() const { return mThreadId; }
>  
>    PlatformData* GetPlatformData() { return mPlatformData; }
> +  int GetGenerationID() const { return mGeneration; }
> +  bool HasGenerationExpired(int aGenID) {

const

::: tools/profiler/ProfilerMarkers.h
@@ +36,5 @@
> +   */
> +  template<typename Builder>
> +  typename Builder::Object PreparePayload(Builder& b)
> +  {
> +    return preparePayload(b);

I've already told you that I don't like this code.  :-)

@@ +56,5 @@
> +
> +class gfxASurface;
> +class ProfilerMarkerImagePayload : public ProfilerMarkerPayload {
> +public:
> +  ProfilerMarkerImagePayload(gfxASurface *aImg);

Nit: explicit.

::: tools/profiler/PseudoStack.h
@@ +107,5 @@
> +class ThreadProfile;
> +class ProfilerMarker {
> +  friend class ProfilerMarkerLinkedList;
> +public:
> +  ProfilerMarker(const char* aMarkerName,

Nit: explicit.

@@ +132,5 @@
> +  ProfilerMarker* mNext;
> +  int mGenID;
> +};
> +
> +class ProfilerMarkerLinkedList {

Why are you reinventing mozilla::LinkedList?

@@ +142,5 @@
> +
> +  void insert(ProfilerMarker* elem);
> +  ProfilerMarker* popHead();
> +
> +  const ProfilerMarker* peek() {

I'm not sure how much nit-picking comments you'd like to have here, but in general the profiler code mixes and matches the Gecko naming convention of CamelCase with camelCase quite often and it makes it really hard to read this code if you are not closely familiar with it.

::: tools/profiler/platform.cpp
@@ +94,5 @@
>  }
>  
> +ProfilerMarker::ProfilerMarker(const char* aMarkerName,
> +    ProfilerMarkerPayload* aPayload)
> +  : mMarkerName(strdup(aMarkerName))

It would be nice if the name was copied in a static buffer in the object so that you wouldn't need two memory allocations.

@@ +101,5 @@
> +}
> +
> +ProfilerMarker::~ProfilerMarker() {
> +  free(mMarkerName);
> +  delete mPayload;

Hmm, so if I understand correctly, the caller of addMarker() should allocate a marker payload object on the heap and then give up the ownership down to the ProfilerMarker object?  That could use some documentation.
Comment 34 Wes Kocher (:KWierso) 2013-09-25 20:38:41 PDT
https://hg.mozilla.org/mozilla-central/rev/ff7c4578831c

Note You need to log in before you can comment on or make changes to this bug.