Last Comment Bug 766579 - Support stack intertwining (C++/JS/Labels)
: Support stack intertwining (C++/JS/Labels)
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Gecko Profiler (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Benoit Girard (:BenWa)
:
Mentors:
Depends on: 707308
Blocks: 761261
  Show dependency treegraph
 
Reported: 2012-06-20 08:35 PDT by Benoit Girard (:BenWa)
Modified: 2012-07-10 20:35 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Part 1: Add SP param to NS_StackWalk (13.42 KB, patch)
2012-06-20 08:37 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Part 1: Add SP param to NS_StackWalk (13.44 KB, patch)
2012-06-20 08:40 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Part 2: Intertwine during sampling (5.62 KB, patch)
2012-06-20 15:45 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Part 1: Add SP param to NS_StackWalk (13.82 KB, patch)
2012-06-22 13:15 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Part 1: Add SP param to NS_StackWalk (13.89 KB, patch)
2012-06-22 15:20 PDT, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Add SP param to NS_StackWalk (13.93 KB, patch)
2012-06-27 13:09 PDT, Benoit Girard (:BenWa)
dbaron: review+
Details | Diff | Review
Part 2: Intertwine during sampling (7.57 KB, patch)
2012-06-27 16:07 PDT, Benoit Girard (:BenWa)
jmuizelaar: review+
Details | Diff | Review
Part 1: Add SP param to NS_StackWalk (13.94 KB, patch)
2012-07-09 21:25 PDT, Benoit Girard (:BenWa)
bgirard: review+
Details | Diff | Review
Part 2: Intertwine during sampling (7.52 KB, patch)
2012-07-09 21:27 PDT, Benoit Girard (:BenWa)
bgirard: review+
Details | Diff | Review
Part 2: Intertwine during sampling (8.07 KB, patch)
2012-07-09 21:28 PDT, Benoit Girard (:BenWa)
bgirard: review+
Details | Diff | Review
Part 3: Follow up (2.36 KB, patch)
2012-07-10 12:30 PDT, Benoit Girard (:BenWa)
dbaron: review+
Details | Diff | Review

Description Benoit Girard (:BenWa) 2012-06-20 08:35:35 PDT
Currently when profiling we either operate in 'stackwalk' mode or 'pseudostack' mode. Full frame information is present in the C++ stack, while with bug 707308 we will get rich contextual information including JS and URI information.

The goal is to intertwine the two stacks using the stack pointer and bring this data together into a single stack.
Comment 1 Benoit Girard (:BenWa) 2012-06-20 08:37:33 PDT
Created attachment 634933 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk
Comment 2 Benoit Girard (:BenWa) 2012-06-20 08:40:42 PDT
Created attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

Fixed a todo comment
Comment 3 :Ehsan Akhgari (busy, don't ask for review please) 2012-06-20 08:52:21 PDT
Comment on attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

I think you should get dbaron's approval on the NS_StackWalk changes (although they look fine to me).
Comment 4 Benoit Girard (:BenWa) 2012-06-20 14:29:47 PDT
Comment on attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

A bit more context why this is needed:

We keep two separate stack. By knowing what the SP are for each of the entry, the upcoming part 2 patch will be able to order them in a single stack when recording a sample.
Comment 5 Benoit Girard (:BenWa) 2012-06-20 15:44:17 PDT
Here's a few sample of it in action:
http://dl.dropbox.com/u/10523664/Screenshots/4x.png
http://dl.dropbox.com/u/10523664/Screenshots/4y.png
Comment 6 Benoit Girard (:BenWa) 2012-06-20 15:45:02 PDT
Created attachment 635102 [details] [diff] [review]
Part 2: Intertwine during sampling
Comment 7 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-20 15:59:41 PDT
Could you describe (preferably in a code comment) how the stack pointer and code pointer pairs are supposed to be associated?  I can imagine two, or maybe four possible answers.  In particular:

1. A stack pointer should be an address on the stack that partitions the stack into pieces associated with different functions.  Is the stack pointer passed along with the PC of the caller (more likely, I'd think) or the PC of the callee?

2. I'd presume that you're defining the stack pointer such that any address greater than or equal to the stack pointer is associated with one stack frame, and any address less than the stack pointer is associated with the other.  (With no prejudice as to which is callee vs. caller, since it could be either way.)  I don't think it much matters which frame the saved PC (and maybe saved stack pointer) are associated with, though, but if it happens to be consistent you should perhaps define it in the comment as well.
Comment 8 Benoit Girard (:BenWa) 2012-06-20 16:18:59 PDT
(In reply to David Baron [:dbaron] from comment #7)
> Could you describe (preferably in a code comment) how the stack pointer and
> code pointer pairs are supposed to be associated?  I can imagine two, or
> maybe four possible answers.  In particular:
> 
> 1. A stack pointer should be an address on the stack that partitions the
> stack into pieces associated with different functions.  Is the stack pointer
> passed along with the PC of the caller (more likely, I'd think) or the PC of
> the callee?
> 
> 2. I'd presume that you're defining the stack pointer such that any address
> greater than or equal to the stack pointer is associated with one stack
> frame, and any address less than the stack pointer is associated with the
> other.  (With no prejudice as to which is callee vs. caller, since it could
> be either way.)  I don't think it much matters which frame the saved PC (and
> maybe saved stack pointer) are associated with, though, but if it happens to
> be consistent you should perhaps define it in the comment as well.

I'm defining it as follows. Note that the definition is extremely vague because libunwind guarantees only to provide the SP/IP of the frame while other implementation provide the SP. Note that the contract to the caller would be very vague and the implementation could later chance to specify something very specific (such as exactly the BP or exactly the SP) at a later time with breaking compatibility. The reason I did not do this is because 1) it is not required for my use case, 2) I'm not sure all the implementation are compatible enough to provide something specific.

The stack pointer is an address between the base address of the frame inclusive and it's calle's base address exclusive. If the function does not have a calle's it is between the base address and the 'sp' register. If the address is not known (or implemented) 0 is returned.

If you're satisfied with this property I'll post a revised patch with this comment.
Comment 9 Benoit Girard (:BenWa) 2012-06-20 16:19:52 PDT
could be changed without* breaking compatibility.
Comment 10 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-20 16:24:31 PDT
I guess that seems ok with s/calle/callee/, though I'd hoped you'd at least be able to say it was at one end of the range or the other.
Comment 11 Benoit Girard (:BenWa) 2012-06-20 16:43:49 PDT
Discussed this on IRC, we need a strong definition otherwise we can't intertwine consistently if this properly isn't consistent.
Comment 12 Benoit Girard (:BenWa) 2012-06-22 13:15:19 PDT
Created attachment 635879 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

It's now as:
// aSP will be the best approximation possible of the SP at the time
// we walk the stack. If no approximation can be made this value
// will be NULL.
Comment 13 Benoit Girard (:BenWa) 2012-06-22 15:20:56 PDT
Created attachment 635940 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

The previous patch didn't build because an if block didn't use {} and I added a second statement to it without noticing.
Comment 14 Mozilla RelEng Bot 2012-06-22 17:15:25 PDT
Try run for dcdbe62dcf56 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=dcdbe62dcf56
Results (out of 30 total builds):
    exception: 14
    success: 1
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-dcdbe62dcf56
Comment 15 Benoit Girard (:BenWa) 2012-06-22 17:22:12 PDT
(In reply to Mozilla RelEng Bot from comment #14)
> Try run for dcdbe62dcf56 is complete.
> Detailed breakdown of the results available here:
>     https://tbpl.mozilla.org/?tree=Try&rev=dcdbe62dcf56
> Results (out of 30 total builds):
>     exception: 14
>     success: 1
>     failure: 15
> Builds (or logs if builds failed) available at:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.
> com-dcdbe62dcf56

Someone aborted this job by accident, a new job is pending.
Comment 16 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-06-26 13:32:56 PDT
(In reply to Benoit Girard (:BenWa) from comment #12)
> Created attachment 635879 [details] [diff] [review]
> Part 1: Add SP param to NS_StackWalk
> 
> It's now as:
> // aSP will be the best approximation possible of the SP at the time
> // we walk the stack. If no approximation can be made this value
> // will be NULL.

I don't see how this answers the question of which way the sp and pc are paired up.  Or did you decide not to go with actually providing a strong-enough definition that it's useful for intertwining reliably?
Comment 17 Benoit Girard (:BenWa) 2012-06-27 13:09:11 PDT
Created attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

Updated the comment as we discussed on IRC.
Comment 18 Benoit Girard (:BenWa) 2012-06-27 16:07:14 PDT
Created attachment 637301 [details] [diff] [review]
Part 2: Intertwine during sampling

I changed this patch to support some changes we made to be more flexible for JS frames.

Here's a screenshot of the Gecko Profiler addons responsiveness canvas graph drawing with these patches and the ones from bug 761261 when the panel is active:
https://dl.dropbox.com/u/10523664/Screenshots/58.png
Comment 19 Benoit Girard (:BenWa) 2012-07-07 17:10:03 PDT
Review ping. Bug 761261 will be landing shortly and this change is required to fully leverage JS profiling.
Comment 20 Benoit Girard (:BenWa) 2012-07-09 08:18:55 PDT
Bug 761261 landed last night. This is now the long pole for JS profiling with unwind information.
Comment 21 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-07-09 14:29:55 PDT
Comment on attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

nsStackWalk.h:

>+// aSP will be the best approximation possible of what the SP will be
>+// pointing to when the execution returns to executing that frame.

Maybe clearer to say "returns to executing at that PC"?

And definitely "what the SP will be" -> "what the stack pointer will be".
(and rewrap)

nsStackWalk.cpp:

In the windows code, it probably would have been better to allocate
the SPs and the PCs in a single array than in 2, but I guess this is ok.
(i.e., have a struct with 2 void*, and allocate an array of structs).
Might be nice to do as a followup, though.

In FramePointerStackWalk, no need to over-parenthesize.  That said, if
you wanted to be exact I think you'd want to use bp+3 for the cases
in the #if of the ifdef a few lines above, and bp+2 for the others.
(This could be refactored by making the #ifdef increment bp.)  Given
that it's that easy (if you agree that's correct), I think it's worth
doing.

>+    // TODO Use something like 'unw_get_reg(&cursor, UNW_REG_SP, &sp)' to get sp

Perhaps this should refer to _Unwind_GetGR() instead?  We're using
unwind.h (part of gcc), not libunwind.h (a separate library).


r=dbaron with those things fixed
Comment 22 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-07-09 14:49:23 PDT
Also, merging with bug 767479 requires some care.
Comment 23 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-09 15:26:59 PDT
Comment on attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

Review of attachment 637227 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsStackWalk.cpp
@@ +515,5 @@
>          if (data.pc_count > data.pc_size) {
>              data.pcs = (void**) malloc(data.pc_count * sizeof(void*));
>              data.pc_size = data.pc_count;
>              data.pc_count = 0;
> +            data.sps = (void**) malloc(data.sp_count * sizeof(void*));

Please use _alloca here as well.

@@ +539,5 @@
>      if (data.pc_size > ArrayLength(local_pcs))
>          free(data.pcs);
>  
> +    if (data.sp_size > ArrayLength(local_sps))
> +        free(data.sps);

And then you can get rid of this whole block!
Comment 24 Jeff Muizelaar [:jrmuizel] 2012-07-09 16:29:16 PDT
Comment on attachment 637301 [details] [diff] [review]
Part 2: Intertwine during sampling

Review of attachment 637301 [details] [diff] [review]:
-----------------------------------------------------------------

Do well.

::: tools/profiler/TableTicker.cpp
@@ +501,5 @@
> +    size_t strLen = strlen(sampleLabel) + 1;
> +    for (size_t j = 0; j < strLen;) {
> +      // Store as many characters in the void* as the platform allows
> +      char text[sizeof(void*)/sizeof(char)];
> +      for (size_t pos = 0; pos < sizeof(void*)/sizeof(char) && j+pos < strLen; pos++) {

- sizeof(char)

@@ +532,5 @@
>  
>  #ifdef USE_NS_STACKWALK
>  typedef struct {
>    void** array;
> +  void** sparray;

sp_array

@@ +587,1 @@
>      for (size_t i = array.count; i > 0; --i) {

// interleave the pseudoStack
add explanation of what's happening

@@ +592,5 @@
> +          break;
> +
> +        //if (entry.isCopyLabel()) {
> +          addProfileEntry(stack, aProfile, pseudoStackPos);
> +        //}

??

@@ +605,5 @@
> +            break;
> +
> +          addProfileEntry(stack, aProfile, pseudoStackPos);
> +          pseudoStackPos++;
> +        }

Simplify
Comment 25 Benoit Girard (:BenWa) 2012-07-09 21:25:30 PDT
Created attachment 640498 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

Carry forward r=dbaron
Comment 26 Benoit Girard (:BenWa) 2012-07-09 21:27:32 PDT
Created attachment 640499 [details] [diff] [review]
Part 2: Intertwine during sampling

Carry forward r=jmuizel

Code is much simpler, fixed sizeof(char), and added a big fat over the top comment :)
Comment 27 Benoit Girard (:BenWa) 2012-07-09 21:28:42 PDT
Created attachment 640500 [details] [diff] [review]
Part 2: Intertwine during sampling
Comment 29 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-07-10 12:10:26 PDT
What about these comments:

(In reply to David Baron [:dbaron] from comment #21)
> In FramePointerStackWalk, no need to over-parenthesize.  That said, if
> you wanted to be exact I think you'd want to use bp+3 for the cases
> in the #if of the ifdef a few lines above, and bp+2 for the others.
> (This could be refactored by making the #ifdef increment bp.)  Given
> that it's that easy (if you agree that's correct), I think it's worth
> doing.
> 
> >+    // TODO Use something like 'unw_get_reg(&cursor, UNW_REG_SP, &sp)' to get sp
> 
> Perhaps this should refer to _Unwind_GetGR() instead?  We're using
> unwind.h (part of gcc), not libunwind.h (a separate library).

And also now that you've done this:

(In reply to Ehsan Akhgari [:ehsan] from comment #23)
> ::: xpcom/base/nsStackWalk.cpp
> @@ +515,5 @@
> >          if (data.pc_count > data.pc_size) {
> >              data.pcs = (void**) malloc(data.pc_count * sizeof(void*));
> >              data.pc_size = data.pc_count;
> >              data.pc_count = 0;
> > +            data.sps = (void**) malloc(data.sp_count * sizeof(void*));
> 
> Please use _alloca here as well.

You *have* to do this as well:

> @@ +539,5 @@
> >      if (data.pc_size > ArrayLength(local_pcs))
> >          free(data.pcs);
> >  
> > +    if (data.sp_size > ArrayLength(local_sps))
> > +        free(data.sps);
> 
> And then you can get rid of this whole block!
Comment 30 Benoit Girard (:BenWa) 2012-07-10 12:20:06 PDT
I fixed the malloc and _Unwind_GetGR() but missed removing the block, bp and didn't merge correctly with ehsan new branch from bug 767479. 

Follow up coming right now. Really sorry about that :(
Comment 31 Benoit Girard (:BenWa) 2012-07-10 12:30:47 PDT
Created attachment 640722 [details] [diff] [review]
Part 3: Follow up

Again, sorry about that.
Comment 32 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2012-07-10 13:41:06 PDT
Comment on attachment 640722 [details] [diff] [review]
Part 3: Follow up

r=dbaron, but you should also fix the unw_get_reg comment (I don't see it fixed in https://hg.mozilla.org/integration/mozilla-inbound/rev/461dd0c4f2b7 , at least.)
Comment 35 Benoit Girard (:BenWa) 2012-07-10 20:35:28 PDT
\o/ Merged in time for nightly. Thanks Ryan!

Note You need to log in before you can comment on or make changes to this bug.