The default bug view has changed. See this FAQ.

Support stack intertwining (C++/JS/Labels)

RESOLVED FIXED

Status

()

Core
Gecko Profiler
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: BenWa, Assigned: BenWa)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 8 obsolete attachments)

(Assignee)

Description

5 years ago
Currently when profiling we either operate in 'stackwalk' mode or 'pseudostack' mode. Full frame information is present in the C++ stack, while with bug 707308 we will get rich contextual information including JS and URI information.

The goal is to intertwine the two stacks using the stack pointer and bring this data together into a single stack.
(Assignee)

Comment 1

5 years ago
Created attachment 634933 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk
Assignee: nobody → bgirard
Status: NEW → ASSIGNED
Attachment #634933 - Flags: feedback?
(Assignee)

Updated

5 years ago
Attachment #634933 - Flags: feedback? → feedback?(jmuizelaar)
(Assignee)

Comment 2

5 years ago
Created attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

Fixed a todo comment
Attachment #634933 - Attachment is obsolete: true
Attachment #634933 - Flags: feedback?(jmuizelaar)
Attachment #634934 - Flags: feedback?(jmuizelaar)
Comment on attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

I think you should get dbaron's approval on the NS_StackWalk changes (although they look fine to me).
(Assignee)

Updated

5 years ago
Depends on: 707308
(Assignee)

Comment 4

5 years ago
Comment on attachment 634934 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

A bit more context why this is needed:

We keep two separate stack. By knowing what the SP are for each of the entry, the upcoming part 2 patch will be able to order them in a single stack when recording a sample.
Attachment #634934 - Flags: review?(dbaron)
(Assignee)

Comment 5

5 years ago
Here's a few sample of it in action:
http://dl.dropbox.com/u/10523664/Screenshots/4x.png
http://dl.dropbox.com/u/10523664/Screenshots/4y.png
Blocks: 761261
(Assignee)

Comment 6

5 years ago
Created attachment 635102 [details] [diff] [review]
Part 2: Intertwine during sampling
Attachment #635102 - Flags: review?(jmuizelaar)
Could you describe (preferably in a code comment) how the stack pointer and code pointer pairs are supposed to be associated?  I can imagine two, or maybe four possible answers.  In particular:

1. A stack pointer should be an address on the stack that partitions the stack into pieces associated with different functions.  Is the stack pointer passed along with the PC of the caller (more likely, I'd think) or the PC of the callee?

2. I'd presume that you're defining the stack pointer such that any address greater than or equal to the stack pointer is associated with one stack frame, and any address less than the stack pointer is associated with the other.  (With no prejudice as to which is callee vs. caller, since it could be either way.)  I don't think it much matters which frame the saved PC (and maybe saved stack pointer) are associated with, though, but if it happens to be consistent you should perhaps define it in the comment as well.
(Assignee)

Comment 8

5 years ago
(In reply to David Baron [:dbaron] from comment #7)
> Could you describe (preferably in a code comment) how the stack pointer and
> code pointer pairs are supposed to be associated?  I can imagine two, or
> maybe four possible answers.  In particular:
> 
> 1. A stack pointer should be an address on the stack that partitions the
> stack into pieces associated with different functions.  Is the stack pointer
> passed along with the PC of the caller (more likely, I'd think) or the PC of
> the callee?
> 
> 2. I'd presume that you're defining the stack pointer such that any address
> greater than or equal to the stack pointer is associated with one stack
> frame, and any address less than the stack pointer is associated with the
> other.  (With no prejudice as to which is callee vs. caller, since it could
> be either way.)  I don't think it much matters which frame the saved PC (and
> maybe saved stack pointer) are associated with, though, but if it happens to
> be consistent you should perhaps define it in the comment as well.

I'm defining it as follows. Note that the definition is extremely vague because libunwind guarantees only to provide the SP/IP of the frame while other implementation provide the SP. Note that the contract to the caller would be very vague and the implementation could later chance to specify something very specific (such as exactly the BP or exactly the SP) at a later time with breaking compatibility. The reason I did not do this is because 1) it is not required for my use case, 2) I'm not sure all the implementation are compatible enough to provide something specific.

The stack pointer is an address between the base address of the frame inclusive and it's calle's base address exclusive. If the function does not have a calle's it is between the base address and the 'sp' register. If the address is not known (or implemented) 0 is returned.

If you're satisfied with this property I'll post a revised patch with this comment.
(Assignee)

Comment 9

5 years ago
could be changed without* breaking compatibility.
I guess that seems ok with s/calle/callee/, though I'd hoped you'd at least be able to say it was at one end of the range or the other.
(Assignee)

Comment 11

5 years ago
Discussed this on IRC, we need a strong definition otherwise we can't intertwine consistently if this properly isn't consistent.
(Assignee)

Comment 12

5 years ago
Created attachment 635879 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

It's now as:
// aSP will be the best approximation possible of the SP at the time
// we walk the stack. If no approximation can be made this value
// will be NULL.
Attachment #634934 - Attachment is obsolete: true
Attachment #634934 - Flags: review?(dbaron)
Attachment #634934 - Flags: feedback?(jmuizelaar)
(Assignee)

Updated

5 years ago
Attachment #635879 - Flags: review?(dbaron)
(Assignee)

Comment 13

5 years ago
Created attachment 635940 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

The previous patch didn't build because an if block didn't use {} and I added a second statement to it without noticing.
Attachment #635940 - Flags: review?(dbaron)

Comment 14

5 years ago
Try run for dcdbe62dcf56 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=dcdbe62dcf56
Results (out of 30 total builds):
    exception: 14
    success: 1
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.com-dcdbe62dcf56
(Assignee)

Comment 15

5 years ago
(In reply to Mozilla RelEng Bot from comment #14)
> Try run for dcdbe62dcf56 is complete.
> Detailed breakdown of the results available here:
>     https://tbpl.mozilla.org/?tree=Try&rev=dcdbe62dcf56
> Results (out of 30 total builds):
>     exception: 14
>     success: 1
>     failure: 15
> Builds (or logs if builds failed) available at:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/b56girard@gmail.
> com-dcdbe62dcf56

Someone aborted this job by accident, a new job is pending.
(In reply to Benoit Girard (:BenWa) from comment #12)
> Created attachment 635879 [details] [diff] [review]
> Part 1: Add SP param to NS_StackWalk
> 
> It's now as:
> // aSP will be the best approximation possible of the SP at the time
> // we walk the stack. If no approximation can be made this value
> // will be NULL.

I don't see how this answers the question of which way the sp and pc are paired up.  Or did you decide not to go with actually providing a strong-enough definition that it's useful for intertwining reliably?
(Assignee)

Updated

5 years ago
Attachment #635879 - Attachment is obsolete: true
Attachment #635879 - Flags: review?(dbaron)
(Assignee)

Comment 17

5 years ago
Created attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

Updated the comment as we discussed on IRC.
Attachment #637227 - Flags: review?(dbaron)
(Assignee)

Updated

5 years ago
Attachment #635940 - Attachment is obsolete: true
Attachment #635940 - Flags: review?(dbaron)
(Assignee)

Comment 18

5 years ago
Created attachment 637301 [details] [diff] [review]
Part 2: Intertwine during sampling

I changed this patch to support some changes we made to be more flexible for JS frames.

Here's a screenshot of the Gecko Profiler addons responsiveness canvas graph drawing with these patches and the ones from bug 761261 when the panel is active:
https://dl.dropbox.com/u/10523664/Screenshots/58.png
Attachment #635102 - Attachment is obsolete: true
Attachment #635102 - Flags: review?(jmuizelaar)
Attachment #637301 - Flags: review?(jmuizelaar)
(Assignee)

Comment 19

5 years ago
Review ping. Bug 761261 will be landing shortly and this change is required to fully leverage JS profiling.
(Assignee)

Comment 20

5 years ago
Bug 761261 landed last night. This is now the long pole for JS profiling with unwind information.
Comment on attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

nsStackWalk.h:

>+// aSP will be the best approximation possible of what the SP will be
>+// pointing to when the execution returns to executing that frame.

Maybe clearer to say "returns to executing at that PC"?

And definitely "what the SP will be" -> "what the stack pointer will be".
(and rewrap)

nsStackWalk.cpp:

In the windows code, it probably would have been better to allocate
the SPs and the PCs in a single array than in 2, but I guess this is ok.
(i.e., have a struct with 2 void*, and allocate an array of structs).
Might be nice to do as a followup, though.

In FramePointerStackWalk, no need to over-parenthesize.  That said, if
you wanted to be exact I think you'd want to use bp+3 for the cases
in the #if of the ifdef a few lines above, and bp+2 for the others.
(This could be refactored by making the #ifdef increment bp.)  Given
that it's that easy (if you agree that's correct), I think it's worth
doing.

>+    // TODO Use something like 'unw_get_reg(&cursor, UNW_REG_SP, &sp)' to get sp

Perhaps this should refer to _Unwind_GetGR() instead?  We're using
unwind.h (part of gcc), not libunwind.h (a separate library).


r=dbaron with those things fixed
Attachment #637227 - Flags: review?(dbaron) → review+
Also, merging with bug 767479 requires some care.
Comment on attachment 637227 [details] [diff] [review]
Add SP param to NS_StackWalk

Review of attachment 637227 [details] [diff] [review]:
-----------------------------------------------------------------

::: xpcom/base/nsStackWalk.cpp
@@ +515,5 @@
>          if (data.pc_count > data.pc_size) {
>              data.pcs = (void**) malloc(data.pc_count * sizeof(void*));
>              data.pc_size = data.pc_count;
>              data.pc_count = 0;
> +            data.sps = (void**) malloc(data.sp_count * sizeof(void*));

Please use _alloca here as well.

@@ +539,5 @@
>      if (data.pc_size > ArrayLength(local_pcs))
>          free(data.pcs);
>  
> +    if (data.sp_size > ArrayLength(local_sps))
> +        free(data.sps);

And then you can get rid of this whole block!
Comment on attachment 637301 [details] [diff] [review]
Part 2: Intertwine during sampling

Review of attachment 637301 [details] [diff] [review]:
-----------------------------------------------------------------

Do well.

::: tools/profiler/TableTicker.cpp
@@ +501,5 @@
> +    size_t strLen = strlen(sampleLabel) + 1;
> +    for (size_t j = 0; j < strLen;) {
> +      // Store as many characters in the void* as the platform allows
> +      char text[sizeof(void*)/sizeof(char)];
> +      for (size_t pos = 0; pos < sizeof(void*)/sizeof(char) && j+pos < strLen; pos++) {

- sizeof(char)

@@ +532,5 @@
>  
>  #ifdef USE_NS_STACKWALK
>  typedef struct {
>    void** array;
> +  void** sparray;

sp_array

@@ +587,1 @@
>      for (size_t i = array.count; i > 0; --i) {

// interleave the pseudoStack
add explanation of what's happening

@@ +592,5 @@
> +          break;
> +
> +        //if (entry.isCopyLabel()) {
> +          addProfileEntry(stack, aProfile, pseudoStackPos);
> +        //}

??

@@ +605,5 @@
> +            break;
> +
> +          addProfileEntry(stack, aProfile, pseudoStackPos);
> +          pseudoStackPos++;
> +        }

Simplify
Attachment #637301 - Flags: review?(jmuizelaar) → review+
(Assignee)

Comment 25

5 years ago
Created attachment 640498 [details] [diff] [review]
Part 1: Add SP param to NS_StackWalk

Carry forward r=dbaron
Attachment #637227 - Attachment is obsolete: true
Attachment #640498 - Flags: review+
(Assignee)

Comment 26

5 years ago
Created attachment 640499 [details] [diff] [review]
Part 2: Intertwine during sampling

Carry forward r=jmuizel

Code is much simpler, fixed sizeof(char), and added a big fat over the top comment :)
Attachment #637301 - Attachment is obsolete: true
Attachment #640499 - Flags: review+
(Assignee)

Comment 27

5 years ago
Created attachment 640500 [details] [diff] [review]
Part 2: Intertwine during sampling
Attachment #640499 - Attachment is obsolete: true
Attachment #640500 - Flags: review+
(Assignee)

Comment 28

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/461dd0c4f2b7
https://hg.mozilla.org/integration/mozilla-inbound/rev/83fc31dccd91
What about these comments:

(In reply to David Baron [:dbaron] from comment #21)
> In FramePointerStackWalk, no need to over-parenthesize.  That said, if
> you wanted to be exact I think you'd want to use bp+3 for the cases
> in the #if of the ifdef a few lines above, and bp+2 for the others.
> (This could be refactored by making the #ifdef increment bp.)  Given
> that it's that easy (if you agree that's correct), I think it's worth
> doing.
> 
> >+    // TODO Use something like 'unw_get_reg(&cursor, UNW_REG_SP, &sp)' to get sp
> 
> Perhaps this should refer to _Unwind_GetGR() instead?  We're using
> unwind.h (part of gcc), not libunwind.h (a separate library).

And also now that you've done this:

(In reply to Ehsan Akhgari [:ehsan] from comment #23)
> ::: xpcom/base/nsStackWalk.cpp
> @@ +515,5 @@
> >          if (data.pc_count > data.pc_size) {
> >              data.pcs = (void**) malloc(data.pc_count * sizeof(void*));
> >              data.pc_size = data.pc_count;
> >              data.pc_count = 0;
> > +            data.sps = (void**) malloc(data.sp_count * sizeof(void*));
> 
> Please use _alloca here as well.

You *have* to do this as well:

> @@ +539,5 @@
> >      if (data.pc_size > ArrayLength(local_pcs))
> >          free(data.pcs);
> >  
> > +    if (data.sp_size > ArrayLength(local_sps))
> > +        free(data.sps);
> 
> And then you can get rid of this whole block!
(Assignee)

Comment 30

5 years ago
I fixed the malloc and _Unwind_GetGR() but missed removing the block, bp and didn't merge correctly with ehsan new branch from bug 767479. 

Follow up coming right now. Really sorry about that :(
(Assignee)

Comment 31

5 years ago
Created attachment 640722 [details] [diff] [review]
Part 3: Follow up

Again, sorry about that.
Attachment #640722 - Flags: review?(dbaron)
(Assignee)

Updated

5 years ago
Whiteboard: [leave open]
Comment on attachment 640722 [details] [diff] [review]
Part 3: Follow up

r=dbaron, but you should also fix the unw_get_reg comment (I don't see it fixed in https://hg.mozilla.org/integration/mozilla-inbound/rev/461dd0c4f2b7 , at least.)
Attachment #640722 - Flags: review?(dbaron) → review+
(Assignee)

Comment 33

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b824d4c9e575
https://hg.mozilla.org/mozilla-central/rev/461dd0c4f2b7
https://hg.mozilla.org/mozilla-central/rev/83fc31dccd91
https://hg.mozilla.org/mozilla-central/rev/b824d4c9e575
(Assignee)

Comment 35

5 years ago
\o/ Merged in time for nightly. Thanks Ryan!
Whiteboard: [leave open]
(Assignee)

Updated

5 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.