Closed Bug 894264 Opened 11 years ago Closed 11 years ago

Breakpad Stack scan: don't generate frames we aren't going to use

Categories

(Core :: Gecko Profiler, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: jseward, Assigned: jseward)

References

Details

Attachments

(1 file, 1 obsolete file)

SPS's connection to Breakpad allows use of a user-specified number
of stack-scanned frames in stacks, by default zero, since stack-scanned
frames in profiles are generally bad news.

This is done by allowing Breakpad to unwind as much as it can, and
then post processing the stacks, truncating them after the allowed
number of scanned frames have been seen.  This has the advantage of
not modifying the Breakpad interface, but is inefficient because 
stack scanning is expensive.  Profiling results on x86_64-linux show
that a call to GetCallerFrameByStackScan costs about 34000 instructions
(even after the improvements of bug 892774), whereas a call to 
GetCallerFrameByCFIFrameInfo costs about 5400 instructions.

So we're wasting a lot of unwind time generating scanned frames which
in most cases we're not going to use.
Attached patch Patch, v1 (obsolete) — Splinter Review
(In reply to Julian Seward from comment #1)
> Created attachment 776779 [details] [diff] [review]
> Patch, v1

* adds a new class FrameAuditor which allows the caller of
  Stackwalker::Walk to inspect each frame as it is generated and
  request unwinding be halted, on completely arbitrary condition

* Uses the above to limit unwinding in the case where we want
  to limit the number of scanned frames, but that limit is > 0

* adds a new static bool stack_scan_allowed_ to Stackwalker, which
  is used in the common case of zero scanned frames allowed, to
  completely avoid generating them.  Plumb this through to all the
  StackwalkerARCH::GetCallerFrame functions.

* use all the above in do_breakpad_unwind_Buffer to minimise
  stack-scan cost for all settings of sUnwindStackScan.

* (not directly related, but) make configuration variables
  sUnwindStackScan and sUnwindInterval be unsigned, since negative
  values for either are meaningless.
Attachment #776779 - Flags: review?(ted)
Comment on attachment 776779 [details] [diff] [review]
Patch, v1

Review of attachment 776779 [details] [diff] [review]:
-----------------------------------------------------------------

I thought I hit "submit" on a review of this the other day but apparently not. :-(

You've implemented two separate mechanisms here--the more general FrameAuditor, and the simpler stack_scan_allowed. Do we really need both? I'd be a lot happier if you settled on one and we used that instead of adding multiple ways to do the same thing. I realize that there are tradeoffs in both approaches, and it sounded like the stack_scan_allowed method was more useful (to prevent any stack scans from happening). Do you have a use case for the FrameAuditor API given the stack_scan_allowed API?

::: toolkit/crashreporter/google-breakpad/src/google_breakpad/processor/stackwalker.h
@@ +104,5 @@
>  
> +  static void set_stack_scan_allowed(bool stack_scan_allowed) {
> +    stack_scan_allowed_ = stack_scan_allowed;
> +  }
> +  static bool get_stack_scan_allowed() { return stack_scan_allowed_; }

I don't think this needs to be static, but I guess this is modeled on set_max_frames above. I suppose you could argue either way. The max_frames code is written this way because when using MinidumpProcessor you don't actually get to see the Stackwalker objects that are instantiated, so maybe this makes sense as well.

::: toolkit/crashreporter/google-breakpad/src/processor/stackwalker.cc
@@ +140,5 @@
>        break;
>      }
>  
>      // Get the next frame and take ownership.
> +    frame.reset(GetCallerFrame(stack, stack_scan_allowed_));

Do you need to pass stack_scan_allowed_ in? Everything that's going to reference it is a Stackwalker subclass and could just reference the class static anyway, right?
Attachment #776779 - Flags: review?(ted)
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3)

Thanks for the review.

> You've implemented two separate mechanisms here--the more general
> FrameAuditor, and the simpler stack_scan_allowed. Do we really need both?
> I'd be a lot happier if you settled on one and we used that instead of
> adding multiple ways to do the same thing.

It's true that the FrameAuditor mechanism is more general than we
need.  It doesn't overlap exactly with the stack_scan_allowed
mechanism though.

The problem I was trying to solve is to be able to set the
stack-scanned-frame limit to any number from zero to infinity, and
have this done with zero unnecessary overhead.  For x86_64-linux it is
indeed useful to be able to select 0, 1, 2, 3, etc max-stack-scanned
frames, so there is a use case for more than just the
"none/any-number" distinction.

What complicates it is that a limit of zero is a special case -- in
which we know that we can always skip the call to
FindCallerByStackScan.  For all other limits (> 0), we have to call
FindCallerByStackScan, examine the result using the FrameAuditor,
which counts stack-scanned frames, and say "no more, now" when the
limit is reached.  Using the FrameAuditor only in the zero-limit case
would require one call into FindCallerByStackScan which is
unnecessary.

But yes, it is overly general.  I'll try to redo it just by passing
around counters of the number of stack scanned frames so far seen, and
allowed.


> Do you need to pass stack_scan_allowed_ in? Everything that's going to
> reference it is a Stackwalker subclass and could just reference the class
> static anyway, right?

Maybe; I'll check.  These global config variables might become a
problem if we ever make breakpad be multithreaded, so as to increase
unwinding throughput.  On the whole I'm not enthusiastic about them,
but MT-breakpad is pie-in-the-sky department at this point.
Perhaps you could change set_stack_scan_allowed(bool) to max_stack_scan_frames(int)? Then you could handle the "zero scanning allowed" case as currently, but also nudge it higher without introducing the separate FrameAuditor interface.
Attached patch Patch, v3Splinter Review
Respin, which has the same performance characteristics but is less
intrusive.  Changes compared to v1:

* Remove FrameAuditor completely

* Remove set_stack_scan_allowed(bool)
  and replace by set_max_frames_dubious(uint32), per comment 3

* Remove unrelated change to config variables sUnwindStackScan and
  sUnwindInterval described in comment 2.  These ought to be fixed,
  but this is not the place for that.

One item of some doubt .. I was originally going to call it
set_max_frames_scanned(uint32).  But it actually limits not only
scanned frames but also FRAME_TRUST_CFI_SCAN and FRAME_TRUST_NONE.
Hence I decided to call it "dubious" instead.  I can change the name
back to "scanned" if you prefer.


> > +  static bool get_stack_scan_allowed() { return stack_scan_allowed_; }
> 
> I don't think this needs to be static,

This is gone now, so .. moot.


> > +    frame.reset(GetCallerFrame(stack, stack_scan_allowed_));
>
> Do you need to pass stack_scan_allowed_ in? Everything that's going to
> reference it is a Stackwalker subclass and could just reference the
> class static anyway, right?

I looked into this.  That might work if the yes/no decision for stack
scanned frames was independent of the number of frames we've already
seen, in any given unwind.  But it isn't -- Stackwalker::Walk has to
monitor how many scanned frames have been seen so far and decide on
stack_scan_allowed_ for each iteration of the unwind loop.

We could conceivably remove some (but not all) of the passing of this
bool by adding it as a field in the CallStack object passed to the
GetCallerFrame implementations.  That seems pretty unclean though,
since CallStack appears to be designed to hold the frames and nothing
else.
Attachment #776779 - Attachment is obsolete: true
Attachment #788088 - Flags: review?(ted)
Comment on attachment 788088 [details] [diff] [review]
Patch, v3

Review of attachment 788088 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this looks a lot nicer.

::: toolkit/crashreporter/google-breakpad/src/google_breakpad/processor/stackwalker.h
@@ +93,5 @@
>      max_frames_set_ = true;
>    }
>    static uint32_t max_frames() { return max_frames_; }
>  
> +  static void set_max_frames_dubious(uint32_t max_frames_dubious) {

I don't think calling it "scanned" is unreasonable, since in practice the only way to get these frames is by stack scanning. (CFI_SCAN is just a scan after a CFI unwind.)

That being said, I don't hate this naming, so if you feel strongly I can live with it.

::: toolkit/crashreporter/google-breakpad/src/processor/stackwalker.cc
@@ +88,5 @@
>    // no more.
>  
> +  // Keep track of the number of dubious frames seen so far, as the
> +  // caller may have set a limit.
> +  uint32_t n_dubious_frames = 0;

It's not very Googley to name something with an n_ prefix. I would either use "dubious_frames" or "dubious_frame_count".

@@ +130,5 @@
>      }
>  
> +    // Keep track of the number of dubious frames so far.
> +    switch (frame.get()->trust) {
> +       case StackFrame::FRAME_TRUST_NONE:

These ought not to exist in practice, but it's fine to lump this in here.

@@ +131,5 @@
>  
> +    // Keep track of the number of dubious frames so far.
> +    switch (frame.get()->trust) {
> +       case StackFrame::FRAME_TRUST_NONE:
> +       case StackFrame::FRAME_TRUST_SCAN: 

nit: trailing whitespace

::: tools/profiler/UnwinderThread2.cpp
@@ +1937,5 @@
>    sw->set_max_frames(256);
>  
> +  // Set the max number of scanned or otherwise dubious frames
> +  // to the user specified limit
> +  sw->set_max_frames_dubious((sUnwindStackScan > 256) ? 256 

nit: trailing whitespace
Attachment #788088 - Flags: review?(ted) → review+
https://hg.mozilla.org/mozilla-central/rev/7c097d951a23
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: