875131 - profile using C++-like stack scanning instead of the SPS psuedo-stack

Reporter

Description

•

12 years ago

Currently, when profiling is enabled, JIT code pushes and pops entries on the SPS pseudo stack. This makes profiling simple (the profiler just reads the pseudo stack from the signal handler), but this incurs non-trivial overhead (5-10% iirc, maybe more) so we can't leave profiling turned on all the time. An alternative, that wouldn't incur any runtime overhead, would be to support directly walking the stack from the signal handler. This is harder and requires we think Real Hard about racing with the signal handler. I think it's doable, though, and we already perform on-demand $rip-to-line/script mapping (when the signal occurs while executing JIT code). Taras pointed out a really good use case for doing the no-overhead approach: the hang detector. What we apparently already do is, when the Gecko event loop is hung on a single event for more than X seconds, we take a stacktrace (using normal C++ stack walking) and send this stack back via Telemetry/FHR. This is useful for pointing out hangs in Gecko code, but a lot of hang stacks end up going into the JS blackhole. However, if we supported on-demand JS stack walking, we could also see what JS was running. This would give us insight into, e.g., Chrome or Addon code that was commonly janking our users. It'd also be useful to see if there were any popular sites that were janky and, e.g., see if they were only janky in FF.

Luke Wagner [:luke]

Reporter

Comment 1

•

12 years ago

One obvious problem is that, if we're using the system's builtin stack walker, we won't have a chance to help out when the walking gets to jit frames. However, from my minimal understanding, we're planning to switch from using the system builtin stack walker to using breakpad's stack walking which means we'd be able to hack it up. Is that right Julian/Taras? If so, Julian: what would your ideal interface be between the stack walker and the JS engine?

Benoit Girard (:BenWa)

Comment 2

•

12 years ago

SPS has had two stack scanning implementation: 1) Hand written one. It scanned the stack, looked for addresses pointing to a mapped library and check that it was a proper call site. This was done for arm-android. It never worked better then the pseudostack mode because we often had hold return addresses on the stack. Jeff: 'think large unintialized arrays on the stack that contain old return addresses'. 2) The breakpad stack scanner. The results are very poor thus the samples doesn't aggregate together in the UI. Because of this we don't get any samples with large counts so we don't know where to look at. HOWEVER, Julian and I have discussed grabbing stack scanning but NOT displaying them in the treeview. When you get a pseudostack site with a large sample count you could right click and reveal the stack scanning for this site. This solves the nonsensical aggregation problem (something I believe even a trained power user would not get used to). (In reply to Luke Wagner [:luke] from comment #1) > we're planning to switch > from using the system builtin stack walker to using breakpad's stack walking > which means we'd be able to hack it up. Is that right Julian/Taras? > Yes that's correct I believe and would be very interesting to try.

Benoit Girard (:BenWa)

Comment 3

•

12 years ago

hold/old

Luke Wagner [:luke]

Reporter

Comment 4

•

12 years ago

What makes the samples poor with breakpad? If it is jit code on the stack, then we could be fixing that :)

Benoit Girard (:BenWa)

Comment 5

•

12 years ago

(In reply to Luke Wagner [:luke] from comment #4) > What makes the samples poor with breakpad? If it is jit code on the stack, > then we could be fixing that :) Even with JIT code aside (say running JS Interepter only using prefs) un-initialized memory containing stale return addresses throw off good stack scanning heuristics.

Luke Wagner [:luke]

Reporter

Comment 6

•

12 years ago

Oh, I thought we used debug/dwarf info to know the precise size of stack frames. Is that only on some platforms?

Benoit Girard (:BenWa)

Comment 7

•

12 years ago

I was trying to point out that stack scanning showed to be very poor for c++ because of the reason I mention above. Perhaps these stack scanning heuristics would work better for jitted frame.

Luke Wagner [:luke]

Reporter

Comment 8

•

12 years ago

Ok, so I think you've explained why (1) in comment 2 had poor results, which makes sense. I was asking why in (2), where presumably you have precise stack frame information (modulo jit code, which I was saying in comment 4 we could fix here), the results were poor.

Mike Hommey [:glandium]

Comment 9

•

12 years ago

(In reply to Luke Wagner [:luke] from comment #8) > Ok, so I think you've explained why (1) in comment 2 had poor results, which > makes sense. I was asking why in (2), where presumably you have precise > stack frame information (modulo jit code, which I was saying in comment 4 we > could fix here), the results were poor. I think you're just both not talking about the same thing when you say "stack scanning". What Benoit is talking about is actually scanning the stack contents to attempt finding frames when you don't have debug info.

Benoit Girard (:BenWa)

Comment 10

•

12 years ago

Yes, I've been using "stack scanning" to mean inferring the stack without deterministic rules for reconstructing it by using custom inexact heuristics and "stack unwinding" to mean reconstructing the stack using guaranteed information (ABI, unwind info).

Luke Wagner [:luke]

Reporter

Comment 11

•

12 years ago

Ah, I assumed that when you said "breakpad stack scanner" in (2) in comment 2, you were talking about "stack unwinding" (viz., I thought breakpad did stack unwinding, but that is from old and probably mis-remembered conversations with jimb). So, am I right that the profiler/breakad *never* do unwinding, only scanning? (Do any of our tools unwind?) In that case, it sounds like all the profiler needs from the JIT is a list of (symbol, address ranges) pairs? Anything else?

Nobody; OK to take it and work on it

Assignee

Updated

•

11 years ago

Assignee: general → nobody

Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8]

Updated

•

9 years ago

Blocks: gametool

Jordan Santell [:jsantell] [@jsantell]

Updated

•

9 years ago

Priority: -- → P3

Whiteboard: [gaming-tools]

Luke Wagner [:luke]

Reporter

Comment 12

•

9 years ago

With all of djvj's work in the intervening years, I think this bug is mostly WFM, although, as a separate, more focused bug, it would simplify everything if we could have the ProfilingFrameIterator traverse InterpreterActivations instead of having the interpreter push magic isJs psuedo-frames.

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → WORKSFORME

Bugzilla

profile using C++-like stack scanning instead of the SPS psuedo-stack

Categories

(Core :: JavaScript Engine, defect, P3)

Tracking

()

People

(Reporter: luke, Unassigned)

References

Details

(Whiteboard: [gaming-tools])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Updated

Updated

Comment 12