[Meta] Remove JS profiler pseudostack use by jitcode.

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: djvj, Unassigned)

Tracking

(Blocks 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

The pseudostack handling with the profiler is just too error prone.  We've been dealing with bug after bug relating to corner case after corner case in what should be a relatively simple subsystem of the engine.

We need to eliminate the JS use of the pseudostack pronto.  We are continually introducing hard-to-understand, disparate code changes.. and wasting profiling runtime to track information that can be passively tracked.

Building any new features on top of the current pseudostack-based implementation is a mistake.

So it's simple: we kill the JS use of the profiler pseudostack.
You might want to have a look at previous attempts, such as Bug 785234 and Bug 728045.
The compression ratio of such information matters a lot for 2 reasons:
 - We want to profile applications which are using almost all the memory (such as on B2G)
 - We do not want to change (much) the memory profile of an application.

Also, I think that this is a bad idea to store a mapping from the IP to the JSScript's pc.  I think it would make more sense only save the stack depth.  By doing so we will only have the PC of calls that we can recover from the OSI Points.

Also, note that mapping a PC to a stack depth is not garanteed, for example one case where we are using some stack allocations is when we are inlining fun.apply, in which case the stack depth is not known at compile time.  Also, if we want to be precise, we would have to store the stack depth as part of the Labels, to make sure that all jumps are expecting the same stack depth.

One idea for statically known stack depth, would be to register sparse ExecutableAllocator bit-masks for each stack depth of executable code.  Even if this would not be good enough for x86/x64, this might be way to map 4 bytes to one 1 bit on RISC architectures.

Note that all code that we generate does not necessary know everything about their stack depth, such as trampolines.  This would be nice to have, but we need to be careful, for example EnterJit / RectifierFrame do not know their size ahead of time.  Identically, the Bailout path is aware of its own stack, but it does not know anything about the size of the frame which jumped on it.

If we go that way, then I would appreciate if we could store these information on a few dedicated pages, such as we can register these pages for breakpad, and be able to iterate the stack when reconstructing the C++ stack.

If we have a low overhead enough, we can even remove most uses of the frame descriptor.
(In reply to Nicolas B. Pierron [:nbp] from comment #1)
> The compression ratio of such information matters a lot for 2 reasons:

IIUC, we'd use two separate maps:
 - callsite-return-pc --> (stack depth, script, pc)
 - pc --> (stack-depth, script)

In both cases, I'd expect we can use a simple linear array of records and not have too many entries per function as the number of callsites and number of stack-depth-changes shouldn't be significant.  We'd need to measure to be sure, of course.

>  - We want to profile applications which are using almost all the memory
> (such as on B2G)

From what I understand, LuL will already maintain in-memory stack-walking metadata for *all native code* so this shouldn't a significant change.  Regardless, if your app is up to the memory limit, profiling will perturb it and, to profile, you'll probably want a ref phone to bump up the total memory, or something.

> Also, I think that this is a bad idea to store a mapping from the IP to the
> JSScript's pc.

Only for callsites, not in general.

> Also, note that mapping a PC to a stack depth is not garanteed, for example
> one case where we are using some stack allocations is when we are inlining
> fun.apply, in which case the stack depth is not known at compile time.

Yes, but those cases are relatively rare and we can deal with them on a case-by-case basis (by having some flag bits that indicate the special case and say how to recover the argc necessary to unwind the frame.

> Also, if we want to be precise, we would have to store the stack depth as
> part of the Labels, to make sure that all jumps are expecting the same stack
> depth.

I think we can avoid all this and just instrument the MacroAssembler to record intervals of pc ranges that have the same stack depth.

> If we go that way, then I would appreciate if we could store these
> information on a few dedicated pages, such as we can register these pages
> for breakpad, and be able to iterate the stack when reconstructing the C++
> stack.

I'm not sure what "dedicated pages" means, but, yes, this stack-walking metadata should be generally available for both profiling and other on-demand stack-walking like breakpad.  Adding JS stacks to the chromehang monitor is one of the motivations for all this a long time ago.

> If we have a low overhead enough, we can even remove most uses of the frame
> descriptor.

We should be able to actually remove both the frame descriptor and callee token.  The overhead should just be a log(n) lookup.  This should be fine for stack iteration; I think the only really hot uses don't even use general stack iteration but rather use currentScript().
Depends on: 1004831
CC'ing Ting-Yuan and Kan-ru, since they're building some tools on top of the SPS stacks.
Depends on: 1057082
Blocks: 1068885
I'm closing this, as we removed the use of the pseudostack by the JIT.  That was the original intent of this bug and it's been fulfilled.  The title was a bit off, so I've fixed that to reflect the intent.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Summary: [Meta] Remove JS profiler pseudostack → [Meta] Remove JS profiler pseudostack use by jitcode.
You need to log in before you can comment on or make changes to this bug.