Right now all contiguous calls to asm.js within a single module get summarized as a single entry but we're now getting requests for per-function profile info. My initial inclination is to the pseudo-stack as this slows down the code and requires recompilation. Instead, I was thinking of chaning the profiler to do exact stack-walking like one might do for C++ by maintaining a [pc -> frame-offset] mapping in the AsmJSModule. (Calls into and out of asm.js would still maintain a single per-activation entry on the SPS stack that delimited the precise stack range.) This would mean a new SPS frame format which in theory Ion could also use in the future.
Ideally if asm.js can follow the ABI of the platform then we only need to provide an internal addr2line like function. The only tricky thing would be that would need an addr2line like function for asm.js code that has since be discard (say you profile, close a tab and want to dump the profile and then need to call addr2line). I could provide a callback when it's safe to expire a section of addr2line mappings. This route would mean that asm.js plays better with any profiler and debugger (although likely symbolication may still be busted for them). We could also make your approach work too.
If we were able to make asm.js play better with ordinary profilers/debuggers, that would be ideal but given that those tools will be looking for symbol information in a section of the executable (or in a side pdb), I don't see how that would work with dynamic codegen (unless the existing standard had some hooks we could abuse). Without the advantage of making existing tools work, I'd rather avoid the per-platform work required. Also, iirc, the C++ profiler is not exact so random stuff on the stack can be interpreted as a return address. It'd be nice if the asm.js profile info was exact.
(In reply to Luke Wagner [:luke] from comment #2) > Also, iirc, the C++ profiler is not exact so random stuff on the stack can > be interpreted as a return address. It'd be nice if the asm.js profile info > was exact. Would be nice if you filed a bug for those. You may be talking about JIT frames that we don't filter out.
(In reply to Benoit Girard (:BenWa) from comment #3) What I'm talking about is that, last I was told (by you, I think), the C++ stack walker doesn't use EH or other stack-frame-size info to walk the stack but, rather, it looks at every word (that is not marked as part of the JS stack), looks it up (presumably using addr2line) and if there is a hit, puts that in the callstack. Is that correct?
(In reply to Luke Wagner [:luke] from comment #4) > (In reply to Benoit Girard (:BenWa) from comment #3) > What I'm talking about is that, last I was told (by you, I think), the C++ > stack walker doesn't use EH or other stack-frame-size info to walk the stack > but, rather, it looks at every word (that is not marked as part of the JS > stack), looks it up (presumably using addr2line) and if there is a hit, puts > that in the callstack. Is that correct? This is called 'stack-scanning'. We tried building this a long time ago but it *horrible*. We would look for addresses in the stack that point to code section where it was a call site. This doesn't work at all because a function would return leaving it's returning information on the stack. Then something else would later increment the stack pointer to include the old return address but not re-initialize that data. So we frequently found stale return addresses on the stack making stack scanning not viable. We instead use either a system unwinder (windows), a custom frame pointer unwinder (mac) or a custom EHABI unwinder (mobile).
Ok, I guess bug 875131 really confused me on what you said SPS was doing and what you said didn't work. Still, it seems easier to just have a new SPS frame type that claims a region of the stack which can then be unwound using basic stack-frame size info stord in the AsmJSModule.
Possibly related: There exists bug 914561, for saving the register info when entering JITcode and storing the address of that frame in the SPS stack, so that when the native unwinder runs into JITcode (or, as a special case, if the sample occurs there) and fails, we can skip over the JS part of the stack and restart from the saved state. In particular, when I talked with :nbp and :djvj about this at a work week last year, we came to the conclusion that trying to generate unwind info wouldn't be practical for Ion, because (I'm told) it emits a lot of push/pop-equivalents instead of preallocating the space when entering the function (or, if the frame truly needs to be variable-sized, using a frame pointer as GCC does for VLAs/alloca), and each one would have to start a separate "function" for EH. But it sounds like that's not the case for Odin.
It's also the case for Odin that we have to handle push/pop during the frame (either from the regalloc or as part of the call sequence). My rough plan is to instrument the MacroAssembler to record these ranges and build a big sorted list so that once can map pc -> stack-size. This could be used to work for general Ion, but the problem is that the [pc -> stack-size] map would be for the whole IonRuntime (or Zone or Compartment) and constantly updated, presenting possible (but solvable) race condition.
This bug has gone stale. asm.js profiling support will draft off of the more general Ion profiler work in bug 994136 and dependents.