Closed Bug 1330150 Opened 3 years ago Closed 3 years ago
Profiler does not find asm
.js function calls on Windows, but does on OS X .
Running a profile of an asm.js application with geckoprofiler, there is a section of asm.js code that turns out to be taking about 33% of total execution time, but on Windows this section is hidden, and only shows up as "native call, asm.js", which has 0 samples attached to it. Running a geckoprofile of the same page on OS X however correctly shows the samples from within that section of code. Not sure if this is an issue with asm.js backend, profiler engine, or gecko profiler UI, but marking tentatively in Gecko Profiler, and CCing a few folks. Unfortunately the test case is closed source, subject to Mozilla partner NDA, so please contact me directly for a test case.
Thanks for reporting. So currently "native call, asm.js" is only going to be shown for time under the interrupt handler (slow script dialog or various other async interrupts that happen in workers). Is this an app running in workers and/or did the slow-script dialog appear? Also, could you show a bit of context (with function names changed to protect privacy if need be) to show how this "native call" is appearing?
(n/m, I see your email)
The slow script dialog (ribbon) does appear initially when the page is loading up and compiling shaders, but these samples do not correspond to that, geckoprofiler was started up only after the page has loaded up. The test page is not using workers at all, it is all single threaded on browser main thread.
The only special thing I can see in a call stack looking like that is that it calls function pointers twice, which causes asm.js <-> JS jumps, i.e. asm.js -> regular JS -> asm.js -> regular JS -> asm.js, and looks like it gets confused the second time it is re-entering asm.js from regular JS.
Hah, tracing this back and back I finally found the source of the problem: in GeckoSampler.cpp, before we even look at the JS and pseudo stacks, we try to do a native stack-walk and, if that fails, we just drop the whole sample. So for many platforms, we use fp-based unwinding which tends to skip over JIT code and not fail, but Win64 uses unwind metadata (of which there is none for JIT code) and that appears to fail quite frequently when the pc gets into JIT code. So this is likely significantly distorting many profiles with JIT code and it just took a horrible case to draw attention to it. Good job Jukka for noticing and reporting! This patch unconditionally merges in the JS and pseudo-stacks, even if native unwinding fails.
Attachment #8826066 - Flags: review?(ehsan)
Comment on attachment 8826066 [details] [diff] [review] dont-drop-sample-if-fail Review of attachment 8826066 [details] [diff] [review]: ----------------------------------------------------------------- Hmm, I was wondering why we were doing this in the first place, and this goes all the way back to bug 718025 where I added support for stackwalking on Windows! I can't remember why I was checking |rv|, I guess for "good programming practices"?! Oh well. :-) Thanks a lot for the great find.
Attachment #8826066 - Flags: review?(ehsan) → review+
Amazing work with the fast fix, looking forward to testing out a new Nightly with the fix!
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/ba0df33538d8 Don't abandon sample if native stack walk fails (r=ehsan)
You need to log in before you can comment on or make changes to this bug.