Open Bug 1749523 Opened 3 years ago Updated 2 years ago

It looks like JIT "unsymbolicated" frames are showing up in profiles nowadays

Categories

(Core :: Gecko Profiler, enhancement, P2)

enhancement

Tracking

()

People

(Reporter: julienw, Unassigned)

References

Details

Attachments

(1 file)

Attached file object.html

This is very visible in this profile https://share.firefox.dev/3JYeCet that I captured from running the attachment.

Here we see clearly 4 different frames starting with 0x , which are maybe (?) different jitted versions for the function.

I think we shouldn't see them as they obscure the view by having different stacks. Happy to hear different feedback though.

Also I thought we got rid of them in the past already but maybe I dreamt it?

Markus, you're the symbolication expert! Any thoughts, please?

Severity: -- → S3
Flags: needinfo?(mstange.moz)
Priority: -- → P2

Here we see clearly 4 different frames starting with 0x , which are maybe (?) different jitted versions for the function.

Looking more closely, out of these 4 0x frames, only 2 are for the same function. Looking at the JS subcategories we clearly see that one is baseline while the other is ion. There's also 1 sample run by the interpreter.

To me they're clearly distractions, but I'm happy to hear more feedback.

We initially (mostly) fixed this in bug 1426124, with follow-up work broken out to bug 1463559. In the profile from comment 0, it looks like the hex addresses all call into C++ code, so I think this is just bug 1463559.


This filtering happens in MergeStacks and is currently unrelated to symbolication.
However, in theory it could be considered symbolication. This is an interesting thought so I'll talk about that here briefly. But the rest of this comment isn't really relevant to this bug.

MergeStacks merges the native stack, the JS JIT stack, and the label stack, and tries to remove JS JIT native frames from the native stack.
The native stack contains two types of native addresses: addresses from "library" code and adresses from JIT code.
The JS JIT stack contains JS function names for JS code running in a JIT.
The label stack contains JS function names for JS code running in the interpreter, and other label frames.
Then, during symbolication, native frames from "library" code get replaced with function names.

So we basically do the following:

  1. Native stack + JS JIT stack + label stack - JS JIT native frames -> Combined stack
  2. Native frames for library code + library symbols -> Symbolicated frames for library code

Instead, we could do:

  1. Native stack + label stack -> Combined stack
  2. Native frames for JIT code + "JIT symbols" -> Symbolicated frames for JIT code
  3. Native frames for library code + library symbols -> Symbolicated frames for library code

So, in step 2, we would take the native stack and treat it as the "ground truth" for all the frames that are on the stack, including JS JIT frames. We'd no longer ask the JS engine about which JIT code is on the stack; we'd only ask it to translate native addresses to JS frames. But in order for that to work, we'd have to ensure that the native stack is complete, and that native stack unwinding always succeeds in JIT code.

Flags: needinfo?(mstange.moz)

In the profiler, you can differentiate between hex addresses from "library" code and hex addresses from JIT code: For library code, the call node will have a hex address and then the library name (e.g. libxul.so). For JIT code, there will just be the hex address.
The profile contains the list of native libraries that were loaded in the process, with start and end addresses for each library. JIT code is outside of those ranges and not assigned to any library.

See Also: → 1751122
See Also: → 1463559

(In reply to Markus Stange [:mstange] from comment #3)

We initially (mostly) fixed this in bug 1426124, with follow-up work broken out to bug 1463559. In the profile from comment 0, it looks like the hex addresses all call into C++ code, so I think this is just bug 1463559.

It turns out that this is also happening in other cases after all. Here's one example profile from bug 1751122 where all hex addresses are leaf frames, and not calls into C++: https://share.firefox.dev/3rEDtvB

(In reply to Markus Stange [:mstange] from comment #5)

It turns out that this is also happening in other cases after all. Here's one example profile from bug 1751122 where all hex addresses are leaf frames, and not calls into C++: https://share.firefox.dev/3rEDtvB

In this one, all hex addresses are children of a wasm function... maybe the check added in bug 1426124 doesn't work properly in this case (or something changed since then).

(In reply to Markus Stange [:mstange] from comment #4)

In the profiler, you can differentiate between hex addresses from "library" code and hex addresses from JIT code: For library code, the call node will have a hex address and then the library name (e.g. libxul.so). For JIT code, there will just be the hex address.
The profile contains the list of native libraries that were loaded in the process, with start and end addresses for each library. JIT code is outside of those ranges and not assigned to any library.

Would that be good enough to merge such call nodes to their caller in the frontend? I feel like this could also replace the fix in bug 1426124 as well as fix the wasm case. What do you all think?

Flags: needinfo?(mstange.moz)
Flags: needinfo?(jdemooij)

I think we should hold off on making any changes here until we've completed most of the profiler JIT work. Now that we have frame pointers in all JIT code, the approach I outlined at the end of comment 3 may be workable.

Flags: needinfo?(mstange.moz)
Flags: needinfo?(jdemooij)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: