Closed Bug 1707489 Opened 3 years ago Closed 3 years ago

Use C++ to JIT entry registers on Win/x64 to resume stack walking when lost

Categories

(Core :: Gecko Profiler, task, P2)

x86_64
Windows
task

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox91 --- fixed

People

(Reporter: mozbugz, Assigned: mozbugz)

References

(Depends on 1 open bug)

Details

Attachments

(4 files, 3 obsolete files)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

From bug 1700869, JS::ProfilingFrameIterator::getCppEntryRegisters() can give us the appropriate register values that can be used to resume stack walking if it gets lost in JIT code.

This bug/task will build the foundations for this, and focus on the Windows x86-64 implementation. More platforms will be added in later bugs.

Since mozglue doesn't know about JIT, this is a generic control structures that will only deal with the few registers needed to stop and resume stack-walking.

On platforms like Windows, we will need PC and SP, but also BP. To avoid polluting common functions with this unneeded information, the new StackWalkControl will record it before the callback is invoked.

StackWalkControl also contains "ResumePoint"s, with the following information:

  • Where to stop stack-walking (this will be used to avoid unnecessarily walking JIT code native frames),
  • Where to resume (if stopped, or if the walker gets lost),
  • Whether to output stop/resume frames.

Depends on D113269

DoMozStackWalkThread is the main stack walker for Windows.
Note that since StackWalkControl is only implemented for x86-64, so this patch effectively only applies to that platform (for now).

Depends on D113270

This is almost code-free: An empty StackWalkControl is constructed where&when needed, and then passed to ExtractJsFrames and the stack walker, but is not actively used yet.
(See next patch for ExtractJsFrames implementation.)

Depends on D113271

ExtractJsFrames should now fill the provided StackWalkControl with C++-to-JIT entry registers, if any.

This patch connects the JIT data to StackWalkControl and MozStackWalkThread that have been implemented on Windows/x86-64 in previous patches. So on this platform, profile should now contain fewer incomplete stacks.

Depends on D113272

Try: https://treeherder.mozilla.org/jobs?repo=try&group_state=expanded&revision=b04ed4825de9438fd8f4a8389f8eb5ce1b90662d

Startup+reddit profile before: https://share.firefox.dev/3xligsy
Startup+reddit profile with patches: https://share.firefox.dev/2QvcILv

The main difference to notice is the number of incomplete stacks, which is visible in the Call Tree, with stacks starting with a label like "XREMain::XRE_main" or "XRE_InitChildProcess", instead of a native function like RtlUserThreadStart.
For example, looking at "Web Content (3/3)" which handled Reddit, the number of incomplete stacks went from 44% down to 6.3%. And in the parent process, 12% to 0.7%. 😄🎉

It's also noticeable more visually in the Stack Chart, where frames near the root now form longer uninterrupted blocks, with more of the native gray frames.

Let's wait for bug 1712674 to be resolved (as it's a simpler fix for most bad-stack issues), before coming back here...

Depends on: 1712674
Attachment #9218218 - Attachment is obsolete: true
Attachment #9218219 - Attachment is obsolete: true
Attachment #9218220 - Attachment is obsolete: true

This structure manages a list of known "ResumePoint", which contains the registers needed to restart stack walks from JIT C++ entry points.

Depends on D113272

When mozglue's MozStackWalkThread ends, the last found SP can be checked to see if it was called by JIT code, in which case we can restart the native stack walk from the known caller of the JIT code.

Depends on D113273

Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5fb8ebb1a365
Pass StackWalkControl to ExtractJsFrames and then forward it to stack walkers - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/e582c06a7323
Implement StackWalkControl struct on Windows/x86-64 - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/a5c3c4a00acc
Retrieve entry registers and stop areas from JIT - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/92276cf98902
Use StackWalkControl in Profiler's DoMozStackWalkBacktrace - r=canaltinova

Backed out for causing build bustages. The build bustages are happening on Windows 2012 AArch64 opt and debug.

Push with failures

Failure log

Backout link

Flags: needinfo?(gsquelart)

Thank you.
Using #error in the false branch of if constexpr (StackWalkControl::scIsSupported) was wrong, as the preprocessor #error is not suppressed as I may have thought. I'll fix and re-test before landing...

Flags: needinfo?(gsquelart)
Pushed by gsquelart@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b8d18e3e34c1
Pass StackWalkControl to ExtractJsFrames and then forward it to stack walkers - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/5eebaf1cdc31
Implement StackWalkControl struct on Windows/x86-64 - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/750af5adba75
Retrieve entry registers and stop areas from JIT - r=canaltinova
https://hg.mozilla.org/integration/autoland/rev/d93c8ecddf48
Use StackWalkControl in Profiler's DoMozStackWalkBacktrace - r=canaltinova
Duplicate of this bug: 1635987
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: