Open Bug 1517739 Opened 5 years ago Updated 2 years ago

Enable CFI unwinding for client-side stack-walking in Windows AArch64

Categories

(Toolkit :: Crash Reporting, enhancement)

enhancement

Tracking

()

People

(Reporter: gsvelto, Unassigned)

References

Details

+++ This bug was initially created as a clone of Bug #1513284 +++

The minidump-analyzer does not support CFI unwinding for the Windows AArch64 targets yet. This entails expanding the unwind information parser [1] to cope with AArch64 specific bits. More info about unwinding can be found in Microsoft documentation:

https://docs.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=vs-2017

[1] https://searchfox.org/mozilla-central/rev/ecf61f8f3904549f5d65a8a511dbd7ea4fd1a51d/toolkit/crashreporter/minidump-analyzer/Win64ModuleUnwindMetadata.h

Note: this is needed for sending crash pings, but doesn't impact our existing crash reporting infrastructure.

No longer blocks: aarch64-crash-reporting

There's two parts to this, right? There's making dump_syms actually produce something intelligent for this information, and then there's making minidump_stackwalk understand the information? Unless we're assuming that the first part can produce information that's compatible with the DWARF/x86-64 CFI code, and then minidump_stackwalk Just Works For Free?

We need to teach breakpad's PDBSourceLineWriter to handle the ARM64-specific unwinding opcodes (as described in the MSDN ARM64 exception handling documentation) so that it can spit out meaningful CFI directives. Then we need to check if breakpad's CFI unwinding machinery is sufficiently generic to handle that output, in theory it should be and should indeed Just Work For Free but until we check we can't be sure.

I am 95% positive the CFI machinery is sufficiently generic to handle whatever the arm64 exception handling bits have. Looking through the unwind codes in the docs, most of them map very nicely to CFI.

But wait. PDBSourceLineWriter is one place for decoding x86-64 windows unwind information:

https://searchfox.org/mozilla-central/rev/8e8ccec700159dc4efe061cfec2af10b21a8e62a/toolkit/crashreporter/google-breakpad/src/common/windows/pdb_source_line_writer.cc#725-856

but comment 0 describes a second place that we need to decode this information as well? Do we need both?

I am pretty sure I understand the exception handling bits well enough to tackle this, but it would be nice to know exactly what needs to be modified and why. :)

Flags: needinfo?(gsvelto)

Yes, what I said in comment 3 is effectively a separate bug so maybe we should file it separately, sorry for the confusion.

I originally opened this bug for teaching the minidump-analyzer we use for client-side stack-walking to understand the CFI opcodes of AArch64 builds. This needs to happen in the code I mentioned in comment 0. When I filed this I hadn't thought about dump_syms at all.

What's more urgent right now is to teach dump_syms to do the same thing via PDBSourceLineWriter; since the output of dump_syms is fed into Socorro we need it so that Socorro's minidump_stackwalk can CFI-walk AArch64 crashes.

The sad part is that those two tools (minidump-analyzer and minidump_stackwalk) basically do the same thing in two very different ways, that's why we need to change the code in two different places. Sigh. We even have a bug for merging them (bug 1487410).

Flags: needinfo?(gsvelto)
Blocks: 1529355
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.