Closed Bug 547075 Opened 14 years ago Closed 14 years ago

Breakpad should produce stack traces for ARM Windows Mobile minidumps

Categories

(Toolkit :: Crash Reporting, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jimb, Unassigned)

References

Details

To get backtraces for Windows Mobile on the ARM, Breakpad needs to learn to process Windows IDiaFrameData stack-walking debugging information for the ARM. At present, Breakpad's support for IDiaFrameData info is x86-specific.

Background:

Producing a stack trace on ARM processors requires debugging information beyond function names/addresses and source line information. Call instructions save the return address in the link register, but there is no convention for where the link register should be saved, when it gets saved, or indeed whether it should be saved at all.

On Linux, DWARF CFI and .eh_frame sections provide instruction-by-instruction annotations which describe how to recover saved registers (like LR). On Windows, the DIA interface's IDiaFrameData class provides similar annotations.

IDiaFrameData:
http://msdn.microsoft.com/en-us/library/hf0s0y3f%28VS.100%29.aspx
What's entailed here is to take the code in StackwalkerX86::GetCallerByWindowsFrameInfo, here:

http://code.google.com/p/google-breakpad/source/browse/trunk/src/processor/stackwalker_x86.cc#94

find some reasonable way to make it platform-independent, and then make both x86 and ARM use the platform-independent interface.
Actually, comment 1 is wrong: Windows Mobile on ARM doesn't use the IDiaFrameData stack walking debugging information.  Perhaps better links:

SEH in RISC environments:
http://msdn.microsoft.com/en-us/library/ms925504.aspx

PDATA structures:
http://msdn.microsoft.com/en-us/library/aa448751.aspx
Right, so, we believe the way to fix this will be:
1) Make dump_syms know how to parse the PE file (EXE or DLL), find the .pdata section
2) Parse the .pdata section which contains the aforementioned PDATA structures
3) For each function, parse the prolog to determine how to restore registers for each function's caller (as described here: http://blogs.msdn.com/hopperx/archive/2006/02/03/524170.aspx ). alternately, parse a PDATA_EH struct from the .text section if the PDATA struct specifies that the function uses exception handling
4) Output this data to the symbol file in some useful format that the stackwalker can use. jimb suggested that we might be able to just translate it to DWARF CFI to reuse his existing infrastructure for handling CFI.
One thing that kind of sucks is that dump_syms on Windows doesn't currently have to look at the executable image at all, it gets everything it needs from the PDB file via DIA. I don't think this will be a showstopper though.
(In reply to comment #3)
> 4) Output this data to the symbol file in some useful format that the
> stackwalker can use. jimb suggested that we might be able to just translate it
> to DWARF CFI to reuse his existing infrastructure for handling CFI.

Actually, I had in mind something much simpler.  DWARF CFI is really hairy, but the STACK CFI records that we use to represent it in breakpad symbol files are really simple:

STACK CFI INIT 804c4b0 40 .cfa: $esp 4 + $eip: .cfa 4 - ^
STACK CFI 804c4b1 .cfa: $esp 8 + $ebp: .cfa 8 - ^

STACK CFI INIT records establish the initial rules for a given address and range; STACK CFI records are deltas, showing which rules changed at a given address.

The rules are a list of REGISTER: EXPRESSION pairs, where EXPRESSION is a postfix expression: "$esp 4 +" means $esp+4; '^' is a memory fetch.  .cfa is a synthetic register, representing the base of the stack frame; it makes the other rules a bit easier to write. If there's no explicit rule for the $esp, you use the .cfa as the caller's $esp.

So that example says that, at the function's entry, the .cfa is four bytes past the stack pointer, and the return address is saved four bytes before the .cfa. Then, one instruction later, the .cfa is eight bytes past the stack pointer, and the frame pointer, $ebp has been saved on top of the return address --- so that instruction must have been "push $ebp".

This is pretty straightforward, and very general. So it's worth considering whether the data we parse out of the .pdata section, or whatever, can be put in this form.

If there are difficulties, it will be because the .pdata doesn't provide enough detail.  For example, the Windows x86 unwinding data, which Breakpad symbol files store as STACK WIN records, only describe the layout of the fixed section of the frame, as created by the prologue. It doesn't describe how the stack pointer changes as the function runs; you need to use data from the callee to find that --- and we can't know the callee at dump time.
Not now.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.