Closed Bug 464750 Opened 11 years ago Closed 10 years ago

Breakpad should use DWARF CFI for stack unwinding

Categories

(Toolkit :: Crash Reporting, defect)

x86
Linux
defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: jimb, Assigned: jimb)

References

Details

We'd like to be able to build with -fomit-frame-pointer on x86, because this frees up a register in functions that have fixed stack frame sizes.  However, breakpad can't generate backtraces through functions compiled with -fomit-frame-pointer.

DWARF CFI (Call Frame Information) is instruction-by-instruction annotation of the machine code indicating how to unwind the stack.  We should be able to convert DWARF CFI to the format breakpad uses for this stuff (the STACK WIN records, I guess).
a) Gonna need to get this upstream
b) There's no DWARF support at all yet on Linux (but maybe you don't care?)
c) You volunteering? :)
(In reply to comment #1)
> a) Gonna need to get this upstream

That happens, right? I mean, we can assume we'll get breakpad patches upstream in general -- I hope.

> b) There's no DWARF support at all yet on Linux (but maybe you don't care?)

We would use this on Mac even if Linux lagged (like that never happens! :-P)

> c) You volunteering? :)

I hope Jim is, but it's up to him and mgmt.

/be
Is there a way to add this to a try server build to see what the upside is here?
You should be able to test the vomit-frame-pointer performance impact without caring about the breakpad symbols. And yeah we can easily upstream stuff, I'm just opposed to taking local patches was my point.
Yeah, I'm asking if the try servers will clobber for us.
Work upstream, definitely.
Changeset a8fc7f522e6e is in the Try server; it uses -fomit-frame-pointer throughout (or at least, widely; I don't know that I caught every case).
Assignee: nobody → jim
Fwiw, I'm reasonably familiar with DWARF CFI stuff too and would probably "enjoy" writing the part of the symbol-dumper required. Assuming we want it and Jim didn't feel inclined.
I'm pretty sure we do want it, because it's a significant speed increase; you drop three instructions and free up a register in most functions.

I'd enjoy doing this, too.  I think whichever one of us has free bandwidth first should take it.
I'm available for moral support and guidance into the breakpad code, as always.
I've started work on this.  Status quo, at start of project:
- Linux is stabs-only.
- Breakpad DWARF code is Mac-only.
- Breakpad DWARF code maps PCs to functions and source locations, but has no CFI parsing.

So it seems like the agenda is:

* Make the DWARF reader code shareable between Mac and Linux.  At the moment, the DWARF code is in a mac subdirectory, and only used on Mac.

* Lightly refactor Linux code to separate STABS parsing from Breakpad symbol file writing.  This separation almost exists already.

* Get the Linux dump_syms program to write pc-to-func/source mapping data based on DWARF.

* Add CFI parsing code to the DWARF reader, and code to convert it to Breakpad's format and write it out.
Status: I've posted a bunch of cleanup patches to breakpad (some of which speed up symbol dumping on Linux roughly 20x).  I'm currently adapting the Linux code to use a representation for the debugging information that it can share with the DWARF reader, so that we can read both STABS and DWARF information from an executable, and dump a single breakpad-format file.
I've posted a 16-patch series to the breakpad list for review.
http://groups.google.com/group/google-breakpad-dev
Over the weekend, I got the DWARF-and-STABS-on-Linux reader working.  This is an additional ten-patch series.  It's not thoroughly tested, but it produces plausible-looking output.  There are no special flags to pass: it just aggregates whatever debugging information it finds and writes it all out as a single breakpad symbol file.

Note that this does not handle Call Frame Information yet, so we still can't use -fomit-frame-pointer.  But it does bring us to the point where we can begin work on that.

What this does let us do now (well, once it has been reviewed and baked) is stop using STABS on Linux.  Of course, except for CFI, the breakpad symbol format is even less expressive than STABS, so switching from STABS to DWARF won't improve our crash dump reporting in any material way.  But at least we can stop passing funny compiler flags.

Full DWARF-and-STABS-on-Linux patch series at:
http://hg.mozilla.org/users/jblandy_mozilla.com/breakpad-mq/

Of the original patch series mentioned in comment 14:

- All but the last two --- the most substantial --- have been reviewed, revised, and landed in the Google breakpad repository.

- Google's Neal Sidhwaney has reviewed the last two; I've made the revisions he suggested, except for one naming suggestion still being discussed.

- Neal pointed out the lack of unit tests for the new code; rectifying that will take a few days.
I think I noticed you fixing some weird stabs bugs along the way, and I know we picked up a few fixes for free when we switched from stabs to dwarf on OS X, things like inline handling and some weird missing filenames, off the top of my head.
That being said, I don't see any real pressing reason to sync with the upstream Breakpad code until you get your CFI work done.
(In reply to comment #16)
> I think I noticed you fixing some weird stabs bugs along the way

Yes, there were three bug fixes.  I was doing before/after comparisons as I worked, so whenever I'd do something right that the old code got wrong, I'd just fix it in the old code to have a minimally disturbed basis for comparison. 

Probably the most important bug fix gave us mangled names with some STABS junk attached for certain functions.  I can backport that if you think it would make sense.

(In reply to comment #17)
> That being said, I don't see any real pressing reason to sync with the upstream
> Breakpad code until you get your CFI work done.

I don't see one either.
The original sixteen-patch series (Linux dumper restructuring, no DWARF) is now fully committed to Google's breakpad SVN repo.
Status: NEW → ASSIGNED
Depends on: 517822
I'm going to break down the remaining work on this bug into separate blocking bugs.
Depends on: 517824
Depends on: 517825
Depends on: 517826
Depends on: 517827
Depends on: 517829
Depends on: 517830
Okay.  The "show dependency graph" function might be useful on this one. :)
Depends on: 543542
Blocks: 543843
Blocks: 543844
Blocks: 517832
Blocks: 548113
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Blocks: 554364
You need to log in before you can comment on or make changes to this bug.