Closed
Bug 421534
Opened 17 years ago
Closed 16 years ago
Mac: generate breakpad .sym from dwarf symbols instead of stabs
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.2a1
People
(Reporter: benjamin, Assigned: ted)
References
Details
(Keywords: fixed1.9.1)
Attachments
(3 files, 4 obsolete files)
2.73 KB,
patch
|
benjamin
:
review+
benjamin
:
approval1.9.1+
|
Details | Diff | Splinter Review |
2.78 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
7.13 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
Dwarf is the new good thing, and we've encountered bugs in GCC where building with stabs just won't work. So we should bite the bullet and get the ability to get crash-reporting data out of dwarf instead of stabs. Licensing is an issue if we're going to commit this upstream in breakpad, so use of GPL dwarf-reading libraries is not so good.
Shebs has been looking into this.
Assignee | ||
Comment 1•17 years ago
|
||
Comment 2•17 years ago
|
||
Just a little status update - I'm now finding and opening the .o files from the main executable, and the output is the cumulative result. However, I still need to come up with the correct base address for each .o, since all the lines are jumbled together. (Also I regressed stabs handling, so need to start cleaning up my hacks so I have the base case to compare against.)
Comment 3•17 years ago
|
||
OK, my read-.o-files-recursively hack doesn't actually work, things aren't happening in the right order to get the not-relocated symbols in the .o files to be moved to their real addresses. So gotta do a little rewrite, basically cache the .o files until we're done with the main symbols, then suck in the .o symbols and match each one with the main program version. (This is what Apple GDB has to do, fortunately we don't have to also support partial symbol tables!)
Comment 4•17 years ago
|
||
I think we need this before we can switch Mac nightlies/releases to Leopard, or we'll have to take the hit of using -save-temps (which bloats out the build time and tree).
Flags: blocking1.9?
Comment 5•17 years ago
|
||
Here is an ultra-hacky first-draft version of Apple-style dwarf2 handling for breakpad. Its only virtues are it gets through assorted Mozilla executables and dylibs without dying (usually), and produces plausible-looking output. There are hordes of omissions and limitations.
Assignee | ||
Comment 6•17 years ago
|
||
Already marked as blocking a blocker...
Flags: blocking1.9? → blocking1.9+
Priority: -- → P1
Comment 7•17 years ago
|
||
Current status is that it's run as part of "make buildsymbols" on a regular trunk build, but there are several serious problems that will have to be resolved before it's an adequate replacement for stabs, such as the need to produce full pathes to source files, mistaken address offsets in separate .o files, and the need to get symbols from objects in .a archives. At least another week will be needed to fix all these.
Assignee | ||
Comment 8•17 years ago
|
||
Looks like we're not going to get bug 417045 regardless, and we don't have a week to do this, so minusing now. I really think we'll want this in the Moz 2 timeframe. Shebs: could you just post a patch of what you've got done, with some comments on what still needs to be finished? Hopefully you or someone else will be able to pick this up and finish it off later.
Flags: wanted-next+
Flags: blocking1.9-
Flags: blocking1.9+
Comment 9•17 years ago
|
||
I've got some half-written pieces to finish off while they're flying around in my head, can take a couple days on those and then put in upstream while moz trunk is frozen.
Updated•17 years ago
|
Target Milestone: --- → mozilla1.9beta5
Assignee | ||
Updated•17 years ago
|
Priority: P1 → --
Target Milestone: mozilla1.9beta5 → ---
Comment 10•17 years ago
|
||
Hey Stan any updates here? We'd like to do this switch quickly after the release of FF3...
Comment 11•17 years ago
|
||
I managed to create a terrible muddle in my code trying to handle rearrangement of functions in object files (not hypothetical, one of the things ICC does differently from GCC), and set it aside to work on blockers and dehydra. I'll bring it forward again.
Comment 12•17 years ago
|
||
FWIW, we just enabled breakpad symbols on mozilla-central builds and discovered that with -gstabs -gfull there is problems...there is little to no source information and Ted said something about a visibility issue. (For reference here's the crash report and the symbols: http://crash-stats.mozilla.com/report/index/08cae0d7-06fc-11dd-85ea-001b78bc73ea http://symbols.mozilla.org/firefox/XUL/D3DEE43FA991794589AE5146C580E5110/XUL.sym)
Comment 13•17 years ago
|
||
OK, translation of relative addresses in object files is basically working, though the code is somewhat hacky and not yet efficient enough to use on the browser.
Comment 14•17 years ago
|
||
WIP version of changes, just as a checkpoint.
Attachment #309018 -
Attachment is obsolete: true
Comment 15•17 years ago
|
||
Address translation is now producing completely correct results, and efficiently enough to use with "make buildsymbols". Unfortunately, dump_syms crashes on seven of the files, with what looks like memory corruption. There are still some discrepancies between stabs and dwarf dumps, though some are unavoidable, and other may be desirable; the stabs dump will show some inline bodies that the dwarf dump skips (this is a good thing because otherwise the .h file of the inline gets listed in the stack dump, and the calling function never appears).
Comment 16•17 years ago
|
||
The crashes are now fixed, and I've added the heuristic that recognizes a random symbol as a function if there is any line number info at all in its address range (the hybrid stabs/dwarf scheme used by OS X does not have N_FUN stabs giving a length, so we get "PUBLIC" instead of "FUNC" entries). Remaining key issues are to get info from .a archives, and to figure out what to do about multiple source files apparently given the same address.
Comment 17•17 years ago
|
||
Reading from archives is working, and the multiple source files at one address is a fundamental design flaw that affects stabs also (the existing code simply uses the last file it found, whether or not it's the right choice), so I'm punting on it. I'm now cleaning out the temporary hacks in my code and putting it into "Google style". Also I went to check that the dwarf bits worked OK in 64-bit, but it's not even recognizing the file as valid, so need to debug that.
Comment 18•17 years ago
|
||
This is the code I plan to submit for incorporation into Google's Breakpad sources. I started on 64-bit support, but it's going to need more before that is functional. I did regression-test with "make buildsymbols", and there is no effect on stabs, so no regression for the existing 10.4 builds.
Attachment #315623 -
Attachment is obsolete: true
Comment 19•17 years ago
|
||
The patch is currently being reviewed by Mark Mentovai and others.
Comment 20•17 years ago
|
||
Xref to the Google project issue: http://code.google.com/p/google-breakpad/issues/detail?id=236
Assignee | ||
Comment 21•17 years ago
|
||
I applied this patch, built a whole universal Firefox with libmozjs.dylib from ICC, then ran buildsymbols on it, crashed the resulting Firefox, and tried running minidump_stackwalk on the minidump using the symbols. I didn't get a stack trace, because apparently the symbol reader is unhappy with having line number data for PUBLIC functions:
2008-05-15 09:35:50: basic_source_line_resolver.cc:736: ERROR: Found source line data without a function at /Users/luser/build/obj-firefox-icc/ppc/dist/crashreporter-symbols/2008051508//libmozjs.dylib/21FEC0F63F7344B9928E28074CF30AD50/libmozjs.dylib.sym:118
The offending line is the second line here:
PUBLIC c60 0 math_static_methods.0
c62 1f 5803 1
c81 5 5808 1
This happens for every symbol file it tries to load, so I don't get a real stack trace.
Comment 22•17 years ago
|
||
Hmm, there's a heuristic for attaching a line number to a random symbol, necessary because in the dwarf2 format symbols don't get line numbers directly like is done in stabs - but the heuristics requires address of symbol and line to be the same, and here we have symbol at 0xc60 but the first line is at 0xc62. Probably the compiler should have issued line number info for the function. I think it's safe to have the dumper change PUBLIC to a FUNC if line number info follows, but we have no way to guess at the line number, so the stack walker just needs to cope with that (which it should, since a stack dump missing only a couple frames of symbol info is still useful).
Assignee | ||
Comment 23•17 years ago
|
||
Right, I was probably misleading above, I got a partial stack trace, I just didn't get any symbol information, because the symbol file reader aborts when it hits an instance of this.
Assignee | ||
Comment 24•17 years ago
|
||
Oh, there's also this:
008-05-27 16:08:46: basic_source_line_resolver.cc:736: ERROR: Found source line data without a function at /Users/luser/build/obj-firefox-icc/ppc/dist/crashreporter-symbols/2008051508//XUL/8D61558EF045AFE2F6CA72CAACADF0340/XUL.sym:340034
That's the first line after the FILE lines, it just goes right into line number data, without a FUNC or a PUBLIC.
Comment 25•16 years ago
|
||
Is this still being worked on? Symbols are still eating up a _ton_ of space on our Mac m-c/actionmonkey/tracemonkey slaves.
Comment 26•16 years ago
|
||
This is still hurting really bad on mac m-c/actionmonkey/tracemonkey slaves. Any update here?
Assignee | ||
Comment 27•16 years ago
|
||
We have a contractor lined up to do some work, but I haven't had time to interface with him enough. Hopefully we'll get to that very soon.
Comment 28•16 years ago
|
||
we discovered over in bug 448616 that upgrading to Xcode 3.1 will let us build without -save-temps, so this issue is going to be less critical from a build-infra perspective.
Comment 29•16 years ago
|
||
It looks like breakpad got dwarf support added on Oct 8. Is there anything holding this up now?
Assignee | ||
Comment 30•16 years ago
|
||
Need to sync up with upstream Breakpad code, make some local changes to match the upstream changes, and then change the symbol build script to produce .dSYM files, as that's what the Breakpad DWARF code expects. Probably ~1 week worth of work. I plan to make that happen once I finish off a few other things.
Assignee | ||
Updated•16 years ago
|
Assignee | ||
Updated•16 years ago
|
Attachment #317448 -
Attachment is obsolete: true
Assignee | ||
Comment 31•16 years ago
|
||
Depends on the patches from bug 465329. I'm still testing the output of this. It looks sort of ok, but might have some issues.
Assignee | ||
Comment 32•16 years ago
|
||
This definitely has some bugs. Two I've found so far:
1) Doesn't support DW_AT_specification, so it completely misses some functions. (Although it has the line number data for them, it doesn't get the function name.)
2) Doesn't provide the full function name for functions.
I'll file these both upstream and fix them there, then sync up again.
Assignee | ||
Comment 33•16 years ago
|
||
Assignee | ||
Comment 34•16 years ago
|
||
I have patches for both issues, should get them reviewed and landed upstream very soon, then I'll some rs to get them in our repo, and we'll be set. With those two issues fixed and the patch here, I'm able to build symbols from dwarf debuginfo and get a valid stack out the other end.
Assignee | ||
Comment 35•16 years ago
|
||
This works great with the breakpad sync + new patches that need to land upstream. I was getting tripped up when I ran buildsymbols twice without cleaning up the .dSYM files in between, so I added some code to work around that situation. Shouldn't be needed in the usual case, but I know I'll hit this again in future testing, so I'll save myself some hassle.
Attachment #350821 -
Attachment is obsolete: true
Attachment #351432 -
Flags: review?(benjamin)
Reporter | ||
Updated•16 years ago
|
Attachment #351432 -
Flags: review?(benjamin) → review+
Comment 36•16 years ago
|
||
Attachment #351887 -
Flags: review?(ted.mielczarek)
Assignee | ||
Comment 37•16 years ago
|
||
Comment on attachment 351887 [details] [diff] [review]
patch for nightly/release mozconfigs [checked in]
I'm r+ing this so much!
Attachment #351887 -
Flags: review?(ted.mielczarek) → review+
Assignee | ||
Comment 38•16 years ago
|
||
Comment on attachment 351432 [details] [diff] [review]
with a little protection if you run buildsymbols twice [checked in]
Pushed:
http://hg.mozilla.org/mozilla-central/rev/ede5b83266d3
bhearsum: close this bug when you push that mozconfig change, please.
Attachment #351432 -
Attachment description: with a little protection if you run buildsymbols twice → with a little protection if you run buildsymbols twice [checked in]
Comment 39•16 years ago
|
||
Comment on attachment 351887 [details] [diff] [review]
patch for nightly/release mozconfigs [checked in]
changeset: 587:b76bbdd5af57
Attachment #351887 -
Attachment description: patch for nightly/release mozconfigs → patch for nightly/release mozconfigs [checked in]
Updated•16 years ago
|
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•16 years ago
|
Attachment #351432 -
Flags: approval1.9.1?
Assignee | ||
Comment 40•16 years ago
|
||
Comment on attachment 351432 [details] [diff] [review]
with a little protection if you run buildsymbols twice [checked in]
See my long justification in bug 465329 comment 11.
Reporter | ||
Updated•16 years ago
|
Attachment #351432 -
Flags: approval1.9.1? → approval1.9.1+
Comment 41•16 years ago
|
||
So, I've just symlinked mozilla-1.9.1 and tracemonkey to mozilla-central -- the only difference between these directories was stabs vs. dwarf. May as well symlink to reduce maintenance burden.
Attachment #354187 -
Flags: review?(ted.mielczarek)
Assignee | ||
Updated•16 years ago
|
Attachment #354187 -
Flags: review?(ted.mielczarek) → review+
Updated•16 years ago
|
Attachment #354187 -
Attachment description: switch to dwarf on tracemonkey and mozilla-1.9.1 → [checked in] switch to dwarf on tracemonkey and mozilla-1.9.1
Comment 42•16 years ago
|
||
Comment on attachment 354187 [details] [diff] [review]
[checked in] switch to dwarf on tracemonkey and mozilla-1.9.1
changeset: 608:b836180f8eb8
Assignee | ||
Comment 43•16 years ago
|
||
Pushed to 1.9.1:
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/2385ebb971f8
Keywords: fixed1.9.1
Updated•16 years ago
|
Target Milestone: --- → mozilla1.9.1b3
Version: unspecified → 1.9.1 Branch
Updated•16 years ago
|
Target Milestone: mozilla1.9.1b3 → mozilla1.9.2a1
You need to log in
before you can comment on or make changes to this bug.
Description
•