Closed Bug 528092 Opened 10 years ago Closed 5 years ago

Supply binaries (firefox.exe and DLLs) from symbol server

Categories

(Toolkit :: Crash Reporting, defect)

x86
Windows Vista
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: dmandelin, Assigned: ted)

References

Details

Attachments

(1 file, 3 obsolete files)

I use the symbol server to debug minidumps from topcrashes. I find that it doesn't work (where "doesn't work" means "symbols are not loaded for Firefox binaries") if I just configure Visual Studio to connect to our symbol server and load the dump. But I'm told that it is possible to set the symbol server to just work.

Currently, I instead have to separately download the binaries (and it's not always easy to identify which build the binary is, or even if it is an official build), and then add a "MODPATH=PATH_TO_BINARIES" to the project properties in MSVC.
It would be infinitely helpful if you include who told you that this was possible, so that we can find out what needs doing.
(In reply to comment #1)
> It would be infinitely helpful if you include who told you that this was
> possible, so that we can find out what needs doing.

OK. :-) I personally don't understand it, so I just hoped that someone would magically know. But dvander is the one who told me it can work, cc'ing him to ask for more info.
Since our symbol server only has .pdbs, it's definitely not possible to download binaries from it. Are you asking for an automated way to download the binaries, or some way to debug these minidumps without having the binaries at all?
The "symstore" program can add .dlls along with .pdbs, and VS will fetch them normally.
Yeah, I can't find any real documentation about this, but I know it exists. From fiddling around with symstore.exe, it looks like the binaries are stored in a different path from the PDBs. For example, my firefox.exe wound up at:
firefox.exe/4AC5E3D017000/firefox.exe

The hex string there is the combination of timestamp+file size from the PE headers (checked with dumpbin, confirmed by this blog post: http://www.differentpla.net/content/2009/04/symbol-store-how-are-exe-files-stored)

We currently store the pdb files in symbolstore.py's CopyDebug method:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/symbolstore.py#599
we could feasibly copy the binaries there too, with a little tweaking.

We might want to make this conditional, since we're packaging and uploading zip files of debug symbols for every single build we do nowadays (so we can get stack traces from unit tests and Talos), and sticking all the binaries in there will make those zips considerably bigger at no benefit to anyone. We could probably work things so that we only add binaries for nightlies and release builds.
Component: Release Engineering → Breakpad Integration
Product: mozilla.org → Toolkit
QA Contact: release → breakpad.integration
Version: other → Trunk
And to answer bsmedberg's question, this is a convenience for postmortem debugging of minidumps. The debugger will fetch the matching binaries from the symbol server so you don't have to find and install the binaries that were in use when the dump was produced.
(In reply to comment #5)
> We could
> probably work things so that we only add binaries for nightlies and release
> builds.

Totally agree with this, shouldn't have that big an impact on the space used on the symbol server.
Related thing that should be done while fixing this bug. We've seen a few bugs (like bug 575100) where the CodeView record for a module is missing, so Breakpad can't find debug symbols for that module, so we get a crummy stack. Opening the minidump in a local debugger works, because it has access to the original binaries, and the minidump contains the module name + code identifier, so the debugger verifies that it has the right binary module, then reads the CodeView record out of the binary and locates the symbols.

If, while dumping symbols, we also left a breadcrumb trail in the symbol store using the code identifier, pointing to the debug symbols, we could teach Breakpad to follow this trail in the absence of a CV record.
I filed an issue upstream with Breakpad for supporting such a breadcrumb trail:
http://code.google.com/p/google-breakpad/issues/detail?id=389
Blocks: 559661
I've got an upstream patch that will make dump_syms output the info needed to get the directory names mentioned in comment 5:
http://breakpad.appspot.com/180001

I should be able to whip up a Mozilla build system patch on top of that that fixes this bug.
Attached patch Store binaries in symbol package (obsolete) — Splinter Review
Turns out to be pretty simple on top of the upstream patch. Not requesting review yet, since I want feedback on the Breakpad change first.
Assignee: nobody → ted.mielczarek
Status: NEW → ASSIGNED
Attachment #470615 - Attachment is obsolete: true
I'd love to have support for this, not just for the better symbols, but also to be able to see the disassembly surrounding each callsite on the stack. Any chance of reviving the patch?
The biggest blocker for this is storage on the symbol server. If we can get that sorted out this should be easy to get landed.
It's about +12% to add the binaries. Who can I talk to about the storage space?
Start with Lonnen.
It doesn't look like we have space for this yet. After we move our primary data store away from hbase to a plain hdfs we could potential host symbols from there, which would unblock this.
tmary was looking into that, FWIW.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #18)
> tmary was looking into that, FWIW.
(In reply to Chris Lonnen :lonnen from comment #17)
> It doesn't look like we have space for this yet. After we move our primary
> data store away from hbase to a plain hdfs we could potential host symbols
> from there, which would unblock this.

How much space is required to store the symbols/binaries/.. (uncompressed) ?
(It might be possible to store them in existing PHX1 clusters, even before migration of raw crashes away from HBase - if that isn't possible, it should certainly be possible to store them in SCL3 cluster)

--
We could put it into hbase, but I don't think we want to spend the engineering effort to read symbols from a deprecated data store instead of netapp. It looks to be ~6 TB of symbols, uncompressed.
Once we fix bug 1071724 I'll get this landed since we won't be space constrained any more.
Depends on: 1071724
Yeah, we don't actually need to wait on bug 1071724, the important bits for this are done now that we'll stop uploading Windows desktop Firefox symbols via SSH.
Depends on: 1153799, 1085557
No longer depends on: 1071724
Attached file MozReview Request: bz://528092/ted (obsolete) —
/r/8339 - bug 528092 - Supply binaries from the symbol server. r?gps

Pull down this commit:

hg pull -r e63833370cc58a3068a9f314ac25f3644016cb53 https://reviewboard-hg.mozilla.org/gecko/
Attachment #8602741 - Flags: review?(gps)
Attachment #471901 - Attachment is obsolete: true
https://reviewboard.mozilla.org/r/8339/#review7179

::: toolkit/crashreporter/tools/symbolstore.py:771
(Diff revision 1)
> +                except OSError:

if e.errno != errno.EEXIST:
    raise
Comment on attachment 8602741 [details]
MozReview Request: bz://528092/ted

https://reviewboard.mozilla.org/r/8337/#review7181

Ship It!
Attachment #8602741 - Flags: review?(gps) → review+
The reason for the "make check" hang that caused the backout was that I neglected to update the mocked version of copy_debug in one unit test:
https://hg.mozilla.org/mozilla-central/annotate/1fab94ad196c/toolkit/crashreporter/tools/unit-symbolstore.py#l169

Unfortunately symbolstore.py will hang if a background task raises an uncaught exception. I filed bug 1164816 on that, and wrote some patches to fix it.
(I also wrote a unit test for the changes in this patch, it's included in that try push.)
https://hg.mozilla.org/mozilla-central/rev/677684d28f2e
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Attachment #8602741 - Attachment is obsolete: true
Attachment #8617979 - Flags: review+
You need to log in before you can comment on or make changes to this bug.