dump_syms doesn't produce symbol information for executables
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr68 | --- | unaffected |
| firefox-esr78 | --- | wontfix |
| firefox78 | --- | wontfix |
| firefox79 | --- | wontfix |
| firefox80 | --- | fixed |
People
(Reporter: froydnj, Assigned: froydnj)
References
(Regression)
Details
(Keywords: regression)
Attachments
(1 file)
TL; DR: we handle cpusubtype information badly and it causes us to silently drop symbol information for executables.
I attempted to rebase my patch in bug 1576748 so that moving to the new dump_syms implementation would be easier on try -- failures would be swift and immediate, and we wouldn't have to worry about silently breaking crash reporting...which my initial push with the new dump_syms implementation for Mac did, and I might have wound up pushing it to central if I hadn't inspected more closely.
I thought I remember having things working with the patch -- it went through several rounds of being backed out the first time around, but pushing it to try met with errors like:
Unable to calculate UUID of mach-o binary /builds/worker/workspace/obj-build/dist/bin/firefox!
which is...uh, unexpected, because on a local OS X cross build that I have lying about, one can see that there is indeed a UUID load command:
froydnj@hawkeye:~/src/dump_syms$ ~/.mozbuild/clang/bin/llvm-objdump -p /opt/build/froydnj/build-macosx-cross/dist/bin/firefox |grep -B 1 -A 2 LC_UUID
Load command 8
cmd LC_UUID
cmdsize 24
uuid 302C8E84-80E4-3587-94AA-8DA186A126CF
but for whatever reason breakpad isn't able to access it.
Looking a little further afield, our crashreporter symbols for Mac don't feature any symbols for any of the executables we produce during the build process; we only produce symbols for dynamic libraries. Presumably this is because (currently) we are silently erroring out on executables and we just don't notice.
So what is the difference, exactly, between what executables look like and dynamic libraries look like? The first clue lies in the file header, which for firefox looks like:
/opt/build/froydnj/build-macosx-cross/dist/bin/firefox: file format Mach-O 64-bit x86-64
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC_64 X86_64 ALL LIB64 EXECUTE 22 2256 NOUNDEFS DYLDLINK TWOLEVEL PIE
and for, to pick an example at random, libnss3.dylib:
/opt/build/froydnj/build-macosx-cross/dist/bin/libnss3.dylib: file format Mach-O 64-bit x86-64
Mach header
magic cputype cpusubtype caps filetype ncmds sizeofcmds flags
MH_MAGIC_64 X86_64 ALL 0x00 DYLIB 19 2296 NOUNDEFS DYLDLINK TWOLEVEL NO_REEXPORTED_DYLIBS
This output is courtesy of llvm-objdump, and the code that prints that header comes from here:
The cpusubtype field in the above output comes from consulting the cpusubtype field of the file header:
The corresponding code that prints ALL comes from here:
Notice that we are masking off CPU_SUBTYPE_MASK to switch on the value; that's important, and we'll come back to it later.
Where is the caps field coming from? Well, the upper eight bits of the cpusubtype header field actually store various interesting things; you can see the llvm-objdump code consulting that field to print out LIB64 for the firefox binary:
The LIB64 bit is defined as 64 bit libraries here:
Why, exactly, firefox gets a bit indicating 64-bit libraries and the actual dynamic library does not is a question better left unexplored, I think. (I guess this simplifies the lookup or loading process for dependent libraries of a binary inside userspace, perhaps?)
Anyway, those upper eight bits for cpu subtypes are really all that differs between firefox and libnss3.dylib...and they make a difference hereabouts:
because at that point, when we're walking over the headers for firefox, we have a header.cpusubtype field of 0x80000003 (CPU_SUBTYPE_LIB64 | CPU_SUBTYPE_X86_64_ALL) and the passed in cpu_substype originally came from the file header itself. So before that line, header.cpusubtype == cpu_subtype. But after that line, we've masked off the upper bits and we've guaranteed that we cannot pass this test:
And because we don't pass that test, the MachoWalker thinks that we can't hope to find any relevant load commands in the file, and ultimately fails to produce symbols.
I don't know how to convince treeherder to show me jobs from eight months ago, but I bet we'd see that we silently stopped collecting symbols for executables when bug 1371390 landed.
I don't know what the right fix is. If Steven's bug 1371390 comment 31 is correct, I think we should be masking with ~CPU_SUBTYPE_MASK at least here:
because selected_object_file_->cpusubtype comes directly from the object itself, and possibly here:
We might never actually hit that second case, though, because I think we explicitly pass in an arch.
The alternate fix is to declare that we just don't care about executable symbols -- and we don't want to have to store anything more than we have to -- and just filter out those files in symbolstore.py so that we never have to worry about dump_syms getting invalid input. WDYT?
| Assignee | ||
Updated•5 years ago
|
Comment 1•5 years ago
|
||
In bug 1371390 comment 31 I promised to open another bug about fixing the other references to mach_header.cpusubtype in Breakpad code. I'd forgotten about that. But I could do it, if there's any interest.
| Assignee | ||
Comment 2•5 years ago
|
||
The cpusubtype fix seems to work:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bf6d98fbda4a3017b1bfb2fd40410bd75fe30e09
The target.crashreporter-symbols.zip archives have symbols for the executables in them!
Updated•5 years ago
|
| Assignee | ||
Comment 3•5 years ago
|
||
If we don't do this, we run into problems when walking over the Mach-O load
commands, where we do mask off cpusubtype bits.
Updated•5 years ago
|
Comment 5•5 years ago
|
||
| bugherder | ||
Updated•5 years ago
|
Updated•5 years ago
|
Description
•