Closed
Bug 855466
Opened 12 years ago
Closed 8 years ago
Build Linux Nightly to support Breakpad Unwinding
Categories
(Core :: Gecko Profiler, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: BenWa, Unassigned)
References
Details
We need Linux Nightly to support breakpad Unwinding. We likely want these to ship to CFI information but anyways that makes breakpad unwinding work efficiently will do.
Comment 1•12 years ago
|
||
The x86-64 nightlies should already be fine, since they ship with .eh_frame and function symbols.
I'm not sure what we'd need to do to make the 32-bit nightlies usable. Shipping them completely unstripped isn't workable because libxul is huge, and we don't actually need all that info. I'm not sure if there's a way to either produce the equivalent of .eh_frame on x86 or strip everything but the CFI info.
Updated•12 years ago
|
OS: Mac OS X → Linux
Hardware: x86 → x86_64
Comment 2•12 years ago
|
||
x86-64 should be fine, right? It's just x86 that we need to do something special for.
Comment 3•12 years ago
|
||
I just tried this out, with both 32- and 64-bit nightlies. Starting
them thusly:
MOZ_PROFILER_NEW=1 MOZ_PROFILER_INTERVAL=100 MOZ_PROFILER_MODE=native \
./firefox/firefox -P dev -no-remote
in both cases they appear to load CFI and start native unwinds, etc.
In particular, in both cases I get
dump_symbols.cc:533: INFO: LoadSymbols: BEGIN /path/to/firefox/libxul.so
dump_symbols.cc:633: INFO: LoadSymbols: read CFI from .eh_frame
dump_symbols.cc:713: INFO: LoadSymbols: SUCCESS /path/to/firefox/libxul.so
In the 64-bit case, I can then back up in Cleopatra in the inverted
callstack after a bit of idling, ending up at XRE_Main::XRE_Main,
which is pretty convincing. Unfortunately not so in the 32-bit case;
the stack seems pretty trashy. I think the first priority is to
disable stack scanning (bug 855977) so we can see when the "reliable"
schemes (CFI, frame-pointer) are failing.
It might also be really useful to improve breakpad's logging, so
instead of saying just
read CFI from .eh_frame
it says something like
read CFI from .eh_frame, covering 1234567 text bytes
so we can get some idea of whether any useful amount of CFI was
obtained, or not.
Comment 4•12 years ago
|
||
Exciting!
(In reply to Julian Seward from comment #3)
> It might also be really useful to improve breakpad's logging, so
> instead of saying just
>
> read CFI from .eh_frame
>
> it says something like
>
> read CFI from .eh_frame, covering 1234567 text bytes
>
> so we can get some idea of whether any useful amount of CFI was
> obtained, or not.
I think such changes would be readily approved.
Comment 5•12 years ago
|
||
Oh, I didn't realize we had a .eh_frame section on x86. On x86-64 that's part of the ABI. I wonder if the compiler just isn't producing CFI in there for all the functions we care about, since it's probably only required to do so for things that could throw C++ exceptions.
Comment 6•12 years ago
|
||
Here's readelf -S output for both 64- and 32-bit Linux nightlies:
64-bit
[Nr] Name Type Address Off Size ES Flg Lk Inf Al
[14] .eh_frame_hdr PROGBITS 0000000002359cb0 2359cb0 18b244 00 A 0 0 4
[15] .eh_frame PROGBITS 00000000024e4ef8 24e4ef8 6ea8cc 00 A 0 0 8
32-bit
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[14] .eh_frame_hdr PROGBITS 020886e4 20886e4 0000b4 00 A 0 0 4
[15] .eh_frame PROGBITS 02089798 2088798 00040c 00 WA 0 0 4
So (as expected) the 32 bit .eh_frame section is almost empty. Looks
like we'll need to build it with -fasynchronous-unwind-tables.
Comment 7•12 years ago
|
||
I was just about to tell you that you could slip that compiler flag into:
http://mxr.mozilla.org/mozilla-central/source/build/autoconf/frameptr.m4
when I suddenly remembered that we're already building our nightlies with --enable-profiling, which means they should have usable frame pointers which should be enough for unwinding. Are they not working?
I would assume that Linux system libraries would be built with frame pointers as well on x86.
Comment 8•12 years ago
|
||
So, the 32-bit story gets stranger. Here's what there is for libxul
for nightlies, which IIUC are compiled with gcc-4.5:
> 32-bit
> [Nr] Name Type Addr Off Size ES Flg Lk Inf Al
> [14] .eh_frame_hdr PROGBITS 020886e4 20886e4 0000b4 00 A 0 0 4
> [15] .eh_frame PROGBITS 02089798 2088798 00040c 00 WA 0 0 4
So .eh_frame is pretty much empty.
But for a build on 32-bit Ubuntu 12.04, using
browser/config/mozconfigs/linux32/nightly (is that the right one to
use?) then I get an .eh_frame size of a6a834, which is way more
convincing. And the stack traces from the profiler look plausible.
Ubuntu 12.04 uses gcc-4.6.3. According to
http://gcc.gnu.org/gcc-4.6/changes.html, 4.6 started to use
-fomit-frame-pointer (and CFI). Implication therefore is that the
nightlies don't unwind properly because they don't contain CFI, but
they do contain frame pointers, but breakpad isn't using them for some
reason.
Comment 9•12 years ago
|
||
Having dug around some more in this, I am of the view that we can't
properly assess what's going on until we have a way to disable stack
scanning. Problem is that if better unwind methods fail, then it
always falls back to stack scanning, which provides semi-random
semi-unreproducible stacks, which make it nearly impossible to
assess how well the non-scan methods have done.
Depends on: 855977
Comment 10•12 years ago
|
||
FTR, here's what is going on for the unwind data for gcc 4.5 vs 4.6 on
32-bit x86 Linux.
* In both cases, the code is built with frame pointers in. That is
the default for gcc 4.5, but only happens with 4.6, IIUC, because
the nightly configs specify -fno-omit-frame-pointer.
* For both gcc 4.5 and 4.6, the libxul.so created (in the objdir)
probably contains complete CFI. But the placement in sections is
different. Here's 4.6:
Name Type Addr Off Size ES Flg Lk Inf Al
.eh_frame_hdr PROGBITS 01d661e0 1d661e0 14648c 00 A 0 0 4
.eh_frame PROGBITS 01ead46c 1ead46c a6a834 00 WA 0 0 4
and 4.5:
.eh_frame_hdr PROGBITS 01e50694 1e50694 0000e4 00 A 0 0 4
.eh_frame PROGBITS 01e513c8 1e513c8 0004d0 00 WA 0 0 4
.debug_frame PROGBITS 00000000 22456a1c b88ba8 00 0 0 4
Hence 4.6 puts all the CFI in .eh_frame, whereas 4.5 puts almost all
of it in .debug_frame and only a tiny bit in .eh_frame.
* 'make package' nukes .debug_frame, leaving only .eh_frame_hdr and
.eh_frame. Hence the 4.6 build is left with full CFI and the 4.5
build is left with almost none:
4.6:
.eh_frame_hdr PROGBITS 01d661e0 1d661e0 14648c 00 A 0 0 4
.eh_frame PROGBITS 01ead46c 1ead46c a6a834 00 WA 0 0 4
4.5
.eh_frame_hdr PROGBITS 01e50694 1e50694 0000e4 00 A 0 0 4
.eh_frame PROGBITS 01e513c8 1e513c8 0004d0 00 WA 0 0 4
Comment 11•12 years ago
|
||
Did you try -fasynchronous-unwind-tables on 4.5?
Comment 12•12 years ago
|
||
Yeah, -fasynchronous-unwind-tables at least gives plausible .eh_frame
with gcc-4.5:
sewardj@u1204x86:~/MOZ/TEST$ readelf -S -W firefox-GCC462/libxul.so | grep frame
[14] .eh_frame_hdr PROGBITS 01d661e0 1d661e0 14648c 00 A 0 0 4
[15] .eh_frame PROGBITS 01ead46c 1ead46c a6a834 00 WA 0 0 4
sewardj@u1204x86:~/MOZ/TEST$ readelf -S -W firefox-GCC453/libxul.so | grep frame
[14] .eh_frame_hdr PROGBITS 01e50694 1e50694 0000e4 00 A 0 0 4
[15] .eh_frame PROGBITS 01e513c8 1e513c8 0004d0 00 WA 0 0 4
sewardj@u1204x86:~/MOZ/TEST$ readelf -S -W firefox-GCC453-AUT/libxul.so | grep frame
[14] .eh_frame_hdr PROGBITS 01e50694 1e50694 17892c 00 A 0 0 4
[15] .eh_frame PROGBITS 01fca330 1fc9330 b76558 00 WA 0 0 4
Comment 13•12 years ago
|
||
At least to a first approximation, native unwind now works on 32 bit
nightlies, although it's hard to tell whether the stack traces are
bogus or not.
Reporter | ||
Comment 14•12 years ago
|
||
I just tried linux x64. We get about ~80% unwind (this will vary a lot across machine, system libs and use case). The quality was excellent expect that I couldn't unwind pass: (1) My driver, (2) System libs like gtk2, libc assembly function, (3) JS JIT frames. Once we support CFI+frame pointers this should solve (3) at least.
Comment 15•12 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #14)
> Once we support CFI+frame pointers this should solve (3) at least.
breakpad on Linux x64 appears to support only CFI or stack scanning,
but no frame pointers. Perhaps unsurprisingly as the ELF x86_64 ABI
doesn't use frame pointers. So, I don't think we can do much better
here. You could maybe try incrementally enabling stack-scanning
by setting MOZ_PROFILER_STACK_SCAN=1, =2, etc.
Comment 16•12 years ago
|
||
It wouldn't be hard to add frame pointer support to the x64 stackwalker, but I'm not sure how useful it'd be, since -fomit-frame-pointer is the ABI default.
Comment 17•8 years ago
|
||
Julian, does any of this apply to LUL? Can this bug be closed because it was about the breakpad unwinder which we're not using for the profiler anymore?
Flags: needinfo?(jseward)
Comment 18•8 years ago
|
||
Markus, this can be closed. This is unrelated to LUL.
Flags: needinfo?(jseward)
Comment 19•8 years ago
|
||
And Linux Nightlies support LUL unwinding by default. Closing this.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•