See https://bugzilla.mozilla.org/show_bug.cgi?id=746243#c52 In order to consolidate our infrastructure around a common language, python, it'd be nice to rewrite fix-linux-stack.pl in python.
I posted a hacky version of this in bug 948718 if anybody wants to look into it. Not that my script is necessarily where you want to start.
I'm going to try improving on that, since the external debuginfo bits need updating to support .note.gnu.build-id as described in https://sourceware.org/gdb/current/onlinedocs/gdb/Separate-Debug-Files.html and https://sourceware.org/binutils/docs/ld/Options.html#Options .
Not sure how much effort it's worth, but it would be nice if we could share some logic b/w the osx and linux versions.
So I started from mccr8's version and added the feature that I needed (the new way of doing external debuginfo, at so that I can use dbgsym packages on Ubuntu 14.04 with it), and just landed what I have so far: remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/cfb6de8ff92c remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/9ef4e1a017e9 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/0f99aee39def remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/1acb2c5d135c remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/89a0fa83c6d7 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/e8b07667f3cc I agree there could probably be some better sharing of the main loop and the buffering, although the rest of the code differs between platforms. (I also realize, in hindsight, that I hooked up the address adjustment stuff differently from Mac, since mccr8 had removed that bit from his Linux version.) There's definitely some cleanup, review, and cross-system testing to add here, but now there's something in the tree (and possibly something that works better for developers on current Linux distros than the perl version, too).
One bug of note: the python version is not as good: __restore_rt (??:?) as the perl version was: __restore_rt (sigaction.c:?) at dealing with __retore_rt in signal handlers. Possibly because it insists on file *and* line rather than taking what it can get.
I compared the output from a DMD run when put through the Perl and Python versions. Two kinds of differences dominate: > Perl: __clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d > Python: clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d > Perl: ??? (/usr/lib/x86_64-linux-gnu/gstreamer-0.10/libgstaudioparsers.so) 0x90f092a4 > Python: ?? (??:0) 0x90f092a4 The latter difference is more significant, though still not a big problem. The time to run was similar; 1m18s for Perl, 1m22 for Python, though I was compiling at the same time in another window so the results may be imprecise.
(In reply to Nicholas Nethercote [:njn] from comment #6) > I compared the output from a DMD run when put through the Perl and Python > versions. Two kinds of differences dominate: > > > Perl: __clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d > > Python: clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d I'm not sure what would cause this difference. Although I remember there being something in the Mac OS X version to trim extra __s, I don't see anything in the Linux version. Is it possible you have two separate external debuginfo sources for glibc, and the patch switched between the two? > > Perl: ??? (/usr/lib/x86_64-linux-gnu/gstreamer-0.10/libgstaudioparsers.so) 0x90f092a4 > > Python: ?? (??:0) 0x90f092a4 > > The latter difference is more significant, though still not a big problem. What does the input line for this look like?
Actually, never mind. The perl version had code to explicitly ignore ??:0 from addr2line; we need that in the python version.
So, I think the big reason for differences is a bug in the perl version that led its first two checks for an external debug file to fail all the time because it was missing a "/" between the directory name and the filename. This matters because the debuglink filename might be the *same* as the original filename. So if there's a debuglink whose name is the same as the file, the python version is preferring the file itself over external debuginfo in the last two locations to check. That doesn't seem right, though it is following the documentation at https://sourceware.org/gdb/current/onlinedocs/gdb/Separate-Debug-Files.html .
(And I think the best way to fix that is to implement the CRC checking.)
Three more changes landed: remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/00907b88a3c1 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/ed2a7702b5e1 remote: https://hg.mozilla.org/integration/mozilla-inbound/rev/0dc5563f5874
Actually 4, since I left some debugging in: https://hg.mozilla.org/integration/mozilla-inbound/rev/336b8cc31768
With the latest changes, the Python version is now better than the Perl version, because of improvements like these (from the diff) > - g_slice_alloc (??:?) 0xd1677707 > - gst_registry_get_feature_list_cookie (??:?) 0x9b788148 > - gst_registry_binary_read_cache (??:?) 0x9b7ab278 > - gst_update_registry (??:?) 0x9b7847ed > + g_slice_alloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0) 0xd1677707 > + gst_registry_get_feature_list_cookie (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b788148 > + gst_registry_binary_read_cache (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b7ab278 > + gst_update_registry (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b7847ed
https://hg.mozilla.org/mozilla-central/rev/cfb6de8ff92c https://hg.mozilla.org/mozilla-central/rev/9ef4e1a017e9 https://hg.mozilla.org/mozilla-central/rev/0f99aee39def https://hg.mozilla.org/mozilla-central/rev/1acb2c5d135c https://hg.mozilla.org/mozilla-central/rev/89a0fa83c6d7 https://hg.mozilla.org/mozilla-central/rev/e8b07667f3cc
(In reply to Nicholas Nethercote [:njn] from comment #13) > With the latest changes, the Python version is now better than the Perl > version, because of improvements like these (from the diff) Yeah, at some point addr2line changed to output "??:?" instead of "??:0", so I made the python version handle either in https://hg.mozilla.org/integration/mozilla-inbound/rev/ed2a7702b5e1
Anybody have thoughts on whether I should mark this bug fixed? The new version should perhaps be code-reviewed, although I don't think the old one ever was, and NPOTB code doesn't strictly require it. There's also the question of switching automation to use the new one, which could perhaps be done in bug 948718 although could perhaps happen sooner by just invoking the script.
> Anybody have thoughts on whether I should mark this bug fixed? Sounds ok to me. We still need to remove fix-linux-stack.pl, but this bug has had enough patches land that it's probably best to do that in a follow-up bug.
I filed bug 1050601 for removing fix-linux-stack.pl.
(In reply to David Baron [:dbaron] (UTC-7) (needinfo? for questions) from comment #17) > The new version should perhaps be code-reviewed, although I don't think the > old one ever was, and NPOTB code doesn't strictly require it. I can take a look (or just fix things up and get review for them). > > There's also the question of switching automation to use the new one, which > could perhaps be done in bug 948718 although could perhaps happen sooner by > just invoking the script. I'll do this as well, I'm in the process of reviving the patches to port runreftest.py to mozbase, so I'll need to use this.