rewrite fix-linux-stack.pl in python

RESOLVED FIXED

Status

Testing
General
RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: Jeff Hammel, Assigned: dbaron)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

4 years ago
See https://bugzilla.mozilla.org/show_bug.cgi?id=746243#c52

In order to consolidate our infrastructure around a common language,
python, it'd be nice to rewrite fix-linux-stack.pl in python.
(Assignee)

Updated

3 years ago
Blocks: 948718
I posted a hacky version of this in bug 948718 if anybody wants to look into it.  Not that my script is necessarily where you want to start.
(Assignee)

Comment 2

3 years ago
I'm going to try improving on that, since the external debuginfo bits need updating to support .note.gnu.build-id as described in https://sourceware.org/gdb/current/onlinedocs/gdb/Separate-Debug-Files.html and https://sourceware.org/binutils/docs/ld/Options.html#Options .
Not sure how much effort it's worth, but it would be nice if we could share some logic b/w the osx and linux versions.
(Assignee)

Comment 4

3 years ago
So I started from mccr8's version and added the feature that I needed (the new way of doing external debuginfo, at so that I can use dbgsym packages on Ubuntu 14.04 with it), and just landed what I have so far:

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/cfb6de8ff92c
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9ef4e1a017e9
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0f99aee39def
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/1acb2c5d135c
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/89a0fa83c6d7
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/e8b07667f3cc

I agree there could probably be some better sharing of the main loop and the buffering, although the rest of the code differs between platforms.  (I also realize, in hindsight, that I hooked up the address adjustment stuff differently from Mac, since mccr8 had removed that bit from his Linux version.)


There's definitely some cleanup, review, and cross-system testing to add here, but now there's something in the tree (and possibly something that works better for developers on current Linux distros than the perl version, too).
Keywords: leave-open
(Assignee)

Comment 5

3 years ago
One bug of note: the python version is not as good:

__restore_rt (??:?)

as the perl version was:

__restore_rt (sigaction.c:?)

at dealing with __retore_rt in signal handlers.  Possibly because it insists on file *and* line rather than taking what it can get.
(Assignee)

Updated

3 years ago
Assignee: nobody → dbaron
I compared the output from a DMD run when put through the Perl and Python versions. Two kinds of differences dominate:

> Perl:    __clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d
> Python:  clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d

> Perl:    ??? (/usr/lib/x86_64-linux-gnu/gstreamer-0.10/libgstaudioparsers.so) 0x90f092a4
> Python:  ?? (??:0) 0x90f092a4

The latter difference is more significant, though still not a big problem.

The time to run was similar; 1m18s for Perl, 1m22 for Python, though I was compiling at the same time in another window so the results may be imprecise.
(Assignee)

Comment 7

3 years ago
(In reply to Nicholas Nethercote [:njn] from comment #6)
> I compared the output from a DMD run when put through the Perl and Python
> versions. Two kinds of differences dominate:
> 
> > Perl:    __clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d
> > Python:  clone (/build/buildd/eglibc-2.19/misc/../sysdeps/unix/sysv/linux/x86_64/clone.S:113) 0xd9e2138d

I'm not sure what would cause this difference.  Although I remember there being something in the Mac OS X version to trim extra __s, I don't see anything in the Linux version.  Is it possible you have two separate external debuginfo sources for glibc, and the patch switched between the two?

> > Perl:    ??? (/usr/lib/x86_64-linux-gnu/gstreamer-0.10/libgstaudioparsers.so) 0x90f092a4
> > Python:  ?? (??:0) 0x90f092a4
> 
> The latter difference is more significant, though still not a big problem.

What does the input line for this look like?
Flags: needinfo?(n.nethercote)
(Assignee)

Comment 8

3 years ago
Actually, never mind.  The perl version had code to explicitly ignore ??:0 from addr2line; we need that in the python version.
Flags: needinfo?(n.nethercote)
(Assignee)

Comment 9

3 years ago
So, I think the big reason for differences is a bug in the perl version that led its first two checks for an external debug file to fail all the time because it was missing a "/" between the directory name and the filename.  This matters because the debuglink filename might be the *same* as the original filename.  So if there's a debuglink whose name is the same as the file, the python version is preferring the file itself over external debuginfo in the last two locations to check.  That doesn't seem right, though it is following the documentation at https://sourceware.org/gdb/current/onlinedocs/gdb/Separate-Debug-Files.html .
(Assignee)

Comment 10

3 years ago
(And I think the best way to fix that is to implement the CRC checking.)
(Assignee)

Comment 11

3 years ago
Three more changes landed:

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/00907b88a3c1
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/ed2a7702b5e1
remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/0dc5563f5874
(Assignee)

Comment 12

3 years ago
Actually 4, since I left some debugging in:
https://hg.mozilla.org/integration/mozilla-inbound/rev/336b8cc31768
With the latest changes, the Python version is now better than the Perl version, because of improvements like these (from the diff)

> -    g_slice_alloc (??:?) 0xd1677707
> -    gst_registry_get_feature_list_cookie (??:?) 0x9b788148
> -    gst_registry_binary_read_cache (??:?) 0x9b7ab278
> -    gst_update_registry (??:?) 0x9b7847ed
> +    g_slice_alloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0) 0xd1677707
> +    gst_registry_get_feature_list_cookie (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b788148
> +    gst_registry_binary_read_cache (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b7ab278
> +    gst_update_registry (/usr/lib/x86_64-linux-gnu/libgstreamer-0.10.so.0) 0x9b7847ed
https://hg.mozilla.org/mozilla-central/rev/cfb6de8ff92c
https://hg.mozilla.org/mozilla-central/rev/9ef4e1a017e9
https://hg.mozilla.org/mozilla-central/rev/0f99aee39def
https://hg.mozilla.org/mozilla-central/rev/1acb2c5d135c
https://hg.mozilla.org/mozilla-central/rev/89a0fa83c6d7
https://hg.mozilla.org/mozilla-central/rev/e8b07667f3cc
(Assignee)

Comment 15

3 years ago
(In reply to Nicholas Nethercote [:njn] from comment #13)
> With the latest changes, the Python version is now better than the Perl
> version, because of improvements like these (from the diff)

Yeah, at some point addr2line changed to output "??:?" instead of "??:0", so I made the python version handle either in https://hg.mozilla.org/integration/mozilla-inbound/rev/ed2a7702b5e1
https://hg.mozilla.org/mozilla-central/rev/00907b88a3c1
https://hg.mozilla.org/mozilla-central/rev/ed2a7702b5e1
https://hg.mozilla.org/mozilla-central/rev/0dc5563f5874
https://hg.mozilla.org/mozilla-central/rev/336b8cc31768
(Assignee)

Comment 17

3 years ago
Anybody have thoughts on whether I should mark this bug fixed?

The new version should perhaps be code-reviewed, although I don't think the old one ever was, and NPOTB code doesn't strictly require it.

There's also the question of switching automation to use the new one, which could perhaps be done in bug 948718 although could perhaps happen sooner by just invoking the script.
Keywords: leave-open
> Anybody have thoughts on whether I should mark this bug fixed?

Sounds ok to me. We still need to remove fix-linux-stack.pl, but this bug has had enough patches land that it's probably best to do that in a follow-up bug.
Blocks: 1050601
I filed bug 1050601 for removing fix-linux-stack.pl.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(In reply to David Baron [:dbaron] (UTC-7) (needinfo? for questions) from comment #17)
> The new version should perhaps be code-reviewed, although I don't think the
> old one ever was, and NPOTB code doesn't strictly require it.

I can take a look (or just fix things up and get review for them).

> 
> There's also the question of switching automation to use the new one, which
> could perhaps be done in bug 948718 although could perhaps happen sooner by
> just invoking the script.

I'll do this as well, I'm in the process of reviving the patches to port runreftest.py to mozbase, so I'll need to use this.
(Assignee)

Comment 21

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/c621d0e5e5e2
https://hg.mozilla.org/mozilla-central/rev/c621d0e5e5e2
You need to log in before you can comment on or make changes to this bug.