Closed Bug 647469 Opened 13 years ago Closed 13 years ago

Linux debug builds on 1.9.2 and 1.9.1 broken in fix-linux-stack.pl

Categories

(Testing :: General, defect)

x86
Linux
defect
Not set
critical

Tracking

(blocking1.9.2 .17+, status1.9.2 .17-fixed, blocking1.9.1 .19+, status1.9.1 .20-fixed)

RESOLVED FIXED
mozilla6
Tracking Status
blocking1.9.2 --- .17+
status1.9.2 --- .17-fixed
blocking1.9.1 --- .19+
status1.9.1 --- .20-fixed

People

(Reporter: philor, Assigned: karlt)

Details

Attachments

(3 files)

Something between a green build on 20110329 (08:07 for 1.9.2, 10:21 for 1.9.1) and the next builds on 20110331 (13:00 on 1.9.2, a different push at 15:34 on 1.9.1 which didn't get the 1.9.2 push, so code changes don't explain it), the Linux32 debug builds started failing in the leak test step that runs

/bin/bash -c 'perl build/tools/rb/fix-linux-stack.pl sdleak.tree.raw > sdleak.tree'
...
program finished with exit code 141
I just ran it manually on a slave that failed recently and got the same error, with no other error message.

I'm attaching sdleak.tree.raw and sdleak.tree in case they're useful to somebody.

Exit code 141 could be SIGPIPE, if that's helpful
Attached file sdleak logs
It sounds like this needs developer eyes?
Should we move this out of our component to get those, or cc people?
Easy enough to cc both everyone who has ever touched fix-linux-stack.pl, and everyone else who knows that it exists, but I'd think the important first step would be to answer the question "what changed between the morning of the 29th and the afternoon of the 31st, that could break a Perl script?"
The only pipe that fix-linux-stack.pl writes to is a /usr/bin/addr2line process.

Would be interesting to know the exit status of echo during the following command:
/bin/echo 0x6F96 | /usr/bin/addr2line -C -f -e /usr/lib/debug/lib/libresolv-2.*.so.debug

In bash that information is in the PIPESTATUS array (until the next command is executed).  I'm guessing the debuginfo filename here.
[cltbld@moz2-linux-slave47 ~]$ /bin/echo 0x6F96 | /usr/bin/addr2line -C -f -e /usr/lib/debug/lib/libresolv-2.*.so.debug
Segmentation fault
[cltbld@moz2-linux-slave47 ~]$ echo ${PIPESTATUS[*]}
0 139
Thanks.  So addr2line is crashing (with SIGSEGV).
(/bin/echo is not aborting on SIGPIPE, but fix-linux=stack.pl would get SIGPIPE if the crash happened before fix-linux-stack.pl performed the write.)

Possible solutions are:

1) Find an addr2line that doesn't crash.

2) Set "$SIG{"PIPE"} = 'IGNORE';" in fix-linux-stack.pl (before the while loop).
   That's probably good enough as this seems to be a rare crash, though it
   would be worth checking that perl handles an empty <$in> ok:
             chomp(my $symbol = <$in>);
             chomp(my $fileandline = <$in>);
It's only a rare crash in so far as builds on 1.9.2 and 1.9.1 are rare; within that small set, it's 100%.
Yes, strange that 1.9.1 and 1.9.2 would start to hit this at the same time, which suggests either something changed in the OS or that the change in date/time triggered the issue.

What I meant be rare though was that 20k invocations of addr2line have succeeded during the fix-linux-stack.pl run, so it seems that particular input to addr2line is necessary to produce the crash.
blocking1.9.1: --- → ?
blocking1.9.2: --- → ?
blocking1.9.1: ? → .19+
blocking1.9.2: ? → .17+
This needs to be out of the release engineering component and in a developer bug queue to get the attention it needs if it's a blocker.
Component: Release Engineering → General
Product: mozilla.org → Firefox
QA Contact: release → general
Version: other → unspecified
Can anyone from release engineering confirm that there were no changes to the test environment between 20110329 and 20110331, or point to a list of any changes that did happen?
Product: Firefox → Testing
QA Contact: general → general
Version: unspecified → 1.9.2 Branch
Assignee: nobody → karlt
Attachment #525756 - Flags: review?
Attachment #525756 - Flags: review? → review?(dbaron)
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/fea6d8a4c4e1

Doesn't seem to apply cleanly to 1.9.1...
Attached patch 1.9.1 patchSplinter Review
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/6035826050f4

Probably not much point putting this on 1.9.1.19 unless there are further changes to land there.

I'll leave the bug open to land for prophylaxis on m-c.
Version: 1.9.2 Branch → unspecified
http://hg.mozilla.org/mozilla-central/rev/d77a997768a9
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla6
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: