If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

crazyhorse (Mozilla1.8) went red for no apparent reason

RESOLVED FIXED

Status

Release Engineering
General
P1
blocker
RESOLVED FIXED
9 years ago
4 years ago

People

(Reporter: Samuel Sidler (old account; do not CC), Assigned: bhearsum)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments)

The build starting at ~8:05am today went red. We really need these machines green this weekend (whether that's IT or build) if we plan on releasing 2.0.0.15 on schedule.

The error I see is: Error: CVS checkout failed.

So, possibly part of bug 435134?

Updated

9 years ago
Assignee: server-ops → thardcastle

Comment 1

9 years ago
Filesystem corruption, waiting on the second run of fsck.
Caused by ongoing NetApp woes... see bug#435134 for details.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 435134
(Assignee)

Comment 3

9 years ago
Turns out this system got hit extra-hard by bug 435134. It's going to need some additional coaxing to get working again. Re-opening to track it separately.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
(Assignee)

Updated

9 years ago
Assignee: thardcastle → bhearsum
Status: REOPENED → NEW
Component: Server Operations: Tinderbox Maintenance → Release Engineering: Maintenance
Priority: -- → P1
(Assignee)

Updated

9 years ago
Status: NEW → ASSIGNED
(Assignee)

Comment 4

9 years ago
And for the record, the current issue is ld failing in nspr via sigsegv:
gcc -o plgetopt.o -c      -pipe -Wall -pthread -O2 -gstabs+ -fPIC  -UDEBUG  -DMOZILLA_CLIENT=1 -DNDEBUG=1 -DHAVE_VISIBILITY_HIDDEN_ATTRIBUTE=1 -DXP_UNIX=1 -D_GNU_SOURCE=1 -DHAVE_FCNTL_FILE_LOCKING=1 -DLINUX=1 -Di386=1 -DHAVE_LCHOWN=1 -DHAVE_STRERROR=1 -D_REENTRANT=1  -DFORCE_PR_LOG -D_PR_PTHREADS -UHAVE_CVAR_BUILT_ON_SEM -I/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/dist/include/nspr  plgetopt.c
rm -f libplc4.a
/usr/bin/ar cr libplc4.a ./plvrsion.o ./strlen.o ./strcpy.o ./strdup.o ./strcat.o ./strcmp.o ./strccmp.o ./strchr.o ./strpbrk.o ./strstr.o ./strcstr.o ./strtok.o ./base64.o ./plerror.o ./plgetopt.o  
ranlib libplc4.a
rm -f libplc4.so
gcc -shared -Wl,-soname -Wl,libplc4.so -o libplc4.so ./plvrsion.o ./strlen.o ./strcpy.o ./strdup.o ./strcat.o ./strcmp.o ./strccmp.o ./strchr.o ./strpbrk.o ./strstr.o ./strcstr.o ./strtok.o ./base64.o ./plerror.o ./plgetopt.o   -L/builds/tinderbox/Tb-Mozilla1.8/Linux_2.4.18-14_Depend/mozilla/dist/lib -lnspr4
collect2: ld terminated with signal 11 [Segmentation fault]

Clobbering doesn't fix it.
(Assignee)

Comment 5

9 years ago
Created attachment 324480 [details]
output from rpm -qa | xargs -n1 rpm -V | grep -v ' c ' | grep -v '.pyc$'
(Assignee)

Comment 6

9 years ago
I'm still investigating this, but do we have a backup of this VM anywhere?
(Assignee)

Comment 7

9 years ago
I've tried:
* clobbering
* manually compiling (fails in a different place, same ld crash) on a different partition
* rebooting
* doing a 'e2fsck -f' on all of the drives
* re-installing binutils (using the exact same version, of course)

Heading out for the day, will try more tomorrow.
(Assignee)

Comment 8

9 years ago
Created attachment 324618 [details]
strace of ld crash (in a different place, this time)
(Assignee)

Comment 9

9 years ago
the strace crash log showed warnings about -j3. I turned that off and it made no difference.
(Assignee)

Comment 10

9 years ago
Mrz/Justin, Do we have a backup of this VM?

I also noticed a second copy, from June 9, on d-fcal-build-002. Any chance we can try booting this up?
(Assignee)

Comment 11

9 years ago
Tried the image on d-fcal-build-002 -- same problems.
(Assignee)

Updated

9 years ago
Blocks: 435134

Comment 12

9 years ago
there are no backups - none was ever requested.  whats the action here?
What's the issue, with / or /builds?  

I can create a new / and you can try to copy tools off the old drive to the new.
This box is using ccache for some reason, unlike every other linux box still operational. Renaming the cache in /builds/ccache to cache.borked got us past the error creating libplc4.so in nspr. The build continues ...
Build ran clean, and machine is now reporting "green" on tinderbox. Looks like corrupted ccache was the last problem with crazyhorse. Now very happy to close this bug. :-) 

(While investigating, Justin also fixed some VM config silliness, which we agree would not cause the problems seen above, but were good to fix once discovered.)
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago9 years ago
Resolution: --- → FIXED

Updated

9 years ago
Component: Release Engineering: Maintenance → Release Engineering
QA Contact: justin → release
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.