Please run diagnostics on bld-lion-r5-045

RESOLVED FIXED

Status

RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: coop, Assigned: vhua)

Tracking

Details

(Whiteboard: scl3)

(Reporter)

Description

7 years ago
This slave has failed many integrity checks lately while working with archive files (specifically bz2). Note that this is happening again *after* it was re-imaged just last week.

Can we please run some diagnostics on it, and possibly send it for repair?
Assignee: server-ops-releng → server-ops
Component: Server Operations: RelEng → Server Operations: DCOps
QA Contact: arich → dmoore

Comment 1

7 years ago
chris,

do you need to back up any data on this host? it appears there is an issue with the hard drive and we would like to replace the drive and re-image the host. please let me know when we can proceed.


thanks,
van
(Reporter)

Comment 2

7 years ago
(In reply to Van Le from comment #1) 
> do you need to back up any data on this host? it appears there is an issue
> with the hard drive and we would like to replace the drive and re-image the
> host. please let me know when we can proceed.

No, I don't. The only thing I was worried about was the ssh keys on the disk, and I've already removed those.
(Assignee)

Comment 3

7 years ago
Chris,
Can I reboot the bld-lion-r5-045 to run the diagnostics right now?
(Assignee)

Updated

7 years ago
Assignee: server-ops → vhua
Status: NEW → ASSIGNED
(Reporter)

Comment 4

7 years ago
(In reply to Vinh Hua [:vinh] from comment #3)
> Can I reboot the bld-lion-r5-045 to run the diagnostics right now?

The machine is disabled in our build environment, so diagnose away!

Updated

7 years ago
Whiteboard: scl3
(Assignee)

Comment 5

7 years ago
Chris,
I've ran the comprehensive diagnostics testing twice on this Mac Mini. Hardware shows up as "passed".
(Assignee)

Comment 6

7 years ago
Chris,
Please see comment #5.

Also, I've noticed from the error log that you had included in bug 758605:

bzip2: Data integrity error when decompressing.
	Input file = Contents/MacOS/XUL.bz2, output file = Contents/MacOS/XUL

It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.

You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.

bzip2: Deleting output file Contents/MacOS/XUL, if it exists.
Couldn't decompress "Contents/MacOS/XUL" at /builds/slave/m-cen-osx64-l10n-ntly/build/mozilla-central/tools/update-packaging/unwrap_full_update.pl line 64.
program finished with exit code 2


Could it be that the file itself is corrupted?
(Reporter)

Comment 7

7 years ago
The nature of l10n repacks is that many slaves download the same file and repack the language strings in it. If the archive file itself was corrupt, I would be expecting all repacks to fail.
(Assignee)

Comment 8

7 years ago
Chris,
Sorry for the delay on this. Is this still problematic?  If so I'll just send it to Apple and see what they diagnose. Will also need to power down "bld-lion r5-046.build.releng.scl3" in order to disassemble the chassis.
(Reporter)

Comment 9

7 years ago
(In reply to Vinh Hua [:vinh] from comment #8)
> Sorry for the delay on this. Is this still problematic?  If so I'll just
> send it to Apple and see what they diagnose. Will also need to power down
> "bld-lion r5-046.build.releng.scl3" in order to disassemble the chassis.

The slave has been out of service since this bug was filed pending diagnostic resolution.

I've now disabled bld-lion-r5-046 as well so that bld-lion-r5-045 can be unracked.

Updated

7 years ago
Blocks: 768530
(Assignee)

Comment 10

7 years ago
Mac Mini has been sent to Apple for repair.  Here's the bug ticket for tracking:

https://bugzilla.mozilla.org/show_bug.cgi?id=768122
Ben, it just came back from repair yesterday. It should be in SCL3 by Monday.
(Assignee)

Comment 13

7 years ago
Just got the Mac Mini back today from desktop team.  Will rack it back on the chassis in the next 2 hour.  

FYI - The OS was probably erased by Apple during repair.
When you power it on, please hold down the N key so it re-installs the OS.
(Assignee)

Comment 15

7 years ago
Both "bld-lion-r5-045" and "bld-lion-r5-046" have been racked and powered on.  Looked like the OS is still intact.  But per Amy's instructions I will netboot it to re-install the OS on "bld-lion-r5-045"

Here is the note that I got back from desktop team:

"Returned from Computer Care and placed on Jake's desk. Unit passed all diags and performed to factory specs."
(Assignee)

Comment 16

7 years ago
bld-lion-r5-045 has finished rebuilding itself.  Let me know if the original issue still occurs.
(Assignee)

Comment 17

6 years ago
Hi all,
Just wanted to check if this bug can be closed?
Sent for repair, reimaged.  Thanks Vinh!
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.