Closed
Bug 652966
Opened 13 years ago
Closed 13 years ago
decommission bm-xserve21 (remove from releng configs & slavealloc)
Categories
(Release Engineering :: General, defect, P2)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 688548
People
(Reporter: bear, Assigned: armenzg)
References
()
Details
(Whiteboard: [badslave?][hardware] DNR)
in #developers the sheriff pinged me to look at
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1303846981.1303851389.5655.gz&fulltext=1
which has this output:
Processing file: ./dist/bin/WriteArgument.dSYM
Processing file: ./dist/bin/xpcshell.dSYM
Processing file: ./dist/bin/xpidl.dSYM
error: Invalid argument - unable to create './dist/bin/XUL.dSYM' bundle directory.
error: Invalid argument - unable to create './dist/bin/components/libalerts_s.dylib.dSYM' bundle directory.
error: Invalid argument - unable to create './dist/bin/crashreporter.app/Contents/MacOS/crashreporter.dSYM' bundle directory.
after talking in IRC and dustin saying that slave sometimes has issues and me not being able to ssh to it, I marked it as disabled and filed this bug
Reporter | ||
Updated•13 years ago
|
QA Contact: zandr → dustin
Comment 1•13 years ago
|
||
This was toyed with in bug 644364, and seemed to be running fine, but wasn't. So it at least needs a reimage, and any diagnostics we have for macs would be great too.
Updated•13 years ago
|
Assignee: server-ops-releng → zandr
Updated•13 years ago
|
colo-trip: --- → sjc1
Updated•13 years ago
|
Assignee: zandr → mlarrain
Comment 2•13 years ago
|
||
Dustin and I went onsite yesterday. Here are the notes from our findings;
There are three failed temp sensors with ridiculously hot values (one
was at 184C). They are the three fans on the fanboard that are aimed at
teh DIMMs. I didn't remove the fanboard, but I did feel around the
location of these sensors and the temperature is certainly not above
boiling, so these are bad sensors.
This wouldn't cause failures, so I went on to run the HD diagnostics.
They weren't finished by the time I wandered off, but had already
detected three errors. The crash-cart is still hooked up, so there
should be more to see when you're back.
IMHO, assuming the disk failures are real failures, we should report
this to releng and as for their prescription - DNR or repair (where the
latter will likely be expensive).
I put the machine back in position, but the network is not re-connected.
I will test check the rest of the notes later today to verify HDD issues.
Comment 3•13 years ago
|
||
So matt didn't get to look at the test results here, but assuming that they do show HDD failures as well as temp sensor failures, what's the plan?
Assignee: mlarrain → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: dustin → release
Comment 4•13 years ago
|
||
Let's pull this one and retire it. Can be used for parts if we need to repair others. We don't use these for releases anyway.
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
Priority: -- → P3
QA Contact: release → zandr
Whiteboard: [badslave?][hardware]
Updated•13 years ago
|
Summary: bm-xserve21 showing signs of hardware/drive issues → decommission bm-xserve21
Comment 5•13 years ago
|
||
Added to the decommissioning spreadsheet, removed from nagios.
Assignee: server-ops-releng → mlarrain
Updated•13 years ago
|
Severity: normal → minor
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Comment 7•13 years ago
|
||
Will be making sure it doesn't show up on slavealloc.
Assignee: mlarrain → armenzg
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Summary: decommission bm-xserve21 → decommission bm-xserve21 (remove from releng configs & slavealloc)
Whiteboard: [badslave?][hardware] → [badslave?][hardware] DNR
Assignee | ||
Updated•13 years ago
|
Priority: P3 → P2
Assignee | ||
Comment 8•13 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #7)
> Will be making sure it doesn't show up on slavealloc.
I will take care of it in bug 700705.
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•