Decommission or repair talos-r4-snow-014

RESOLVED FIXED

Status

Infrastructure & Operations
DCOps
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: coop, Unassigned)

Tracking

Details

(Whiteboard: ETA - 1/11)

(Reporter)

Description

5 years ago
We've recently decided to repair a few other r4 minis, so we should add this mini (talos-r4-snow-014) to the list if it is cost-effective to do so. If not, let's just decommission it.

Updated

5 years ago
colo-trip: --- → scl1
coop, can you quantify "cost-effective?"  If it's a bad logic board, it's ~$600 to repair.  Memory and disk are likely cheaper.

Comment 2

5 years ago
Hi,
What's the verdict on this host?
(Reporter)

Comment 3

5 years ago
(In reply to Amy Rich [:arich] [:arr] from comment #1)
> coop, can you quantify "cost-effective?"  If it's a bad logic board, it's
> ~$600 to repair.  Memory and disk are likely cheaper.

Sorry, was on vacation.

tl;dr - yes, let's repair this particular slave, even if that means replacing the logic board.

We'll always pursue the cheaper repair (memory and disk) when that's the fix. We have already agreed to replace the logic board on a handful of machines (maybe 5?). I think once we start talking about having repaired (or are looking to repair) the logic board on 10% of the total pool, we should revisit this decision.

Slavealloc tells me we have 169 r4 minis split between 10.6 and 10.7, so that means my current threshold for changing the full repair criteria is once we've had to touch 17 machines.

Standard wishful thinking also applies: we may also stop supporting 10.6 at some point in the future, which would allow us to reallocate any remaining r4 machines to 10.7.
Currently running techtoolpro diags on this host. will check on this tomorrow once it finishes all the tests .
The results from the diagnostics show that the host has memory issues.

Need to order new memory modules for this macmini.

Comment 6

5 years ago
Memory has been ordered. Will arrive in a few days.
Just checking in to see if this fix has been completed.

Comment 8

5 years ago
The new memory that had been ordered is not working with these mac minis. The minis would only boot up if 4gb (one stick) is installed. When I try 8gb, the mini would not boot up.  

Ann is on PTO so I'm waiting to hear back from her if these RAM are the exact same specs as last time.

Comment 9

5 years ago
Three 8.0GB PC8500 DDR3 1066MHz SO-DIMM Memory Upgrade Kits have been ordered.

Comment 10

5 years ago
Snow-014 is still rebooting after replacing with new RAM. I can't get a full diagnostic test to complete without the host rebooting itself every time. 

Amy - Do you want to order a new logic board or just decommission?
Coop said yes, replace the logic board, in comment 3.
The logic board has been ordered. 

Confirmation # for the order: 101448645

ETA - 1/11
Whiteboard: ETA - 1/11
Motherboard is waiting in SFO

Comment 14

5 years ago
Peter,
Can you have the courier bring it down to MV office please?
Its on its way.

Comment 16

5 years ago
Talos-r4-snow-014 has been racked and inventory updated with new MAC Address.  Will need releng's help adding the host to the correct imaging group.
That's relops, not releng, and I added it o the right group.  You're good to netboot it.  :D

Comment 18

5 years ago
Host has been reimaged and back online.

talos-r4-snow-014.build.scl1.mozilla.com is alive
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.