Replace RAM in vm1-2.phy.labs.scl3.mozilla.com

RESOLVED FIXED

Status

Infrastructure & Operations
DCOps
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: gozer, Unassigned)

Tracking

Details

(Reporter)

Description

5 years ago
Machine paniced and on reboot, the BIOS alerted:

Un-Correctable DRAM ECC Error Detected at CPU02/DIMM1B

That RAM stick needs replacing.

Updated

5 years ago
colo-trip: --- → scl3

Comment 1

5 years ago
RMA case opened with iX Systems.

 Case ID - SZB-907513

Comment 2

5 years ago
Update from iX Systems:

We are having a dead-lock with the manufacturer, unfortunately, for the only memory they can provide us has not yet been qualified for the affected board. We are now trying Supermicro direct however and will have a solution by tomorrow. 

We may have to exchange all of the DIMM's from the system and hope this will not be an inconvenience, if this becomes our quickest route. If there is anything else we can help address until our next update, please do not hesitate to ask.

Comment 3

5 years ago
:gozer
The replacement RAM came in. Let us know when we can power down the host to make the swap.

Comment 4

5 years ago
iX Systems delivered the wrong memory DIMMs. Waiting for them to ship us the ECC DIMMs.
Good catch, Vinh

Comment 6

5 years ago
:gozer
The correct RAM came in.  Let me know when you are available to take the host offline.
(Reporter)

Comment 7

5 years ago
Today is not very convenient, how about sometime tomorrow, Wednesday ? Like before, I need enough time to migrate some VMs away from the host before we can perform the maintenance.

Should take the same time it did last time, approx 30minutes.

Just let me know when you'll be in SCL3 and I'll adapt.
OS: Mac OS X → All
Hardware: x86 → All

Comment 8

5 years ago
:gozer

One of us will be at SCL3 tomorrow. You can ping us in #dcops when ready.

Comment 9

5 years ago
worked with :gozer to replace bad DIMM.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.