sync1.db.scl2.svc: RAM, SATA issues

RESOLVED FIXED

Status

Cloud Services
Operations
RESOLVED FIXED
6 years ago
2 years ago

People

(Reporter: atoll, Assigned: jlaz)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [RAM swapped, check back in a week])

(Reporter)

Description

6 years ago
Either all four RAM sticks are bad, or this server needs a chassis swap.  I'm restarting MySQL to make sure it can still start up at all, but we should fix this ASAP before it destroys itself unexpectedly.

# dmesg | grep --only-matching 'DRAM-Bank=[0-9]' | sort | uniq -c
    133 DRAM-Bank=0
     96 DRAM-Bank=1
    123 DRAM-Bank=2
    147 DRAM-Bank=3

# dmesg | grep ^ata | grep 'failed' | sort | uniq -c
     36 ata3.00: failed command: READ DMA EXT
      1 ata4.00: failed command: FLUSH CACHE EXT
      1 ata4: reset failed, giving up
      4 ata4: softreset failed (device not ready)
     31 ata7.00: failed command: READ FPDMA QUEUED
      1 ata7: failed to read log page 10h (errno=-5)
      1 ata9.00: failed command: WRITE FPDMA QUEUED
(Reporter)

Comment 1

6 years ago
We'll do a full repair pass tomorrow, but for now leaving in service with MySQL freshly restarted. If this goes down or starts flapping horrifically, escalate and we'll migrate.
(Reporter)

Updated

6 years ago
Component: Operations → Operations: Hardware
(Assignee)

Comment 2

6 years ago
RAM replaced, will check in a week to see if SATA/RAM issues persist
Whiteboard: [RAM swapped, check back in a week]
(Assignee)

Comment 3

6 years ago
No RAM issues since, closing out
Assignee: nobody → jlaz
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Component: Operations: Hardware → Operations
Product: Cloud Services → Cloud Services
You need to log in before you can comment on or make changes to this bug.