Closed Bug 896203 Opened 12 years ago Closed 12 years ago

pdu1.r102-2.build.scl1.mozilla.com flapping

Categories

(Infrastructure & Operations :: DCOps, task)

x86_64
Windows 7
task
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 895895

People

(Reporter: Callek, Unassigned)

Details

My #buildduty scrollback only goes back so far [Thu 15:42:18 PDT], but we have an issue with one of our PDU's constantly flapping: Fri 00:41:12 PDT [465] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Fri 00:44:22 PDT [472] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 6.17 ms Fri 06:40:23 PDT [437] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Fri 06:43:33 PDT [442] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 5.56 ms Fri 12:39:26 PDT [406] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Fri 12:45:15 PDT [415] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 20.76 ms Fri 18:41:58 PDT [487] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Fri 18:45:08 PDT [494] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 3.70 ms Sat 00:41:12 PDT [407] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Sat 00:44:22 PDT [411] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 59.58 ms Sat 06:40:26 PDT [488] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% Sat 06:43:36 PDT [489] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 49.05 ms Sat 12:42:46 PDT [434] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100%
As I was filing it recovered. I then set it to downtime to avoid the constant nag: Downtime for talos-r4-snow-045.build.scl1.mozilla.com scheduled for 1 day, 12:00:00
I can confirm we see the PDU management interface flapping on the switch side, as well. Assuming that this isn't directly impacting the hosts connected to it, we'll investigate what's happening with the management module on Monday.
colo-trip: --- → scl1
<nagios-releng> Mon 12:39:34 PDT [475] pdu1.r102-2.build.scl1.mozilla.com is DOWN :PING CRITICAL - Packet loss = 100% <nagios-releng> Mon 12:45:04 PDT [482] pdu1.r102-2.build.scl1.mozilla.com is UP :PING OK - Packet loss = 0%, RTA = 5.09 ms
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.