Status

Infrastructure & Operations
MOC: Problems
RESOLVED WORKSFORME
a year ago
a year ago

People

(Reporter: jedi, Unassigned)

Tracking

Details

(Reporter)

Description

a year ago
02:10 <@nagios-scl3> Tue 23:10:10 PDT [5638] admin1a.private.tpe1.mozilla.com:Time Sync is CRITICAL: CHECK_NRPE: Socket timeout after 15 seconds.  (http://m.mozilla.org/Time+Sync)


This is the first in a series of alerts for just about everything in tpe1 going down.

No planned maintenance in Whistlepig.

CAN ping fw1.tpe1.mozilla.net
CAN ping fw1.tier2.tpe1.mozilla.net

Checked Observium: 
Both checks for fw1.ops.tpe1.mozilla.net are red - last changed 3h8m ago.  

This doesn't appear to be a telco thing, as I can ping both fw1 connections.  Paging netops.
(Reporter)

Comment 1

a year ago
Came back up.
Final up alert:
02:31 <@nagios-scl3> (IRC) Tue 23:31:46 PDT [5760] endpoint1.kenya.av.tpe1.mozilla.com:Check vidyo port 80 is OK: TCP OK - 0.152 second response time on 10.247.48.46 port

Total downtime: 21 minutes.  
NOT following up with netops.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → WORKSFORME
(Assignee)

Updated

a year ago
Component: MOC: Incidents → MOC: Problems
Product: Infrastructure & Operations → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.