Closed Bug 1089152 Opened 10 years ago Closed 10 years ago

https - datazilla.mozilla.org on datazilla-zlb.vips.scl3.mozilla.com is CRITICAL: CRITICAL - Socket timeout after 10 seconds

Categories

(Infrastructure & Operations :: MOC: Problems, task)

Other
Other
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nagiosapi, Unassigned)

References

()

Details

(Whiteboard: [id=nagios1.private.scl3.mozilla.com:454574])

Automated alert report from nagios1.private.scl3.mozilla.com:

Hostname: datazilla-zlb.vips.scl3.mozilla.com
Service:  https - datazilla.mozilla.org
State:    CRITICAL
Output:   CRITICAL - Socket timeout after 10 seconds

Runbook:  http://m.allizom.org/https+-+datazilla.mozilla.org
nagios-scl3: Sat 07:36:58 PDT [5061] datazilla-zlb.vips.scl3.mozilla.com:https - datazilla.mozilla.org is CRITICAL: CRITICAL - Socket timeout after 10 seconds (http://m.mozilla.org/https+-+datazilla.mozilla.org)
[4:41pm] dividehex joined the chat room.
[4:45pm] nagios-scl3: Sat 07:45:16 PDT fw1.par1.mozilla.net (64.214.196.170) is DOWNTIMEEND (UP) :PING OK - Packet loss = 0%, RTA = 162.58 ms
[4:45pm] nagios-scl3: Sat 07:45:20 PDT admin1.par1.mozilla.com (64.213.97.196) is DOWNTIMEEND (UP) :PING OK - Packet loss = 0%, RTA = 163.52 ms
[4:45pm] nagios-scl3: Sat 07:45:29 PDT fw1.tier1.par1.mozilla.net (64.214.196.170) is DOWNTIMEEND (UP) :PING OK - Packet loss = 0%, RTA = 167.18 ms
[4:49pm] Usul: nagios-scl3: oncall dba
[4:49pm] nagios-scl3: Usul: sheeri currently has the pager
[4:50pm] Usul: sheeri: any idea how datazilla.mozilla.org works ?
[4:56pm] Usul: nagios-scl3: page sheeri can you look at the datazilla dbs, as the vip is down I suscpect the db.
[4:56pm] nagios-scl3: Usul: sheeri has been paged with the message "can you look at the datazilla dbs, as the vip is down I suscpect the db.(Usul)"
[5:06pm] nagios-scl3: SMS from sheeri: Pls text cyborgshadow, I'm afk
[5:09pm] Usul: nagios-scl3: page cyborgshadow can you look at the datazilla-db the servoce is down and I suspect db playing games ?
[5:09pm] nagios-scl3: Usul: cyborgshadow has been paged with the message "can you look at the datazilla-db the servoce is down and I suspect db playing games ?(Usul)"
[5:46pm] nagios-scl3: Sat 08:46:45 PDT [5088] datazilla-zlb.vips.scl3.mozilla.com:https - datazilla.mozilla.org is CRITICAL: HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 630 bytes in 0.024 second response time (http://m.mozilla.org/https+-+datazilla.mozilla.org)
[5:47pm] Usul: nagios-phx1: page cyborgshadow Can you look at datazilla , sheeri is afk ?
[5:47pm] nagios-phx1: Usul: cyborgshadow has been paged with the message "Can you look at datazilla , sheeri is afk ?(Usul)"
nagios-phx1: Usul: sheeri has been paged with the message "cyborgshadow is not answering …(Usul)"
[6:17pm] nagios-scl3: SMS from sheeri: Can you login to the machine as root and type mysql and see what happens?
[6:17pm] nagios-scl3: SMS from sheeri: If it doesn't get you to a MySQL prompt, restart the db with /etc/init.d/mysql restart

Both machine seem to have mysql running.
Automated alert acknowledgement: (Usul)dba knows
Status: NEW → ASSIGNED
Automated alert recovery:

Hostname: datazilla-zlb.vips.scl3.mozilla.com
Service:  https - datazilla.mozilla.org
State:    OK
Output:   HTTP OK: HTTP/1.1 200 OK - 8558 bytes in 0.721 second response time
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
with help from tmary we restarted httpd on @datazilla1.webapp.scl3 datazilla.mozilla.org and it fixed the issue.
Component: MOC: Incidents → MOC: Problems
You need to log in before you can comment on or make changes to this bug.