crash-reports-xpsp2.mozilla.com flapping

RESOLVED FIXED

Status

Infrastructure & Operations
MOC: Problems
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: ashlee, Unassigned)

Tracking

Details

(Reporter)

Description

2 years ago
crash-reports started flapping tonight @:

Tue 16:54:36 PST [1010] crash-reports-xpsp2.mozilla.com (54.218.3.252) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 2.490 second response time

and again @:

Tue 18:58:56 PST [1024] crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.706 second response time

Pinged in #breakpad, cc'd KaiRo and jp.
(Reporter)

Updated

2 years ago
Summary: crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN → crash-reports-xpsp2.mozilla.com flapping
(Reporter)

Comment 1

2 years ago
Downtimed host for 2d.

Comment 2

2 years ago
Alert come in for the following 

Dec 30, 2015 at 08:34 AM 	Pingdom Alert: Incident #20223 for crash-stats.mozilla.org (https://crash-stats.mozilla.org/home/products/Firefox), has been assigned to you.

Comment 3

2 years ago
Reopen again if necessary.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WORKSFORME
(Reporter)

Comment 4

2 years ago
Thu 20:24:05 PST [1255] crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.479 second response time started flapping tonight.

Notified jp-away that this continuing. He mentioned everything on his end looked fine. https://www.irccloud.com/pastebin/OKA6NAsf/ 
jp-away> we've got 10+ servers available and in rotation, i'm not seeing any 500's


Could this be an error on the alert config someone from the moc set up?

Updated

2 years ago
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Comment 5

2 years ago
3:04 AM <nagios-phx1> Fri 03:04:10 PST [1016] crash-reports-xpsp2.mozilla.com (54.218.3.252) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.431 second response time

New messages since you tabbed out
3:06 AM <nagios-phx1> Fri 03:06:20 PST [1017] crash-reports-xpsp2.mozilla.com (54.218.3.252) is UP :HTTP OK: HTTP/1.1 200 OK - 164 bytes in 0.118 second response time

Comment 6

2 years ago
More flaps and hiccups 


4:09 AM <nagios-phx1> Fri 04:09:55 PST [1020] crash-reports-xpsp2.mozilla.com (52.27.26.102) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.372 second response time


4:11 AM <nagios-phx1> Fri 04:11:15 PST [1021] crash-reports-xpsp2.mozilla.com (52.27.26.102) is UP :HTTP OK: HTTP/1.1 200 OK - 164 bytes in 0.120 second response time

5:32 AM <nagios-phx1> Fri 05:32:30 PST [1022] crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.570 second response time


5:36 AM <nagios-phx1> Fri 05:36:10 PST [1023] crash-reports-xpsp2.mozilla.com (52.10.144.3) is UP :HTTP OK: HTTP/1.1 200 OK - 164 bytes in 0.123 second response time

New messages
6:01 AM <nagios-phx1> Fri 06:01:14 PST [1024] crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.515 second response time

Comment 7

2 years ago
7:36 AM <nagios-phx1> Fri 07:36:19 PST [1035] crash-reports-xpsp2.mozilla.com (52.10.144.3) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.556 second response time
7:39 AM <nagios-phx1> Fri 07:39:20 PST [1036] crash-reports-xpsp2.mozilla.com (52.10.144.3) is UP :HTTP OK: HTTP/1.1 200 OK - 164 bytes in 0.198 second response time



8:07 AM <nagios-phx1> Fri 08:07:56 PST [1037] crash-reports-xpsp2.mozilla.com (54.218.3.252) is DOWN :HTTP CRITICAL: HTTP/1.1 500 Internal Server Error - 181 bytes in 0.251 second response time
(Reporter)

Updated

2 years ago
Group: infra

Comment 8

2 years ago
Peter,

Noticed crash-reports.mozilla.com has max_check_attempts in the entry, so I added same for crash-reports-xpsp2.mozilla.com.

This seems to have done the trick for flaps. Can you confirm this as good?
Flags: needinfo?(pradcliffe+bugzilla)
Looks reasonable to me. Good catch.
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
Flags: needinfo?(pradcliffe+bugzilla)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.