Closed
Bug 1872150
Opened 2 years ago
Closed 2 years ago
crash stats outage (15 minutes) due to elasticsearch cluster going unresponsive
Categories
(Socorro :: Webapp, defect, P1)
Socorro
Webapp
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: wsmwk, Unassigned)
References
()
Details
All urls starting with https://crash-stats.mozilla.org/ fail
| Reporter | ||
Comment 1•2 years ago
|
||
Seems better now
Comment 2•2 years ago
|
||
Looks like between 13:28 UTC and 13:47 UTC, the Elasticsearch cluster had some kind of something where it was unresponsive, CPU spiked, and the node heap usage dropped. Everything looks ok now. I'll leave this open and if everything's still fine later, I'll close it out.
Comment 3•2 years ago
|
||
Everything continues to be working, so I'm going to close this out.
We declared an incident for the outage and I wrote up my observations in the incident document:
https://docs.google.com/document/d/1Hm2WWwSKmTARn_vBJakEpYjYUm4M6LzXkTJivb8liMc/edit
We'll do a retrospective in the next couple of weeks.
Status: NEW → RESOLVED
Closed: 2 years ago
Priority: -- → P1
Resolution: --- → FIXED
Summary: Internal Server Error at https://crash-stats.mozilla.org/ → crash stats outage (15 minutes) due to elasticsearch cluster going unresponsive
You need to log in
before you can comment on or make changes to this bug.
Description
•