Closed Bug 1293652 Opened 8 years ago Closed 8 years ago

stage submitter down (august 9th)

Categories

(Socorro :: Infra, task, P1)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: jschneider)

References

Details

Today (August 9th, 2016) at around 3:30am in whatever timezone datadog time for me (it looks like my time which is GMT-4), it looks like the stage submitter stopped sending crashes to the stage infrastructure. https://app.datadoghq.com/monitors#238219?group=all&from_ts=1470714598104&to_ts=1470742108552 This issue covers kicking the submitter to start it up again.
Making this a critical P1 since it blocks development and pushing and all that. Assigning to JP since it's infrastructure. Drinking some coffee because that's what coffee is for.
Assignee: nobody → jschneider
Severity: normal → critical
Priority: -- → P1
JP restarted the stage submitter. The datadog graph looked good for like 15 minutes, then took a nosedive. Maybe it was eating up what was in the queue to be submitted? It seems to be submitting crashes now, but it's well below the "normal". I think we should wait a bit and if things don't improve, restart the submitter again.
There were two screens running that, and one had errored. The other had a perms error, presumably because process 1 had files locked. I recycled the first, killed the second, and happiness has once again entered these lands.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Summary: stage submitter is dead → stage submitter down (august 9th)
You need to log in before you can comment on or make changes to this bug.