Closed Bug 1146974 Opened 9 years ago Closed 9 years ago

Add automated alerting for runner abnormally high retries.

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mrrrgn, Assigned: mrrrgn)

References

Details

We can make use of the influxdb data (the same that powers: https://stats.taskcluster.net/grafana/#/dashboard/db/runner)
Summary: Add automated alerting for runner high retries. → Add automated alerting for runner abnormally high retries.
It looks like the way to go here is via integration with Nagios.
Assignee: nobody → winter2718
Come to think of it, it would be even simpler to do this via papertrail.
Depends on: 1139034
builddity and myself are signed up for alerts. Alerts are triggered if more than 100 retries (across all machines) are seen within a one minute period.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.