Closed Bug 1466889 Opened 7 years ago Closed 7 years ago

Add metric to a10n poller for use as health check

Categories

(Localization Infrastructure and Tools :: Automation, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brian, Assigned: Pike)

Details

If the a10n poller is not running or is running but locked up, we need to alert on it so we can investigate and fix it. The ideal way to do this is for the poller to submit metric(s) to datadog at a regular interval so that we can use datadog's ability to alert if a metric is missing data for more than N minutes. My suggestion is to take the cycle time that you log already and also publish it as a gauge under the name 'poller.cycle_time'. That would publish every ~2 minutes. https://github.com/mozilla-services/elmo-automation/blob/master/twisted/plugins/pushes_plugin.py#L209
Flags: needinfo?(l10n)
sgtm, taking a crack at that.
Assignee: nobody → l10n
Flags: needinfo?(l10n)
I've verified this is working as intended in stage and prod.
Marking FIXED then. thanks for checking.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.