[crontabber] Flag to run for nagios alerts

RESOLVED FIXED

Status

Socorro
General
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: peterbe, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

5 years ago
Add a flag called something like `--has-errors` that returns an exit code 0 if there are no errors. And higher exit codes (a la the way Nagios escalates warnings and errors) if there are errors. 

With this in place, we can decide on some higher abstractions such as only returning an exit code >0 if a backfill backed job has failed more than once.
(Reporter)

Updated

5 years ago
Blocks: 818736
The only docs I can find are http://nagios.sourceforge.net/docs/nrpe/NRPE.pdf
For Nagios plugins, it will call the script and if it returns 0 everything is ok.  A return code of 1 indicates a WARNING.   And 2 indicates CRITICAL.  Whatever is printed to STDOUT will be available in Nagios as the reason it is WARNING or CRITICAL.  So basically, this seems reasonable to me if you can reduce the amount of alerting it will do with logic in crontabber.
(Reporter)

Comment 3

5 years ago
:rhelmer
I'm still curious about specific business logic to apply. 

Here's one possible solution::

1. If there is any error with a count > 1, yield a CRITICAL
2. If there is any error with count == 1, yield a WARNING

Another solution is::

1. Same as 1 above but...
2. If the app is NOT backfill based, yield a CRITICAL 

Any thoughts?
(In reply to Peter Bengtsson [:peterbe] from comment #3)
> :rhelmer
...
> Another solution is::
> 
> 1. Same as 1 above but...
> 2. If the app is NOT backfill based, yield a CRITICAL 
> 
> Any thoughts?

I like this one ^ 

We want to know right away if anything requires manual intervention, and this should just about cover it I think.
(Reporter)

Comment 5

5 years ago
Pull request: https://github.com/mozilla/socorro/pull/1074

Comment 6

5 years ago
Commit pushed to master at https://github.com/mozilla/socorro

https://github.com/mozilla/socorro/commit/fd35afc6aa9f537c7475c2de8163640887274fc7
bug 836425 - nagios alerts introspection, r=rhelmer
(Reporter)

Updated

5 years ago
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.