nagios_blocker_checker.pl doesn't fail NRPE gracefully with bad inputs.

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
General
P5
minor
RESOLVED FIXED
2 years ago
6 months ago

People

(Reporter: gcox, Unassigned)

Tracking

Production

Details

https://github.com/mozilla-bteam/bmo/blob/master/contrib/nagios_blocker_checker.pl

================================
$ sudo /data/bugzilla/www/bugzilla.mozilla.org/scripts/nagios_blocker_checker.pl --product 'Infrastructure & Operations' --component 'MOC: Incidents' --severity blocker

There is no component named 'MOC: Incidents' in the 'Infrastructure &
Operations' product.
================================

This causes "NRPE: Unable to read output" in nagios because the check's returned text is not properly formatted, rather than something you can quickly figure out.

My quickie diagnosis is, this appears to be a by-product of the script calling Bugzilla::Product->check, leading to Bugzilla::Error->ThrowUserError with a pretty generic web-friendly-but-nagios-unfriendly template.  Which is about the time I start saying "I have no graceful patch here without breaking the API into a lot of different calls", and seeing if you do.
The fix for this is not too bad:

Bugzilla->error_mode(ERROR_MODE_DIE) will cause that to be thrown as a real exception, which can be caught with try, something like:

# after https://github.com/mozilla-bteam/bmo/blob/master/contrib/nagios_blocker_checker.pl#L20
use Try::Tiny; # bmo ships with this nowadays
Bugzilla->error_mode(ERROR_MODE_DIE);

try {
# all lines from 
# https://github.com/mozilla-bteam/bmo/blob/master/contrib/nagios_blocker_checker.pl#L119-L196
} catch {
# print meaningful output. 
};
See Also: → bug 1329995
Aaaand, :dylan merged that PR.  Since this bug was future-prevention and issue only crops up when we have a change in BMO component/products AND we have none planned in the areas IT are watching, I'm calling this done (even though presently it's only upstream-committed, not deployed-on-the-admin-host), as it'll get to us eventually on a future deploy, and we shouldn't notice it anyway.

Thanks!
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED

Updated

6 months ago
Blocks: 1329995
You need to log in before you can comment on or make changes to this bug.