Per bug 933334#c7, we should add a Nagios check that checks the contents of a certain file and pages certain people in certain ways at certain times. This would be for the host sumotools1.webapp.phx1. Based on the current process (send an email when things are bad, otherwise don't send an email), this should probably be a logfile check. I would advise that whatever process is currently sending emails-on-CRITICAL additionally should append a single line to a logfile somewhere. That way we can have the nagios check simply take that line and emit it via IRC/email/SMS. Should nagios deliver alerts on IRC, by email, and/or by SMS? (needinfo :cww) Does the above logfile check sound reasonable? (needinfo :ashish)
Alerts via email. We can have it sent to a mailing list which subscribers can be added to.
Sorry for the late response but logfile check sounds doable... Not sure if the application already does that but do let me know once that is setup and we can fix up the Nagios-side of things.
Component: Server Operations → MOC: Service Requests
Product: mozilla.org → Infrastructure & Operations
Just checking in - is this request still relevant?
I'm going to say no for now, since I can't discern what log entries I cared about in 2013. Sorry for the noise!
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.