Closed
Bug 625143
Opened 14 years ago
Closed 14 years ago
nagios gets failing PINGs that magically come back
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 625867
People
(Reporter: dustin, Assigned: dustin)
References
()
Details
I'm not sure if these are a nagios failure (nagios isn't really built to ping boxes that restart all the time) or a problem with the slaves.
I'll track the alerts in the google spreadsheet in the URL to see if I can find the pattern.
Assignee | ||
Comment 1•14 years ago
|
||
I can't tell if these alerts are bogus or not - too much other chaos. I'll comment on that in the parent bug, and hopefully work it out in person tomorrow.
Assignee | ||
Comment 2•14 years ago
|
||
Sometimes these look like:
15:08 < nagios> [42] try-linux64-slave09.build:buildbot is CRITICAL: Connection refused by host
and they seem to happen while the slave is restarting - this one did. I checked the web interface and saw the CRITICAL I expected. A few moments later I navigated back to the same page and saw "NRPE: Unable to read output"
What I don't understand is that in the web interface this service - indeed, all of the services for this host, and on a few other hosts I've checked - are marked as passive, with active checks disabled. I didn't think that was possible with NRPE - aren't NRPE checks triggered when the master connects to the slave and requests the check?
There's something here I don't understand that's blocking my ability to diagnose further.
Assignee | ||
Comment 3•14 years ago
|
||
I was mixing up some "Connection refused" (which was due to a typo in my puppet deployment of the nrpe.cnf changes) with the ping failures, which are better described in bug 625867. So, dup'ing.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
Updated•10 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•