Closed
Bug 746396
Opened 12 years ago
Closed 12 years ago
add nagios monitoring of the opsi master process to staging-opsi and production-opsi
Categories
(Infrastructure & Operations :: RelOps: General, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bear, Assigned: arich)
References
Details
(Whiteboard: [scl3][opsi])
according to https://nagios.mozilla.org/nagios/cgi-bin/status.cgi?navbarsearch=1&host=staging-opsi nagios is not monitoring the OPSI master process
Assignee | ||
Comment 1•12 years ago
|
||
Okay, what should the check be?
Assignee | ||
Updated•12 years ago
|
Assignee: server-ops-releng → arich
Assignee | ||
Comment 2•12 years ago
|
||
Checking back on this to see if there's more information about what this check should look like.
Comment 3•12 years ago
|
||
The opsi people have a project to write a nagios plugin that does many things, including making sure that process is responding, but their 'someone pays for this development and then it's free for everyone' model hasn't attracted any funding yet. You can find the source in google, but licensing excludes us from using it. We have '/usr/bin/python /usr/sbin/opsiconfd -D' in the process list. So lets just go with a simple process check for now - 1 instance of opsiconfd should be running.
Assignee: arich → nobody
Component: Server Operations: RelEng → Release Engineering: Machine Management
QA Contact: arich → armenzg
Assignee | ||
Comment 4•12 years ago
|
||
I've added a check for /usr/sbin/opsiconfd to the new opsi servers in scl3 I also had to: * modify the allowed hosts in /etc/nagios.nrpe.cfg on both machines so that admin1.infra.scl1.mozilla.com and nagios1.private.releng.scl3.mozilla.com could talk to them * add the check definitions for swap and procs_regex to /etc/nagios/nrpe_local.cfg I didn't even see puppet installed, so I don't think these changes will get overwritten.
Assignee: nobody → arich
Status: NEW → RESOLVED
Closed: 12 years ago
Component: Release Engineering: Machine Management → Server Operations: RelEng
QA Contact: armenzg → arich
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•