Closed
Bug 651558
Opened 13 years ago
Closed 13 years ago
Re-enable nagios checks for geriatric Mac slaves
Categories
(Infrastructure & Operations :: RelOps: General, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: coop, Assigned: arich)
References
Details
I just compiled and installed PPC versions of the nagios plugins on the geriatric Mac slaves in bug 578234, so we should re-enable whatever checks we can for these slaves. The affected slaves are: * bm-xserve0[1-5] * g4-leopard01
Assignee | ||
Updated•13 years ago
|
Assignee: server-ops-releng → arich
Assignee | ||
Comment 1•13 years ago
|
||
I've added the checks for bm-xserve0[1-5] back in, but the DNS information for g4-leopard01 disagrees when it comes to PTR and A record, so the nagios config build scripts will not handle it correctly. host g4-leopard01.build g4-leopard01.build.mozilla.org is an alias for g4-leopard1.build.mtv1.mozilla.com. g4-leopard1.build.mtv1.mozilla.com has address 10.250.48.73 host 10.250.48.73 73.48.250.10.in-addr.arpa domain name pointer g4-leopard01.mv.mozilla.com. The A and PTR need to match. I'm not sure what the correct hostname/datacenter designator should be for this host. Does someone have more information?
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•13 years ago
|
||
Many of the checks failed due to lack of NRPE definitions. I've acked them for now until we can decide what to do with them.
Comment 3•13 years ago
|
||
As for g4-leopard01, let's follow the usual slave pattern: $ORIGIN build.mozilla.org. g4-leopard01 IN CNAME g4-leopard01.build.mtv1.mozilla.com. $ORIGIN build.mtv.mozilla.com. g4-leopard01 IN A 10.250.48.73 $ORIGIN 48.250.10.in-addr.arpa. 73 IN PTR g4-leopard01.build.mtv1.mozilla.com. As for the NRPE checks, here's the list for future reference: bm-xserve01.build:buildbot is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. bm-xserve01.build:disk - / is CRITICAL: CHECK_NRPE: Error - Could not complete SSL handshake. bm-xserve02.build:hung slave is CRITICAL: NRPE: Command check_file_age not defined bm-xserve03.build:buildbot is WARNING: PROCS WARNING: 0 processes with command name python, args buildbot.tac bm-xserve04.build:hung slave is CRITICAL: NRPE: Command check_file_age not defined bm-xserve05.build:hung slave is CRITICAL: NRPE: Command check_file_age not defined It looks like these systems aren't using runslave.py, either (hence the PROCS WARNING). We should probably fix that - bug 652125. These boxes don't run puppet. That's probably not worth fixing.
Assignee | ||
Comment 4•13 years ago
|
||
Coop, I've fixed all of these up except for g4-leopard01. It doesn't appear to be running nrpe, and I don't know the root or cltbld passwd, so I can't get in to take a look. Do you have a way I can log in?
Assignee | ||
Comment 5•13 years ago
|
||
This is all set.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•