Closed Bug 1207411 Opened 9 years ago Closed 9 years ago

nagios alerts for bm07, bm08, bm124, bm125

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Assigned: arich)

References

Details

Not sure is this is the right bucket

an you add the following 4 masters to nagios? Buildbot is not currently up on them, as I'm still going through the steps to add them.

buildbot-master07.bb.releng.usw2.mozilla.com
buildbot-master08.bb.releng.use1.mozilla.com
buildbot-master124.bb.releng.use1.mozilla.com
buildbot-master125.bb.releng.usw2.mozilla.com

The following checks are needed:
 Command Queue	
 MySQL Connectivity
 PING
 Pulse Queue
 buildbot
 disk - /
 load
 procs - command_runner
 procs - pulse_publisher

No need to check swap.
Anyone in releng has the ability to add basic nagios checks like this now. I've added the above hosts to the use1 and usw2 buildbot-master groups, which gives them all the same checks as the other buildbot masters in their region. If you want me to show you how to do this in the future, let me know.

As you point out, several things are not configured/running yet, so the hosts have been downtimed for 7 days. Delete the downtime when you're ready to put them in production (or extend it if you need more than 7 days).
Assignee: relops → arich
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.