nagios alerts for bm07, bm08, bm124, bm125

RESOLVED FIXED

Status

Infrastructure & Operations
RelOps
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: kmoir, Assigned: arr)

Tracking

Details

(Reporter)

Description

2 years ago
Not sure is this is the right bucket

an you add the following 4 masters to nagios? Buildbot is not currently up on them, as I'm still going through the steps to add them.

buildbot-master07.bb.releng.usw2.mozilla.com
buildbot-master08.bb.releng.use1.mozilla.com
buildbot-master124.bb.releng.use1.mozilla.com
buildbot-master125.bb.releng.usw2.mozilla.com

The following checks are needed:
 Command Queue	
 MySQL Connectivity
 PING
 Pulse Queue
 buildbot
 disk - /
 load
 procs - command_runner
 procs - pulse_publisher

No need to check swap.
(Assignee)

Comment 1

2 years ago
Anyone in releng has the ability to add basic nagios checks like this now. I've added the above hosts to the use1 and usw2 buildbot-master groups, which gives them all the same checks as the other buildbot masters in their region. If you want me to show you how to do this in the future, let me know.

As you point out, several things are not configured/running yet, so the hosts have been downtimed for 7 days. Delete the downtime when you're ready to put them in production (or extend it if you need more than 7 days).
Assignee: relops → arich
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.