Closed Bug 1156773 Opened 6 years ago Closed 5 years ago

figure out nagios checks for buildbot bridge

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: bhearsum)

References

Details

(Whiteboard: [bbb])

Attachments

(2 files, 2 obsolete files)

This is being deployed in bug 1154423, but if any of the services get stuck we'll have no idea about them. At the very least I think we need alive checks on the services.
Whiteboard: [bbb]
Attachment #8610216 - Flags: review?(dustin) → review+
Attachment #8610216 - Flags: checked-in+
I pretty much just copied how the selfserve-agent checks work to make this patch - it seemed like it had the exact same requirements (custom check configured on a small set of buildbot masters).
Attachment #8610698 - Flags: review?(arich)
Comment on attachment 8610698 [details] [diff] [review]
add buildbot bridge nagios checks to nagios server

It looks like this is just a general check_procs command, so I don't think there's a need for a custom check definition on the client or server.

check_procs_regex is already defined as:

$plugins_dir/check_procs -c \$ARG2\$:\$ARG3\$ --ereg-argument-array=\$ARG1\$

Can we use the regex match in this case?
Attachment #8610698 - Flags: review?(arich) → review-
(In reply to Amy Rich [:arich] [:arr] from comment #3)
> Comment on attachment 8610698 [details] [diff] [review]
> add buildbot bridge nagios checks to nagios server
> 
> It looks like this is just a general check_procs command, so I don't think
> there's a need for a custom check definition on the client or server.
> 
> check_procs_regex is already defined as:
> 
> $plugins_dir/check_procs -c \$ARG2\$:\$ARG3\$ --ereg-argument-array=\$ARG1\$
> 
> Can we use the regex match in this case?

Hm, yeah, that should be fine. I'm not sure why I did it any other way now!
Attached patch use check procs instead (obsolete) — Splinter Review
Gotta switch up the PuppetAgain configs to fix the nagios config on the masters.
Attachment #8610745 - Flags: review?(arich)
Comment on attachment 8610745 [details] [diff] [review]
use check procs instead

Er, wait...I don't even need this now, I think...
Attachment #8610745 - Attachment is obsolete: true
Attachment #8610745 - Flags: review?(arich)
Okay! After much back and forth I think I finally have this right. If I've understood correctly, I should be able to back out my PuppetAgain patch as well.
Attachment #8610698 - Attachment is obsolete: true
Attachment #8610761 - Flags: review?(arich)
Comment on attachment 8610761 [details] [diff] [review]
use check procs, really

a much cleaner way to do it and it means that you can back out your puppet change on the releng machines to copy out a custom check, yes.
Attachment #8610761 - Flags: review?(arich) → review+
Attachment #8610216 - Flags: checked-in+ → checked-in-
Attachment #8610761 - Flags: checked-in+
Amy, it looks like this needs to be deployed manually (based on my read of https://mana.mozilla.org/wiki/display/SYSADMIN/Nagios#Nagios-OntheNagiosserver) - could you do that?
Flags: needinfo?(arich)
Comment on attachment 8610761 [details] [diff] [review]
use check procs, really

Needed a bustage fix because:
08:43 < arr> it's check_procs_regex on the local machine, but for whatever reason, the command listed in manifests/mozilla/checkcommands.pp is called check_nrpe_procs_regex
Also Amy told me that this sort of thing autodeploys!
Flags: needinfo?(arich)
And the new checks are passing \o/:
PROCS OK: 3 processes with regex args '/builds/bbb/bin/buildbot-bridge'
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.