Closed Bug 649007 Opened 14 years ago Closed 13 years ago

allow for setting downtimes through the nagios bot by alert number

Categories

(mozilla.org Graveyard :: Server Operations, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: rtucker)

Details

The ability to set downtimes through the nagios bot by service name is really awesome. In addition to that, it would be helpful to set them by alert number, too, as that's often the thing that prompts a person to set one.
Assignee: server-ops → justdave
Assignee: justdave → rtucker
justdave: Going to punt this to you. If you want to give me instructions to do so I'll happily take care of it. I don't want to mess with the bots since you just rewrote them. It very well will take longer for you to tell me what to do than to just do it yourself though.
Assignee: rtucker → justdave
All that stuff about me taking over the bots turned out to be shortlived.
Assignee: justdave → rtucker
I've implemented this as requested. Here is the usage and output: rtucker> nagios-phx1-dev: downtime 100 1m testing quick downtime nagios-phx1-dev> rtucker: Downtime for sp-processor05.phx1:File Age - /var/log/socorro/socorro-processor.log scheduled for 0:01:00
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Awesome, thank you!
Hmm, this doesn't seem to be working in #build: [07:04] <nagios-sjc1> [89] dc01.winbuild.scl1:Windows Services is CRITICAL: CRITICAL: clr_optimization_v4.0.30319_32: stopped (critical), clr_optimization_v4.0.30319_64: stopped (critical), ShellHWDetection: stopped (critical), sppsvc: stopped (critical) [07:04] <bhearsum|buildduty> nagios-sjc1: downtime 89 1w check not fully working yet I waited a couple of minutes, and there was no response. Not sure if it's related, but this happened shortly after: [07:06] <nagios-sjc1> dc01.winbuild.scl1:Windows Services is ACKNOWLEDGEMENT (CRITICAL): CRITICAL: clr_optimization_v4.0.30319_32: stopped (critical), clr_optimization_v4.0.30319_64: stopped (critical), ShellHWDetection: stopped (critical), sppsvc: stopped (critical);arr;check not working yet
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
ben: It's fixed in the new release of the bot that hasn't been propogated to your channel yet. Notice in my usage example that i'm calling nagios-phx1-dev. We're doing some final testing of the new bot and it will be deployed soon. Sorry that I didn't make that clear when I closed it.
No worries, thanks for the info!
this is complete since the python bots are in prod
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.