Closed Bug 565156 Opened 14 years ago Closed 14 years ago

Add rabbitmq to munin

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

All
Other
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: clouserw, Assigned: oremj)

Details

Rabbitmq is running on the gearman boxes (or at least one of them).  We should send stats to munin so we can monitor it over time:

http://github.com/ask/rabbitmq-munin/

(I'm assuming nagios is already making sure it's running, but if that's not the case please do that also)
Assignee: server-ops → jeremy.orem+bugs
Installed munin plugins.  What about rabbitmq should I be monitoring, just a process or tcp check?
(In reply to comment #1)
> Installed munin plugins.  What about rabbitmq should I be monitoring, just a
> process or tcp check?

I think we should be monitoring the same stuff munin is.  Munin is just execing rabbitmqctl, the scripts it's running are almost good enough for nagios - it even has warn/crit thresholds.  They are all at http://github.com/ask/rabbitmq-munin
What will the action be if nagios goes off? Need to make docs for the other admins.
If it's below threshold for workers, start more workers and then figure out why they disappeared.  

If the queue is too high, order more hardware I guess.  Also let webdev know so we can throttle back unimportant jobs.

I think amo-developers should get these pages too.
Only the connections graph is working in munin, all the rest are blank.  If you run the commands manually do they execute?  If you're running them through sudo don't forget to add rabbitmqctl to what it can run.
I didn't have "env.vhost vhostname" set. I was hoping by default it would just graph all vhosts. Kind of lame that it will only do 1.
Turns out these plugins don't work with celery at all. It expects just a couple of queues to exists and celery has created over 7,000 queues.
Can we try these again now that we're not creating tons of result queues? (bug 567932)
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.