Closed Bug 714451 Opened 13 years ago Closed 13 years ago

MOAR monitoring for dm-symbolpush01

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: arich)

References

Details

Something funky is going on when builds try and unpack symbols onto dm-symbolpush01. It could be disk related, or perhaps out of memory, or something else completely.

It's not currently monitored by ganglia, and the only nagios check is for /mnt/netapp/breakpad.

Can we hook this up to ganglia, and add more nagios checks for e.g. free space on other drives, load, free memory/swap.

Thanks!
Assignee: server-ops-releng → arich
I've added ganglia support for it at:

https://ganglia.mozilla.org/sjc1/?c=Webtools&m=load_one&r=hour&s=descending&hc=4&mc=2


It had checks for fileage, ping, disk / and disk /mnt/netapp/breakpad. There were no other drives to check, so it was already checking free space on all the available mount points.

I've removed the / disk check and specified that it use the generic check (root partition + avg_load) and added a swap check.
I've added ganglia support for it at:

https://ganglia.mozilla.org/sjc1/?c=Webtools&m=load_one&r=hour&s=descending&hc=4&mc=2


It had checks for fileage, ping, disk / and disk /mnt/netapp/breakpad. There were no other drives to check, so it was already checking free space on all the available mount points.

I've removed the / disk check and specified that it use the generic check (root partition + avg_load) and added a swap check.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.