reduce threshold at which nagios warns about disk free space

RESOLVED FIXED

Status

mozilla.org Graveyard
Server Operations
--
major
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: joduinn, Assigned: chizu)

Tracking

Details

Currently, nagios warns us when free disk space on a build slave gets down to approx Free: 2.19G (8%). 

Can you reduce that threshold down to only warn us when we have 0.5gb free? 

Its awfully close to the wire, so we'll have to react quickly if we hit this. Once we have a real fix for all the files left behind after each build on the slaves, we can revert this nagios setting.
There are two thresholds, warning and critical. I think they are set at 10% and 5% of the total disk size. What new values are you proposing ?

Comment 2

9 years ago
I would say 5% and 1%, so that current critical messages become warnings, and new critical messages will actually be critical.
(In reply to comment #2)
> I would say 5% and 1%, so that current critical messages become warnings, and
> new critical messages will actually be critical.

wfm

Comment 4

9 years ago
The release team is OK with the new proposed thresholds (5% and 1%). 

In the interest of saving our inboxes and making nagios warnings more relevant, can we get this change sooner rather than later?
Severity: normal → major
(In reply to comment #3)
> (In reply to comment #2)
> > I would say 5% and 1%, so that current critical messages become warnings, and
> > new critical messages will actually be critical.
> 
> wfm

wfm
Do you want this done for all "disks" (/builds, C:\, D:\, E:\) nagios monitors or just certain ones?
* for win32, it's the "disk - E" check that need adjusting, on moz2-win32-slave01 thru 20 
* for linux it's "disk - /builds", on moz2-linux-slave01 thru 16 and moz2-linux64-slave01 
* for mac it's "root partition", on bm-xserve16/17/18/19/22, plus moz2-darwin9-slave02/05/06/07

Some of these machines may not be setup in nagios yet, so please copy over a set of checks from a machine with a similar host name.
(Assignee)

Updated

9 years ago
Assignee: server-ops → thardcastle
Updated machine list:
 win32: moz2-win32-slave01 thru 23
 linux: moz2-linux-slave01 thru 19, moz2-linux64-slave01
 mac:   bm-xserve16/17/18/19/22, moz2-darwin9-slave01 thru 08

Do you need any further info here Trevor ?
(Assignee)

Comment 9

9 years ago
These should all be updated/added now.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.