Automated alert report from nagios1.private.phx1.mozilla.com: Hostname: dxradm.private.phx1.mozilla.com Service: Disk - All State: CRITICAL Output: DISK CRITICAL - free space: /data_staging 0 MB (0% inode=73%): Runbook: http://m.allizom.org/Disk+-+All
Automated alert acknowledgement: (Usul)bug 1060296
Status: NEW → ASSIGNED
This is on staging, not prod, so leaving to business day. Something looks incorrect in the configuration of storage. The more useful alert was: nagios-phx1: Fri 03:07:39 PDT  dxr-processor1.private.phx1.mozilla.com:Disk - All is CRITICAL: DISK CRITICAL - free space: /data/www_staging 0 MB (0% inode=73%): (http://m.mozilla.org/Disk+-+All) And that's a tiny 60G volume mounted over a 6T volume: $ df -h /data /data/www_* Filesystem Size Used Avail Use% Mounted on /dev/sda1 6.0T 100G 5.6T 2% /data 10.8.75.249:/dxr_data/prod 140G 34G 107G 24% /data/www_prod 10.8.75.249:/dxr_data/stage 60G 60G 64K 100% /data/www_staging So I'm assuming it's supposed to fill up and alert. Not in graphite. over to :fubar
Assignee: nobody → server-ops-devservices
Component: Server Operations: MOC → Server Operations: Developer Services
QA Contact: dmoore → nmaul
Created attachment 8481268 [details] graphite of dxr_stage Spiked after being steady-state well behaved.
I re-enabled all of the tree builds on staging, forgetting we were shorter on disk space there. Due to how the indexes are copied onto NFS and swapped, and how large the m-c and c-c indexes are, we ran out of room. I'll disabled the m-c and c-c builds on staging and removed the new index from the NFS volume, so we're all set.
Assignee: server-ops-devservices → klibby
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Component: Server Operations: Developer Services → General
Product: mozilla.org → Developer Services
You need to log in before you can comment on or make changes to this bug.