Socorro collectors should alert at 30% disk usage. This should give us a buffer to investigate what is causing collectors to fill up.
Alerts should report to #socorro-alerts, and email@example.com
Disk alerts for Collectors already alert sooner than the typical 10%/5% for the rest of the infra and do alert in #socorro-alerts and email firstname.lastname@example.org.
How much earlier? We'd like to give ourselves more lead time if they're filling up.
It will currently warn at 15% free space left and alert as critical with 11% left. We can bump those up if you'd like, just let us know what values to put in.
70% free space should do it. That may sound aggressive, but it should be enough to avoid false positives and give us a long lead time to diagnose and fix the issue.
As per :ashish, I need to make check_disk_early take arguments in order to support this.
check_disk_all_early modified and set to warn at 65% full and alert critical at 70% full for the socorro processors.