Crontabber is failing on stage: https://sentry.prod.mozaws.net/operations/socorro-stage/issues/355891/ ▶ awsin ec2-54-200-76-190.us-west-2.compute.amazonaws.com # stage admin Last login: Fri Oct 7 19:31:24 2016 from 64-179-205-217.hartcom.net grep: write error [centos@ip-172-31-10-19 ~]$ df -h Filesystem Size Used Avail Use% Mounted on /dev/xvda1 8.0G 8.0G 20K 100% / devtmpfs 480M 0 480M 0% /dev tmpfs 497M 0 497M 0% /dev/shm tmpfs 497M 57M 440M 12% /run tmpfs 497M 0 497M 0% /sys/fs/cgroup [centos@ip-172-31-10-19 ~]$ 20K free.
[centos@ip-172-31-10-19 socorro]$ ls -lh /var/log/socorro/crontabber.log -rw-r--r-- 1 socorro socorro 5.4G Oct 10 17:35 /var/log/socorro/crontabber.log
Tweaking summary to talk about disk--not memory. If we were rotating that log file, we'd probably be fine.
Summary: Stage admin box out of memory → Stage admin box out of disk
This node does not received automated updates, and thus never picked up the changes in puppet. I'll go ahead and manually set the logrotate scripts, like it's 1999.
This chapter in "Why we love long lived instances" has been completed! I manually updated logrotate files on that server to match what we have in real socorro these days. Root cause of manually managed long lived instances is not yet resolved.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.