Sat 09:05:44 PST  socorro-collector4.webapp.phx1.mozilla.com:Disk - All Early is WARNING: DISK WARNING - free space: / 67708 MB (15% inode=82%): /dev/shm 5947 MB (100% inode=99%): /boot 870 MB (93% inode=99%) Sat 08:53:35 PST  socorro-collector3.webapp.phx1.mozilla.com:Disk - All Early is WARNING: DISK WARNING - free space: / 66572 MB (15% inode=82%): /dev/shm 5947 MB (100% inode=99%): /boot 870 MB (93% inode=99%)
Assignee: tmeyarivan → nobody
Component: Hadoop/HBase Operations → Infra
Product: Mozilla Metrics → Socorro
Target Milestone: Unreviewed → ---
Crashes backed up on disk again. e.g. [email@example.com primaryCrashStore]# find . -type f -name *.dump | wc -l 3335144 I'm not sure how to get crashmover to go back and try to resubmit those crashes; lars is on the road. mburns is paging jakem for assistance since he was in the trenches on this last time.
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/c6833b40e079c56c8118e2e23535d02dc399e2ea Merge pull request #1128 from twobraids/delete_all_dumps Fixes Bug 849529
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Lars: are these actually deleted? This bug should remain open until the piled up dumps have been removed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Related alert; Sat 19:54:14 PST  sp-admin01.phx1.mozilla.com:Socorro Admin - cron_submitter-crash-reports.allizom.org.log is CRITICAL: FILE_AGE CRITICAL: /var/log/socorro/cron_submitter-crash-reports.allizom.org.log is 4295 seconds old and 85385563 bytes
Another (possibly) related alert; Sun 04:27:01 PDT  sp-admin01.phx1.mozilla.com:Socorro Admin - cron_create_partitions.log is CRITICAL: FILE_AGE CRITICAL: /var/log/socorro/cron_create_partitions.log is 605152 seconds old and 0 bytes
(In reply to Adrian Fernandez [:Aj] from comment #5) > Another (possibly) related alert; > Sun 04:27:01 PDT  sp-admin01.phx1.mozilla.com:Socorro Admin - > cron_create_partitions.log is CRITICAL: FILE_AGE CRITICAL: > /var/log/socorro/cron_create_partitions.log is 605152 seconds old and 0 bytes This is not related. I made a mistake in the ownership of a new table. I will investigate why this is still a problem -- was able to run the partitioner function today without an error: breakpad=# SELECT * FROM weekly_report_partitions(); weekly_report_partitions|t Time: 5747.353 ms
socorro-collector4.webapp.phx1 out of the zeus pools to be cleaned up by lars.
socorro-collector1 undrained in Zeus, socorro-collector2 is drained while :lars works on it.
socorro-collector2 undrained in Zeus, socorro-collector3 is drained now.
socorro-collector3 undrained and socorro-collector5 drained, as per lars's request on irc.
collector 5 undrained and collector 6 drained as per lars's request on irc.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago → 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.