Closed
Bug 1196641
Opened 9 years ago
Closed 9 years ago
node1.bagheera.metrics.scl3.mozilla.com:Disk - /data is WARNING:
Categories
(Infrastructure & Operations :: MOC: Problems, task)
Infrastructure & Operations
MOC: Problems
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: sal, Unassigned)
Details
No description provided.
Reporter | ||
Comment 1•9 years ago
|
||
nagios-scl3> (IRC) Wed 23:45:58 PDT [5477] node1.bagheera.metrics.scl3.mozilla.com:Disk - /data is WARNING: DISK WARNING - free space: /data 513007 MB (14% inode=99%):
[root@node1.bagheera.metrics.scl3 sespinoza]# du -chx --max-depth=1 /
6.7G /usr
5.0G /var
4.0G /home
17G /
17G total
also :sheeri
[root@node1.bagheera.metrics.scl3 home]# du -hsc * | grep G | sort -nr | grep -v total
4.0G scabral
Summary: disk → node1.bagheera.metrics.scl3.mozilla.com:Disk - /data is WARNING:
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(scabral)
Comment 2•9 years ago
|
||
The problem is in /data, not in /
I removed logs older than 14 days in /data, there were some March and February 2014 logs.
Flags: needinfo?(scabral)
Comment 3•9 years ago
|
||
This alerted again today:
Thu 12:32:23 PDT [5258] node1.bagheera.metrics.scl3.mozilla.com:Disk - All is WARNING: DISK WARNING - free space: /data 378340 MB (10% inode=99%)
Talked with sheeri:
[12:37:05] <sheeri> ashlee: it's one node in a cluster, and other servers have more space, so I think this might just be regular growth
[12:37:29] <sheeri> I'm thinking of ack'ing the warning and worrying if it goes critical, I think this is just normal growth, and we need to be off these servers by the end of the year
Comment 4•9 years ago
|
||
It's getting a bit cozy.
Fri 01:57:39 PDT [5000] node1.bagheera.metrics.scl3.mozilla.com:Disk - /data is CRITICAL: DISK CRITICAL - free space: /data 180054
[rchilds@node1.bagheera.metrics.scl3 ~]$ sudo du -chx --max-depth=1 /data/kafka-logs/ | grep G
796G /data/kafka-logs/metrics-1
794G /data/kafka-logs/metrics-3
796G /data/kafka-logs/metrics-2
796G /data/kafka-logs/metrics-0
1.1G /data/kafka-logs/sslreports-0
Comment 5•9 years ago
|
||
alerting again
nagios-scl3> (IRC) Fri 09:27:38 PDT [5149] node1.bagheera.metrics.scl3.mozilla.com:Disk - /data is CRITICAL: DISK CRITICAL - free space: /data 25596 MB (0% inode=99%): (http://m.mozilla.org/Disk+-+/data)
Comment 6•9 years ago
|
||
Had to restart bagheera and kafka to release some of the space. Took the opportunity to patch software and firmware. Everything came up OK and things are great now:
-bash-4.1$ df -h /data
Filesystem Size Used Avail Use% Mounted on
/dev/sda4 3.5T 2.1T 1.3T 63% /data
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•8 years ago
|
Component: MOC: Incidents → MOC: Problems
You need to log in
before you can comment on or make changes to this bug.
Description
•