Low disk space on admin1b.private.tpe1.mozilla.com

RESOLVED FIXED

Status

RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: fauweh, Assigned: fauweh)

Tracking

Details

(Assignee)

Description

2 years ago
<@nagios-scl3> Sat 14:16:03 PDT [5317] admin1b.private.tpe1.mozilla.com:Disk - All is WARNING: DISK WARNING - free space: / 3670 MB (10% inode=95%): (http://m.mozilla.org/Disk+-+All)

I compressed a single /var/log/messages.* backup file to clear the alert temporarily.

du -sh /var/log/
14G     /var/log/

du -sh /var/lib/ldap/backup*
604M    /var/lib/ldap/backup.1341247976
651M    /var/lib/ldap/backup.1346368986
758M    /var/lib/ldap/backup.1361484013
780M    /var/lib/ldap/backup.1362593162
860M    /var/lib/ldap/backup.1369151946
1.5G    /var/lib/ldap/backup.1415938512
1.8G    /var/lib/ldap/backup.1441246929
This needs those ldap backups cleaning up, can we limit them more to prevent this reoccurring?
Component: MOC: Incidents → Infrastructure: LDAP
QA Contact: lypulong
Mmm, no, my mistake.

Log files are full of this at high rates:

Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card not present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card not present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card not present on Slot(0-1)
Jul 10 10:06:19 admin1b.private.tpe1.mozilla.com kernel: pciehp 0000:00:1c.4:pci
e04: Card present on Slot(0-1)

Box at minimum needs a reboot, perhaps has more serious hardware problems.

:ashish, can we kick this host please?
Component: Infrastructure: LDAP → MOC: Incidents
Flags: needinfo?(ashish)
QA Contact: lypulong
Scheduled CHG0010545 for early Thu local time.
Flags: needinfo?(ashish)
Host was updated (OS-wise) and kicked.
Updates and reboot seem to have fixed the continuous log messages. Ta.
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Component: MOC: Incidents → MOC: Problems
Product: Infrastructure & Operations → Infrastructure & Operations

Updated

2 years ago
See Also: → bug 1326430
You need to log in before you can comment on or make changes to this bug.