Closed Bug 930216 Opened 11 years ago Closed 10 years ago

Alert on/auto-delete stale cleanup lock files on buildbot masters

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: jhopkins, Unassigned)

Details

buildbot-master10 did not run a disk cleanup task for several months due to a stale lock file referenced by /etc/cron.d/bm10-tests1-tegra (and leading to bug 930021).

Two possible options:
- alert on such stale lock files via Nagios
- auto-delete stale lock files
We have 
 @hourly cltbld find /var/lock/cltbld -name lockfile.bbdb -mmin +360 -delete
on both bm10 and something modern like bm52. However I think this is a left over from before the queue system for log upload + status db insert + pulse message, and no longer has any effect.

On bm10, the master cleanup cron uses $HOME/lockfile.bm10-tests1-tegra_cleanup, while on bm52 it's /var/lock/cltbld/lockfile.bm52-tests1-linux_cleanup.
Component: Other → Platform Support
QA Contact: joduinn → coop
This only affected the older masters: those setup by hand for the mobile devices. These old masters have all been replaced by ones setup from puppet now, and they all have the buildmaster-cron entries ported by Massimo:

http://hg.mozilla.org/build/puppet/diff/e1c695967cc0/modules/buildmaster/templates/buildmaster-cron.erb
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.