Closed Bug 861733 Opened 12 years ago Closed 11 years ago

Clean up old release dirs on slaves

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
All
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Assigned: jhopkins)

References

Details

Attachments

(2 files)

Adding 00000 suffixes to build directories (bug 827306) also changed several release builders. The old dirs aren't cleared up by purge_builds.py because we specifically exempt them. We've now gone a full release cycle so any old directories are taking up space and causing more churn than necessary in non-release builds. eg bld-centos6-hp-017 happened to run out of space today, and 66GB was freed by running cd /builds/slave ls -d rel-m-rel* | grep -v 0$ | xargs rm -r ls -d rel-m-beta* | grep -v 0$ | xargs rm -r That's nearly 30% of the disk. A sample size of 1 is totally legit, right ?
What if we had some code that looked in the twistd.log for lines like 2013-04-28 14:32:57-0700 [Broker,client] I have a leftover directory 'rel-m-beta-xr-psh-mrrrs' that is not being used by the buildmaster: you can delete it now 2013-04-28 14:32:57-0700 [Broker,client] I have a leftover directory 'rel-m-esr17-lnx-update-verify-4' that is not being used by the buildmaster: you can delete it now from the last buildbot startup ? I'm not sure if this is best in runslave or purge_builds, or even if patching buildbot slave to (optionally) delete those directories.
catlee suggested we teach purge_builds.py to expire rel-* after N days, say 40 or 60, so that we still have cleanup while retaining release builds for post-release debugging. With that directory renaming is eventually cleaned up without human intervention.
Assignee: nobody → jhopkins
This will require a separate buildbotcustom patch that makes use of the expiry syntax. Tested script locally on its own.
Attachment #770875 - Flags: review?(aki)
Comment on attachment 770875 [details] [diff] [review] [tools] support time-based expiry for skipped directories I'm finding the second lambda fairly opaque...
Attachment #770875 - Flags: review?(aki) → review+
Comment on attachment 770875 [details] [diff] [review] [tools] support time-based expiry for skipped directories https://hg.mozilla.org/build/tools/rev/d1e287abf65c
Attachment #770875 - Flags: checked-in+
I'm assuming a 45 day expiry for release dirs - please let me know if you have another preference. Test build in staging: http://dev-master01.build.scl1.mozilla.com:8900/builders/Linux%20mozilla-central%20build/builds/0/steps/clean_old_builds/logs/stdio
Attachment #770954 - Flags: review?(catlee)
Attachment #770954 - Flags: review?(catlee) → review+
Comment on attachment 770954 [details] [diff] [review] [buildbotcustom] call purge_builds with expiry time for release dirs https://hg.mozilla.org/build/buildbotcustom/rev/edf2327007cf
Attachment #770954 - Flags: checked-in+
Comment on attachment 770954 [details] [diff] [review] [buildbotcustom] call purge_builds with expiry time for release dirs Cherry-picked onto production-0.8 branch: http://hg.mozilla.org/build/buildbotcustom/rev/359b0c3d670f
Confirmed that release directories are being cleaned up as expected.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
QA Contact: other → armenzg
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: