Closed Bug 1038063 Opened 11 years ago Closed 10 years ago

add a cron to clean up build space in dev-stage01:/builds

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: pmoore, Assigned: catlee)

References

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/826] )

Only 39GB free on /builds: [root@dev-stage01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 28G 3.8G 23G 15% / /dev/sdc1 493G 429G 39G 92% /builds /dev/sda1 99M 16M 78M 17% /boot tmpfs 502M 0 502M 0% /dev/shm This is causing nagios alerts in #buildduty: nagios-releng 10:45:52 Mon 01:45:52 PDT [4605] dev-stage01.srv.releng.scl3.mozilla.com:disk - /builds is WARNING: DISK WARNING - free space: /builds 39179 MB (8% inode=98%): (http://m.mozilla.org/disk+-+/builds) Looks like a replacement of dev-stage01 is in progress: https://bugzilla.mozilla.org/show_bug.cgi?id=808025#c34 In the meantime, we should probably clean up /builds. I'll send an email out to release@mozilla.com to ask people what can be cleaned up (don't want to break anybody's staging tests by removing stuff randomly).
This may be unrelated but FYI - earlier today the / partition alerted for being full, it's only 28G. I made /pvt a symlink to /builds/data/pvt_builds/pvt to free a few GB (the bits were all stale so they got deleted). I cleaned something else up, but it also recovered a lot of space by itself. Or someone else was around too and I didn't spot them.
From Nick: Looks like a /builds/data/ftp/pub/firefox/releases/31.0b9 dir was recently added. Unless the files need modifying (very unusual!), I suggest proxying is used instead: https://wiki.mozilla.org/ReleaseEngineering/How_To/Mirror_Releases_on_dev-stage01
Sorry guys! this was me last night. I had to run the beta32 staging release and I could not make the proxying working. I'll remove the 31.0b9 files as soon as the staging release completes
I've removed files from firefox/{bundles,releases,candidates} which were more than a couple of weeks old (ie old staging releases), plus assorted other strays from around the place. We don't have any cron jobs in place for this, so this is a periodic maintenance task. # df -h /builds Filesystem Size Used Avail Use% Mounted on /dev/sdc1 493G 254G 215G 55% /builds
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Also took an axe to /etc/httpd/conf.d/release-candidates.conf to reduce the # of old releases in there.
(In reply to Nick Thomas [:nthomas] from comment #4) > We don't have any cron jobs in place for this, so this is a periodic maintenance task. I'd like to set up a cron for this (ideally in puppet, if this machine is puppet-managed) - can you share the details of the specific cleanup command(s) that should be run? I had a look at the bash history of the root user on the machine, but could not quite work out how to translate this into something cronable. For example, should we: a) delete files with last modified timestamp older than X, b) keep X most recent releases, sort by release number c) keep cleaning out releases until X disk space available etc Also which directories should we run these cleanup commands against? For example, are there multiple entry points into the file system where cleanup of files and subdirectories is needed? Also would be cool to know how to axe /etc/httpd/conf.d/release-candidates.conf so I can script this up too. If you have example specific commands, that would be best, to avoid any misunderstandings. Thanks Nick! Pete
Status: RESOLVED → REOPENED
Flags: needinfo?(nthomas)
Resolution: FIXED → ---
We don't have puppet for this box (the old puppet system was disabled) but could still do a cron. It's a little tricky to know what can be deleted from /pub/mozilla.org/{firefox,mobile,thunderbird}/{candidates,releases}/ because it depends what's been going on in staging. With the proxying we should only be storing bits we're producing in staging releases though. If we did find /pub/mozilla.org/{firefox,mobile,thunderbird,xulrunner}/{candidates,releases}/ \ -mindepth 1 -maxdepth 1 ! -type l -mtime +60 -print0 | xargs -0 rm -rfv symlinks -d /pub/mozilla.org/{firefox,mobile,thunderbird,xulrunner}/{nightly,releases} that should be plenty long enough. Those are both tested if you want to just plug them into a cron. If that turns out to be too short, then we could probably go longer without hitting disk issues.
Flags: needinfo?(nthomas)
I have removed the 31.0b9-candidates and releases/31.0b9, now we have about 280GB free.
Sounds like this is no longer a buildduty issue. Judging by comment 6 pete, you're re-opening this in hopes to add a cron. Changing bug summary and assigning you in puppet component. Please feel free to edit my changes if I am interpreting incorrectly.
Assignee: nobody → relops
Component: Buildduty → RelOps: Puppet
Product: Release Engineering → Infrastructure & Operations
QA Contact: bugspam.Callek → dustin
Summary: Running out of space on dev-stage01:/builds → add a cron to clean up build space in dev-stage01:/builds
Assignee: relops → pmoore
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/354]
Assignee: pmoore → gmiroshnykov
Assignee: gmiroshnykov → relops
Assignee: relops → gmiroshnykov
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/354] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/826] [kanban:engops:https://kanbanize.com/ctrl_board/6/354]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/826] [kanban:engops:https://kanbanize.com/ctrl_board/6/354] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/826]
Assignee: gmiroshnykov → catlee
Blocks: 1129974
catlee, is this still needed?
Flags: needinfo?(catlee)
probably not. hopefully the need for this machine goes away soon.
Flags: needinfo?(catlee)
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.