Closed
Bug 464093
Opened 16 years ago
Closed 16 years ago
Builds on Mac take too much space
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: catlee)
References
Details
Attachments
(6 files, 4 obsolete files)
874 bytes,
patch
|
bhearsum
:
review+
|
Details | Diff | Splinter Review |
3.53 KB,
patch
|
bhearsum
:
review+
|
Details | Diff | Splinter Review |
1.93 KB,
patch
|
bhearsum
:
review+
|
Details | Diff | Splinter Review |
1.38 KB,
patch
|
coop
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
3.03 KB,
patch
|
bhearsum
:
review+
bhearsum
:
checked-in+
|
Details | Diff | Splinter Review |
2.82 KB,
patch
|
nthomas
:
review+
nthomas
:
checked-in+
|
Details | Diff | Splinter Review |
We've run into a problem a few times on mac slaves on staging where a build will fail to clean up after itself properly, causing future builds to run out of disk space, and thus fail as well. This requires manual intervention to fix.
We haven't (yet) run into this problem on Linux or Windows, since the Mac builds have both PPC and x86 code. However, there are several new product branches coming down the pipe, each of which will require significant disk space.
There are a few issues here:
- Mac builds are done with -save-temps which causes gcc to leave all its internal temporary files on disk. This is to work around a bug where gcc will crash with a bus error when using '-gstabs' (which we do).
- Builds should be cleaned up regardless of if the rest of the build steps complete successfully or not.
Comment 1•16 years ago
|
||
(In reply to comment #0)
> - Builds should be cleaned up regardless of if the rest of the build steps
> complete successfully or not.
This isn't really solvable right now. When a BuildStep fails that isn't set to haltOnFailure=False the build just stops. It'd be nice to give Buildbot support for a set of steps that always get run, sortof like a "finally" statement.
Blocks: 417045
Assignee | ||
Comment 2•16 years ago
|
||
Could we add an additional command as part of the compile step to do the cleanup?
e.g.
make -f client.mk; make cleanup
Comment 3•16 years ago
|
||
I have hit this same problem when the l10n repackages happened on staging-master
For instance moz2-darwin09-slave04's df -hi shows:
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk0s2 74Gi 74Gi 55Mi 100% 19439562 14144 100% /
devfs 105Ki 105Ki 0Bi 100% 600 0 100% /dev
fdesc 1.0Ki 1.0Ki 0Bi 100% 4 253 2% /dev
map -hosts 0Bi 0Bi 0Bi 100% 0 0 100% /net
map auto_home 0Bi 0Bi 0Bi 100% 0 0 100% /home
Comment 4•16 years ago
|
||
I have filed bug 464103 to see if we can see how this high disk usage is going
to affect us in the long run
Assignee | ||
Comment 5•16 years ago
|
||
`du -msc * | sort -n` in /builds/moz2_slave yields the following:
0 macosx_build
0 macosx_update_verify
1 buildbot.tac
1 info
1 twistd.log
1 twistd.log.1
1 twistd.log.2
1 twistd.log.3
1 twistd.log.4
1 twistd.pid
105 mozilla-central-macosx-l10n-nightly
933 mozilla-central-macosx-unittest
1680 mozilla-central-macosx-debug
7533 tracemonkey-macosx-nightly
11681 mozilla-central-macosx-nightly
16468 tracemonkey-macosx
16560 mozilla-central-macosx
54962 total
Comment 6•16 years ago
|
||
As a work around, perhaps we can make the _start_ of all build runs clean up all the nightly directories.
Assignee | ||
Comment 7•16 years ago
|
||
(In reply to comment #6)
> As a work around, perhaps we can make the _start_ of all build runs clean up
> all the nightly directories.
yeah, that would work, but means that each of the builders needs to know about all the other ones in order to clean them up. It could get tricky to update those lists as we add / remove branches.
Comment 8•16 years ago
|
||
bash -c rm -rf *-nightly/build/
except that it's brutal, and will fail on new slaves.
Comment 9•16 years ago
|
||
(In reply to comment #2)
> Could we add an additional command as part of the compile step to do the
> cleanup?
>
> e.g.
>
> make -f client.mk; make cleanup
This is a kludge, but it would help in some cases. Obviously, when the failure comes somewhere after the Compile step we'd still be out of luck.
(In reply to comment #8)
> bash -c rm -rf *-nightly/build/
> except that it's brutal, and will fail on new slaves.
We could do something like...
bash -c 'for dir in ../*-nightly; do rm -rf ../$dir/build'.
It irks me when there are BuildSteps that step outside their own build directory, but in the name of not getting people paged I think I could live with something like this :).
Comment 10•16 years ago
|
||
(In reply to comment #8)
> bash -c rm -rf *-nightly/build/
> except that it's brutal, and will fail on new slaves.
The '-rf' should make the command oblivious to failure, no? Even so, couldn't we just warnOnFailure there?
Assignee | ||
Comment 11•16 years ago
|
||
Comment 12•16 years ago
|
||
(In reply to comment #10)
> (In reply to comment #8)
> > bash -c rm -rf *-nightly/build/
> > except that it's brutal, and will fail on new slaves.
>
> The '-rf' should make the command oblivious to failure, no? Even so, couldn't
> we just warnOnFailure there?
The command will fail completely if it can't glob anything, because rm -rf will be called with no arguments. But we can get Buildbot to ignore it.
Assignee | ||
Comment 13•16 years ago
|
||
Attachment #347553 -
Attachment is obsolete: true
Comment 14•16 years ago
|
||
We could also investigate if we still need --save-temps now that we switched to XCode 3.1 - Apple might have fixed gcc.
Comment 15•16 years ago
|
||
I have hit this again during the l10n repackages but I was fast enough to "rm -rf tracemonkey*" and allowing the rest of the mac builds after "ru" to complete properly.
/me in love with the patch and even more that it tackles all 3 platforms
Assignee | ||
Comment 16•16 years ago
|
||
(In reply to comment #14)
> We could also investigate if we still need --save-temps now that we switched to
> XCode 3.1 - Apple might have fixed gcc.
I tried on my laptop which has XCode 3.1, and it crashed out pretty soon with a bus error.
Does anybody know if we have contacted Apple with this problem? We should be able to send a copy of the preprocessed source, along with the gcc flags used to reproduce the error.
Assignee | ||
Comment 17•16 years ago
|
||
Assignee | ||
Updated•16 years ago
|
Attachment #347773 -
Flags: review?(bhearsum)
Assignee | ||
Updated•16 years ago
|
Assignee: nobody → catlee
Priority: -- → P2
Comment 18•16 years ago
|
||
Looks like we'll get dwarf support in breakpad in the near future (bug 421534), so we can probably ditch --save-temps then.
Blocks: 421534
Comment 19•16 years ago
|
||
Comment on attachment 347773 [details] [diff] [review]
[checked in] changes to unittest, mobile and l10n masters to delete previous nightly builds
We should probably add this to MercurialBuildFactory in factory.py, too.
Attachment #347773 -
Flags: review?(bhearsum) → review+
Comment 20•16 years ago
|
||
Comment on attachment 347773 [details] [diff] [review]
[checked in] changes to unittest, mobile and l10n masters to delete previous nightly builds
changeset: 509:e05ba6596c28
still need a follow-up for MercurialBuildFactory
Attachment #347773 -
Attachment description: changes to unittest, mobile and l10n masters to delete previous nightly builds → [checked in] changes to unittest, mobile and l10n masters to delete previous nightly builds
Updated•16 years ago
|
Attachment #347556 -
Flags: review+
Comment 21•16 years ago
|
||
Comment on attachment 347556 [details] [diff] [review]
[checked in] remove previous nightly builds before every build
Checking in factory.py;
/cvsroot/mozilla/tools/buildbotcustom/process/factory.py,v <-- factory.py
new revision: 1.33; previous revision: 1.32
done
Attachment #347556 -
Attachment description: remove previous nightly builds before every build → [checked in] remove previous nightly builds before every build
Updated•16 years ago
|
Comment 22•16 years ago
|
||
Comment on attachment 347556 [details] [diff] [review]
[checked in] remove previous nightly builds before every build
I just noticed that production-master never picked this change up. I just reconfig'ed it for it.
Assignee | ||
Comment 23•16 years ago
|
||
Attachment #348762 -
Flags: review?(bhearsum)
Comment 24•16 years ago
|
||
Comment on attachment 348762 [details] [diff] [review]
changes to mobile and l10n factories to clean up old nightly builds on production
changeset: 512:bf93ed906fcd
Feel free to update & reconfig the master with this.
Attachment #348762 -
Flags: review?(bhearsum) → review+
Assignee | ||
Updated•16 years ago
|
Priority: P2 → --
Comment 25•16 years ago
|
||
This patch will take us down to approx. 10GB per build directory at the end of a successful build w/ save-temps. Currently, we're at 15GB per dir.
Attachment #349464 -
Flags: review?(ccooper)
Updated•16 years ago
|
Attachment #349464 -
Flags: review?(ccooper) → review+
Comment 26•16 years ago
|
||
Comment on attachment 349464 [details] [diff] [review]
cleanup both the i386 and ppc objdirs, delete even more temp files
Checking in factory.py;
/cvsroot/mozilla/tools/buildbotcustom/process/factory.py,v <-- factory.py
new revision: 1.38; previous revision: 1.37
done
Attachment #349464 -
Attachment description: cleanup both the i386 and ppc objdirs, deleted even more temp files → cleanup both the i386 and ppc objdirs, delete even more temp files
Attachment #349464 -
Flags: checked‑in+
Comment 27•16 years ago
|
||
Can you spot what's wrong with this snippet:
bash -c rm -rf ../*-nightly/build
in dir /builds/moz2_slave/mozilla-1.9.1-linux-nightly/build
It's missing another ../ on the path to delete
Attachment #349700 -
Flags: review?(bhearsum)
Updated•16 years ago
|
Attachment #349700 -
Flags: checked‑in+
Comment 28•16 years ago
|
||
Comment on attachment 349700 [details] [diff] [review]
Fix up path to nightly dirs
changeset: 531:aef8d5499c06
Updated•16 years ago
|
Attachment #349700 -
Flags: review?(bhearsum) → review+
Assignee | ||
Comment 29•16 years ago
|
||
Attachment #349803 -
Flags: review?(bhearsum)
Comment 30•16 years ago
|
||
Actually, the same thing happens to other nightly builds. Eg
C:\WINDOWS\system32\cmd.exe /c bash -c rm -rf ../../*-nightly/build
in dir e:\builds\moz2_slave\mozilla-central-win32-nightly\build
rm: cannot remove directory `../../mozilla-central-win32-nightly/build':Permission denied
program finished with exit code 1
What if move the rm -rf ../../*nightly/build into an else block at line 215
http://mxr.mozilla.org/seamonkey/source/tools/buildbotcustom/process/factory.py#208
Comment 31•16 years ago
|
||
Except that won't clear up other failed nightlies when trying to do a new one. Teh suck.
Comment 32•16 years ago
|
||
why don't we set the workdir to be '.' instead of not specifying it? (and therefore set by default to "build")
This way "rm" is not being executed from within "build"
Comment 33•16 years ago
|
||
This nightly cleanup job is turning the nightly builds orange - we need to fix this before the next set of nightlies comes out.
Assignee | ||
Comment 34•16 years ago
|
||
Attachment #349803 -
Attachment is obsolete: true
Attachment #349957 -
Flags: review?(bhearsum)
Attachment #349803 -
Flags: review?(bhearsum)
Assignee | ||
Comment 35•16 years ago
|
||
Attachment #349957 -
Attachment is obsolete: true
Attachment #349967 -
Flags: review?(nthomas)
Attachment #349957 -
Flags: review?(bhearsum)
Comment 36•16 years ago
|
||
(In reply to comment #33)
> This nightly cleanup job is turning the nightly builds orange - we need to fix
> this before the next set of nightlies comes out.
Is this orange because of the windows mozilla-central nightly build?
"NMAKE : fatal error U1052: file 'makefile.sub' not found"?
What is this file for?
Assignee | ||
Comment 37•16 years ago
|
||
(In reply to comment #36)
> (In reply to comment #33)
> > This nightly cleanup job is turning the nightly builds orange - we need to fix
> > this before the next set of nightlies comes out.
>
> Is this orange because of the windows mozilla-central nightly build?
> "NMAKE : fatal error U1052: file 'makefile.sub' not found"?
> What is this file for?
No, I think they turned orange because of this:
http://production-master.build.mozilla.org:8010/builders/WINNT%205.2%20mozilla-1.9.1%20nightly/builds/8/steps/shell_3/logs/stdio
rm: cannot remove directory `../../mozilla-1.9.1-win32-nightly/build': Permission denied
This is because it's trying to remove the directory that the shell is currently in. So your suggestion of running with workdir='.' should work.
Comment 38•16 years ago
|
||
Comment on attachment 349967 [details] [diff] [review]
Fix cleaning up other nightly builds
r+, switched the workDir lines to use single quotes on checkin (rev 1.43)
Updated moz2-master on staging and production before 2am PST.
Attachment #349967 -
Flags: review?(nthomas) → review+
Updated•16 years ago
|
Attachment #349967 -
Flags: checked‑in+
Comment 39•16 years ago
|
||
I am making progress on bug 421534, so don't lose hope!
Assignee | ||
Comment 40•16 years ago
|
||
This is on staging right now
Attachment #351251 -
Flags: review?(bhearsum)
Updated•16 years ago
|
Attachment #351251 -
Attachment is obsolete: true
Attachment #351251 -
Flags: review?(bhearsum)
Comment 41•16 years ago
|
||
This will probably be WFM now. I intend to try to get bug 421534 landed on the 1.9.1 branch, which should eliminate the problem.
Assignee | ||
Comment 42•16 years ago
|
||
The new dwarf support fixes this for the Mac, and bug 464103 will handle other platforms.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•