Closed Bug 930826 Opened 9 years ago Closed 6 years ago

Don't clobber unrelated trees

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: glandium, Unassigned)

References

Details

Attachments

(1 file)

I had the surprise of having a Windows build take about 150 minutes on birch yesterday, while the usual build time is under 110 minutes. It turns out this was due to:
   Finished checking clobber times (results: 0, elapsed: 37 mins, 24 secs)
   (in https://tbpl.mozilla.org/php/getParsedLog.php?id=29610865&tree=Birch )

Looking at the log, it spent that time clobbering completely unrelated trees (fx-team). While that particular slave would require a clobber for those trees if it ever builds those trees, unless there is not enough disk space for the current build to happen, there is no reason to waste time doing those clobbers. In fact, as of now, that slave hasn't done a single fx-team build yet, so those 40 minutes have been completely wasted.
We already skip periodic clobbers for unrelated trees. I can understand that we may want to use clobberer.py for separate bookkeeping of slaves (will file a bug for that), so I kept the possibility to opt in to scan all trees. But the default should be to just clobber the my_builddir if necessary.
Attachment #822063 - Flags: review?(catlee)
Assignee: nobody → mh+mozilla
Status: NEW → ASSIGNED
It wouldn't have had to clobber those directories if it hadn't previously done fx-team builds.

https://secure.pub.build.mozilla.org/buildapi/recent/w64-ix-slave122 shows several fx-team builds recently.

The rationale here is that by deleting stuff we *know* we want to delete (from clobberer), we can avoid deleting stuff we may want to keep around. The next step, purge_builds, tries to make sure there's enough free space on disk before starting the build. It doesn't know which directories are set for a clobber or not, it just deletes in order of oldest to newest until it frees up enough space.

I think better improvements could be had by combining clobberer/purge behaviour, and/or running them before starting buildbot so that the slave generally has enough disk space before being considered ready to run jobs. This latter idea would be part of bug 712206.
Bug 930831 will help with this.
QA Contact: john+bugzilla
Depends on: 1131790
Comment on attachment 822063 [details] [diff] [review]
Don't clobber unrelated trees by default

Review of attachment 822063 [details] [diff] [review]:
-----------------------------------------------------------------

flipping over to Morgan now that she's been working on clobberer
Attachment #822063 - Flags: review?(catlee) → review?(winter2718)
Comment on attachment 822063 [details] [diff] [review]
Don't clobber unrelated trees by default

Review of attachment 822063 [details] [diff] [review]:
-----------------------------------------------------------------

This won't change the behavior of the windows builds, and it will break bulk clobbers on our Linux and Mac machines. The reason is that Windows machines currently still rely on the "legacy" clobbering mode. The legacy clobber only ever returns a single result: http://hg.mozilla.org/build/tools/file/2e284f1cc424/clobberer/clobberer.py#l195

See also the clobberer url that we're using: http://hg.mozilla.org/build/buildbot-configs/file/dd86cb2037f6/mozilla/production_config.py#l45
and the endpoint: https://github.com/mozilla/build-relengapi/blob/master/relengapi/blueprints/clobberer/__init__.py#L210

My objectives for the quarter are nearly finished, and I have a working example machine which takes care of this clobbering issue (via bulk clobbers). I'm going to attach that bug ("use runner on windows").
Attachment #822063 - Flags: review?(winter2718) → review-
Depends on: 1055794
QA Contact: mshal
Now that we've got bulk clobbers on all platforms, can we revisit this?
Assignee: mh+mozilla → nobody
Status: ASSIGNED → NEW
:mrrrgn, is there anything we still need to do here or did bug 1055794 take care of it?
Flags: needinfo?(winter2718)
(In reply to Michael Shal [:mshal] from comment #7)
> :mrrrgn, is there anything we still need to do here or did bug 1055794 take
> care of it?

I'm inclined to say no. Bulk clobbers should have solved it.
Flags: needinfo?(winter2718)
Thanks!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Correction, if I remember correctly, bulk clobbers don't necessarily solve the problem so much as make it highly unlikely. If we're still clobbering in bb jobs themselves we would still probably encounter this on occasion I think, just far less frequently.

The trouble of modifying in job clobbers is probably not worthwhile at this point. I'd be in favor of closing the bug.
You need to log in before you can comment on or make changes to this bug.