Tryserver should perform incremental builds using the patch set's tip from its respective repository

NEW
Unassigned

Status

Release Engineering
General
2 years ago
a year ago

People

(Reporter: jaws, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

In an effort to make try builds complete much faster, we should do the following:

When a patch (or series of patches) are pushed to tryserver, tryserver should look at the base of the series and see what the parent changeset was. It should then request the obj-dir of the build for that changeset from its respective build server. Using that obj-dir, tryserver can apply those changesets and get an incremental build.

Are we storing obj-dirs of builds? This would take up a lot more server space but storage is cheap, right?

Secondly, the obj-dirs would need to be transferable to the tryserver machines, which would mean they need the same hardware setup so the machine code is compatible/build-resumable.

Joel, is something like this possible?
Flags: needinfo?(jmaher)
this is a fabulous idea- I recall some talk of this in the past, I am not sure if objdir is stored and can be reused, let me ask some folks.
Flags: needinfo?(mshal)
Flags: needinfo?(jmaher)
Flags: needinfo?(gps)

Comment 2

2 years ago
This has been discussed before.

AFAIK, we are not storing the objdir after builds. We arguably should - at least for failed builds, so we can debug why a build failure occurred. Of course, this is only realistically doable on TaskCluster with its Docker-based approach. Snapshotting the entire objdir on a traditional filesystem would add a lot of overhead. Even with Docker, I worry the overhead might be prohibitive.

It's not enough to just snapshot the objdir - you also need to snapshot the source checkout. Otherwise, mtimes could be out of sync and you build more than you need to or fail to pick up changed dependencies. Of course, we have to assume clocks are relatively consistent between machines or again, mtimes could cause havoc.

Another problem with filesystem snapshots is they assume incremental builds work. We have way too many clobbers. We don't do incremental builds on Try at all because of clobber concerns. We need a build system more robust against clobbers before we can attempt this.

My inclination for this bug is WONTFIX. I think resources are better spent making the build system faster and more robust. We're already doing work around using pre-built artifacts to speed up builds. In automation, we already have caching of compiler invocations. And we're working towards a non-make build backend. When we put more effort into these areas, I think the benefits will be just as good if not better than exchanging filesystem snapshots in automation.
Flags: needinfo?(gps)
For a build system like make where you just need mtimes of the objdir to be newer than the srcdir, I think you could do:

1) hg update baserev
2) unpack objdir for baserev
3) hg update tryrev

If each step sets the mtime of the files to the current time, then everything in the objdir is newer than the srcdir, except for the things that were touched by the new patches.

But as gps said, you still have to rely on incremental builds to work properly for this to save us any time. Otherwise you'd have an extra cycle there trying to figure out if your try push failed because of an incremental build issue, and then do another try push with the objdir caching disabled, etc.

If we could rely on incremental builds, then this approach might be faster than using sccache, since the latter still has to preprocess each source file and pull it separately. However, that is likely a ways off.
Flags: needinfo?(mshal)
And FWIW, I do think it is worthwhile that we actually benchmark a comparison between this approach & sccache when we could rely on it working properly.
Duplicate of this bug: 1234887
I'm not sure that transferring objdirs around is the best way to solve this - while storage is relatively cheap, the time to archive and transfer the object directories would be pretty significant.

Would the new 'artifact' builds be a better way of solving this in general?
sccache improvement would be a better generic way of solving this.
(Assignee)

Updated

a year ago
Component: Tools → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.