Closed Bug 431905 Opened 16 years ago Closed 14 years ago

Change build process to generate consistent BuildIDs

Categories

(Release Engineering :: General, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: joduinn)

References

Details

(Whiteboard: [automation])

Separating out from bug#

Currently, the buildid is generated someway into the build, and is based on
the current time when that part of the build process is reached. It sometimes,
but not always, represents the hour in which the build started. This means that
as builds on each O.S. runs at different speeds, its possible to start the
linux+mac+win32 builds all at the same time, and end up with these builds
containing different buildIDs, even though it might be expected they have the
same buildID.

Instead of calculating BuildID using current time when part-way through a build, how about calculating BuildID at the start of the build process? ...or calculating BuildID part-way through the build using a pre-determined value, so the speed of the build does not impact the value of the BuildID. It seems important to me that builds starting at the same time, using the same source repo cutoff, should have the same BuildID. 

One possible solution is discussed in:
https://bugzilla.mozilla.org/show_bug.cgi?id=431270#c2
Right, and I think using the checkout timestamp or the build start time as the build ID would be sane.
To split hairs, the current buildID does a pretty good job of capturing the start of the _compilation_ process. Tinderbox just does a bunch of I/O dependent tasks before that, so it isn't the same as the tinderbox-cycle start-time. 
Priority: -- → P3
Yeah, splitting hairs here is actually quite useful here. The underlying problem is that there are lots of different uses and understandings of "BuildID".

As best as I understand it (and please correct me if I'm missing something), we have three different understandings of time here:

1) cvs-timestamp-cutoff: 
This is the CVS cutoff or mercurial changeset. This is not currently recorded anywhere in the product, but adding this to application.ini was suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=431270#c6. 

2) build process start time: 
The time that tinderbox and buildbot display on waterfall for the start of the entire build process, including any steps before the start of compilation. This is not currently recorded anywhere in the product.

3) compile start time: 
The time that compilation starts within the overall build process. Different o.s. take different duration to reach this point, so we can have different BuildIDs for different o.s. even though the build processes were all started at the same time on all o.s. This is what the BuildID currently tracks, to the current hour.

I am proposing that BuildID be change to (2) " build process start time". 

This change would mean we can quickly identify corresponding builds started at the same time on all o.s. with the same BuildID. All release builds would have the same BuildID for example. Currently, if we are looking at the BuildID for a linux build, and want to find the corresponding win32 build, we have to do some manual detective work to figure it out. Thats annoying for Release folks, but doable. However, once we start using BuildID on the graphserver, developers/QA who see a change in linux build will have to jump through some hoops to the equivalent win32 build for comparison. Making this change would fix those problems.

Ted's suggestion in https://bugzilla.mozilla.org/show_bug.cgi?id=431270#c2 seems a small safe change to make. 

Does all that make sense? Can Ted go ahead with this change?
> I am proposing that BuildID be change to (2) " build process start time". 
> 
> This change would mean we can quickly identify corresponding builds started at
> the same time on all o.s. with the same BuildID. All release builds would have
> the same BuildID for example. 
What about nightlies?

I nightlies we don't have builds starting at the same time.

Two builds with different BuildIDs don't necessarily have to be different as long as between them there has been no commits. Are we keeping the "checkout timestamp" somewhere inside the app?

BuildId1 == BuildId2 ---> "the build process start time is the same"
BuildId1 != BuildId3 ---> "the build process start time is NOT the same" but they MIGHT have been built from the exact same source code. How can I (for instance) tell if they have been build from the same source code? Looking at the build logs in tinderbox or buildbot?
(In reply to comment #3)
> I am proposing that BuildID be change to (2) " build process start time". 

Did I understand all the comments in bug 431270 correctly, i.e. that the change Ted described in bug 431270 comment 2, coupled with the start-time/MOZ_CO_DATE stuff Nick described in bug 431270 comment 1, would have the effect of making the BuildID reflect both the build process start time *and* the cvs timestamp of the pulled code (whereas currently it does neither)?

If so, I think that's great (and would get you both 1 and 2 from John's list).  It's long been received wisdom in the QA world that you can take the BuildID and do regression range searches off of it and find the patches that landed.  It turns out that's not the case, though it's usually been close enough; however, if we have the chance to make it so (or closer, given the remaining hour/minute issue), that's a win.

(In reply to comment #4)
> I nightlies we don't have builds starting at the same time.
> 
> Two builds with different BuildIDs don't necessarily have to be different as
> long as between them there has been no commits. Are we keeping the "checkout
> timestamp" somewhere inside the app?

We're not, as comment 3 point 1 notes, but if we implement the proposed fix, (as I understand it), for all practical purposes, we won't need to.  When we get a full start time in the BuildID (bug 431270), we really won't need to (again assuming I've understood correctly).

> BuildId1 == BuildId2 ---> "the build process start time is the same"
> BuildId1 != BuildId3 ---> "the build process start time is NOT the same" but
> they MIGHT have been built from the exact same source code. How can I (for
> instance) tell if they have been build from the same source code? Looking at
> the build logs in tinderbox or buildbot?

You could use those, or you look in bonsai and see there are no checkins between the two timestamps, which is easier for most people.
Now that Firefox3.0 has shipped, and we've opened mozilla-central on hg, I'm revisiting this.

Armen is experimenting with Ted's suggestion in bug 431270 comment 2 about MOZ_BUILD_DATE. This would make all builds that are started at the same time each have the same BuildID, even if each of those builds reach make-platformini.py at slightly different times. This should make the builds have consistent BuildIDs, but stay tuned...
Removed blocking bug 434878 since we are adding SourceStamp in the application.ini which is what I need for L10n repackages, we do not need BuildID for anymore (unless I am missing anything)
No longer blocks: 434878
Component: Release Engineering → Release Engineering: Future
I think we want 1, even for the waterfall page.

The most useful thing on the waterfall is seeing what code was used for a build, not when which server started doing what.
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
We ran into a build id collision on February 27th when two builds started at slightly different times ended up with the same build id.

644d92ef8781 and 6d21a2262e80

both got build id 20100227132903

This could be very confusing if consumers of the binaries on ftp don't check to make sure the source stamp matches what they are expecting.
Whiteboard: [automations]
Whiteboard: [automations] → [automation]
Priority: P3 → P5
If we have a consistent build number/buildid for all builds in a build set, the consistency of directory names will follow.  Marking blocking.
Blocks: 449607, 538540
The only way this is going to be solved going forward is that a GUID (to just borrow that term) is generated and that needs to be passed everywhere.

Otherwise your going to be battling each piece
Assignee: nobody → joduinn
Catlee is doing work on this in "old" bug#570814. Dont want to DUP forward, so leaving open for tracking.
See Also: → 570814
Fixed by bug 570814.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.