Closed Bug 861573 Opened 12 years ago Closed 12 years ago

Find a way to acceptably deal with the use of external repos in b2g builds

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 868597

People

(Reporter: philor, Unassigned)

Details

b2g desktop builds apparently pull the tip of the gaia repo. That means that when someone pushes something to gaia which breaks the build, it's visible to sheriffs as "huh, I wonder how that mozilla-inbound push broke b2g? well, I'll back it out, since it did even though I don't know how." That's not acceptable for a visible-on-tbpl job. I think they used to build from a curated "gaia-nightly" repo; that worked. Before it landed in mozilla-central, Jetpack used to have a file in mozilla-central that said what curated revision of their repo to pull; that worked. Talos points to a zip built from a particular rev of the talos repo; that works. Having builds from the tip of an external repo, one that the sheriffs have no access to to back out bustage even if they could recognize it as the source of the bustage, one which doesn't show up in any way other than careful examination of the full build log, which is capable of breaking the build but committers to it apparently have no idea it does and do not watch the effects of their commits, is not acceptable, and so the b2g desktop builds on all trunk trees are now hidden, until we can find a better way.
I agree this is a big problem. Gaia's "nightly" branch wasn't a very good solution to this, as it required a lot of manual testing to promote changes from nightly to master. And even then, there's no guarantee that manual testing would be sufficient to catch all breaking changes. The only real solution would be to run all our automated tests against gaia-master as well as mozilla-central, then sheriffs could compare those views and determine if something in gaia had broken the build, or gecko. I believe something like this may be in the works. Clint and John would know more.
> Having builds from the tip of an external repo, one that the sheriffs have > no access to to back out bustage even if they could recognize it as the > source of the bustage, one which doesn't show up in any way other than > careful examination of the full build log gaia_revision does show up in the summary of each applicable build in TBPL doesn't it? Does comparing gaia_revision between two builds help at all for diagnosing bustage? I note that this problem is going to get much worse on trunk when we land bug 838321 since more than just gaia point to a branch name rather than a specific revision.
What about if we both build previous and current changesets? I know it is crazy but it simplifies the detection problem by increasing the load. A - build A + tests = Orange build B + tests = Orange (this indicates that an external element changed) B - build B + tests = Green build C + tests = Green C - build C + tests = Green This would nevertheless only help to catch issue but not tackle the control issue. Should we change how sources.xml (or something else) controls which external changes get built-in? I would be scared of having to run CI for external projects.
Mostly, we don't need to worry about the external projects much; the ones that are important are gecko, gaia, gonk-misc, platform_build, plus a handfull of other mozilla-b2g repos. If each time someone committed to one of those repos, we built + tested, using the previous sources.xml, except for a one-line difference corresponding to the one repo that had changed, we would get good coverage and would be able to determine which commit(s) in which repo(s) were responsible for new oranges. I think this is maybe what you were suggesting, armen.
The 'correct' solution here is treeherder; so I guess the question then becomes what we do until then...
(In reply to Chris AtLee [:catlee] from comment #2) > gaia_revision does show up in the summary of each applicable build in TBPL > doesn't it? Nope, it's shown for Panda and Unagi builds, but not for desktop b2g. I did look for it, but in the heat of battle I forgot that it doesn't apply to emulator builds, and that was where I looked for it after seeing that it wasn't shown on the desktop builds. And it might have helped me this time, but the set of sheriffs is actually the set of people with L3 commit access, and the set of affected people is the set of people with >= L1 access. We have professional sheriff coverage from 2am to 2pm, amateur sheriff coverage from around 6-8pm to midnight, and whoever-might-take-care-of-the-tree the rest of the time, and no coverage that helps people with no idea why their try push broke b2g Windows desktop. (Well, helped me other than the fact that having a build that we just hide when it gets broken externally, and wait for it to be fixed days later, doesn't please me much.)
No longer blocks: 861571
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.