Closed Bug 1399679 Opened 7 years ago Closed 7 years ago

Switch build environment from CentOS to Debian

Categories

(Firefox Build System :: General, enhancement)

enhancement
Not set
normal

Tracking

(firefox57 wontfix, firefox60 fixed)

RESOLVED FIXED
mozilla60
Tracking Status
firefox57 --- wontfix
firefox60 --- fixed

People

(Reporter: gps, Assigned: glandium)

References

(Blocks 2 open bugs)

Details

Attachments

(10 files)

We currently build Firefox on Linux using CentOS 6 as the base operating system.

Because Debian is investing a lot in reproducible builds and because I feel that most Linux developers are on Debian-based distros (like Ubuntu), I think it makes sense for us to transition the official build environment from CentOS to Debian. Plus, CentOS 6 is a bit long in the tooth and it is consuming a non-trivial amount of engineering effort to continue to work around its age. If we switch to a modern distro, we get a lot of nice things out-of-the-box and don't have to spend as much effort massaging the environment.

Since the build environment is controlled via Docker images whose population is defined in-tree, we can make this change and it will "ride the trains" just like everything else.

When we change the build environment, we may also change run-time dependencies. For example, newer distros have a newer glibc and this prevents binaries compiled on new distros from running on older ones. CentOS 6's libraries are sufficiently old and run in a lot of places.

glandium: could you please brain dump and/or set up bug dependencies to track what needs to be configured on Debian (presumably Stretch) to build Firefox so that produced binaries have acceptable binary compatibility with various Linux distributions?
Flags: needinfo?(mh+mozilla)
I was actually considering working on this next quarter.

> (presumably Stretch)

that's way too recent. The best we can do at the moment is wheezy. And that will allow us to drop the gtk3 tooltool package.
Flags: needinfo?(mh+mozilla)
What are the exact reasons we can't use stretch (or newer)? I assume it has something to do with library compatibility, including glibc. Could we compile appropriate versions of the run-time dependencies from source so "modern" dependencies don't leak into shipped builds? I'm thinking of something like having a shadow /usr in /mozilla or something. As long as configure is picking up the proper dependencies, we should be good, right?

I *really* don't want the general execution environment beholden to compatibility with the oldest distro we need run-time support for. i.e. I'd love if we could standardize on stretch (or later) so there is greater coherency across images and we don't have to manage configuring so many one-offs.
Assignee: nobody → mh+mozilla
(In reply to Gregory Szorc [:gps] (away until ~October 23) from comment #2)
> What are the exact reasons we can't use stretch (or newer)? I assume it has
> something to do with library compatibility, including glibc. Could we
> compile appropriate versions of the run-time dependencies from source so
> "modern" dependencies don't leak into shipped builds? I'm thinking of
> something like having a shadow /usr in /mozilla or something. As long as
> configure is picking up the proper dependencies, we should be good, right?

Maintaining that and ensuring it doesn't screw things up randomly in the future is much more work than maintaining a base image of an old debian, even if the latter means we need to backport a few packages (at least python, thanks to https://bz.mercurial-scm.org/show_bug.cgi?id=5710 ; as for mercurial, we already don't install it from packages).

A lower maintenance solution would be to use chroots of an older debian in a stretch image, but then you end up having to switch between running things on the host and running others in the chroot. Remember mock? Not really appealing.

FWIW, I built a wheezy based image by barely changing the dockerfile from the stretch-based android-build image.
Depends on: 1409276
Depends on: 1419269
Depends on: 1419638
Blocks: fastci
Depends on: 1426283
Depends on: 1426321
Depends on: 1426553
Depends on: 1426785
Depends on: 1427150
Depends on: 1427232
Depends on: 1427266
Depends on: 1427312
Depends on: 1427316
Depends on: 1427326
Depends on: 1427339
Depends on: 1427340
No longer depends on: 1409276
Depends on: 1427344
Depends on: 1427145
Depends on: 1429669
Depends on: 1429670
Depends on: 1430005
Depends on: 1393119
Depends on: 1430036
Depends on: 1430270
No longer depends on: 1393119
Depends on: 1430504
Depends on: 1430506
Blocks: 1431251
- Here's a push with the first patch, which does a raw comparison between builds on CentOS and builds on Debian:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9dff3d5f42beb1080642e481eb33c7ac6bdbc5bb

Those have large linux diffs (follow the diff.html links in the various "diff opt" jobs), and minimal macosx diffs, that only show a difference in buildconfig.html, showing that the path to GNU make changed, and that's expected. See the commit message of that patch for some of the tweaks that were done to reduce the raw diff size (e.g. stripping binaries).

- Here's a push with all the patches up to "Disable math inlines on Linux x86":
https://treeherder.mozilla.org/#/jobs?repo=try&revision=14ca230ca6105e67833fb2ff84aaa6640cf84bc5
Each of those commits explains what is being changed, and how they affect code. They are not meant to land, they are only there as a documentation of what differences switching the builds to Debian inevitably will do, allowing an assessment of the risk.

There again, follow the diff.html links in the various "diff opt" jobs. The macosx diff is the same, unsurprisingly, and the linux* diffs have the same buildconfig.html difference in the GNU make path, and build-id differences in all binaries.

- Here's a push with all the patches applied:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f616ed03ee0fea7a9441955c0c3a120a198034a1

It enabled all the jobs that are currently built with the desktop-build docker image. Considering the "Revert everything" patch, it's equivalent to only the last two patches being applied, the last one being the try selection. How the try selection was done is documented in the corresponding commit message (taking the full-task-graph.json from the mozilla-central changeset this is all based on).

As of writing there is only one job failed: osx searchfox idx, but AFAICT, it's been busted for a while. I expect everything else to go green (as you can expect, this is not the first time I pushed a full blown test, and it's been all green (except that searchfox job) multiple times already).

As noted in bug 1430506, I don't want to land this before central goes to 60.
Comment on attachment 8943536 [details]
Bug 1399679 - Use debian7-*-build instead of desktop-build.

https://reviewboard.mozilla.org/r/213886/#review220404

glandium: the work you've done to get us to this point is amazing. The Try pushes showing binary diffs and you showing the series of patches required to minimize those diffs is excellent, thorough engineering. Switching the official build environment from CentOS to Debian is not something that should be undertaken lightly. But your work demonstrates that very little actually changes as a result of that switch. Of course, you put in so much effort to reduce those differences to get us to this point. You deserve commendation for your work on this project.

This change is a massive step forward in making Firefox builds deterministic and reproducible. That's not only a marketable product feature that can improve the security and trust of Firefox, but it is also a property that will foster more effective build caching and make it easier for others to build Firefox like our automation does (by making the toolchains we use for Firefox builds readily reusable).

FWIW, one of the lesser-noticed improvements of this change is that the zstd compressed Docker image for linux64 is now ~177 MB instead of ~650 MB. That means faster task starts when the worker doesn't have the image cached.

mozilla-central is Firefox 60 as of a few hours ago. So you are clear to land this.

Review granted with delight. Congratulations on a project very well done.

::: taskcluster/ci/spidermonkey/linux.yml:44
(Diff revision 1)
>      treeherder:
>          symbol: SM-tc(rust)
>          tier: 2
>          platform: linux64/debug
> +    worker:
> +        docker-image: {in-tree: desktop-build}

It would be *really* nice to nuke the `desktop-build` image from the source repository. Would you mind filing a bug so we can track this?
Attachment #8943536 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #14)
> ::: taskcluster/ci/spidermonkey/linux.yml:44
> (Diff revision 1)
> >      treeherder:
> >          symbol: SM-tc(rust)
> >          tier: 2
> >          platform: linux64/debug
> > +    worker:
> > +        docker-image: {in-tree: desktop-build}
> 
> It would be *really* nice to nuke the `desktop-build` image from the source
> repository. Would you mind filing a bug so we can track this?

We unfortunately can't just yet. I have to file a series of bug for the steps required to do so, which is essentially move valgrind and mingw builds off of it (the valgrind builds don't use the desktop-build image, but use an image that is almost exactly it, except for a few details). There are a few things I want to do before to make that transition easier though, one of which is allow actually building derivative docker images without relying on the hub.
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8664edd863ff
Use debian7-*-build instead of desktop-build. r=gps
Oh, and the mozjs rust bindings, too because it uses cmake.
Comment on attachment 8944647 [details]
Bug 1399679 - Add a version string to cache names;

https://reviewboard.mozilla.org/r/214804/#review220424
Attachment #8944647 - Flags: review?(mh+mozilla) → review+
Pushed by gszorc@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e331a3b9fae2
Add a version string to cache names; r=glandium
Blocks: 1432392
Blocks: 1432395
Blocks: 1432397
Blocks: 1432398
https://hg.mozilla.org/mozilla-central/rev/8664edd863ff
https://hg.mozilla.org/mozilla-central/rev/e331a3b9fae2
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
(In reply to Gregory Szorc [:gps] from comment #14)
> Comment on attachment 8943536 [details]
> Bug 1399679 - Use debian7-*-build instead of desktop-build.
> 
> https://reviewboard.mozilla.org/r/213886/#review220404
> 
> glandium: the work you've done to get us to this point is amazing. The Try
> pushes showing binary diffs and you showing the series of patches required
> to minimize those diffs is excellent, thorough engineering. Switching the
> official build environment from CentOS to Debian is not something that
> should be undertaken lightly. But your work demonstrates that very little
> actually changes as a result of that switch. Of course, you put in so much
> effort to reduce those differences to get us to this point. You deserve
> commendation for your work on this project.
> 
> This change is a massive step forward in making Firefox builds deterministic
> and reproducible. That's not only a marketable product feature that can
> improve the security and trust of Firefox, but it is also a property that
> will foster more effective build caching and make it easier for others to
> build Firefox like our automation does (by making the toolchains we use for
> Firefox builds readily reusable).
> 
> FWIW, one of the lesser-noticed improvements of this change is that the zstd
> compressed Docker image for linux64 is now ~177 MB instead of ~650 MB. That
> means faster task starts when the worker doesn't have the image cached.
> 
> mozilla-central is Firefox 60 as of a few hours ago. So you are clear to
> land this.
> 
> Review granted with delight. Congratulations on a project very well done.

I just want to pile on here -- this has been a strong effort that I've watched with pleasure and learned from.  Great work!
Pushed by mozilla@hocat.ca:
https://hg.mozilla.org/comm-central/rev/5fad234e2e84
Use posix shell in comm-confvars.sh; r=me
https://hg.mozilla.org/comm-central/rev/32a25748090a
Port Bug 1399679: Use debian7-*-build instead of desktop-build; r=me
Blocks: 1433033
Depends on: 1433688
Depends on: 1433703
No longer depends on: 1433688
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: