Closed Bug 1189892 Opened 4 years ago Closed 4 years ago

Switch TaskCluster linux builds to use CentOS 6.<latest> instead of Ubuntu 14.04

Categories

(Release Engineering :: General, defect)

defect
Not set

Tracking

(firefox44 fixed)

RESOLVED FIXED
Tracking Status
firefox44 --- fixed

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(3 files, 1 obsolete file)

Per discussions on Linux ,our Linux compatibility promise is "back to CentOS 6", and the best way to guarantee that is to build on that platform.
Assignee: nobody → dustin
Blocks: 1187047
This will track the "latest" 6.x repos upstream, eventually with some kind of automatic bumping.  RHEL's support timeline is
  https://access.redhat.com/support/policy/updates/errata#Life_Cycle_Dates
which lists RHEL6 support extending through November 30, 2020 (although just for critical patches).  Likely after that time, public mirrors of CentOS 6 will start to be taken down, and we'll be unable to generate updated docker images.  But that's five years off!

Ideally we can avoid referring to "custom" repositories (mockbuild-repos) at all and just use upstream packages, with anything else we require brought in from tooltool.
For my reference, https://reviewboard.mozilla.org/r/7805/diff/1#6 was my last attempt at setting up a centos6 docker image.  We want to use mozbootstrap, though!
:glandium has just indicated that mozbootstrap for CentOS 6 is broken, and anyway tries to install a bunch of packages that should be installed from tooltool in automation, and fails to install other release requirements.

But, I had gotten quite a way toward running mozbootstrap, including some docker backflips to solve bug 1188851.  Here it is for my reference, in patch form.

I'm going to start over, building a simple shell script to install a list of packages gleaned from the mock configurations in puppet and mozharness, but trimmed of the obvious cruft, such as a slew of extra copies of gcc.
Notes to self on progress
 - need to figure out how best to install a newer hg (emailed gps)
 - need to install python2.7
I'd like to avoid building a custom RPM repository, since then getting things uploaded to that repo is a hurdle to updating versions.

I will try scripting out builds from source for these two items as part of the `docker build` process.
That worked; now

17:06:41     INFO -  checking for gtk+-3.0 >= 3.4.0 gtk+-unix-print-3.0 glib-2.0 gobject-2.0 ... Package freetype2 was not found in the pkg-config search path. Perhaps you should add the directory containing `freetype2.pc' to the PKG_CONFIG_PATH environment variable Package 'freetype2', required by 'cairo', not found
17:06:41     INFO -  configure: error: Library requirements (gtk+-3.0 >= 3.4.0 gtk+-unix-print-3.0 glib-2.0 gobject-2.0 ) not met; consider adjusting the PKG_CONFIG_PATH environment variable if your libraries are in a nonstandard prefix so pkg-config can find them.

Probably missing freetype-devel.
21:41:32     INFO -  checking for gtk+-3.0 >= 3.4.0 gtk+-unix-print-3.0 glib-2.0 gobject-2.0 ... Package libpng was not found in the pkg-config search path. Perhaps you should add the directory containing `libpng.pc' to the PKG_CONFIG_PATH environment variable Package 'libpng', required by 'cairo', not found

(sorry for the blow-by-blow -- with the long iteration time this is just the easiest way for me to keep state)
xrender.. I wonder if I have an old copy of the mock configs?
From http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1438942541/mozilla-inbound-linux64-bm71-build1-build935.txt.gz

03:18:07     INFO -  INFO: installing package(s): autoconf213 python mozilla-python27 zip mozilla-python27-mercurial git ccache perl-Test-Simple perl-Config-General yasm wget mpfr xorg-x11-font* imake gcc45_0moz3 gcc454_0moz1 gcc472_0moz1 gcc473_0moz1 yasm ccache valgrind dbus-x11 glibc-static libstdc++-static gtk2-devel libnotify-devel alsa-lib-devel libcurl-devel wireless-tools-devel libX11-devel libXt-devel mesa-libGL-devel gnome-vfs2-devel GConf2-devel gcc45_0moz3 gcc454_0moz1 gcc472_0moz1 gcc473_0moz1 yasm ccache pulseaudio-libs-devel gstreamer-devel gstreamer-plugins-base-devel freetype-2.3.11-6.el6_1.8.x86_64 freetype-devel-2.3.11-6.el6_1.8.x86_64

at a guess, libxrandr is a dependency of one of those.  I've transcribed this (except the wrong freetype version) into my script and I"m trying again.

Note to self: use tooltool to download the required RPMs (freetype, at least).
17:53:58     INFO -  checking for gstreamer-0.10 >= 0.10.25
17:53:58     INFO -                        gstreamer-app-0.10
17:53:58     INFO -                        gstreamer-plugins-base-0.10... configure: error: gstreamer and gstreamer-plugins-base development packages are needed to build gstreamer backend. Install them or disable gstreamer support with --disable-gstreamer

gstreamer-devel was included but not gstreamer-plugins-base-devel.  Another cycle :)
OK, that actually got past ./configure.

The patch so far is https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/rev/d49b122ea877.  Specifically, system-setup.sh is

----
#!/usr/bin/env bash

set -ve

test `whoami` == 'root'

# lots of goodies in EPEL
yum install -y epel-release

# this sometimes fails to fetch the EPEL metadata, so we repeat it
yum makecache || yum makecache

yum shell -y <<'EOF'
### This list is a cargo-cult holdover :(
### The packages listed here have been culled from the existing build process, as
### run by Buildbot.  Some may not be necessary, and some may be more easily
### expressed in a different way.

install diffutils
install findutils
install gawk
install gmp
install ppl
install cpp
install grep
install gzip
install info
install sed
install tar
install util-linux
install autoconf213
install yasm
install zlib-devel
install mpfr
install perl-Test-Simple
install perl-Config-General
# are these necessary?
install libstdc++
install libstdc++.i686
install zlib.i686
install xorg-x11-font*  # fonts required for PGO
install imake  # required for makedepend!?!
install valgrind
install dbus-x11
install libnotify-devel
install alsa-lib-devel
install libcurl-devel
install wireless-tools-devel
install libX11-devel
install libXt-devel
install mesa-libGL-devel
install gnome-vfs2-devel
install GConf2-devel
install pulseaudio-libs-devel
install gstreamer-devel
install gstreamer-plugins-base-devel

### differences from Buildbot builds

# This covers a bunch of requirements
groupinstall Base

# Prerequisites for GNOME that are not included in tooltool
install freetype  # TODO: need to pin this version, or is it in tooltool now?
install freetype-devel
install libpng-devel
install libXrandr-devel

# not in CentOS anyway
# install fedora-release

# only for Android; now from tooltool
# install java-1.7.0-openjdk-devel

# glibc is now installed from tooltool
# install glibc-static
# install glibc.i686

# gtk is now in tooltool as well

# Installed by the 'Core' group (included in the centos6 docker image)
# install bash
# install coreutils
# install cpio
# install shadow-utils

# Installed by the 'Base' group
# install bzip2
# install bc
# install openssh-clients
# install unzip
# install which
# install xz
# install zip

# Installed by the "Development Tools" group
# install gcc
# install gcc-c++
# install make
# install patch
# install redhat-rpm-config
# install rpm-build

# compilers are installed from tooltool
# install gcc472_0moz1
# install gcc473_0moz1

### stuff with a rationale

# required for the Python build, below
install zlib-devel
install bzip2-devel
install openssl-devel
install xz-libs

# required for the git build, below
install autoconf
install perl-ExtUtils-MakeMaker
install gettext-devel

# build utilities
install ccache

# a basic node environment so that we can run TaskCluster tools
install nodejs
install npm

run
EOF

BUILD=/root/build
mkdir $BUILD

# For a few packges, we want to run the very latest, which is hard to find
# for stable old CentOS 6.  So we build from source.

# Python
PYTHON_VERSION=2.7.10
cd $BUILD
wget https://www.python.org/ftp/python/2.7.10/Python-${PYTHON_VERSION}.tar.xz
tar -xf Python-${PYTHON_VERSION}.tar.xz
cd Python-${PYTHON_VERSION}
./configure --prefix=/usr
make
# `altinstall` means that /usr/bin/python still points to CentOS's Python 2.6 install.
# If you want Python 2.7, use `python2.7`
make altinstall

# Setuptools
SETUPTOOLS_VERSION=18.1
cd $BUILD
wget https://pypi.python.org/packages/source/s/setuptools/setuptools-${SETUPTOOLS_VERSION}.tar.gz
tar -xf setuptools-${SETUPTOOLS_VERSION}.tar.gz
cd setuptools-${SETUPTOOLS_VERSION}
python2.7 setup.py install

# Pip (latest)
cd $BUILD
curl https://raw.githubusercontent.com/pypa/pip/master/contrib/get-pip.py | python2.7 -

# Virtualenv (latest)
cd $BUILD
pip2.7 install virtualenv

# Git
GIT_VERSION=2.5.0
cd $BUILD
wget https://www.kernel.org/pub/software/scm/git/git-${GIT_VERSION}.tar.xz
tar -xf git-${GIT_VERSION}.tar.xz
cd git-${GIT_VERSION}
make configure
./configure --prefix=/usr --without-tcltk
make all install
git config --global user.email "nobody@mozilla.com"
git config --global user.name "mozilla"

# Mercurial
MERCURIAL_VERSION=3.5
cd $BUILD
pip2.7 install mercurial==${MERCURIAL_VERSION}

# TC-VCS
npm install -g taskcluster-vcs@2.3.6

# clean up caches from all that downloading and building
cd /
rm -rf $BUILD ~/.ccache ~/.cache ~/.npm

# note that TC will replace workspace with a cache mount; there's no sense
# creating anything inside there
mkdir -p /home/worker/workspace
chown worker:worker /home/worker/workspace

# /builds is *not* replaced with a mount in the docker container. The worker
# user writes to lots of subdirectories, though, so it's owned by that user
mkdir -p /builds
chown worker:worker /builds

rm /tmp/system-setup.sh
----

That ended up with

21:37:11     INFO -  process 21700: D-Bus library appears to be incorrectly set up; failed to read machine uuid: Failed 
to open "/var/lib/dbus/machine-id": No such file or directory   

That's about the last I'll be able to do before PTO (until Aug 20).
No longer blocks: 1185643
With the addition of uuid creation, I now get

17:20:22     INFO -  /home/worker/workspace/build/src/obj-firefox/_virtualenv/bin/python /home/worker/workspace/build/src/build/gen_mach_buildprops.py --complete-mar-file dist/update/firefox-43.0a1.en-US.linux-x86_64.complete.mar  --upload-properties dist/upload-properties.json
17:20:22     INFO -  Traceback (most recent call last):
17:20:22     INFO -    File "/home/worker/workspace/build/src/build/gen_mach_buildprops.py", line 65, in <module>
17:20:22     INFO -      with open(args.upload_properties) as f:
17:20:22     INFO -  IOError: [Errno 2] No such file or directory: 'dist/upload-properties.json'
17:20:22     INFO -  gmake: *** [automation/build] Error 1

which may or may not be part of this bug.  It could also be something added to the tree or mozharness recently.
Indeed, that's bug 1197293.  So to the extent that anything's working in TC right now, this works.
For the record, some of the files that are present in the mock configs which aren't present in the new docker image:

# not in CentOS anyway
# install fedora-release

# only for Android; now from tooltool
# install java-1.7.0-openjdk-devel

# glibc is now installed from tooltool
# install glibc-static
# install glibc.i686

# gtk is now in tooltool as well

# Installed by the 'Core' group (included in the centos6 docker image)
# install bash
# install coreutils
# install cpio
# install shadow-utils

# Installed by the 'Base' group
# install bzip2
# install bc
# install openssh-clients
# install unzip
# install which
# install xz
# install zip

# Installed by the "Development Tools" group
# install gcc
# install gcc-c++
# install make
# install patch
# install redhat-rpm-config
# install rpm-build

# compilers are installed from tooltool
# install gcc472_0moz1
# install gcc473_0moz1
Bug 1189892: build on CentOS 6.<latest>; r?glandium r?mrrrgn

Introduces a centos6-builder images and refactors desktop-build to use it.
Attachment #8651160 - Flags: review?(winter2718)
Attachment #8651160 - Flags: review?(mh+mozilla)
(note that this only fixes the 64-bit version -- we can replicate for 32-bit once 64-bit is green)
Comment on attachment 8644027 [details] [diff] [review]
mozbootstrap-approach.patch

Review of attachment 8644027 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for the namespacing (-c6-). My only complaint is that I'm still listed as the maintainer. ^.^
Attachment #8644027 - Flags: review+
Attachment #8644027 - Flags: review+
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

I plussed the wrong patch before.
Attachment #8651160 - Flags: review?(winter2718) → review+
Keywords: leave-open
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

Bug 1189892: build on CentOS 6.<latest>; r?glandium r=mrrrgn

Introduces a centos6-builder images and refactors desktop-build to use it.
Attachment #8651160 - Attachment description: MozReview Request: Bug 1189892: build on CentOS 6.<latest>; r?glandium r?mrrrgn → MozReview Request: Bug 1189892: build on CentOS 6.<latest>; r?glandium r=mrrrgn
Attachment #8651160 - Flags: review+
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

https://reviewboard.mozilla.org/r/16857/#review15195

::: testing/docker/centos6-build/system-setup.sh:17
(Diff revision 2)
> +install diffutils

I don't think we use diffutils provided tools.

::: testing/docker/centos6-build/system-setup.sh:25
(Diff revision 2)
> +install info

I doubt we need info.

::: testing/docker/centos6-build/system-setup.sh:30
(Diff revision 2)
> +install yasm

IIRC, the version of yasm in centos 6 is not enough.

::: testing/docker/centos6-build/system-setup.sh:32
(Diff revision 2)
> +install mpfr

I'm dubious about mpfr and gmp being required.

::: testing/docker/centos6-build/system-setup.sh:35
(Diff revision 2)
> +install zlib.i686

I doubt you need this.

::: testing/docker/centos6-build/system-setup.sh:37
(Diff revision 2)
> +install imake  # required for makedepend!?!

This is probably not needed anymore.

::: testing/docker/centos6-build/system-setup.sh:38
(Diff revision 2)
> +install valgrind

I'm pretty sure the valgrind we install is not the version in centos 6.

::: testing/docker/centos6-build/system-setup.sh:49
(Diff revision 2)
> +install gnome-vfs2-devel

You can remove gnome-vfs2-devel, we've not required it for a while.

::: testing/docker/centos6-build/system-setup.sh:55
(Diff revision 2)
> +# may not be necessary, if these come from tooltool

They are.

::: testing/docker/centos6-build/system-setup.sh:60
(Diff revision 2)
> +install freetype  # TODO: need to pin this version, or is it in tooltool now?

you shouldn't need freetype when you install freetype-devel.
(freetype is not in tooltool)

::: testing/docker/centos6-build/system-setup.sh:65
(Diff revision 2)
> +# required for the Python build, below

I'm not a big fan of installing development things to build stuff that will be installed, and leaving them there. Why not build packages for python and git, and put them in the releng centos repo (where they, in fact, already are)?

::: testing/docker/centos6-build/system-setup.sh:66
(Diff revision 2)
> +install zlib-devel

You already have zlib-devel above

::: testing/docker/centos6-build/system-setup.sh:95
(Diff revision 2)
> +wget https://www.python.org/ftp/python/2.7.10/Python-${PYTHON_VERSION}.tar.xz

Please avoid downloading tarballs from non-mozilla property.

::: testing/docker/centos6-build/system-setup.sh:114
(Diff revision 2)
> +curl https://raw.githubusercontent.com/pypa/pip/master/contrib/get-pip.py | python2.7 -

huh

::: testing/docker/centos6-build/system-setup.sh:129
(Diff revision 2)
> +git config --global user.email "nobody@mozilla.com"
> +git config --global user.name "mozilla"

This shouldn't be needed unless we plan to commit things from this environment.

::: testing/docker/centos6-build/system-setup.sh:138
(Diff revision 2)
> +npm install -g taskcluster-vcs@2.3.6

is it wise to have tc-vcs, mercurial, and maybe git require docker image updates whenever we need to update them?

::: testing/docker/centos6-build/system-setup.sh:154
(Diff revision 2)
> +rm /tmp/system-setup.sh

rm $0 ?

::: testing/docker/desktop-build/Dockerfile:26
(Diff revision 2)
> +RUN chmod +x /builds/tooltool.py

Same remark as for tc-vcs.
Attachment #8651160 - Flags: review?(mh+mozilla)
Thanks for the updates on what is and is not required - I'll update accordingly.

If at all possible, I'd like to avoid hosting any custom yum repositories.  Including a repository in the image-building process means that we have binary blobs of usually-unknown provenance involved, in an un-version-controlled environment (no history), with the risk that just adding a new package can cause existing processes to break.  Such a repo also makes it difficult to self-serve updates to the docker images, as the updates require write access to the yum repo (with attendant risks).  It's been kind of a nightmare so far!

As for rebuilding the docker image to upgrade anything -- yes, I expect docker images to be rebuilt roughly weekly, and more frequently when needed.  That achieves the goal of keeping images up-to-date with upstream updates, while allowing older jobs to be re-run in correspondingly old images.  We will eventually have automation in place so that a change to testing/docker/<imgname>/VERSION triggers a build of the new version of <imgname> before executing tasks that will depend on that new image, so common tasks like upgrading mercurial will be a one-liner.

As for downloading tarballs hosted outside Mozilla, there are three reasons not to do it, and one reason to do it:
 1. Possibility that the remote tarball is corrupted, trojaning our image
 2. Possibility that the remote tarball is unavailable, preventing building new images
 3. Possibility that the file disappears forever, rendering us unable to rebuild the image
vs.
 A. Allows self-serve changes to the docker image specification without access to some protected file storage (similar to the reasoning about yum repositories above)

I'll also note that the docker images themselves are not hosted at Mozilla, so requiring that the sources for those images be hosted at Mozilla seems to be asking a lot.  And finally, the upstream yum repositories we're using are not hosted locally (and for many of the reasons above, plus the nuisance of constantly re-mirroring, we don't want to mirror those locally).

We can fix #1 easily enough by verifying a hash embedded in the source file.  #2 isn't particularly problematic, as we can continue to build with the existing image.  #3 may be an issue, although it's unlikely for the tarballs used so far.

Probably the best fix is just to use tooltool to store and download these tarballs -- it satisfies 1, 2, 3, and A.  So I'll take that approach.

I can remove the packages required for the Python, git, etc. builds after those builds take place.

I believe that the build process *does* perform git commits, although I don't know for sure.  Defaults don't hurt.
A few other things to add:

 * Xvfb and xvinfo to support make test and PGO builds
 * a bit more cleanup -- `yum clean all` saves 240MB
 * installing Python stuff using peep, which gets the download verification we want
Hm, with Xvfb and xvinfo, I get a different 'make check' failure:

https://tools.taskcluster.net/task-inspector/#bpAoT6V9QFS1vpKa9BXgEg/0
22:33:43     INFO - TEST-START | GeckoMediaPlugins.GMPTestCodec
22:33:43     INFO - [28858] WARNING: pipe error (31): Connection reset by peer: file /home/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 459
22:38:43  WARNING - gtest TEST-UNEXPECTED-FAIL | gtest | timed out after 1200 seconds

This rings a bell -- I remember chasing down failures in GMP before.. ah, right, bug 1162965.
Sadly, peep is currently broken:
  https://github.com/erikrose/peep/pull/94
what a mess :(
Peep is working now.  And the GMP tests pass on my system, which makes reproduction pretty difficult :(

Also:

17:30:36     INFO -   File "/home/worker/workspace/build/src/build/mach_bootstrap.py", line 356, in __call__
17:30:36     INFO -     module = self._original_import(name, globals, locals, fromlist, level)
17:30:36     INFO - ImportError: No module named _sqlite3

so the Python build needs some work.
FWIW, I don't think updated docker images every week is a practical approach from the perspective of making those images usable by local developers, especially for bisections that will spread across multiple weeks.
It's also probably not really nice for try, where people will usually push trees with various freshness, making build slaves need to clone various different images.
The docker images used are tied to the tree (they're named and specified in-tree) so bisection should be straightforward, and may determine that the image regeneration was at fault just as it might determine that a particular cset was at fault.  Cloning different images is unfortunate, but ultimately not that problematic (the images are cached).  So basically, I don't see an issue here, and the alternative (never updating build images) is not a mistake I am interested in repeating.

I'm still working on chasing down the GeckoMediaPlugins error.  So far, we have realpath(3) failing with EACCES on a fully qualified path to plugin-container within the objdir.
(In reply to Dustin J. Mitchell [:dustin] from comment #29)
> I'm still working on chasing down the GeckoMediaPlugins error.  So far, we
> have realpath(3) failing with EACCES on a fully qualified path to
> plugin-container within the objdir.

…because, ironically, it was running as root instead of the "worker" user.

The plugin-container unshares the user namespace early in startup (see the call to SandboxEarlyInit in plugin-container.cpp), which implicitly drops all capabilities it may have had in the outer namespace.  In this case, that means that when the process tries to realpath() itself, it accesses the filesystem as uid 0 without any of the usual "superuser" overrides.  /home/worker is owned by "worker" and mode 0700; root is not the owner, so it has no permission to access anything inside there.

It turns out that MOZ_DISABLE_GMP_SANDBOX doesn't disable that part of sandboxing, which made debugging unnecessarily confusing (and is not the intended behavior!); I've filed bug 1199413.

And this doesn't happen if the kernel doesn't support user namespaces, which is why Dustin couldn't reproduce this locally (and why it *does* happen in a CentOS 6 container hosted on Ubuntu 14.04, even though actual CentOS 6 doesn't support any of our sandboxing).
Depends on: 1199379
(In reply to Dustin J. Mitchell [:dustin] from comment #29)
> The docker images used are tied to the tree (they're named and specified
> in-tree) so bisection should be straightforward, and may determine that the
> image regeneration was at fault just as it might determine that a particular
> cset was at fault.  Cloning different images is unfortunate, but ultimately
> not that problematic (the images are cached).

That's exactly the problem. Images are big. People will have to download dozens of gigabytes just to a bisect over a few weeks. And will have to download a new image every week, regardless of bisection. Well, that won't /have to/, they thankfully will still be able to build without the images, but docker images being advertised as something that's supposed to make their lives easier...

> So basically, I don't see an issue here, and the alternative (never updating
> build images) is not a mistake I am interested in repeating.

Surely, there has to be a sweet spot between never updating and updating every day. Every week doesn't seem to be that.
This bug probably isn't the right place for this particular conversation.
It's not the right place for this discussion, but this discussion has an impact on this bug.
OK! A successful build with a CentOS 6 image
  https://tools.taskcluster.net/task-inspector/#QDmERbcFS52R67_6YmNgxA/0

The job didn't handle artifacts correctly, but that's another bug.  I may hack around that temporarily in try just so I can get a working binary out to smoke-test on my laptop.

Remaining clean-up:

  install yasm            # probably need a newer/pinned version
  install valgrind        # probably need a newer/pinned version
  install freetype-devel  # probably need an older/pinned version

for all three, I will put the relevant RPMs (from the repos used for mock) into tooltool and install them directly, avoiding all of the horrors of yum repositories.
(In reply to Dustin J. Mitchell [:dustin] from comment #34)
> The job didn't handle artifacts correctly, but that's another bug.  I may
> hack around that temporarily in try just so I can get a working binary out
> to smoke-test on my laptop.

There's bug 1198179 on part of that being broken, but I don't know what's up with the x-test stuff.
No longer depends on: 1199379
Great success with the RPMs in tooltool.  I'll squash everything down and put it up for review again.
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

Bug 1189892: build on CentOS 6.<latest>; r?glandium r=mrrrgn

Introduces a centos6-builder images and refactors desktop-build to use it.
Attachment #8651160 - Flags: review?(winter2718)
Attachment #8651160 - Flags: review?(mh+mozilla)
Attachment #8651160 - Flags: review?(winter2718)
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

https://reviewboard.mozilla.org/r/16857/#review16037

::: testing/docker/centos6-build/system-setup.sh:87
(Diff revision 3)
> +# For a few packges, we want to run the very latest, which is hard to find for

Mi gusta !
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

https://reviewboard.mozilla.org/r/16857/#review16039

Dang, looks good. Dig the use of tooltool + rpms.
Attachment #8651160 - Flags: review+
Attachment #8651160 - Flags: review?(mh+mozilla) → review+
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

https://reviewboard.mozilla.org/r/16857/#review16249

The previous comments about the frequency of image updates still apply (and I'm not particularly fond of the tooltool stuff being installed as part of the image as such) but meh.

::: testing/docker/centos6-build/system-setup.sh:96
(Diff revisions 2 - 3)
> +tooltool_fetch <<'EOF'

Those single quotes are not necessary. First time I've seen someone use quotes in this construct. Not that it doesn't work, just that it's not idiomatic shell imho.

::: testing/docker/centos6-build/system-setup.sh:102
(Diff revisions 2 - 3)
> +"visibility": "public",

you shouldn't need to specify the visibility in the manifest.

::: testing/docker/centos6-build/system-setup.sh:140
(Diff revisions 2 - 3)
> +"filename": "yasm-1.1.0-1.x86_64.rpm"

Any reason yasm and valgrind are in separate tooltool invocations?

::: testing/docker/centos6-build/system-setup.sh:203
(Diff revisions 2 - 3)
>  make

Why not build rpms for python and git and install them the same way as the others?

::: testing/docker/centos6-build/system-setup.sh:290
(Diff revisions 2 - 3)
> +rm -rf $BUILD ~/.ccache ~/.cache ~/.npm

Note that if ccache is being used during the docker image build and actively fills ~/.ccache, that seems like a waste of resources and you might as well explicitly disable ccache.

::: testing/docker/centos6-build/system-setup.sh:293
(Diff revisions 2 - 3)
>  rm /tmp/system-setup.sh

rm $0 ?
I dropped the ball on the image updates, sorry -- I'd like to get you and other build-team members more involved in the wider discussion, rather than just you and me.  That decision is orthogonal to this patch, anyway.

The single quotes avoid variable substitution in the heredoc body.  Not necessary here, but good practice in general.

I used the RPMs from tooltool, rather than a from-source build, because that's what was available.  The build instructions for all of those aren't in the tree, so it's not clear how they were built.  We have SRPMs for valgrind and freetype, so those are probably reproducible but to figure out how they were built requires some extra work  And there's no SRPM for yasm, so I really have *no* idea how that was built.  In general, I'd prefer an in-tree build script because it means that changes can be easily tested out in try and the provenance of the binary is clear.

The separate tooltool invocations don't really make a difference -- tooltool downloads sequentially, anyway -- but it might read better with those rearranged (and with consistent JSON indentation!)

I didn't do anything to *enable* ccache, so without digging I don't know how to disable it -- got a quick hint?
OK, that wasn't much digging:

export CCACHE_DISABLE=1
It remains to fix up linux32 and asan builds here, but I'm going to put that on hold until we can get linux64 to tier 2.
As I was embarking on the construction of a 32-bit centos 6 image, catlee pointed out that we no longer use the mock-mozilla-i386 mock environment, preferring to build in the 64-bit environment with multilib packages installed.
I rebuilt the centos6-build image to include those multilib packages, per https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/configs/builds/releng_base_linux_32_builds.py#37

The good news is, the Linux64 builds continue to work, so using a single image is still a possibility.

The bad news is, the Linux32 builds failed:

22:01:07     INFO - Calling ['python2.7', 'mach', '--log-no-times', 'build', '-v'] with output_timeout 2400
22:01:07     INFO -  Error loading mozconfig: /home/worker/workspace/build/src/.mozconfig
22:01:07     INFO -  Evaluation of your mozconfig exited with an error. This could be triggered
22:01:07     INFO -  by a command inside your mozconfig failing. Please change your mozconfig
22:01:07     INFO -  to not error and/or to catch errors in executed commands.
22:01:07     INFO -  mozconfig output:
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-update-channel=
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-update-packaging
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --with-google-api-keyfile=/builds/gapi.data
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --with-google-oauth-api-keyfile=/builds/google-oauth-api.key
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --with-mozilla-api-keyfile=/builds/mozilla-desktop-geoloc-api.key
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  AUTOCLOBBER=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-crashreporter
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-release
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_BUILD_SYMBOLS=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_L10N_CHECK=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_PACKAGE=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_PACKAGE_TESTS=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_INSTALLER=0
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_UPDATE_PACKAGING=0
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_UPLOAD=1
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_UPLOAD_SYMBOLS=0
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export MOZ_AUTOMATION_SDK=0
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  PATH=/home/worker/workspace/build/src/gcc/bin:/tools/buildbot/bin:/usr/local/bin:/usr/lib/ccache:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/tools/git/bin:/tools/python27/bin:/tools/python27-mercurial/bin:/home/cltbld/bin
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-elf-hack
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-stdcxx-compat
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_AC_OPTION
22:01:07     INFO -  --enable-default-toolkit=cairo-gtk3
22:01:07     INFO -  ------END_AC_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export FONTCONFIG_PATH=/home/worker/workspace/build/src/gtk3/usr/local/etc/fonts
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export PANGO_SYSCONFDIR=/home/worker/workspace/build/src/gtk3/usr/local/etc
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export PANGO_LIBDIR=/home/worker/workspace/build/src/gtk3/usr/local/lib
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export GDK_PIXBUF_MODULE_FILE=/home/worker/workspace/build/src/gtk3/usr/local/lib/gdk-pixbuf-2.0/2.10.0/loaders.cache
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export GDK_PIXBUF_MODULEDIR=/home/worker/workspace/build/src/gtk3/usr/local/lib/gdk-pixbuf-2.0/2.10.0/loaders
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ------BEGIN_MK_OPTION
22:01:07     INFO -  export LD_LIBRARY_PATH=/home/worker/workspace/build/src/gtk3/usr/local/lib
22:01:07     INFO -  ------END_MK_OPTION
22:01:07     INFO -  ./usr/local/bin/pango-querymodules: error while loading shared libraries: libXft.so.2: cannot open shared object file: No such file or directory
22:01:07     INFO -  ./usr/local/bin/gdk-pixbuf-query-loaders: error while loading shared libraries: libpng12.so.0: cannot open shared object file: No such file or directory
22:01:07     INFO -  ./usr/local/bin/fc-cache: error while loading shared libraries: libfontconfig.so.1: cannot open shared object file: No such file or directory
22:01:07    ERROR - Return code: 1
22:01:07  WARNING - setting return code to 2
22:01:07    FATAL - 'mach build' did not run successfully. Please check log for errors.
Is that build using the same gtk3 package as the 64-bit build?
It shouldn't be -- it's using the 32-bit mozharness script, which points to
  /home/worker/workspace/build/src/browser/config/tooltool-manifests/linux32/releng.manifest
You're likely missing a few 32-bits libraries, specifically, from the log, at least libXtf, libpng and libfontconfig.
There are some conflicts over freetype version.

From a Buildbot 32-bit build:

09:18:28     INFO -   fontconfig              i686   2.8.0-3.el6         centos6               186 k
09:18:28     INFO -   fontconfig              x86_64 2.8.0-3.el6         centos6               186 k
09:18:28     INFO -   fontconfig-devel        i686   2.8.0-3.el6         centos6               209 k

09:18:28     INFO -   libXft                  i686   2.1.13-4.1.el6      centos6                48 k
09:18:28     INFO -   libXft                  x86_64 2.1.13-4.1.el6      centos6                49 k
09:18:28     INFO -   libXft-devel            i686   2.1.13-4.1.el6      centos6                18 k

09:18:28     INFO -   freetype                i686   2.3.11-6.el6_1.8    centos6-updates       362 k
09:18:28     INFO -   freetype                x86_64 2.3.11-6.el6_1.8    centos6-updates       358 k
09:18:28     INFO -   freetype-devel          i686   2.3.11-6.el6_1.8    centos6-updates       364 k

09:18:28     INFO -   libpng                  i686   2:1.2.46-2.el6_2    centos6-updates       184 k
09:18:28     INFO -   libpng                  x86_64 2:1.2.46-2.el6_2    centos6-updates       181 k
09:18:28     INFO -   libpng-devel            i686   2:1.2.46-2.el6_2    centos6-updates       112 k

From a 64-bit build of the same rev:

09:21:29     INFO -   fontconfig              x86_64 2.8.0-3.el6         centos6               186 k
09:21:29     INFO -   fontconfig-devel        x86_64 2.8.0-3.el6         centos6               209 k

09:21:29     INFO -   libXft                  x86_64 2.1.13-4.1.el6      centos6                49 k
09:21:29     INFO -   libXft-devel            x86_64 2.1.13-4.1.el6      centos6                18 k

09:21:29     INFO -   freetype                x86_64 2.3.11-6.el6_1.8    centos6-updates       358 k
09:21:29     INFO -   freetype-devel          x86_64 2.3.11-6.el6_1.8    centos6-updates       364 k

09:21:29     INFO -   libpng                  x86_64 2:1.2.46-2.el6_2    centos6-updates       181 k
09:21:29     INFO -   libpng-devel            x86_64 2:1.2.46-2.el6_2    centos6-updates       112 k

Yet when installing freetype from tooltool:

Setting up Install Process
Examining freetype-2.3.11-6.el6_1.8.i686.rpm: freetype-2.3.11-6.el6_1.8.i686
freetype-2.3.11-6.el6_1.8.i686.rpm: does not update installed package.
freetype-2.3.11-6.el6_1.8.i686.rpm: does not update installed package.
Examining freetype-2.4.12-6.el6.1.x86_64.rpm: freetype-2.4.12-6.el6.1.x86_64
Marking freetype-2.4.12-6.el6.1.x86_64.rpm as an update to freetype-2.3.11-15.el6_6.1.x86_64
Loading mirror speeds from cached hostfile
 * base: centos-mirror.jchost.net
 * epel: mirror.cogentco.com
 * extras: centos.mirror.constant.com
 * updates: mirrors.advancedhosters.com
Marking freetype-2.4.12-6.el6.1.x86_64.rpm as an update to freetype-2.3.11-15.el6_6.1.i686
Examining freetype-devel-2.3.11-6.el6_1.8.i686.rpm: freetype-devel-2.3.11-6.el6_1.8.i686
freetype-devel-2.3.11-6.el6_1.8.i686.rpm: does not update installed package.
Examining freetype-devel-2.4.12-6.el6.1.x86_64.rpm: freetype-devel-2.4.12-6.el6.1.x86_64
Marking freetype-devel-2.4.12-6.el6.1.x86_64.rpm as an update to freetype-devel-2.3.11-15.el6_6.1.x86_64
Resolving Dependencies
--> Running transaction check
---> Package freetype.x86_64 0:2.3.11-15.el6_6.1 will be updated
---> Package freetype.x86_64 0:2.4.12-6.el6.1 will be an update
---> Package freetype-devel.x86_64 0:2.3.11-15.el6_6.1 will be updated
---> Package freetype-devel.x86_64 0:2.4.12-6.el6.1 will be an update
--> Finished Dependency Resolution
Error:  Multilib version problems found. This often means that the root
       cause is something else and multilib version checking is just
       pointing out that there is a problem. Eg.:
       
         1. You have an upgrade for freetype which is missing some
            dependency that another package requires. Yum is trying to
            solve this by installing an older version of freetype of the
            different architecture. If you exclude the bad architecture
            yum will tell you what the root cause is (which package
            requires what). You can try redoing the upgrade with
            --exclude freetype.otherarch ... this should give you an error
            message showing the root cause of the problem.
       
         2. You have multiple architectures of freetype installed, but
            yum can only see an upgrade for one of those arcitectures.
            If you don't want/need both architectures anymore then you
            can remove the one with the missing update and everything
            will work.
       
         3. You have duplicate versions of freetype installed already.
            You can use "yum check" to get yum show these errors.
       
       ...you can also use --setopt=protected_multilib=false to remove
       this checking, however this is almost never the correct thing to
       do as something else is very likely to go wrong (often causing
       much more problems).
       
       Protected multilib versions: freetype-2.4.12-6.el6.1.x86_64 != freetype-2.3.11-15.el6_6.1.i686
So we need to be using 2.3.11 on both architectures, not 2.4.12.  Oops!
And CentOS 6.7 ships with 2.3.11-15, which is probably close enough to 2.3.11-6 for our purposes.  Mike, can you confirm?
Flags: needinfo?(mh+mozilla)
Putting that many multilib packages on the system seems to make "normal" compliation a lot harder.  I managed to make Git and Python build by installing glibc-devel.x86_64.  NPM builds were failing due to the lack of libstdc++-devel.x86_64.

With that in place, I pushed to try:
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=de0aa47e9fcd&exclusion_profile=false
Any version of 2.3 should work.
Flags: needinfo?(mh+mozilla)
That failed with a missing Xrender on the Linux64 build, but success for Linux32.

[root@taskcluster-worker ~]# rpm -qa | grep Xrender
libXrender-devel-0.9.8-2.1.el6.i686
libXrender-0.9.8-2.1.el6.x86_64
libXrender-0.9.8-2.1.el6.i686

so we need to install that explicitly.
Well, that's an odd failure:

19:35:00     INFO -  gmake[5]: Entering directory `/home/worker/workspace/build/src/obj-firefox/addon-sdk'
19:35:00     INFO -  mkdir -p '/home/worker/workspace/build/src/obj-firefox/addon-sdk/source/test/addons/'
19:35:00     INFO -  rm -f '../../dist/bin/modules/services-sync/constants.js'
19:35:00     INFO -  mkdir: cannot create directory `/home/worker/workspace/build/src/obj-firefox/addon-sdk/source/test/addons/': No such file or directory

It's possible I just pulled a bad repo from mozilla-incoming

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1ba5d6ac40a4
Attachment #8651160 - Attachment description: MozReview Request: Bug 1189892: build on CentOS 6.<latest>; r?glandium r=mrrrgn → MozReview Request: Bug 1189892: add linux32 support; r?glandium
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

Bug 1189892: add linux32 support; r?glandium

Add new tasks for the "Linux" platform.  These run on the same docker image as
the Linux64 builds, but that image has been modified to contain a bunch of
*.i686 packages required to cross-compile for i686.  Due to yum's propensity
for resolving dependencies without regard to architecture, with this patch the
system-setup.sh script lists both architectures of each file explicitly.
Comment on attachment 8651160 [details]
MozReview Request: Bug 1189892: add linux32 support; r?glandium

Bug 1189892: add linux32 support; r?glandium

Add new tasks for the "Linux" platform.  These run on the same docker image as
the Linux64 builds, but that image has been modified to contain a bunch of
*.i686 packages required to cross-compile for i686.  Due to yum's propensity
for resolving dependencies without regard to architecture, with this patch the
system-setup.sh script lists both architectures of each file explicitly.

This also leaves `gcc` installed for user convenience in installing Python
extensions, NPM modules, etc.
Blocks: 1208031
I made one last push to make sure this didn't kill Mac OS X 64:
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=b6207cce0b9a&exclusion_profile=false
'course, the decision task hasn't run yet...
Blocks: 1208029
Looks like mozreview didn't recognize this as a distinct patch, and didn't give any in-bug indication that it required further review :(

This is leaving a tier-2 job (android) orange, and blocking ehsan's work on bug 1208029, so I'm redirecting r? to ted in hopes we can land this before the weekend.
Attachment #8651160 - Attachment is obsolete: true
Attachment #8665909 - Flags: review?(ted)
Duplicate of this bug: 1208029
Comment on attachment 8665909 [details] [diff] [review]
bug1189892-linux32.patch

Review of attachment 8665909 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/docker/centos6-build/system-setup.sh
@@ +31,5 @@
> +install xorg-x11-font*
> +
> +# lots of required packages that we build against.  We need the i686 and x86_64
> +# versions of each, along with -devel packages, and yum does a poor job of
> +# figuring out the interdependencies so we list all four.

Boy, that sure is terrible. Does yum not at least support specifying these like `alsa-lib-devel.{i686,x86_64}`?

@@ +436,5 @@
>  chown worker:worker /builds
>  
>  # remove packages installed for the builds above
>  yum shell -y <<'EOF'
> +remove gcc

Your patch description claims "leaves gcc installed", what's up with this?

::: testing/docker/desktop-build/REGISTRY
@@ +1,1 @@
> +quay.io/djmitche

Intentional?
Attachment #8665909 - Flags: review?(ted) → review+
Sadly, no:
  No package zlib.{i686,x86_64} available.

The gcc bit was a merge error of some sort -- thanks for spotting that!

And no, the use of quay.io was only for testing.
I made some errors in the last push, around task configuration.  My bad, but this stuff is rather non-obvious!

This:
 - adds the 'linux' platform to `testing/taskcluster/tasks/branches/base_jobs.yml` so that it exists for all branches
 - removes the redundant `linux64` and `macosx64` entries in `testing/taskcluster/tasks/branches/try/job_flags.yml` (these might have been a merge error??)
 - refers to `linux` rather than `linux32` in `testing/taskcluster/tasks/branches/base_job_flags.yml`, causing it to run for all branches.

Sorry for the extra review.  On the plus side, green builds from the new image!
Attachment #8666758 - Flags: review?(ted)
Attachment #8666758 - Flags: review?(ted) → review+
Comment on attachment 8666758 [details] [diff] [review]
bug1189892-activate.patch

Review of attachment 8666758 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/taskcluster/tasks/branches/try/job_flags.yml
@@ -135,5 @@
> -    types:
> -      opt:
> -        task: tasks/builds/opt_linux64_clobber.yml
> -      debug:
> -        task: tasks/builds/dbg_linux64_clobber.yml

This hunk makes these builds not be clobber any more, AFAICT.  Is that desired?
Keywords: leave-open
(In reply to Ehsan Akhgari (don't ask for review please) from comment #68)
> Comment on attachment 8666758 [details] [diff] [review]
> bug1189892-activate.patch
> 
> Review of attachment 8666758 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: testing/taskcluster/tasks/branches/try/job_flags.yml
> @@ -135,5 @@
> > -    types:
> > -      opt:
> > -        task: tasks/builds/opt_linux64_clobber.yml
> > -      debug:
> > -        task: tasks/builds/dbg_linux64_clobber.yml
> 
> This hunk makes these builds not be clobber any more, AFAICT.  Is that
> desired?

Needinfoing since this landed...
Flags: needinfo?(dustin)
Comment on attachment 8666758 [details] [diff] [review]
bug1189892-activate.patch

Review of attachment 8666758 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/taskcluster/tasks/branches/try/job_flags.yml
@@ -135,5 @@
> -    types:
> -      opt:
> -        task: tasks/builds/opt_linux64_clobber.yml
> -      debug:
> -        task: tasks/builds/dbg_linux64_clobber.yml

It doesn't; you can't see it in the patch, but there's another identical 'linux64' entry in this dictionary later in the YML file.  I know, it's a mess.
(for some reason my review didn't post the first time, sorry)
Flags: needinfo?(dustin)
(In reply to Dustin J. Mitchell [:dustin] from comment #71)
> Comment on attachment 8666758 [details] [diff] [review]
> bug1189892-activate.patch
> 
> Review of attachment 8666758 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: testing/taskcluster/tasks/branches/try/job_flags.yml
> @@ -135,5 @@
> > -    types:
> > -      opt:
> > -        task: tasks/builds/opt_linux64_clobber.yml
> > -      debug:
> > -        task: tasks/builds/dbg_linux64_clobber.yml
> 
> It doesn't; you can't see it in the patch, but there's another identical
> 'linux64' entry in this dictionary later in the YML file.  I know, it's a
> mess.

Oops!  Thanks for looking anyway.  :-)
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.