Closed Bug 1125973 Opened 8 years ago Closed 7 years ago

Docker images for Android builds

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

References

Details

Attachments

(4 files, 4 obsolete files)

Let's land some Docker profiles.
Attached patch base-android-build.patch (obsolete) — Splinter Review
OK, this is a *really* rough draft, with lots of TODO and XXX and stuff that's just not done.

Here's what it does:
 * introduce two new docker "profiles" for Android images; one installs a lot of packages (and is thus a pretty big layer) and the (android-build) installs some small stuff (scripts, etc.) for doing android builds.  The idea is that the stuff in android-build will change a lot more often, so we can save proliferation of big layers full of packages by shipping that in a small layer.
 * add build.sh, which knows how to check out mozilla-central, tools (although this is currently unused), and mozharness, then run a mozharness script.  This is similar to, although quite a bit simpler than, the B2G stuff.
 * add build.py.  This is a mozharness script that I hope to eventually store in-tree under builds/mozharness, but putting it here lets me start drafting it now.

I want to do all of this in the most natural, straightforward way.  I also want to avoid cargo-culting as much as possible -- stuff should have a reason and not just be copy/pasted because that's what we did with Buildbot.

Some of the things I'm interested in getting feedback on:
 * Organization of the docker profiles -- should I shuffle things between the two, or into base-build?
 * Build shell scripts - are the differences from how this is done in B2G OK?
 * Mozharness script -- I'm a total mozharness n00b and all of the existing Firefox mozharness scripts seemed deeply integrated with Buildbot.  There are no Android mozharness scripts (at least, not used in production - there are scripts/*android*.py but they're not used by Buildbot).  So I wrote one from scratch based on the stuff I saw Buildbot do, all of it plain and logical.  Should I continue to develop this script, or try to base my script on some existing scripts and/or mixins?  If so, which?

Generally, I think I've made enough stabs in the dark here that I need an injection of expertise before proceeding.
Attachment #8554772 - Flags: feedback?(jlal)
Attachment #8554772 - Flags: feedback?(catlee)
Attachment #8554772 - Flags: feedback?(armenzg)
Comment on attachment 8554772 [details] [diff] [review]
base-android-build.patch

Review of attachment 8554772 [details] [diff] [review]:
-----------------------------------------------------------------

Nothing outrageous. Thanks Dustin!

::: testing/docker/android-build/bin/build.sh
@@ +18,5 @@
> +tc-vcs checkout-revision build $GECKO_HEAD_REPOSITORY $GECKO_HEAD_REV $GECKO_HEAD_REV
> +
> +# TODO: don't use this?
> +tc-vcs clone $TOOLS_REPOSITORY tools
> +tc-vcs checkout-revision tools $TOOLS_REPOSITORY $TOOLS_REV $TOOLS_REV

You can let mozharness checkout tools if you actually need it.
Attachment #8554772 - Flags: feedback?(armenzg) → feedback+
Comment on attachment 8554772 [details] [diff] [review]
base-android-build.patch

Review of attachment 8554772 [details] [diff] [review]:
-----------------------------------------------------------------

overall looks great, I think you're on the right path so far.

::: testing/docker/android-build/bin/build.py
@@ +69,5 @@
> +
> +    def build(self):
> +        env = self.query_env()
> +        make = self.query_exe("make")
> +        # TODO: add MOZ_BUILD_DATE to make invocation

hmmm...this is going to be tricky. what we do right now in buildbot is to create the buildid in the scheduler so all builds triggered by that scheduler share the same buildid, which is passed in as a property and used to set MOZ_BUILD_DATE. Can we do that in TC?

@@ +71,5 @@
> +        env = self.query_env()
> +        make = self.query_exe("make")
> +        # TODO: add MOZ_BUILD_DATE to make invocation
> +        if self.run_command(
> +                [make, "-f", "client.mk", "build"],

We should aim to use mach for this (and the steps below). I'm not sure what state mach support for android builds is, please check with Jordan and Mike.

@@ +79,5 @@
> +            self.fatal("Build failed!")
> +
> +        env = self.query_build_env()
> +        make = self.query_exe("make")
> +        for target in ['buildsymbols', 'package-test', 'package']:

In buildbot the packages get signed at this stage. We'll need to figure out how to handle signing in TC.

::: testing/docker/android-build/bin/build.sh
@@ +20,5 @@
> +# TODO: don't use this?
> +tc-vcs clone $TOOLS_REPOSITORY tools
> +tc-vcs checkout-revision tools $TOOLS_REPOSITORY $TOOLS_REV $TOOLS_REV
> +
> +# TODO: this build.py should be in-tree, perhaps under build/mozharness?

there are ongoing discussions about where mozharness should live. Jordan and Armen probably have more context than I do, but I think at the very least having build.py in-tree is a good idea.

::: testing/docker/base-android-build/system-setup.sh
@@ +47,5 @@
> +groupinstall Base
> +
> +# android ndk/sdk are installed from tooltool
> +# install android-sdk16
> +# install android-ndk8

yeah, definitely not needed from rpms; we just got rid of them in bug 1125354

@@ +91,5 @@
> +install ccache
> +
> +# the mozilla build of vcs tools (just newer versions than in centos)
> +install mozilla-git
> +install mozilla-python27-mercurial

are these still required if we're using tc-vcs?
Attachment #8554772 - Flags: feedback?(catlee) → feedback+
Comment on attachment 8554772 [details] [diff] [review]
base-android-build.patch

Review of attachment 8554772 [details] [diff] [review]:
-----------------------------------------------------------------

::: testing/docker/android-build/bin/build.py
@@ +69,5 @@
> +
> +    def build(self):
> +        env = self.query_env()
> +        make = self.query_exe("make")
> +        # TODO: add MOZ_BUILD_DATE to make invocation

Not completely sure if this answers the question but there are a few options:

 - the push has access to the pushid and project (which could potentially be use)
 - taskGroupId is the graph is which is shared by all builds in push this can be fetched via using the $TASK_ID environment variable and then
   parsing out the value from the json response of queue.taskcluster.net/v1/task/$TASK_ID (we can make this easier by exposing $TASK_GROUP_ID perhaps?)

::: testing/docker/android-build/bin/build.sh
@@ +11,5 @@
> +test $TARGET
> +
> +# get our repos installed
> +tc-vcs clone $MOZHARNESS_REPOSITORY mozharness
> +tc-vcs checkout-revision mozharness $MOZHARNESS_REPOSITORY $MOZHARNESS_REV $MOZHARNESS_REV

Note- mozharness has support for tc-vcs as a vcs tool that you can use if that is somehow easier then the shell scripts
Attachment #8554772 - Flags: feedback?(jlal) → feedback+
Yep, we can set MOZ_BUILD_DATE in the decision task and pass that along, or as James said base that on some of the other good identifiers TC gives us.

tc-vcs will use whatever 'git' and 'hg' it finds in PATH, which includes those in /tools (we set up the PATH correctly elsewhere).

I don't mind shell scripts at all, and IMHO mozharness is useful for stuff that changes frequently and needs to be in-tree.  Checking out the tree itself doesn't qualify.

Thanks for the fb+'s -- I'll redraft and come back looking for r+'s :)
neato! bear with me as I braindump. please argue where you disagree :)

> ::: testing/docker/android-build/bin/build.py

I'm just sanity checking the path setup for this. if I can read diffs correctly, you are going to create a root 'testing' dir in mh? Any reason to break away from convention of mozharness/mozharness/mozilla/docker/* (for internal bits) and mozharness/scripts/android_build.py (for the script) ?


> @@ +69,5 @@
> > +
> > +    def build(self):
> > +        env = self.query_env()
> > +        make = self.query_exe("make")
> > +        # TODO: add MOZ_BUILD_DATE to make invocation
> 
> hmmm...this is going to be tricky. what we do right now in buildbot is to
> create the buildid in the scheduler so all builds triggered by that
> scheduler share the same buildid, which is passed in as a property and used
> to set MOZ_BUILD_DATE. Can we do that in TC?


buildbase.py holds a BuildScript class that desktop ff uses heavily. Where possible, we should re-use or tweak this class so not to fragment or re-invent the wheel.

the logic flow here in this case is: we allow for some sort of buildid to be passed down (in the buildbot case, that's via a buildbot prop stemming from the scheduler. There are some scenarios where the buildid may be None (maybe a manual force build without passing a custom buildid?). In that case, I get mh to create one: http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/building/buildbase.py?rev=af1461521190#662

we could modify that to allow for tc to inject a push-wide id or else just allow mh to create different ones for each job. You can see how I pass that to MOZ_BUILD_DATE here: http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/building/buildbase.py?rev=af1461521190#737 (note: query_build_env has also already been done here)




> 
> @@ +71,5 @@
> > +        env = self.query_env()
> > +        make = self.query_exe("make")
> > +        # TODO: add MOZ_BUILD_DATE to make invocation
> > +        if self.run_command(
> > +                [make, "-f", "client.mk", "build"],
> 
> We should aim to use mach for this (and the steps below). I'm not sure what
> state mach support for android builds is, please check with Jordan and Mike.

from mh side, this is the easy part:
s/'make -f client.mk build'/http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/building/buildbase.py?rev=af1461521190#1388

we should also get mach to take care of the other make targets below: 'buildsymbols', 'package-test', and 'package'. That is done (in desktop world) by passing a 'MOZ_AUTOMATION': 1 in the build env.

From there, you'll need to beg mshal to help you set up the in tree parts. It should just be some mozconfig tweaking in a similar fashion to desktop.

> 
> @@ +79,5 @@
> > +            self.fatal("Build failed!")
> > +
> > +        env = self.query_build_env()
> > +        make = self.query_exe("make")
> > +        for target in ['buildsymbols', 'package-test', 'package']:
> 
> In buildbot the packages get signed at this stage. We'll need to figure out
> how to handle signing in TC.

expanding on this, for mh jobs that need signing we use a subclass of ScriptFactory: http://mxr.mozilla.org/build/source/buildbotcustom/process/factory.py?rev=52856513ac37#5667

mh can determine all the bits, signing_server + cmd: but I guess we will need to figure out the token part in tc: http://mxr.mozilla.org/build/source/mozharness/mozharness/mozilla/signing.py?rev=36c2a89a53bf#126


> 
> ::: testing/docker/android-build/bin/build.sh
> @@ +20,5 @@
> > +# TODO: don't use this?
> > +tc-vcs clone $TOOLS_REPOSITORY tools
> > +tc-vcs checkout-revision tools $TOOLS_REPOSITORY $TOOLS_REV $TOOLS_REV
> > +
> > +# TODO: this build.py should be in-tree, perhaps under build/mozharness?
> 
> there are ongoing discussions about where mozharness should live. Jordan and
> Armen probably have more context than I do, but I think at the very least
> having build.py in-tree is a good idea.

so far, 0 scripts live in tree. That doesn't mean we should avoid the extra effort; you'll just have nothing to go by.

IIUC, if you have a mh checkout outside of tree on a tc slave, we could in theory just download the script from in tree and call it (you'll just have to play with sys.path and let the script know where to find the mh internals. Armen has many plans set in stages for this so we should verify off him to stick with an agreed convention.
The thing that makes moving mozharness configs and scripts in-tree is that Buildbot builds expect mozharness to do the checkouts.  So there's a chicken-and-egg problem.

Within taskcluster, that's not a problem -- tc-vcs does a darn good job of checking out repos, using all manner of resiliency and caching.  So we just check out the appropriate MH repo and the appropriate mozilla-central repo, and invoke the MH script.
(In reply to Chris AtLee [:catlee] from comment #3)
> > +# the mozilla build of vcs tools (just newer versions than in centos)
> > +install mozilla-git
> > +install mozilla-python27-mercurial
> 
> are these still required if we're using tc-vcs?


Yes, tc-vcs relies on reasonably up-to-date vc tools.
I meant more the mozilla-specific packages, rather than git/python2.7/hg themselves.
Attached file MozReview Request: bz://1125973/dustin (obsolete) —
/r/7807 - Bug 1125973: create a base docker image for android builds
/r/7901 - Bug 1125973: create a build docker image for android

Pull down these commits:

hg pull -r 36f98d1f52f7bc51981cf2fbd4ae0e2a69fcaeeb https://reviewboard-hg.mozilla.org/gecko/
Comment on attachment 8599510 [details]
MozReview Request: bz://1125973/dustin

Here's what I have going so far, just as an FYI.  As we just discussed, I'm going to try to implement something similar based on https://github.com/mrrrgn/ff64-build/blob/master/Dockerfile instead.
Attachment #8599510 - Flags: feedback?(winter2718)
https://reviewboard.mozilla.org/r/7805/#review6657

::: testing/docker/android-build/bin/build.py:1
(Diff revision 1)
> +#!/usr/bin/env python

Ignore this file - it shouldn't be here.

::: testing/docker/android-build/bin/build.sh:18
(Diff revision 1)
> +PYTHONPATH=mozharness python build/build/mozharness/build_android.py \

This refers to the in-tree build script (not included in this review)
(In reply to Chris AtLee [:catlee] from comment #9)
> I meant more the mozilla-specific packages, rather than git/python2.7/hg
> themselves.

If, as Morgan's working on, we switch to building on up-to-date Ubuntu, then this shouldn't be a problem.
Comment on attachment 8599510 [details]
MozReview Request: bz://1125973/dustin

I'm following your work on bitbucket, and using it for my own things. Looks great so far, I'll send over any changes I make.
Attachment #8599510 - Flags: feedback?(winter2718) → feedback+
Attached file MozReview Request: bz://1125973/dustin (obsolete) —
/r/8149 - Bug 1125973: don't sign if MOZ_SIGNING_SERVERS is not set

Pull down this commit:

hg pull -r 1f710a97bae6c161324988d64f68d8f3dfb1c49d https://reviewboard-hg.mozilla.org/build-mozharness
Attachment #8601554 - Flags: review?(mshal)
This was necessary to be able to disable signing in the docker image (where it's not currently available).
Comment on attachment 8601554 [details]
MozReview Request: bz://1125973/dustin

https://reviewboard.mozilla.org/r/8147/#review6915

Seems like we should fix query_moz_sign_cmd to use self.warning instead of self.fatal, though that would require fixing all of the callers to handle query_moz_sign_cmd returning an empty command. I don't want to block you on fixing all those scripts, though :)
Attachment #8601554 - Flags: review?(mshal) → review+
https://reviewboard.mozilla.org/r/8147/#review6921

What if it returned ['true', 'fake-signing-command'] in those cases?
Comment on attachment 8601554 [details]
MozReview Request: bz://1125973/dustin

remote:   https://hg.mozilla.org/build/mozharness/rev/fe52c16631e4
Attachment #8601554 - Flags: checked-in+
I just had a successful build of Fennec in Taskcluster:
  http://ec2-52-7-104-61.compute-1.amazonaws.com:49157/log
There's lots still to do to make this work better (caching, artifacts, symbols, encrypt the relengapi token, etc.), but it's a nice milestone.
\o/ this is awesome
Attached patch bug1125973-images.patch (obsolete) — Splinter Review
Please feel free to spread the review love around here.  I'd like to get something landed so that mrrrgn and I can both iterate on it, so please also be explicit about things to fix before landing vs. things that could be done post-landing.
Attachment #8604920 - Flags: review?(garndt)
Comment on attachment 8604920 [details] [diff] [review]
bug1125973-images.patch

Review of attachment 8604920 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  Just had a couple of comments, mostly for my own curiosity.  There might be some instances where the ordering in the dockerfile could prevent new layers from being created, but definitely nothing that would need to be fixed now.

::: testing/docker/desktop-build/bin/build.sh
@@ +82,5 @@
> +    # display could be opened. As long as we can open the display
> +    # tests should work.
> +    sleep 2 # we need to sleep so that Xvfb has time to startup
> +    xvinfo || if [ $? == 255 ]; then exit 255; fi
> +    export DISPLAY=:2

Curious, why is this done here and a few lines above?  Probably something subtle I'm missing.

::: testing/docker/desktop-build/oauth.txt
@@ +1,1 @@
> +taskcluster_clientId = None

Is this never populated?  Not sure what it was used for.

::: testing/docker/ubuntu-build/Dockerfile
@@ +10,5 @@
> +
> +# configure git and install tc-vcs
> +RUN git config --global user.email "nobody@mozilla.com" && \
> +    git config --global user.name "mozilla"
> +RUN npm install -g taskcluster-vcs || true

Not sure if you want to pin the tc-vcs version here.  Things are not changing as much, especially breaking changes, but we pin it in our docker images just for the sake of making sure things behave consistently.

@@ +17,5 @@
> +ADD           build-setup.sh   /tmp/build-setup.sh
> +RUN           bash /tmp/build-setup.sh
> +
> +# Builds need the share module enabled
> +ADD           hgrc /home/worker/.hgrc

Perhaps since this probably won't change at all, or at least not frequently, it should be one of the first lines of the dockerfile so the layer never changes by breaking changes above it.
Attachment #8604920 - Flags: review?(garndt) → review+
Just for clarity, regarding the review, anything I asked questions about or pointed out do not need to be fixed in this iteration and could be done later if you see benefits.
Thanks!

double-DISPLAY= is a merge error on my part, I think.  I'll fix up before merging.

oauth.txt is used for taskcluster credentials to upload the build results.  It's not needed if we're running inside TC, but the file must exist.  It's something we can remove once nothing runs in Buildbot.

I'll pin the taskcluster-vcs version before merging.

As for the order of the dockerfile -- we don't use the docker cache anyway (since it caches non-functional operations like 'apt-get update'), so the precise order doesn't matter.  I have designs on shifting around what happens in desktop-build vs. ubuntu-build, but that can wait until this has landed.

I'll carry forward the r+ for the two minor changes.
r=garndt carried forward
Attachment #8554772 - Attachment is obsolete: true
Attachment #8604920 - Attachment is obsolete: true
Attachment #8605375 - Flags: review+
Still lots to do (including the dockerfile reorg), but I'll be filing new bugs.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Attachment #8601554 - Attachment is obsolete: true
Attachment #8599510 - Attachment is obsolete: true
Attachment #8619222 - Flags: review+
You need to log in before you can comment on or make changes to this bug.