Closed Bug 1302763 Opened 8 years ago Closed 7 years ago

move docker images out of testing/docker into taskcluster/docker

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla53

People

(Reporter: dustin, Assigned: CuriousLearner, Mentored)

Details

(Keywords: good-first-bug)

Attachments

(2 files, 3 obsolete files)

In the gecko source tree, the dockerfiles are described under testing/docker, but they are fairly taskcluster-specific, so let's move those to taskcluster/docker and adjust the image generation code to find them there.
Hi Dustin!

I'd like to take on this bug. Can you please let me know the IRC channel I can find you on?
Hi Sanyam!

Sure, you can find me in #tc-contributors, and I (or some of the others in there) would be happy to talk about this bug.
Assignee: nobody → sanyam.khurana01
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82254

A few minor bits here:

  * All of the stuff under mobile/ should be reverted.
  * gecko.log should be removed
  * let's find out about AUTHORS (so maybe drop that bit for now..)
  * finally, you missed a spot in taskcluster/taskgraph/task/docker_image.py because it uses `os.path.join`, but that spot will be critical!

I triggered a try job for this, just for reference.  We should run another try job on the next revision too, before landing it.

::: AUTHORS:882
(Diff revision 1)
>  Ryan Cassin <rcassin@supernova.org>
>  Ryan Flint <rflint@dslr.net>
>  Ryan Jones <sciguyryan@gmail.com>
>  Ryan VanderMeulen <ryanvm@gmail.com>
>  Ryoichi Furukawa <oliver@1000cp.com>
> +Sanyam Khurana <sanyam.khurana01@gmail.com>

Huh, I don't know if I can r+ this.. I am not even in this file, so I'm not sure what the requirements are to be added.

::: mobile/android/tests/background/junit4/resources/feed_atom_planetmozilla.xml:682
(Diff revision 1)
>              <p>In a continuing effort to enable faster, more reliable, and more easily-run tests for TaskCluster components, Dustin landed support for an in-memory, credential-free mock of Azure Table Storage in the <a href="https://www.npmjs.com/package/azure-entities" target="_blank">azure-entities</a> package.  Together with the fake mock support he added to <a href="https://github.com/djmitche/taskcluster-lib-testing" target="_blank">taskcluster-lib-testing</a>, this allows tests for components like taskcluster-hooks to run without network access and without the need for any credentials, substantially decreasing the barrier to external contributions.</p>
>  
>              <p>All release promotion tasks are now signed by default. Thanks to Rail for his work here to help improve verifiability and chain-of-custody in our upcoming release process. (<a href="https://bugzil.la/1239682" target="_blank">https://bugzil.la/1239682</a>)
>                  Beetmover has been spotted in the wild! Jordan has been working on this new tool as part of our release promotion project. Beetmover helps move build artifacts from one place to another (generally between S3 buckets these days), but can also be extended to perform validation actions inline, e.g. checksums and anti-virus. (<a href="https://bugzil.la/1225899" target="_blank">https://bugzil.la/1225899</a>)</p>
>  
> -            <p>Dustin configured the “desktop-test” and “desktop-build” docker images to build automatically on push.  That means that you can modify the Dockerfile under `testing/docker`, push to try, and have the try job run in the resulting image, all without pushing any images.  This should enable much quicker iteration on tweaks to the docker images.  Note, however, that updates to the base OS images (ubuntu1204-build and centos6-build) still require manual pushes.</p>
> +            <p>Dustin configured the “desktop-test” and “desktop-build” docker images to build automatically on push.  That means that you can modify the Dockerfile under `taskcluster/docker`, push to try, and have the try job run in the resulting image, all without pushing any images.  This should enable much quicker iteration on tweaks to the docker images.  Note, however, that updates to the base OS images (ubuntu1204-build and centos6-build) still require manual pushes.</p>

I think this is testing data, so it probably shouldn't change -- it's representing a historical snapshot of that file.

::: mobile/android/tests/background/junit4/resources/feed_rss10_planetmozilla.xml:638
(Diff revision 1)
>              &lt;p&gt;In a continuing effort to enable faster, more reliable, and more easily-run tests for TaskCluster components, Dustin landed support for an in-memory, credential-free mock of Azure Table Storage in the &lt;a href=&quot;https://www.npmjs.com/package/azure-entities&quot; target=&quot;_blank&quot;&gt;azure-entities&lt;/a&gt; package.  Together with the fake mock support he added to &lt;a href=&quot;https://github.com/djmitche/taskcluster-lib-testing&quot; target=&quot;_blank&quot;&gt;taskcluster-lib-testing&lt;/a&gt;, this allows tests for components like taskcluster-hooks to run without network access and without the need for any credentials, substantially decreasing the barrier to external contributions.&lt;/p&gt;
>  
>              &lt;p&gt;All release promotion tasks are now signed by default. Thanks to Rail for his work here to help improve verifiability and chain-of-custody in our upcoming release process. (&lt;a href=&quot;https://bugzil.la/1239682&quot; target=&quot;_blank&quot;&gt;https://bugzil.la/1239682&lt;/a&gt;)
>              Beetmover has been spotted in the wild! Jordan has been working on this new tool as part of our release promotion project. Beetmover helps move build artifacts from one place to another (generally between S3 buckets these days), but can also be extended to perform validation actions inline, e.g. checksums and anti-virus. (&lt;a href=&quot;https://bugzil.la/1225899&quot; target=&quot;_blank&quot;&gt;https://bugzil.la/1225899&lt;/a&gt;)&lt;/p&gt;
>  
> -            &lt;p&gt;Dustin configured the “desktop-test” and “desktop-build” docker images to build automatically on push.  That means that you can modify the Dockerfile under `testing/docker`, push to try, and have the try job run in the resulting image, all without pushing any images.  This should enable much quicker iteration on tweaks to the docker images.  Note, however, that updates to the base OS images (ubuntu1204-build and centos6-build) still require manual pushes.&lt;/p&gt;
> +            &lt;p&gt;Dustin configured the “desktop-test” and “desktop-build” docker images to build automatically on push.  That means that you can modify the Dockerfile under `taskcluster/docker`, push to try, and have the try job run in the resulting image, all without pushing any images.  This should enable much quicker iteration on tweaks to the docker images.  Note, however, that updates to the base OS images (ubuntu1204-build and centos6-build) still require manual pushes.&lt;/p&gt;

same here

::: testing/marionette/client/gecko.log:1
(Diff revision 1)
> +1472031320219	Marionette	INFO	Listening on port 2828

I don't think you meant to include this?
Attachment #8798149 - Flags: review?(dustin) → review-
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82254

> Huh, I don't know if I can r+ this.. I am not even in this file, so I'm not sure what the requirements are to be added.

It is still there as you told.
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82448

This much looks good, just one minor detail to fix up.  I've also triggered a try run, so we should make sure that succeeds before pushing.  The last one failed, as predicted:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c926278ac16d&selectedJob=28627208

because of the missing `docker_image.py` change.

::: taskcluster/taskgraph/transforms/task.py:135
(Diff revision 3)
>          # in-tree, then a dependency will be created automatically.  This is
>          # generally `desktop-test`, or an image that acts an awful lot like it.
>          Required('docker-image'): Any(
>              # a raw Docker image path (repo/image:tag)
>              basestring,
> -            # an in-tree generated docker image (from `testing/docker/<name>`)
> +            # an in-tree generated docker image (from `taskcluster/docker<name>`)

you dropped a `/` here
Attachment #8798149 - Flags: review?(dustin) → review+
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82566

Just submitted the try job -- sorry for the delay.
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82640

Same issue as before:

https://treeherder.mozilla.org/logviewer.html#?job_id=28721167&repo=try#L235
IOError: [Errno 2] No such file or directory: u'/home/worker/checkouts/gecko/testing/docker/REGISTRY'
Attachment #8798149 - Flags: review+
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review82640

Hi Dustin, I've fixed the issue. Can you please trigger a push to try.
Comment on attachment 8798837 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/84220/#review82846

The decision task works, but as we discussed in irc, now all those python files are under taskcluster/ and need to be de-linted
Attachment #8798837 - Flags: review?(dustin)
Comment on attachment 8798973 [details]
Bug 1302763 - Linting test files under taskcluster/docker;

https://reviewboard.mozilla.org/r/84284/#review82918

Looks good.  I've triggered another try job, so let's see how that turns out.

::: taskcluster/docker/phone-builder/tests/test_validation.py:8
(Diff revision 1)
>  import unittest
>  import sys
>  import yaml
> -sys.path.append('../bin')
> +
>  from validate_task import check_task
> +sys.path.append('../bin')

This will probably be a problem, as `validate_task` is in `../bin`.  On the other hand, I believe bug 1306641 deletes this file, so let's keep it.

::: taskcluster/docker/rust-build/tcbuild.py:187
(Diff revision 1)
>      finally:
>          os.chdir(oldcwd)
>  
> +
>  def update_manifest(artifact, manifest, local_gecko_clone):
> -    platform = linux
> +    platform = 'linux'

Guh, I'm guessing that never worked..
Attachment #8798973 - Flags: review?(dustin)
> ::: taskcluster/docker/rust-build/tcbuild.py:187
> (Diff revision 1)
> >      finally:
> >          os.chdir(oldcwd)
> >  
> > +
> >  def update_manifest(artifact, manifest, local_gecko_clone):
> > -    platform = linux
> > +    platform = 'linux'
> 
> Guh, I'm guessing that never worked..

Rillian, mind having a look?
Flags: needinfo?(giles)
Oh, I just saw your irc comment that there's still one more flake8 error, sorry..
Yeah, looks like that is broken. The code is never called, so I hadn't tripped over it before. Your change in correct.
Flags: needinfo?(giles)
Hi Dustin & Ralph,


Only one file is left which is: `taskcluster/docker/rust-build/tcbuild.py` which is giving linting errors as mentioned below:

```
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:14:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:15:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:16:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:17:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:18:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:19:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:20:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:21:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:22:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:23:1 | module level import not at top of file (E402)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:200:100 | line too long (173 > 99 characters) (E501)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:202:100 | line too long (115 > 99 characters) (E501)
TEST-UNEXPECTED-ERROR | /Users/sanyam/personal/mozilla/mozilla-central/taskcluster/docker/rust-build/tcbuild.py:204:100 | line too long (130 > 99 characters) (E501)
```

This file uses first import statement as `from __future__ import print_function` and this should be the first line for any python file (if it is there)

So, I'm confused as how should I clear all the linting errors for this file. Can you suggest something, or we should turn linting off for this file?
Flags: needinfo?(giles)
Flags: needinfo?(dustin)
I answered this in irc -- the issue is not `from __future__ import print_function` -- that's an import statement, so it's fine.  The issue is the `requests.packages.urllib3.disable_warnings()`, which is not an import statement but comes before other import statements.  I expect that can simply be moved after the import statements.
Flags: needinfo?(dustin)
You can just remove the disable_warnings() call. It doesn't seem to be needed any more.
Flags: needinfo?(giles)
Attachment #8798837 - Attachment is obsolete: true
Attachment #8798973 - Attachment is obsolete: true
Comment on attachment 8798149 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker;

https://reviewboard.mozilla.org/r/83686/#review86412

Nice work!
Attachment #8798149 - Flags: review?(dustin) → review+
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d15798b73b9e
Move docker images out of testing/docker into taskcluster/docker; r=dustin
I suspect what happened here is that the generated docker images were invalid:
  exec: "/home/worker/bin/run-task": stat /home/worker/bin/run-task: no such file or directory

I'm not sure why that would be the case!  You can use 
  ./mach taskcluster-load-image --task-id YonAc5GRSk6aXBNk_6pSEg
to load that docker image locally and take a look at it.

I don't think the errors have anything to do with your hg configuration, but re-landing the patch can be a little tricky (and explains why you are getting errors from mozreview).
Sanyam, any chance you can take a look at this?
Hi Dustin!

Yeah, I was just learning Docker to analyse this! I'll be reaching over to you on IRC if I'm stuck :)
Flags: needinfo?(sanyam.khurana01)
Hi Dustin,

when I try to load the docker image; I get 404. Here are the logs: https://pastebin.mozilla.org/8932284
Try using a newer taskId?  That task executed before we switched from .tar to .zst files.

For example, VPgQBKdYSuiE_yXoGsopww
Sorry we keep missing each other in irc!

I think I haven't fully focused on the question, so I'm maybe not helping as much as I could.  The last comment was especially silly because that's not the task you were interested in!

I downloaded the image in task YonAc5GRSk6aXBNk_6pSEg, and it appears to be almost entirely empty.  Its contents are:

dev/
dev/console
dev/pts/
dev/shm/
etc/
etc/hostname
etc/hosts
etc/mtab
etc/resolv.conf
proc/
repositories
sys/

which is to say, basically empty.  Weird!  Looking at the logs from the task, that makes no sense :(

I'll keep working..
So, I rebased this patch to the latest from mozilla-central, and will give it a try push to see what happens.
Comment on attachment 8817478 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker

Do you see anything I missed or did wrong in the rebase?
Attachment #8817478 - Flags: review?(dustin) → review?(sanyam.khurana01)
Attachment #8817478 - Flags: review?(dustin)
Seems like the try push was not successful. It is showing:

> Exception: image directory does not exist: /home/worker/checkouts/gecko/testing/docker/desktop-test

I'm looking into it.
Did you look at the latest try push?

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=dc575b886e98

It has some lint errors (that I already fixed) but all of the images built correctly.  Now you have me worried that I'm missing something :)
Comment on attachment 8817478 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker

https://reviewboard.mozilla.org/r/97726/#review99896

Sorry, it was confusing since the try push was not reflected here. Now it looks good to me :)
Attachment #8817478 - Flags: review?(sanyam.khurana01) → review+
Attachment #8819970 - Attachment is obsolete: true
Comment on attachment 8817478 [details]
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker

https://reviewboard.mozilla.org/r/97726/#review99952
Attachment #8817478 - Flags: review?(dustin) → review+
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fe7303de56be
Move docker images out of testing/docker into taskcluster/docker; r=CuriousLearner
Backed out for failing webdriver tests on Linux x64 debug (geckodriver.manifest fetch failed):

https://hg.mozilla.org/integration/autoland/rev/9d0c5328c89782f32bb458487278944bf219a417

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=fe7303de56be76914f0eebe008b0ef02c803b6cd
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=8124094&repo=autoland

[task 2016-12-19T19:59:24.695329Z] 19:59:24     INFO - Calling ['/usr/bin/python2.7', '/home/worker/checkouts/gecko/testing/docker/recipes/tooltool.py', '--url', 'https://api.pub.build.mozilla.org/tooltool/', 'fetch', '-m', '/home/worker/workspace/build/tests/config/tooltool-manifests/linux64/geckodriver.manifest', '-o', '-c', '/builds/tooltool_cache'] with output_timeout 600
[task 2016-12-19T19:59:24.707374Z] 19:59:24     INFO -  /usr/bin/python2.7: can't open file '/home/worker/checkouts/gecko/testing/docker/recipes/tooltool.py': [Errno 2] No such file or directory
[task 2016-12-19T19:59:24.713803Z] 19:59:24    ERROR - Return code: 2
[task 2016-12-19T19:59:24.714230Z] 19:59:24    FATAL - Tooltool /home/worker/workspace/build/tests/config/tooltool-manifests/linux64/geckodriver.manifest fetch failed!
Flags: needinfo?(sanyam.khurana01)
Boy, this is a hard one to get landed!

testing/mozharness/scripts/web_platform_tests.py
  192     def fetch_geckodriver(self):
..
  217         self.tooltool_fetch(
  218             manifest=tooltool_path,
  219             output_dir=dirs['abs_work_dir'],
  220             cache=c.get('tooltool_cache')
  221         )

testing/mozharness/mozharness/mozilla/tooltool.py
 47     def tooltool_fetch(self, manifest,
 48                        output_dir=None, privileged=False, cache=None):
 49         """docstring for tooltool_fetch"""
 50         # Use vendored tooltool.py if available.
 51         if self.topsrcdir:
 52             cmd = [
 53                 sys.executable,
 54                 os.path.join(self.topsrcdir, 'testing', 'docker', 'recipes',
 55                                 'tooltool.py')
 56             ]
 57         elif self.config.get("download_tooltool"):
 58             cmd = [sys.executable, self._fetch_tooltool_py()]
 59         else:
 60             cmd = self.query_exe('tooltool.py', return_type='list')

Sanyam, do you want to make up a new patch with that fixed?
Sorry, I'll take care of that (I'd rather get this landed sooner than later)

(sandbox) dustin@lamport ~/p/m-c (bug1302763) $ grep -r 'testing.*docker' testing/ taskcluster/
(sandbox) dustin@lamport ~/p/m-c (bug1302763) $ 

so hopefully we've gotten everything!
Flags: needinfo?(sanyam.khurana01)
Ugh, and then I forgot to land it.  I just rebased, and I'm doing one more try run.
OK, that's green.  SHIP IT!
https://hg.mozilla.org/integration/mozilla-inbound/rev/f374e4333d3a17f72d89063670115d71a9643a5f
Bug 1302763 - Move docker images out of testing/docker into taskcluster/docker; r=dustin r=CuriousLearner
https://hg.mozilla.org/mozilla-central/rev/f374e4333d3a
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Product: TaskCluster → Firefox Build System
You need to log in before you can comment on or make changes to this bug.