Factor builds et al. out of legacy kind

RESOLVED FIXED in mozilla51

Status

task
RESOLVED FIXED
3 years ago
Last year

People

(Reporter: dustin, Assigned: dustin)

Tracking

unspecified
mozilla51
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(53 attachments, 6 obsolete attachments)

58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
glandium
: review+
Details
58 bytes, text/x-review-board-request
Callek
: review+
Details
58 bytes, text/x-review-board-request
maja_zf
: review+
Details
58 bytes, text/x-review-board-request
mossop
: review+
Details
58 bytes, text/x-review-board-request
nalexander
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
nalexander
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
armenzg
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
garndt
: review+
Details
58 bytes, text/x-review-board-request
ochameau
: review+
Details
58 bytes, text/x-review-board-request
jmaher
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
wcosta
: review+
Details
58 bytes, text/x-review-board-request
gerard-majax
: review+
Details
58 bytes, text/x-review-board-request
gerard-majax
: review+
Details
58 bytes, text/x-review-board-request
wcosta
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
Callek
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
jlund
: review+
Details
58 bytes, text/x-review-board-request
chmanchester
: review+
gps
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
58 bytes, text/x-review-board-request
Callek
: review+
Details
58 bytes, text/x-review-board-request
ahal
: review+
gps
: review+
Details
58 bytes, text/x-review-board-request
ted
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
Ehsan
: review+
Details
58 bytes, text/x-review-board-request
gps
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
58 bytes, text/x-review-board-request
gerard-majax
: review+
Details
58 bytes, text/x-review-board-request
maja_zf
: review+
Details
58 bytes, text/x-review-board-request
nalexander
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
pmoore
: review+
gps
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
58 bytes, text/x-review-board-request
gerard-majax
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
58 bytes, text/x-review-board-request
mshal
: review+
Details
58 bytes, text/x-review-board-request
Callek
: review+
Details
58 bytes, text/x-review-board-request
maja_zf
: review+
gps
: review+
Details
58 bytes, text/x-review-board-request
sfink
: review+
Details
The follow-on to bug 1281004 -- factor out the rest of the legacy kind.
Blocks: 1277682
Depends on: 1286652
Progress is being made:

  https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/rev/c4617e936bb5

this is currently generating almost all builds that actually build something, with the exception of Android x86 (I just haven't gotten to it yet).  I'll probably also roll mulet in there.  Note that "all" includes win32/64, including support for generic-worker and the mozharness-invoked-directly approach.

The remaining tasks that aren't included are as follows.  I will probably break most of these out into distinct kinds, since they are not builds, per se, but build-like things.

Android x86
 tc(B) android-x86/opt

Mulet
 ?(B) linux64-mulet/debug
 ?(B) linux64-mulet/opt

Artifact Builds
 ?(AB) linux64-artifact/opt

Static Analysis
 ?(S) linux64-st-an/opt

Devices
 Aries(Bd) aries-eng/debug
 Aries(Be) aries-eng/opt
 Nexus 5-L(Bd) nexus-5l-eng/debug
 Nexus 5-L(Be) nexus-5l-eng/opt

Spidermonkey
 SM-tc(H) linux64-shell-haz/debug
 SM-tc(arm) sm-arm-sim/debug
 SM-tc(arm64) sm-arm64-sim/debug
 SM-tc(asan) sm-asan/opt
 SM-tc(cgc) sm-compacting/debug
 SM-tc(msan) sm-msan/opt
 SM-tc(nu) sm-nonunified/debug
 SM-tc(p) sm-plain/debug
 SM-tc(p) sm-plain/opt
 SM-tc(pkg) sm-package/opt
 SM-tc(r) sm-rootanalysis/debug
 SM-tc(tsan) sm-tsan/opt

Android Partner
 tc(B) android-partner-sample1/opt

Hazard
 tc(H) linux64-haz/debug
 tc(H) linux64-mulet-haz/debug

L10n
 tc(L10n) linux-l10n/opt
 tc(L10n) linux64-l10n/opt

Valgrind
 tc(V) linux64-valgrind/opt


Any feedback is appreciated, but at this point the patch is small enough that most of you (the folks I just cc'd) can probably take a glance at the diff and get a sense for where I'm headed, which is all I really want right now :)
I'm leaning toward liking things like:

builds:
  linux64:
      opt:
         ...
      debug:
         ...
      pgo:
         ...

Instead of:

builds:
   linux64/opt:
      ...

But I can see value in the less-indentation.

(The former method can have a `<default>` key too, for stuff shared between all/most types if you go that way)
Thanks, yeah, that's a structure I hadn't thought of yet.  I want to get everything implemented and then refactor the heck out of it until I'm happy.  I'll give that a shot!
Depends on: 1290523
Depends on: 1290602
Depends on: 1291473
Blocks: 1284602
Quick status update here:

I have all but about a half-dozen jobs ported to new kinds.  The plan is to leave those jobs as legacy and work separately with their owners to port them to new kinds.  This is in the interest of getting this patch landed soon, rather than spending time fiddling in the corner.  The jobs are

    * Device builds
    * tc(Mn-h) marionette-harness/opt
    * tc(Deps) android-api-15-gradle-dependencies/opt
    * tc(checkstyle) android-checkstyle/opt
    * tc(lint) android-lint/opt
    * tc(test) android-test/opt

Now that I've replicated all of these tasks, I'm going to refactor the new stuff until it makes sense.  Currently the transformation structure I'm thinking about is this:

  Test Descriptions (specify what test to run in a platform-independent way, with chunking, etc.)
   -generate-
  Job Descriptions (specify what to run, e.g., "this mozharness script" or "that shell script")
   -generate-
  Task Descriptions (specify a gecko task)
   -generate-
  Task Definitions (data structures that are passed to queue.createTask)

So the test kind would generate test descriptions, while builds and other kinds of jobs would generate job descriptions.  Particularly weird jobs can generate task descriptions directly if desired (for example, upload-symbols).

The job descriptions are parameterized on a "using" field which defines how the particular run is performed -- for example, we have seven or so ways to run mozharness, so each of those will be a distinct "using" value.
I've got most of this refactor done, although I still need to fix up tests to map to jobs.  With lots of minor TODOs, what I have right now is at https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/rev/bug1286075@7

This is now near enough to its final form that those who are desperately curious can start having a look :)
WARNING! HUGE PATCH SET INBOUND :)

I'm about to land a "sneak preview" of the work on this bug.  It's basically done, but has some TODO's and needs more testing and documentation and whatnot.  And probably a few rounds of review.  The patchset consists of a bunch of small commits that modify existing behavior, and then one final, large commit to get the job done.

Here's what I'd like at this point:

 * Where you're r?'d, please review -- that's one or more of the "small commits".  If the change doesn't seem justified, please say so, as these are more like a permission request than a review request.

 * Please have a look at the final patch in the series, and feel free to focus on the parts that are important to you (especially if those parts have ended up in their own kind).  Give it a rather high-level review at this point: has this design made something important impossible?  Are common things easy?  Have I built more equipment than I need (YAGNI)?  Is some of the code particularly impenetrable or difficult to understand?

I will come back with a better version of that last commit, taking into account the comments I get in the interim, and hopefully broken down into a few smaller changesets.
For those of you just copied, see comment 8 regarding the review request that's about to land.
I have to confess I've only had a superficial look - but off the bat, here are my concerns (which maybe you can address quickly as non-issues).

Previously, each job was defined declaratively by a yaml file, which could inherit from other yaml files recursively.

This meant it was pretty trivial to see what a job definition would be, or to affect changes at an appropriate level of the hierachy. It was pretty trivial to know which jobs you would affect by changing a particular yaml file. We could have had better tooling to help with this, but there was not a lot of code to interpret, in order to understand how the task definitions were built up.

What it looks like to me from the outside, is this is now replaced by rather a complex codebase and programmatic logic to generate task definitions. This appears, at least from a cursory inspection, to be much more impenetrable than it previously was. I'm also a little wary that it might be more difficult to "fork" definitions where previously this was quite trivial so that if one project wants to do things a little differently, you don't have to refactor the codebase.

Like I say, I've only had a cursory glance, so maybe this is all quite trivial and possible, but since it is my end-of-day, I wanted to at least raise the question before I sign off.

I'm worried that we might end up maintaining something like buildbot-configs for taskcluster. I loved the flexibility of the previous approach, combined with great simplicity. I'm worried the new approach might be considerably less simple, and soon turn to spaghetti. Hopefully this doesn't come across as provocative, I'm quite happy to be corrected.

Maybe it would be useful to put together a simple guide first, before this lands, with some examples about setting up new jobs. And apologies if I've misjudged after my cursory glance. The words "HUGE SET INBOUND" also scared me a little, as before this was not a lot of code, it was for the most part a set of static configs that could be independently edited on a task-by-task basis. Maybe it still is. :-)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Attachment #8786802 - Attachment is obsolete: true
Attachment #8786802 - Flags: review?(ted)
Attachment #8786803 - Attachment is obsolete: true
Attachment #8786803 - Flags: review?(gps)
Attachment #8786804 - Attachment is obsolete: true
Attachment #8786804 - Flags: review?(mshal)
Attachment #8786805 - Attachment is obsolete: true
Attachment #8786805 - Flags: review?(garndt)
Attachment #8786806 - Attachment is obsolete: true
Attachment #8786806 - Flags: review?(mh+mozilla)
Attachment #8786807 - Attachment is obsolete: true
Attachment #8786807 - Flags: review?(bugspam.Callek)
Attachment #8786808 - Attachment is obsolete: true
Attachment #8786808 - Flags: review?(mjzffr)
Attachment #8786809 - Attachment is obsolete: true
Attachment #8786809 - Flags: review?(dtownsend)
Attachment #8786810 - Attachment is obsolete: true
Attachment #8786810 - Flags: review?(nalexander)
Attachment #8786811 - Attachment is obsolete: true
Attachment #8786811 - Flags: review?(gps)
Attachment #8786812 - Attachment is obsolete: true
Attachment #8786812 - Flags: review?(nalexander)
Attachment #8786813 - Attachment is obsolete: true
Attachment #8786813 - Flags: review?(gps)
Attachment #8786814 - Attachment is obsolete: true
Attachment #8786814 - Flags: review?(gps)
Attachment #8786815 - Attachment is obsolete: true
Attachment #8786815 - Flags: review?(gps)
Attachment #8786816 - Attachment is obsolete: true
Attachment #8786816 - Flags: review?(armenzg)
Attachment #8786817 - Attachment is obsolete: true
Attachment #8786817 - Flags: review?(gps)
Attachment #8786818 - Attachment is obsolete: true
Attachment #8786818 - Flags: review?(mshal)
Attachment #8786819 - Attachment is obsolete: true
Attachment #8786819 - Flags: review?(rthijssen)
Attachment #8786820 - Attachment is obsolete: true
Attachment #8786820 - Flags: review?(lissyx+mozillians)
Attachment #8786821 - Attachment is obsolete: true
Attachment #8786821 - Flags: review?(garndt)
Attachment #8786823 - Attachment is obsolete: true
Attachment #8786824 - Attachment is obsolete: true
Attachment #8786825 - Attachment is obsolete: true
Attachment #8786826 - Attachment is obsolete: true
Attachment #8786822 - Attachment is obsolete: true

Comment 36

3 years ago
mozreview-review
Comment on attachment 8786805 [details]
Bug 1286075: always set groupSymbol to something for lint jobs;

https://reviewboard.mozilla.org/r/75672/#review73610

::: taskcluster/ci/legacy/tasks/lint.yml:45
(Diff revision 1)
>      build_product: '{{build_product}}'
>      build_name: {{build_name}}
>      build_type: {{build_type}}
>      treeherder:
>        jobKind: test
> +      groupSymbol: "?"

I actually think you don't need this.  taskcluster-treeherder will default it for you if you're happy with accepting the defaults.

https://github.com/taskcluster/taskcluster-treeherder/blob/master/src/handler.js#L207
Attachment #8786805 - Flags: review+
Attachment #8786802 - Attachment is obsolete: false
Attachment #8786803 - Attachment is obsolete: false
Attachment #8786804 - Attachment is obsolete: false
Attachment #8786805 - Attachment is obsolete: false
Attachment #8786806 - Attachment is obsolete: false
Attachment #8786807 - Attachment is obsolete: false
Attachment #8786808 - Attachment is obsolete: false
Attachment #8786809 - Attachment is obsolete: false
Attachment #8786810 - Attachment is obsolete: false
Attachment #8786811 - Attachment is obsolete: false
Attachment #8786812 - Attachment is obsolete: false
Attachment #8786813 - Attachment is obsolete: false
Attachment #8786814 - Attachment is obsolete: false
Attachment #8786815 - Attachment is obsolete: false
Attachment #8786816 - Attachment is obsolete: false
Attachment #8786817 - Attachment is obsolete: false
Attachment #8786818 - Attachment is obsolete: false
Attachment #8786819 - Attachment is obsolete: false
Attachment #8786820 - Attachment is obsolete: false
Attachment #8786821 - Attachment is obsolete: false
Attachment #8786822 - Attachment is obsolete: false
Attachment #8786823 - Attachment is obsolete: false
Attachment #8786824 - Attachment is obsolete: false
Attachment #8786825 - Attachment is obsolete: false
Attachment #8786826 - Attachment is obsolete: false
Attachment #8786802 - Flags: review?(ted)
Attachment #8786803 - Flags: review?(gps)
Attachment #8786804 - Flags: review?(mshal)
Attachment #8786806 - Flags: review?(mh+mozilla)
Attachment #8786807 - Flags: review?(bugspam.Callek)
Attachment #8786809 - Flags: review?(dtownsend)
Attachment #8786810 - Flags: review?(nalexander)
Attachment #8786811 - Flags: review?(gps)
Attachment #8786812 - Flags: review?(nalexander)
Attachment #8786813 - Flags: review?(gps)
Attachment #8786814 - Flags: review?(gps)
Attachment #8786815 - Flags: review?(gps)
Attachment #8786816 - Flags: review?(armenzg)
Attachment #8786817 - Flags: review?(gps)
Attachment #8786818 - Flags: review?(mshal)
Attachment #8786819 - Flags: review?(rthijssen)
Attachment #8786820 - Flags: review?(lissyx+mozillians)
Attachment #8786821 - Flags: review?(garndt)
Attachment #8786822 - Flags: review?(poirot.alex)

Comment 38

3 years ago
mozreview-review
Comment on attachment 8786821 [details]
Bug 1286075: use pushdate from params for docker images;

https://reviewboard.mozilla.org/r/75704/#review73614

::: taskcluster/taskgraph/task/docker_image.py:42
(Diff revision 1)
>                 self.index_paths == other.index_paths
>  
>      @classmethod
>      def load_tasks(cls, kind, path, config, params, loaded_tasks):
>          # TODO: make this match the pushdate (get it from a parameter rather than vcs)
> -        pushdate = time.strftime('%Y%m%d%H%M%S', time.gmtime())
> +        pushdate = time.strftime('%Y%m%d%H%M%S', time.gmtime(params['pushdate']))

I think the comment above coudl be removed now.
Attachment #8786821 - Flags: review?(garndt) → review+

Comment 39

3 years ago
mozreview-review
Comment on attachment 8786802 [details]
Bug 1286075: set ARTIFACT_TASKID correctly for upload-symbol jobs;

https://reviewboard.mozilla.org/r/75664/#review73616

::: taskcluster/taskgraph/task/legacy.py:106
(Diff revision 1)
> -    if 'treeherder' not in task['task']['extra']:
> -        task['task']['extra']['treeherder'] = {}
> +    if 'extra' not in task['task']:
> +        task['task']['extra'] = {}

task['task'].setdefault('extra', {})
Attachment #8786802 - Flags: review+

Comment 40

3 years ago
mozreview-review
Comment on attachment 8786825 [details]
Bug 1286075: rename taskgraph.transforms.make_task;

https://reviewboard.mozilla.org/r/75710/#review73622
Attachment #8786825 - Flags: review+

Comment 41

3 years ago
mozreview-review
Comment on attachment 8786820 [details]
Bug 1286075: add tier for simulator jobs;

https://reviewboard.mozilla.org/r/75702/#review73626
Attachment #8786820 - Flags: review?(lissyx+mozillians) → review+
(In reply to Pete Moore [:pmoore][:pete] from comment #10)
> I have to confess I've only had a superficial look - but off the bat, here
> are my concerns (which maybe you can address quickly as non-issues).

Please do have a closer look -- I'm not sure what you were looking at since your comment landed before the review request, but I think a patch I've been working on for 2+ months deserves more than a few minutes :)

> I'm worried that we might end up maintaining something like buildbot-configs
> for taskcluster. I loved the flexibility of the previous approach, combined
> with great simplicity. I'm worried the new approach might be considerably
> less simple, and soon turn to spaghetti. Hopefully this doesn't come across
> as provocative, I'm quite happy to be corrected.

In fact, the old approach is already impenetrable spaghetti which is preventing people from doing the things they want to do, and is littered with bugs -- most of the patches in this patchset are fixing bugs which I encountered.  So in fact I'm fixing the issues you're identifying .. perhaps you read the patch in reverse? :)

> Maybe it would be useful to put together a simple guide first, before this
> lands, with some examples about setting up new jobs. And apologies if I've
> misjudged after my cursory glance. The words "HUGE SET INBOUND" also scared
> me a little, as before this was not a lot of code, it was for the most part
> a set of static configs that could be independently edited on a task-by-task
> basis. Maybe it still is. :-)

This is already quite a bit of code to support tests, which are already ported.  This extends that model a little bit (job descriptions and some optimization support, mostly) and uses it to implement builds and other ephemera that remained in the legacy type.

Please do have another look -- I think you've badly mischaracterized my work based on your cursory look.
Assignee

Comment 43

3 years ago
mozreview-review-reply
Comment on attachment 8786805 [details]
Bug 1286075: always set groupSymbol to something for lint jobs;

https://reviewboard.mozilla.org/r/75672/#review73610

> I actually think you don't need this.  taskcluster-treeherder will default it for you if you're happy with accepting the defaults.
> 
> https://github.com/taskcluster/taskcluster-treeherder/blob/master/src/handler.js#L207

All of the other legacy stuff generates ? for the group in this case, so this is just making them consistent.  However, I'll add some code later to leave out the groupSymbol if it's "?"..

Comment 44

3 years ago
mozreview-review
Comment on attachment 8786802 [details]
Bug 1286075: set ARTIFACT_TASKID correctly for upload-symbol jobs;

https://reviewboard.mozilla.org/r/66914/#review73648

::: taskcluster/ci/build/android.yml:20
(Diff revision 1)
> +        max-run-time: 7200
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build multi-l10n update]
> +        config:
> +            - builds/releng_base_android_64_builds.py 

nit: trailing whitespace (throughout).

::: taskcluster/docs/attributes.rst:27
(Diff revision 1)
> +
> +run_on_projects
> +===============
> +
> +The projects where this task should be in the target task set.  This is how
> +requireements like "only run this on inbound" get implemented.  These are

nit: "requireements" has an extra 'e'.

::: taskcluster/docs/transforms.rst:128
(Diff revision 1)
>     still a test description, just with defaults and policies applied, and
>     per-platform options resolved.  So transforms up to this point do not modify
>     the "shape" of the test description, and are still governed by the schema in
>     ``test_description.py``.
>  
>   * The ``taskgraph.transforms.tests.make_task_description:transforms`` then

Why did you s/make_task/task/, but not s/make_task_description/task_description/?

::: taskcluster/taskgraph/jobs/mozharness.py:202
(Diff revision 1)
> +        mh_command.append(r"--config " + cfg.replace('/', '\\'))
> +    mh_command.append(r"--branch " + config.params['project'])
> +    mh_command.append(r"--skip-buildbot-actions --work-dir %cd:Z:=z:%\build")
> +    worker['command'] = [
> +        r"mkdir .\build\src",
> +        r"hg share c:\builds\hg-shared\mozilla-central .\build\src",

This seems so unlikely: Windows paths?  How can this be `mozharness_on_generic_worker_setup`?  Is "generic" code for "non-Docker", which is code for "Windows"?

::: taskcluster/taskgraph/jobs/mulet.py:57
(Diff revision 1)
> +    }]
> +
> +    add_workspace_cache(config, taskdesc, worker)
> +
> +    env = worker.setdefault('env', {})
> +    env.update({

The repetition around setting up these repositories, and the translation from lower-case to UPPER-CASE, sets my teeth on edge a little.  Why can't we keep the same key names?

::: taskcluster/taskgraph/jobs/mulet.py:139
(Diff revision 1)
> +
> +    taskdesc.setdefault('routes', []).extend([
> +        'index.gecko.v1.{project}.latest.simulator.opt'.format(**config.params),
> +    ])
> +
> +    # TODO(taksdiff): has the scope for this cache, but not the cache

nit: s/taksdiff/taskdiff/, for future grepping.  (Throughout.)

::: taskcluster/taskgraph/task/post_build.py:19
(Diff revision 1)
> +logger = logging.getLogger(__name__)
> +
> +
> +class PostBuildTask(transform.TransformTask):
> +    """
> +    A task implementing a post-build jobs.  These depend on jobs and perform

nit: s/a post-build jobs/a post-build job/.

::: taskcluster/taskgraph/task/post_build.py:33
(Diff revision 1)
> +    platform.
> +    """
> +
> +    @classmethod
> +    def get_inputs(cls, kind, path, config, params, loaded_tasks):
> +        if config.get('kind-dependencies', []) != ["build"]:

I don't understand why the division between build and non-build.  Is it easy to make a post-test task?  Is there an example of that in tree?

::: taskcluster/taskgraph/try_option_syntax.py
(Diff revision 1)
>          'sm-arm64-sim',
>          'sm-compacting',
>          'sm-rootanalysis',
>          'sm-package',
>          'sm-tsan',
> -        'sm-asan',

nit: superfluous rearrangement.

::: taskcluster/taskgraph/try_option_syntax.py:525
(Diff revision 1)
>              return match_test(self.unittests, 'unittest_try_name')
> +        elif attr('kind') in BUILD_KINDS:
> +            if attr('build_type') not in self.build_types:
> +                return False
> +            elif self.platforms is None:
> +                # for "-p all", look fro try in the 'run_on_projects' attribute

nit: "look fro" should be "look for".

Comment 45

3 years ago
mozreview-review
Comment on attachment 8786810 [details]
Bug 1286075: add TOOLTOOL_CACHE for gradle builds;

https://reviewboard.mozilla.org/r/75682/#review73716
Attachment #8786810 - Flags: review?(nalexander) → review+

Comment 46

3 years ago
mozreview-review
Comment on attachment 8786812 [details]
Bug 1286075: use a distinct symbol for Android gradle builds;

https://reviewboard.mozilla.org/r/75686/#review73720

Thanks, meant to get to this myself.
Attachment #8786812 - Flags: review?(nalexander) → review+

Comment 47

3 years ago
mozreview-review
Comment on attachment 8786806 [details]
Bug 1286075: simplify toolchain build tasks;

https://reviewboard.mozilla.org/r/75674/#review73738

::: taskcluster/ci/legacy/tasks/builds/linux64_clang.yml:53
(Diff revision 1)
>        - production
>      treeherder:
>        groupSymbol: Cc
>        groupName: Compilers, submitted by taskcluster
>        symbol: Clang
> +      tier: 2

Seems to me this should be tier 1. But it depends if this has the same meaning as other uses of tiers at Mozilla in relation to Firefox.

::: taskcluster/ci/legacy/tasks/builds/linux64_gcc.yml:53
(Diff revision 1)
>        - production
>      treeherder:
>        groupSymbol: Cc
>        groupName: Compilers, submitted by taskcluster
>        symbol: GCC
> +      tier: 2

Same here.
Attachment #8786806 - Flags: review?(mh+mozilla) → review+

Comment 48

3 years ago
mozreview-review
Comment on attachment 8786822 [details]
Bug 1286075: remove mulet simulator builds;

https://reviewboard.mozilla.org/r/75714/#review73778

Given the current state of b2g, I don't think anyone care about having Firefox OS simulator builds.
At least, we are not releasing any new version of the simulator via WebIDE...
Attachment #8786822 - Flags: review?(poirot.alex) → review+

Comment 49

3 years ago
mozreview-review
Comment on attachment 8786807 [details]
Bug 1286075: set tier, drop empty product for tc(Mh);

https://reviewboard.mozilla.org/r/75676/#review73956
Attachment #8786807 - Flags: review?(bugspam.Callek) → review+

Comment 50

3 years ago
mozreview-review
Comment on attachment 8786814 [details]
Bug 1286075: improve dict merging support;

https://reviewboard.mozilla.org/r/75690/#review74036

I see `merge` used exactly once in the final mega-patch, and `merge_to` used 0 times. Is the latter necessary, is the former worth it?
Assignee

Comment 51

3 years ago
mozreview-review-reply
Comment on attachment 8786814 [details]
Bug 1286075: improve dict merging support;

https://reviewboard.mozilla.org/r/75690/#review74036

`merge_to` is used twice -- once in the implementation of `merge` and once in the implementation of the template inheritance behavior in the same file.  `merge` is only used in one place (to apply defaults in TransformTask) but could be uesful other places as well for similar reasons.  The refactor into two functions was to avoid the modify-in-place behavior of `merge_to` without risking breaking the subtlties of template inheritance.
Assignee

Comment 52

3 years ago
mozreview-review-reply
Comment on attachment 8786806 [details]
Bug 1286075: simplify toolchain build tasks;

https://reviewboard.mozilla.org/r/75674/#review73738

> Seems to me this should be tier 1. But it depends if this has the same meaning as other uses of tiers at Mozilla in relation to Firefox.

Yes, same notion.  I'll change it to 1.

Comment 53

3 years ago
mozreview-review
Comment on attachment 8786816 [details]
Bug 1286075: allow optimization of tasks whose dependencies have not been optimized;

https://reviewboard.mozilla.org/r/75694/#review74170
Attachment #8786816 - Flags: review?(armenzg) → review+

Comment 54

3 years ago
mozreview-review
Comment on attachment 8786818 [details]
Bug 1286075: use regular cache names for various builds;

https://reviewboard.mozilla.org/r/75698/#review74184

::: taskcluster/ci/legacy/tasks/builds/android_api_15.yml:18
(Diff revision 1)
>    routes:
>      - 'index.buildbot.branches.{{project}}.android-api-15'
>      - 'index.buildbot.revisions.{{head_rev}}.{{project}}.android-api-15'
>  
>    scopes:
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-android-api-15-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-android-api-15-opt-workspace'

Why not switch these to be {{build_name}}-{{build_type}}? From the base_macosx64.yml it looks like that should work. If you used that for everything I think a lot of the other issues would be automatically resolved.

::: taskcluster/ci/legacy/tasks/builds/android_api_15_partner_sample1.yml:18
(Diff revision 1)
>    routes:
>      - 'index.buildbot.branches.{{project}}.android-api-15-partner-sample1'
>      - 'index.buildbot.revisions.{{head_rev}}.{{project}}.android-api-15-partner-sample1'
>  
>    scopes:
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-android-api-15-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-android-api-15-opt-workspace'

Is this intentionally not supposed to match the build_name / build_type?

::: taskcluster/ci/legacy/tasks/builds/haz_linux.yml:16
(Diff revision 1)
>    workerType: 'gecko-{{level}}-b-linux'
>  
>    scopes:
>      - 'docker-worker:cache:tooltool-cache'
>      - 'docker-worker:relengapi-proxy:tooltool.download.public'
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-haz-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-haz-debug-workspace'

Shouldn't this be 'browser-haz-debug' instead of 'linux64-haz-debug'?

::: taskcluster/ci/legacy/tasks/builds/haz_shell_linux.yml:16
(Diff revision 1)
>    workerType: 'gecko-{{level}}-b-linux'
>  
>    scopes:
>      - 'docker-worker:cache:tooltool-cache'
>      - 'docker-worker:relengapi-proxy:tooltool.download.public'
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-haz-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-shell-haz-debug-workspace'

'shell-has-debug' instead of 'linux64-shell-haz-debug'?

::: taskcluster/ci/legacy/tasks/builds/mulet_linux.yml:18
(Diff revision 1)
>    routes:
>      - 'index.buildbot.branches.{{project}}.linux64-mulet'
>      - 'index.buildbot.revisions.{{head_rev}}.{{project}}.linux64-mulet'
>  
>    scopes:
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-mulet-linux-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-mulet-opt-workspace'

'mulet-opt' instead of 'linux64-mulet-opt'?

::: taskcluster/ci/legacy/tasks/builds/mulet_linux_dbg.yml:18
(Diff revision 1)
>    routes:
>      - 'index.buildbot.branches.{{project}}.linux64-mulet'
>      - 'index.buildbot.revisions.{{head_rev}}.{{project}}.linux64-mulet'
>  
>    scopes:
> -    - 'docker-worker:cache:level-{{level}}-{{project}}-build-mulet-dbg-linux-workspace'
> +    - 'docker-worker:cache:level-{{level}}-{{project}}-build-linux64-mulet-debug-workspace'

'mulet-dbg' instead of 'linux64-mulet-debug'? Though it would be nice to 'dbg' -> 'debug' this, but I'm not sure if we ever called mulet 'debug' if it was always in Taskcluster.
Assignee

Comment 55

3 years ago
mozreview-review-reply
Comment on attachment 8786818 [details]
Bug 1286075: use regular cache names for various builds;

https://reviewboard.mozilla.org/r/75698/#review74184

So there are two primary, and a few secondary places that we currently describe platforms.  In most cases I've based my determination of the build platform and type on the try names (job flags in the legacy config), and those usually agree with the taskcluster.extra.treeherder.machine.platform.  Neither of those is always aligned with the build_name/build_type template variables, nor with the build_platform.  And those variables aren't always well-aligned with related things like cache names and buidbot and gecko.v1 index routes.

I've worked around those differences in places where changing them would have an external impact -- index routes, treeherder platform, and so on.  These hacks usually involve a dedicated transform.  In places where the impact was not visible (cache names) I elected to change the legacy stuff instead to make it consistent with what I'd selected as the platform.

> Why not switch these to be {{build_name}}-{{build_type}}? From the base_macosx64.yml it looks like that should work. If you used that for everything I think a lot of the other issues would be automatically resolved.

Well, in the final form, the cache name is generated with

    'level-{}-{}-build-{}-{}-workspace'.format(
                config.params['level'], config.params['project'],
                taskdesc['attributes']['build_platform'],
                taskdesc['attributes']['build_type'],
                )

So yes, in most of these cases I'm adding the build_type.  Since this code is deleted a few commits later, I don't think it's too important how that's achieved.  However, build_name is not the same as build_platform in many cases, so while using those variables might work in this file, it won't work in all of them.  In fact, the remainder of your comments are great examples of this.

> Is this intentionally not supposed to match the build_name / build_type?

Hm, actually this should be ..build-android-partner-sample1-opt-workspace.  It looks like I don't have any examples that actually create this task.  I'll fix that.

The build platform is defined at
  https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/build/android-partner.yml#l1

> Shouldn't this be 'browser-haz-debug' instead of 'linux64-haz-debug'?

The build platform is linux64-haz-debug
  https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/hazard/kind.yml#l48

> 'shell-has-debug' instead of 'linux64-shell-haz-debug'?

https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/hazard/kind.yml#l20

> 'mulet-opt' instead of 'linux64-mulet-opt'?

https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/build/mulet.yml#l23

(ok, this is a weird one -- its treeherder platform is mulet-linux64/opt but it's referred to as linux64-mulet in try syntax, and that seems to make more sense; this is one of the weird things to straighten out in bug 1286086)

> 'mulet-dbg' instead of 'linux64-mulet-debug'? Though it would be nice to 'dbg' -> 'debug' this, but I'm not sure if we ever called mulet 'debug' if it was always in Taskcluster.

https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/build/mulet.yml#l1
(In reply to Dustin J. Mitchell [:dustin] from comment #55)
> > Why not switch these to be {{build_name}}-{{build_type}}? From the base_macosx64.yml it looks like that should work. If you used that for everything I think a lot of the other issues would be automatically resolved.

My point here was just that we should be re-using existing variables for the cache name, rather than re-typing them and potentially have them mismatch. I don't think it matters whether it's build_name or build_platform. The base_macosx64.yml file was an example where it already used {{build_name}}-{{build_type}}, so it appears that it already works.

> Well, in the final form, the cache name is generated with
> 
>     'level-{}-{}-build-{}-{}-workspace'.format(
>                 config.params['level'], config.params['project'],
>                 taskdesc['attributes']['build_platform'],
>                 taskdesc['attributes']['build_type'],
>                 )

That sounds like what I'm suggesting.

> So yes, in most of these cases I'm adding the build_type.  Since this code
> is deleted a few commits later, I don't think it's too important how that's
> achieved.

What code is deleted a few commits later? I tried looking through the rest of your patchset but I'm not sure exactly what you're referring to. (It's possible I didn't pull the patchset correctly, since it's the first time I tried pulling directly from mozreview into a git repo.)

> However, build_name is not the same as build_platform in many
> cases, so while using those variables might work in this file, it won't work
> in all of them.  In fact, the remainder of your comments are great examples
> of this.

Since it sounds like the cache name is just used & has to match internally, I don't think it really matters which variable it is based on.

> In most cases I've based my determination of the build platform and type on the try names (job
> flags in the legacy config), and those usually agree with the
> taskcluster.extra.treeherder.machine.platform.
> > Shouldn't this be 'browser-haz-debug' instead of 'linux64-haz-debug'?
> 
> The build platform is linux64-haz-debug

So the new canonical platform name ("build_platform") is based on what we have from https://dxr.mozilla.org/mozilla-central/rev/d5f20820c80514476f596090292a5d77c4b41e3b/taskcluster/ci/legacy/tasks/branches/base_jobs.yml ? Is this eventually going to be used for the one-true-platform name in bug 1286086? If so, we probably want to show that explicitly to a wider audience.

If the ultimate goal in bug 1286086 is to say that we call a linux64 hazard build "linux64-haz-debug" everywhere, rather than "linux64-haz-debug" in try and "browser-haz-debug" in the index, I would suggest that it is probably easier to go with the indexing name as the source. Although they are changeable, you do have to backfill the index to make it useful. I think the try platform name just has to be consistent with itself (since that's all in-tree, right?), whereas the index needs to be consistent across time. IOW, "give me a linux64 hazard debug build from 6 months ago" should be the same as "give me a linux64 hazard debug build from yesterday". If someone checks out a tree from 6 months ago and has to do "try -p linux64-haz-debug" instead of "try -p browser-haz-debug" today, I think that is less of an issue.

I am obviously not as familiar with the try/treeherder side of things, so if there's a lot more details to it than that, please let me know :)

Comment 57

3 years ago
mozreview-review
Comment on attachment 8786804 [details]
Bug 1286075: do not add index routes for jobs;

https://reviewboard.mozilla.org/r/75670/#review74432

::: taskcluster/taskgraph/task/legacy.py:496
(Diff revision 1)
>              task_extra = build_task['task']['extra']
>              build_parameters['build_name'] = task_extra['build_name']
>              build_parameters['build_type'] = task_extra['build_type']
>              build_parameters['build_product'] = task_extra['build_product']
>  
> -            if 'treeherder' in task_extra:
> +            # jobs don't have indexes

Why don't we want lint jobs indexed? They upload their logs as artifacts, at least. Is it not useful to go back in time and look at logs for different types of lint jobs? (I don't know much about them, so maybe this is a question for those interested in those jobs).
The files in taskcluster/ci/legacy aren't deleted in this review request, just unused.  By the time I replied I had already deleted them locally, so I forgot and I can see how that was confusing!  There are a few files remaining (for about 12 tasks that haven't been ported yet), but all of the android, hazard, macosx, mulet, etc. have been ported and the legacy files will appear as deleted in the next revision of this review request.  Sorry about that confusion.

So the overall organization to this review request is:

 * tweak legacy \
 * tweak legacy  |      - changes that affect the created tasks in nontrivial ways 
 * tweak legacy  |
 * ...          /
 * scripts for testing  - scripts used to compare legacy-generated taskgraphs to new taskgraphs
 * replace almost all.. - implementation of the new taskgraphs

My testing consists of generating taskgraphs using scripts/gather_old.py without the final cset applied, then running scripts/compare.py with the final cset applied.  The latter script compares the resulting taskgraphs, discarding irrelevant differences and collating tasks by their label and treeherder platform, and shows any differences.

So attachment 8786818 [details] is really just asking "is it OK to change these cache names to this string which, I promise, avoids the need for special-cases later?"  If it's not OK, then I'll add an extra transform, including a comment describing your reasoning, to implement the special cases in the new taskgraph generation code.  Hopefully from that perspective, this is an easy r+?

Regarding which string to use for which platform -- I have my hands with this bug, so let's not start working on bug 1286086 yet.  I'd ask that you please reserve judgement on that bug until work actually begins :)

Comment 59

3 years ago
mozreview-review
Comment on attachment 8786809 [details]
Bug 1286075: set tier for eslint;

https://reviewboard.mozilla.org/r/75680/#review74468
Attachment #8786809 - Flags: review?(dtownsend) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #58)
> So attachment 8786818 [details] is really just asking "is it OK to change
> these cache names to this string which, I promise, avoids the need for
> special-cases later?"  If it's not OK, then I'll add an extra transform,
> including a comment describing your reasoning, to implement the special
> cases in the new taskgraph generation code.  Hopefully from that
> perspective, this is an easy r+?

Ahh, that makes more sense. Thanks for the explanation!

> Regarding which string to use for which platform -- I have my hands with
> this bug, so let's not start working on bug 1286086 yet.  I'd ask that you
> please reserve judgement on that bug until work actually begins :)

I guess that was part of my concern - by making a decision on what to use for build_platform now, it seems like we are already deciding some direction for bug 1286086 without directly working on it yet? Or is it more like "we already have this as a platform name for try syntax, so call that 'build_platform'" without any broader implications?

Comment 61

3 years ago
mozreview-review
Comment on attachment 8786818 [details]
Bug 1286075: use regular cache names for various builds;

https://reviewboard.mozilla.org/r/75698/#review74470
Attachment #8786818 - Flags: review?(mshal) → review+
Assignee

Comment 62

3 years ago
mozreview-review-reply
Comment on attachment 8786804 [details]
Bug 1286075: do not add index routes for jobs;

https://reviewboard.mozilla.org/r/75670/#review74432

> Why don't we want lint jobs indexed? They upload their logs as artifacts, at least. Is it not useful to go back in time and look at logs for different types of lint jobs? (I don't know much about them, so maybe this is a question for those interested in those jobs).

Here's the full set of jobs which have index routes:

flake8-gecko
sphinx
mozharness
wptlint-gecko
taskgraph-tests
eslint-gecko
linux64-gcc
android-api-15-gradle-dependencies
android-checkstyle
linux64-clang
marionette-harness
android-lint
android-test

based on a run of ./mach taskgraph tasks against a mozilla-central push, having disabled the functionality that skips jobs when matching files have not changed.  The routes in this case are gecko.v1 only, and things like

index.gecko.v1.mozilla-central.latest.linux.flake8-gecko

These routes are added in taskcluster/ci/legacy/tasks/lint.yml, introduced by :gps in http://hg.mozilla.org/mozilla-central/rev/623765c2381e.  Calling out the last paragraph of the commit message:

    There is definitely some wonkiness in this implementation. For example,
    there are references to "build_name," "build_type," and "build_product,"
    which arguably are no longer relevant to generic tasks. However, they
    appear to be so integrated into task processing (including route names)
    that I'm a bit scared to change them.

I could equally well have removed the gecko.v1 routes from taskcluster/ci/legacy/tasks/lint.yml, but that still would have left index.rank = 0 which is superfluous when there are no index routes.

Given that these are only gecko.v1 routes, which you suggested earlier are soon to be deleted, maybe we can just make the call to remove them?
(In reply to Michael Shal [:mshal] from comment #60)
> I guess that was part of my concern - by making a decision on what to use
> for build_platform now, it seems like we are already deciding some direction
> for bug 1286086 without directly working on it yet? Or is it more like "we
> already have this as a platform name for try syntax, so call that
> 'build_platform'" without any broader implications?

I see, the question makes a lot of sense now.  There are probably two parts to this concern.  The first is that landing this patchset would create some lasting state that we will be committed to -- index routes, try syntax, artifact names, etc.  I think this is categorically avoided my testing approach: I'm deliberately trying not to change any of those things outside of the small modifications in the "tweak legacy" csets, none of which is a large-scale renaming of platforms.

The second part is about "deciding some direction", meaning that this code strongly hints that there is one 'right' answer for what constitutes a platform, by naming it build_platform.  I'm probably guilty of that.  My only counter-argument is that I'm more concerned right now that we have a platform name against which other platformish names are compared, and less concerned about what that name is.  I've chosen one here (based on the try name), but if we determine that it's best to change that name in bug 1286086, that doesn't bother me.  We'll need to do some analysis and think carefully about the implications of any changes, and of electing to leave things uncoordinated, in that bug.
(In reply to Dustin J. Mitchell [:dustin] from comment #62)
> > Why don't we want lint jobs indexed? They upload their logs as artifacts, at least. Is it not useful to go back in time and look at logs for different types of lint jobs? (I don't know much about them, so maybe this is a question for those interested in those jobs).
> 
> Here's the full set of jobs which have index routes:
> 
> flake8-gecko
> sphinx
> mozharness
> wptlint-gecko
> taskgraph-tests
> eslint-gecko
> linux64-gcc
> android-api-15-gradle-dependencies
> android-checkstyle
> linux64-clang
> marionette-harness
> android-lint
> android-test
> 
> based on a run of ./mach taskgraph tasks against a mozilla-central push,
> having disabled the functionality that skips jobs when matching files have
> not changed.  The routes in this case are gecko.v1 only, and things like
> 
> index.gecko.v1.mozilla-central.latest.linux.flake8-gecko

It appears we do currently have routes for this in gecko.v2: https://tools.taskcluster.net/index/artifacts/#gecko.v2.mozilla-central.latest.lint/gecko.v2.mozilla-central.latest.lint.flake8-gecko-opt

I'm wondering whether or not it is useful to anyone to have the logs for these jobs in the index.

> These routes are added in taskcluster/ci/legacy/tasks/lint.yml, introduced
> by :gps in http://hg.mozilla.org/mozilla-central/rev/623765c2381e.  Calling
> out the last paragraph of the commit message:
> 
>     There is definitely some wonkiness in this implementation. For example,
>     there are references to "build_name," "build_type," and "build_product,"
>     which arguably are no longer relevant to generic tasks. However, they
>     appear to be so integrated into task processing (including route names)
>     that I'm a bit scared to change them.

:gps, can you provide some additional clarification here? Is the wonkiness because the parameters are called "build_*" when these are non-build jobs? Or would you have preferred if the lint jobs weren't indexed at all? If we didn't want them indexed in the first place, then I'm fine with removing them from gecko.v2.

> I could equally well have removed the gecko.v1 routes from
> taskcluster/ci/legacy/tasks/lint.yml, but that still would have left
> index.rank = 0 which is superfluous when there are no index routes.
> 
> Given that these are only gecko.v1 routes, which you suggested earlier are
> soon to be deleted, maybe we can just make the call to remove them?

Yeah, the gecko.v1 routes can go away (bug 1277881) - feel free to delete those :).
(In reply to Dustin J. Mitchell [:dustin] from comment #63)
> (In reply to Michael Shal [:mshal] from comment #60)
> > I guess that was part of my concern - by making a decision on what to use
> > for build_platform now, it seems like we are already deciding some direction
> > for bug 1286086 without directly working on it yet? Or is it more like "we
> > already have this as a platform name for try syntax, so call that
> > 'build_platform'" without any broader implications?
> 
> I see, the question makes a lot of sense now.  There are probably two parts
> to this concern.  The first is that landing this patchset would create some
> lasting state that we will be committed to -- index routes, try syntax,
> artifact names, etc.  I think this is categorically avoided my testing
> approach: I'm deliberately trying not to change any of those things outside
> of the small modifications in the "tweak legacy" csets, none of which is a
> large-scale renaming of platforms.
> 
> The second part is about "deciding some direction", meaning that this code
> strongly hints that there is one 'right' answer for what constitutes a
> platform, by naming it build_platform.  I'm probably guilty of that.  My
> only counter-argument is that I'm more concerned right now that we have a
> platform name against which other platformish names are compared, and less
> concerned about what that name is.  I've chosen one here (based on the try
> name), but if we determine that it's best to change that name in bug
> 1286086, that doesn't bother me.  We'll need to do some analysis and think
> carefully about the implications of any changes, and of electing to leave
> things uncoordinated, in that bug.

That makes sense - thanks!
Oh, I keep forgetting that the gecko.v2 routes are added "automatically" by legacy.py, whereas buildbot and gecko.v1 are included in the .yml files.  So all of those jobs also have v2 routes as you've shown.  I can try to track down the authors of those tasks and see if they want them indexed, if that helps.
Unless there is some explicit reason to *not* index them, I think it might be easier to just leave them indexed. Perhaps someone would want to go back and look at lint logs over the past year to do some analysis, or something...
But, definitely worth checking with the task authors.
The following taskcluster "jobs" are currently indexed, meaning that you can look up a particular task id by revision or pushdate or just get the most recent one on a particular branch.  It has nothing to do with whether the job appears on treeherder.

flake8-gecko (now mozlint-flake8) - ahal
sphinx - gps
mozharness - Callek
wptlint-gecko - jgraham
taskgraph-tests - me
eslint-gecko - Mossop
linux64-gcc - glandium
android-api-15-gradle-dependencies - mcomella/nalexander
android-checkstyle - mcomella/nalexander
linux64-clang - glandium
marionette-harness - maja_zf
android-lint - mcomella/nalexander
android-test - mcomella/nalexander

Typically, index routes are used to look up build artifacts, for example as part of regression hunting or to find a suitable artifact build.  My understanding is that of the above are, instead, either checks which do not produce output aside form logging, or one-off tasks triggered on demand with a try push.  I think they were indexed accidentally, due to the way tasks are generated in-tree, without any intent of actually using the index to find tasks.  If this is the case, I'd like to stop indexing the tasks, if for no other reason than to halt the cargo cult.  So, for those of you flagged for info here, please let me know if you do need the index routes or not.
Flags: needinfo?(nalexander)
Flags: needinfo?(mjzffr)
Flags: needinfo?(michael.l.comella)
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(james)
Flags: needinfo?(gps)
Flags: needinfo?(dtownsend)
Flags: needinfo?(bugspam.Callek)
Flags: needinfo?(ahalberstadt)
The linting tasks were almost for sure cargo culted, feel free to remove routes for flake8, eslint and wptlint. Also for future reference you should probably ping me for eslint related stuff instead of Mossop.
Flags: needinfo?(james)
Flags: needinfo?(dtownsend)
Flags: needinfo?(ahalberstadt)
(In reply to Dustin J. Mitchell [:dustin] from comment #69)
> The following taskcluster "jobs" are currently indexed, meaning that you can
> look up a particular task id by revision or pushdate or just get the most
> recent one on a particular branch.  It has nothing to do with whether the
> job appears on treeherder.
> 
> android-api-15-gradle-dependencies - mcomella/nalexander

This one produces toolchain binaries for use in subsequent jobs; if I understand correctly, it needs to be indexed to get at those artifacts. 

> android-checkstyle - mcomella/nalexander
> android-lint - mcomella/nalexander
> android-test - mcomella/nalexander

These are test jobs, like Robocop.  They produce logging output.  I don't think they need to be indexed to preserve those artifacts.
Flags: needinfo?(nalexander)
Flags: needinfo?(michael.l.comella)
The marionette-harness job does not need to be indexed.
Flags: needinfo?(mjzffr)
Assignee

Comment 73

3 years ago
mozreview-review-reply
Comment on attachment 8786818 [details]
Bug 1286075: use regular cache names for various builds;

https://reviewboard.mozilla.org/r/75698/#review74184

> Hm, actually this should be ..build-android-partner-sample1-opt-workspace.  It looks like I don't have any examples that actually create this task.  I'll fix that.
> 
> The build platform is defined at
>   https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/f066e2b75fe7/taskcluster/ci/build/android-partner.yml#l1

OK, this one puzzled me for a long time.  In fact, this cache is never used, because the job is only defined on try, but workspace caches aren't used on try.  So I'll drop this hunk.
> linux64-gcc - glandium
> linux64-clang - glandium

These are meant to provide the compiler for other TC builds after this bug lands, replacing tooltool.
Flags: needinfo?(mh+mozilla)

Comment 75

3 years ago
mozreview-review
Comment on attachment 8786803 [details]
Bug 1286075: refactor test.py to use TransformTask;

https://reviewboard.mozilla.org/r/75666/#review74554

The overall deletion of code is a good sign that your abstractions are correct!
Attachment #8786803 - Flags: review?(gps) → review+

Comment 76

3 years ago
mozreview-review
Comment on attachment 8786804 [details]
Bug 1286075: do not add index routes for jobs;

https://reviewboard.mozilla.org/r/75670/#review74556
Attachment #8786804 - Flags: review+

Comment 77

3 years ago
mozreview-review
Comment on attachment 8786811 [details]
Bug 1286075: set workerType correctly for linux64 asan;

https://reviewboard.mozilla.org/r/75684/#review74570
Attachment #8786811 - Flags: review?(gps) → review+

Comment 78

3 years ago
mozreview-review
Comment on attachment 8786813 [details]
Bug 1286075: set tier explicitly for tc(Doc);

https://reviewboard.mozilla.org/r/75688/#review74572
Attachment #8786813 - Flags: review?(gps) → review+

Comment 79

3 years ago
mozreview-review
Comment on attachment 8786814 [details]
Bug 1286075: improve dict merging support;

https://reviewboard.mozilla.org/r/75690/#review74574
Attachment #8786814 - Flags: review?(gps) → review+

Comment 80

3 years ago
mozreview-review
Comment on attachment 8786815 [details]
Bug 1286075: if <kind>/kind.yml doesn't exist, skip it;

https://reviewboard.mozilla.org/r/75692/#review74576
Attachment #8786815 - Flags: review?(gps) → review+

Comment 81

3 years ago
mozreview-review
Comment on attachment 8786817 [details]
Bug 1286075: include redo in mach python path;

https://reviewboard.mozilla.org/r/75696/#review74578
Attachment #8786817 - Flags: review?(gps) → review+

Comment 82

3 years ago
mozreview-review
Comment on attachment 8786819 [details]
Bug 1286075: use a per-level build for windows;

https://reviewboard.mozilla.org/r/75700/#review74580
Attachment #8786819 - Flags: review+
(In reply to Dustin J. Mitchell [:dustin] from comment #69)
> The following taskcluster "jobs" are currently indexed, meaning that you can
> look up a particular task id by revision or pushdate or just get the most
> recent one on a particular branch.  It has nothing to do with whether the
> job appears on treeherder.
> 
> mozharness - Callek

I'm guilty of cargo-culting the index stuff (and the v2 index stuff even tripped me up because I didn't even have some of the build type et-all vars defined at first). I have no current need.
Flags: needinfo?(bugspam.Callek)
(In reply to Mike Hommey [:glandium] from comment #74)
> These are meant to provide the compiler for other TC builds after this bug
> lands, replacing tooltool.

For the record, I'm reading this as "these will be hooked up as build prerequisites eventually, and don't need indexing"

Comment 85

3 years ago
mozreview-review
Comment on attachment 8786819 [details]
Bug 1286075: use a per-level build for windows;

https://reviewboard.mozilla.org/r/75700/#review74596
Attachment #8786819 - Flags: review?(rthijssen) → review+

Comment 86

3 years ago
mozreview-review
Comment on attachment 8786804 [details]
Bug 1286075: do not add index routes for jobs;

https://reviewboard.mozilla.org/r/75670/#review75054
Attachment #8786804 - Flags: review?(mshal) → review+
Blocks: 1300901
Blocks: 1278402
Blocks: 1296397
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Attachment #8786804 - Attachment is obsolete: true
Attachment #8786805 - Attachment is obsolete: true
Attachment #8786820 - Attachment is obsolete: true
Attachment #8786826 - Attachment is obsolete: true

Comment 141

3 years ago
mozreview-review
Comment on attachment 8789140 [details]
Bug 1286075: factor load_yaml into a util module;

https://reviewboard.mozilla.org/r/77418/#review75708

::: taskcluster/taskgraph/test/test_util_yaml.py:18
(Diff revision 1)
> +prop:
> +    - val1
> +"""
> +
> +
> +class TestDocker(unittest.TestCase):

TestDocker?
Attachment #8789140 - Flags: review?(gps) → review+
Attachment #8789146 - Flags: review?(gps)
Attachment #8789147 - Flags: review?(gps)

Comment 142

3 years ago
mozreview-review
Comment on attachment 8789164 [details]
Bug 1286075: use the default gecko-b-* workerType for all haz builds;

https://reviewboard.mozilla.org/r/77464/#review75710
Attachment #8789164 - Flags: review+
Attachment #8789141 - Flags: review?(gps)

Comment 143

3 years ago
mozreview-review
Comment on attachment 8789156 [details]
Bug 1286075: add a b2g-device kind;

https://reviewboard.mozilla.org/r/77450/#review75842

Well I think you know those changes more than I do, since it is all new, but it looks to be much more simple now that it was before. I guess you took the current descriptions and you moved that to the new system, and I saw nothing that would raise a warning :)

Thanks for doing this!

::: taskcluster/ci/b2g-device/kind.yml:37
(Diff revision 1)
> +        kind: build
> +        tier: 3
> +    run-on-projects:
> +      - try
> +      - mozilla-central
> +      - integration

I guess we will have to add pine too? But maybe that should be done only on pine tree :)
Attachment #8789156 - Flags: review?(lissyx+mozillians) → review+

Comment 144

3 years ago
mozreview-review
Comment on attachment 8789165 [details]
Bug 1286075: clean up mulet builds;

https://reviewboard.mozilla.org/r/77466/#review75846
Attachment #8789165 - Flags: review?(lissyx+mozillians) → review+

Comment 145

3 years ago
mozreview-review
Comment on attachment 8789137 [details]
Bug 1286075: use cache names based on build_platfom+build_type;

https://reviewboard.mozilla.org/r/77412/#review75848
Attachment #8789137 - Flags: review?(lissyx+mozillians) → review+
Comment on attachment 8786823 [details]
Bug 1286075: never coalesce on try;

https://reviewboard.mozilla.org/r/75706/#review75850

thanks!
Attachment #8786823 - Flags: review?(jmaher) → review+

Comment 147

3 years ago
mozreview-review
Comment on attachment 8789136 [details]
Bug 1286075: all device builds are at tier 3;

https://reviewboard.mozilla.org/r/77410/#review75854
Attachment #8789136 - Flags: review?(lissyx+mozillians) → review+

Comment 148

3 years ago
mozreview-review
Comment on attachment 8789143 [details]
Bug 1286075: fix target task generation, including try;

https://reviewboard.mozilla.org/r/77424/#review75878

Overall looks good, the majority of this code goes unused until later patches though, but thats ok.

::: taskcluster/taskgraph/target_tasks.py:19
(Diff revision 1)
> +
> +RELEASE_PROJECTS = set([
> +    'mozilla-central',
> +    'mozilla-aurora',
> +    'mozilla-beta',
> +    'mozilla-release',

add esr now? or only when we get ready to branch for esr with this code involved?

::: taskcluster/taskgraph/target_tasks.py:20
(Diff revision 1)
> +RELEASE_PROJECTS = set([
> +    'mozilla-central',
> +    'mozilla-aurora',
> +    'mozilla-beta',
> +    'mozilla-release',
>  ])

I'm also tempted to bikeshed this and say "mainline" or such, and then leave "RELEASE" for either the stuff in hg.m.o/releases/ or just being beta/release/esr.

Instead I'm just going to leave this "color square" here and let you run with it as-is or change, at your discretion.

::: taskcluster/taskgraph/target_tasks.py:70
(Diff revision 1)
>                  task.attributes['task_duplicates'] = options.trigger_tests
>  
>      return target_tasks_labels
>  
>  
> -@_target_task('all_builds_and_tests')
> +@_target_task('all_builds_and_tests')  # (old name)

I'm curious why the # (old name) needs to stay here. But I suspect there is some sort of backwards-compat or chicken-egg thing involved.

::: taskcluster/taskgraph/target_tasks.py:107
(Diff revision 1)
> -            'linux64-pgo',
> -        ])):
> -            return False
> +                return False
> -        if not attrmatch(attrs, e10s=True):
> +        # don't upload symbols
> +        if task.attributes['kind'] == 'upload-symbols':
>              return False

huh -- did we not upload symbols before? (I'd imagine having symbols from symbol server for these builds could have been wanted by the devs.

I'd be happy with an additional review on <human who checked out ash> if you don't have a good answer on the need or no-need here.

::: taskcluster/taskgraph/test/test_target_tasks.py:43
(Diff revision 1)
> +            'b': TestTask(kind='build', label='b',
> +                          attributes={'run_on_projects': ['integration']}),
>              'boring': TestTask(kind='docker', label='boring'),
>          }, graph=Graph(nodes={'a', 'b', 'boring'}, edges=set()))
> -        self.assertEqual(sorted(method(graph, {})), sorted(['a', 'b']))
> +        parameters = {'project': 'mozilla-inbound'}
> +        self.assertEqual(sorted(method(graph, parameters)), sorted(['a', 'b']))

I'd love moar tests here, e.g. same tasks with project': m-c

and similarly a test for beta/m-c with run_on_projects matching that.

maybe adding a third task to all of them with run_on for try?
Attachment #8789143 - Flags: review?(bugspam.Callek) → review+

Comment 149

3 years ago
mozreview-review
Comment on attachment 8789150 [details]
Bug 1286075: add source-check kind;

https://reviewboard.mozilla.org/r/77438/#review75892

Thanks this looks great! r-'ing for now, I'd like to re-review once the issues are fixed.

::: taskcluster/ci/source-check/eslint.yml:1
(Diff revision 1)
> +mozlint-eslint/opt:

The eslint, flake8 and wpt linters are all mozlint based linters. I'd recommend putting them all in a single 'mozlint.yml' file for now. There's probably an argument to be made that these should even be their own kind, but this is good for now. We can re-evaluate later if need be.

Also if there's a way we can share configs between them (without making a new kind) we should as the number of mozlint based linters are likely to grow. Otherwise a bit of duplication is fine for now.

::: taskcluster/ci/source-check/eslint.yml:30
(Diff revision 1)
> +    when:
> +        files-changed:
> +            # Files that are likely audited.

Nice, glad this is defined in task config instead of branch config now.

::: taskcluster/ci/source-check/python-lint.yml:33
(Diff revision 1)
> +            - '**/.flake8'
> +            - 'python/mozlint/**'
> +            - 'tools/lint/**'
> +            - 'testing/docker/lint/**'
> +
> +wptlint-gecko/opt:

This isn't a python linter, but if you follow the advice of my previous issue, then this issue is moot anyway.

::: taskcluster/ci/source-check/python-lint.yml:50
(Diff revision 1)
> +        using: run-task
> +        command: >
> +            cd /home/worker/checkouts/gecko &&
> +            ./mach lint -l wpt -f treeherder

Should these three lint tasks use the new 'mach' transform you added? I guess the eslint one is tricky because it does some extra stuff. The flake8 and wpt ones should work though.

::: taskcluster/ci/source-check/python-lint.yml:58
(Diff revision 1)
> +        files-changed:
> +        - 'testing/web-platform/tests/**'
> +        - 'python/mozlint/**'
> +        - 'tools/lint/**'
> +        - 'testing/docker/lint/**'

Missing indent

::: taskcluster/ci/source-check/python-tests.yml:1
(Diff revision 1)
> +taskgraph-tests/opt:

Buildtypes don't make much sense for jobs that don't require a build, but I guess the /opt everywhere is needed to appease treeherder? Maybe we can at least specify it behind the scenes so it's not needed in the task configs.

::: taskcluster/docs/kinds.rst:38
(Diff revision 1)
> +------------
> +
> +Source-checks are tasks that look at the Gecko source directly to check
> +correctness.  This can include linting, Python unit tests, source-code
> +analysis, or measurement work -- basically anything that does not require a
> +full ``./mach build`` or equivalent.

I would just replace this whole line with 'build'. Saying 'full' might make people think e.g artifact builds don't count.
Attachment #8789150 - Flags: review?(ahalberstadt) → review-
Comment on attachment 8789157 [details]
Bug 1286075: add a (temporary) marionette-harness kind;

https://reviewboard.mozilla.org/r/77452/#review75944

r+wc (One question.)

::: taskcluster/ci/marionette-harness/kind.yml:37
(Diff revision 1)
> +        worker:
> +            implementation: docker-worker
> +            docker-image: {in-tree: desktop-build}  # NOTE: better to use the lint image
> +            env:
> +                JOB_SCRIPT: "taskcluster/scripts/tester/harness-test-linux.sh"
> +                MOZHARNESS_SCRIPT: "testing/mozharness/scripts/marionette_harness_tests.py\n" # NOTE: remove trailing newline

Why can't the newline be removed now? (The original task definition did not include a newline.)
Attachment #8789157 - Flags: review?(mjzffr) → review+

Comment 151

3 years ago
mozreview-review
Comment on attachment 8789148 [details]
Bug 1286075: add hazard kind;

https://reviewboard.mozilla.org/r/77434/#review75948

::: taskcluster/ci/hazard/kind.yml:41
(Diff revision 1)
> +        run-on-projects:
> +            - integration
> +            - release

This one should be run everywhere, so I guess you can just delete run-on-projects

::: taskcluster/ci/hazard/kind.yml:81
(Diff revision 1)
> +        run-on-projects:
> +            - try

It's not in this patch, but I don't really want linux64-mulet-haz to ever run unless someone explicitly requests it.

It's broken now anyway, and I have no intention of fixing it myself, but it may still be useful for whoever is doing b2g work? I don't know the state of that.
Attachment #8789148 - Flags: review?(sphink) → review-

Comment 152

3 years ago
mozreview-review
Comment on attachment 8789155 [details]
Bug 1286075: add a spidermonkey kind;

https://reviewboard.mozilla.org/r/77448/#review75950

It all looks pretty much good, but there were a couple things that I think need to be fixed. Some may stem from my own misunderstandings, though.

::: taskcluster/ci/spidermonkey/kind.yml:27
(Diff revision 1)
> +        max-run-time: 36000
> +        docker-image: {in-tree: desktop-build}
> +
> +jobs:
> +    sm-package/opt:
> +        description: "Spidermonkey Plain"

"Spidermonkey source package and test", I guess?

::: taskcluster/ci/spidermonkey/kind.yml:30
(Diff revision 1)
> +jobs:
> +    sm-package/opt:
> +        description: "Spidermonkey Plain"
> +        index:
> +            job-name:
> +                buildbot: sm-plain

This particular job never existed on buildbot, though I can believe it might have had a buildbot index entry anyway.

Nobody uses this. Please remove it from all of these jobs.

::: taskcluster/ci/spidermonkey/kind.yml:38
(Diff revision 1)
> +        run-on-projects:
> +            - integration
> +            - release

I definitely want this on try as well.

::: taskcluster/ci/spidermonkey/kind.yml:42
(Diff revision 1)
> +            files-changed:
> +                - build/**
> +                - configure.py
> +                - dom/bindings/**
> +                - intl/icu/**
> +                - js/moz.configure
> +                - js/public/**
> +                - js/src/**
> +                - layout/tools/reftest/reftest/**
> +                - media/webrtc/trunk/tools/gyp/**
> +                - memory/**
> +                - mfbt/**
> +                - modules/fdlibm/**
> +                - modules/zlib/src/**
> +                - mozglue/**
> +                - moz.configure
> +                - nsprpub/**
> +                - python/**
> +                - taskcluster/moz.build
> +                - testing/mozbase/**
> +                - test.mozbuild
> +                - toolkit/mozapps/installer/package-name.mk
> +                - toolkit/mozapps/installer/upload-files.mk

Oh wow, I hadn't realized all this was here already. RyanVM recently opened bug 1301122 to add things to this list... but they already seem to be here. Well, with the exception of Makefile.in, moz.build, and config/\*\*.

I was planning on waiting until this landed to add those in.

::: taskcluster/ci/spidermonkey/kind.yml:77
(Diff revision 1)
> +                gecko-v2: sm-plaindebug-debug
> +        treeherder:
> +            platform: linux64/debug
> +            symbol: SM-tc(p)
> +        run:
> +            using: spidermonkey

This could be in job-defaults, if it can handle non-toplevel keys. I don't know how the merging works, though, so maybe specifying the spidermonkey-variant would wipe out the using? Or maybe you just prefer this to be explicit? I'm fine either way.

::: taskcluster/ci/spidermonkey/kind.yml:79
(Diff revision 1)
> +        run-on-projects:
> +            - integration
> +            - release

All of these spidermonkey jobs should run on try too. Assuming I understand what that means -- I mean, they should run on try with try: -p linux64 if they are listed in the ridealongs, and they should be individually selectable on try no matter what.

::: taskcluster/ci/spidermonkey/kind.yml:83
(Diff revision 1)
> +            files-changed:
> +                - js/public/**
> +                - js/src/**

It would be nice for this to be set in the job-defaults, as it should be the same for all spidermonkey jobs except for sm-package. But again, I don't know whether that's possible.

::: taskcluster/ci/spidermonkey/kind.yml:194
(Diff revision 1)
> +                buildbot: sm-plain
> +                gecko-v1: sm-msan.opt
> +                gecko-v2: sm-msan-opt
> +        treeherder:
> +            symbol: SM-tc(msan)
> +            tier: 1

Why an explicit tier setting for this one but not the rest? They should all be tier 1 except for sm-tsan. (Which reminds me, I really ought to fix that up to be tier 1 too.)

::: taskcluster/ci/spidermonkey/kind.yml:217
(Diff revision 1)
> +                gecko-v1: sm-tsan.opt
> +                gecko-v2: sm-tsan-opt
> +        treeherder:
> +            symbol: SM-tc(tsan)
> +            tier: 3
> +        run-on-projects: []

So... what does that mean? I want it to be selectable on try. I guess I don't really want it to run if you don't specifically request it.
Attachment #8789155 - Flags: review?(sphink) → review-

Comment 153

3 years ago
mozreview-review
Comment on attachment 8789158 [details]
Bug 1286075: add a (temporary) android-stuff kind;

https://reviewboard.mozilla.org/r/77454/#review75962

I'm fine with this if you are, but I feel like the abstraction point should be "higher up".  These jobs are supposed to run in the same environment as "regular build jobs", but with Android bits included, and these were the set of things I cargo culted in the legacy .yml files.  In any case, I trust your judgement.
Attachment #8789158 - Flags: review?(nalexander) → review+

Comment 154

3 years ago
mozreview-review
Comment on attachment 8789149 [details]
Bug 1286075: add l10n kind;

https://reviewboard.mozilla.org/r/77436/#review75970

Looks good to me.

When you rebase this please add:

diff --git a/taskcluster/ci/l10n/kind.yml b/taskcluster/ci/l10n/kind.yml
--- a/taskcluster/ci/l10n/kind.yml
+++ b/taskcluster/ci/l10n/kind.yml
@@ -80,8 +80,31 @@ jobs:
             options:
                 - environment-config=single_locale/production.py
                 - branch-config=single_locale/try.py
                 - platform-config=single_locale/linux64.py
                 - total-chunks=1
                 - this-chunk=1
             tooltool-downloads: public
             need-xvfb: true
+    android-api-15-l10n/opt:
+        description: "Single Locale Repack"
+        index:
+            product: mobile
+            job-name:
+                gecko-v2: android-l10n-opt
+        treeherder:
+            platform: android-api-15/opt
+            symbol: tc(L10n)
+        worker-type: aws-provisioner-v1/android-api-15
+        worker:
+            max-run-time: 18000
+        run:
+            using: mozharness
+            script: mozharness/scripts/mobile_l10n.py
+            actions: [clone-locales list-locales setup repack upload-repacks summary]
+            config:  # NOTE: this will need to be modified in a transform..
+                - single_locale/try_android-api-15.py single_locale/tc_android-api-15.py
+            options:
+                - total-chunks=1
+                - this-chunk=1
+            tooltool-downloads: internal
+            need-xvfb: true
Attachment #8789149 - Flags: review?(bugspam.Callek) → review+
(In reply to Justin Wood (:Callek) from comment #154)
> +        index:
> +            product: mobile
> +            job-name:
> +                gecko-v2: android-l10n-opt

Currently we index these as:

{index}.gecko.v2.{project}.latest.{build_product}-l10n.{build_name}-{build_type}.{locale}

eg:

gecko.v2.mozilla-central.latest.mobile-l10n.android-api-11-opt.ar

It seems like your suggestion would make this:

gecko.v2.mozilla-central.latest.mobile.android-l10n-opt (I'm not sure how the locale fits in here?)

Is this intended?
Flags: needinfo?(bugspam.Callek)

Comment 156

3 years ago
mozreview-review
Comment on attachment 8786824 [details]
Bug 1286075: drop gecko.v1 routes for lint tasks;

https://reviewboard.mozilla.org/r/75708/#review75972

::: taskcluster/ci/legacy/tasks/builds/linux64_gcc.yml:4
(Diff revision 2)
>  $inherits:
>    from: 'tasks/build.yml'
>    variables:
> -    build_name: 'linux64-clang'
> +    build_product: firefox

It looks like these are just fixes from an earlier commit, right? ("simplify toolchain build tasks"). It doesn't seem related to gecko.v1 lint routes.

Comment 157

3 years ago
mozreview-review
Comment on attachment 8789151 [details]
Bug 1286075: add upload-symbols kind; .mielczarek

https://reviewboard.mozilla.org/r/77440/#review75976

We discussed this on IRC a little. We should only be uploading symbols for Nightly or Release builds, which we aren't doing from TC right now. It appears that we will wind up doing Nightly builds in TC in the near future, so having this in-place seems desirable, but we should not actually hook it up to anything yet.
Attachment #8789151 - Flags: review?(ted) → review+
(In reply to Michael Shal [:mshal] from comment #155)
> (In reply to Justin Wood (:Callek) from comment #154)
> > +        index:
> > +            product: mobile
> > +            job-name:
> > +                gecko-v2: android-l10n-opt
> 
> Currently we index these as:
> 
> {index}.gecko.v2.{project}.latest.{build_product}-l10n.{build_name}-
> {build_type}.{locale}

These index's are from buildbot's upload to taskcluster tasks, individual tasks that each have their own single locale on them.

> eg:
> 
> gecko.v2.mozilla-central.latest.mobile-l10n.android-api-11-opt.ar
> 
> It seems like your suggestion would make this:
> 
> gecko.v2.mozilla-central.latest.mobile.android-l10n-opt (I'm not sure how
> the locale fits in here?)
> 
> Is this intended?

This specific index is based on what I already had active for android-l10n in the initial-landing bug, which I put here in the new world so it doesn't change. Its similar to how I setup the TC linux index's for l10n as well.

I think the locale-specific way to index is desired somehow, I'm not sure how we'll do those *yet* but its on my list to figure out.
Flags: needinfo?(bugspam.Callek)
(In reply to Justin Wood (:Callek) from comment #158)
> (In reply to Michael Shal [:mshal] from comment #155)
> > (In reply to Justin Wood (:Callek) from comment #154)
> > > +        index:
> > > +            product: mobile
> > > +            job-name:
> > > +                gecko-v2: android-l10n-opt
> > 
> > Currently we index these as:
> > 
> > {index}.gecko.v2.{project}.latest.{build_product}-l10n.{build_name}-
> > {build_type}.{locale}
> 
> These index's are from buildbot's upload to taskcluster tasks, individual
> tasks that each have their own single locale on them.

Correct. For android we also put the multi-l10n build also goes here under the "multi" locale.

> This specific index is based on what I already had active for android-l10n
> in the initial-landing bug, which I put here in the new world so it doesn't
> change. Its similar to how I setup the TC linux index's for l10n as well.

In general, we don't want to change how things are indexed just because the build is moving from buildbot to Taskcluster. Is there a reason for the reorganization? Which bug is the initial landing bug? We can move this discussion over there.

> I think the locale-specific way to index is desired somehow, I'm not sure
> how we'll do those *yet* but its on my list to figure out.

Yes, I think a locale-specific indexing is required. Without the {local} in the route, one locale will overwrite the indexing for another locale, so you could only access locales based on which group finished first (or last?)
Flags: needinfo?(bugspam.Callek)

Comment 160

3 years ago
mozreview-review
Comment on attachment 8789153 [details]
Bug 1286075: add a static-analysis kind;

https://reviewboard.mozilla.org/r/77444/#review75984

r=me with the comments below addressed.

::: taskcluster/ci/static-analysis/kind.yml:34
(Diff revision 1)
> +        treeherder:
> +            platform: osx-10-7/opt
> +        worker-type: aws-provisioner-v1/opt-macosx64
> +        run:
> +            using: mozharness
> +            actions: [get-secrets build generate-build-stats update]

Not pretending in the slightest that I understand what's going on here, but this set of actions has "update" which is missing from the other one.  Is that intentional?

::: taskcluster/ci/static-analysis/kind.yml:62
(Diff revision 1)
> +                - builds/releng_sub_linux_configs/64_stat_and_opt.py
> +                - balrog/production.py
> +            script: "mozharness/scripts/fx_desktop_build.py"
> +            secrets: true
> +            tooltool-downloads: public
> +            need-xvfb: true

We shouldn't need xvfb for these builds (unless if that's needed by something outside of the Firefox build system.)

::: taskcluster/docs/kinds.rst:55
(Diff revision 1)
>  
> +static-analysis
> +---------------
> +
> +Static analysis builds use the compiler to perform some detailed analysis of
> +the source code without actually producing a build.  The useful output from

The "without actually producing a build" part is false -- it's just that we don't upload those builds.  Perhaps you'd want to reword this paragraph.
Attachment #8789153 - Flags: review?(ehsan) → review+
(In reply to Michael Shal [:mshal] from comment #159)
> (In reply to Justin Wood (:Callek) from comment #158)
> > I think the locale-specific way to index is desired somehow, I'm not sure
> > how we'll do those *yet* but its on my list to figure out.
> 
> Yes, I think a locale-specific indexing is required. Without the {local} in
> the route, one locale will overwrite the indexing for another locale, so you
> could only access locales based on which group finished first (or last?)

Can we make this a new bug, assigned to me, that we then work on hashing out.

I can do the linkage of it all too. I'd rather avoid conflating this bug with discussion on the index here, since this bug isn't changing the index of existing tasks related to this...
Flags: needinfo?(bugspam.Callek)

Comment 162

3 years ago
mozreview-review
Comment on attachment 8789152 [details]
Bug 1286075: add a valgrind kind;

https://reviewboard.mozilla.org/r/77442/#review75990

LGTM - the naming issue is more curiosity than anything.

::: taskcluster/ci/valgrind/kind.yml:20
(Diff revision 1)
> +        index:
> +            product: firefox
> +            job-name: linux64-valgrind-opt
> +        treeherder:
> +            platform: linux64/opt
> +            symbol: tc(V)

Should this just be "V" now instead of "tc(V)"? I thought the tc prefix was primarily for when things were both in buildbot and Taskcluster.
Attachment #8789152 - Flags: review?(mshal) → review+
(In reply to Justin Wood (:Callek) from comment #161)
> (In reply to Michael Shal [:mshal] from comment #159)
> > (In reply to Justin Wood (:Callek) from comment #158)
> > > I think the locale-specific way to index is desired somehow, I'm not sure
> > > how we'll do those *yet* but its on my list to figure out.
> > 
> > Yes, I think a locale-specific indexing is required. Without the {local} in
> > the route, one locale will overwrite the indexing for another locale, so you
> > could only access locales based on which group finished first (or last?)
> 
> Can we make this a new bug, assigned to me, that we then work on hashing out.
> 
> I can do the linkage of it all too. I'd rather avoid conflating this bug
> with discussion on the index here, since this bug isn't changing the index
> of existing tasks related to this...

Now bug 1301495, for anyone who wants to follow along.
Comment on attachment 8789161 [details]
Bug 1286075: make redo logging prettier;

Redirecting to bhearsum since he authored redo (I just imported it into m-c). Maybe this should be upstreamed?
Attachment #8789161 - Flags: review?(mshal) → review?(bhearsum)

Comment 165

3 years ago
mozreview-review
Comment on attachment 8789167 [details]
Bug 1286075: disable indexing for jobs that do not need it;

https://reviewboard.mozilla.org/r/77470/#review76006
Attachment #8789167 - Flags: review?(mshal) → review+

Comment 166

3 years ago
mozreview-review
Comment on attachment 8789161 [details]
Bug 1286075: make redo logging prettier;

https://reviewboard.mozilla.org/r/77460/#review76008

::: python/redo/redo/__init__.py:70
(Diff revision 1)
>  
>          if _ == attempts - 1:
>              # Don't need to sleep the last time
>              break
>          log.debug("sleeping for %.2fs (attempt %i/%i)", sleeptime, _ + 1, attempts)
>          time.sleep(sleeptime)

If you have time, could you send a PR to https://github.com/bhearsum/redo as well?
Attachment #8789161 - Flags: review?(bhearsum)

Comment 168

3 years ago
mozreview-review
Comment on attachment 8786824 [details]
Bug 1286075: drop gecko.v1 routes for lint tasks;

https://reviewboard.mozilla.org/r/75708/#review75974
Attachment #8786824 - Flags: review?(mshal) → review+

Comment 169

3 years ago
mozreview-review
Comment on attachment 8789163 [details]
Bug 1286075: do not use secrets for hazard jobs;

https://reviewboard.mozilla.org/r/77462/#review76024

Rubber stamp.
Attachment #8789163 - Flags: review?(sphink) → review+

Comment 170

3 years ago
mozreview-review
Comment on attachment 8789164 [details]
Bug 1286075: use the default gecko-b-* workerType for all haz builds;

https://reviewboard.mozilla.org/r/77464/#review76026
Attachment #8789164 - Flags: review?(sphink) → review+

Comment 171

3 years ago
mozreview-review
Comment on attachment 8789166 [details]
Bug 1286075: remove unused secrets access from spidermonkey;

https://reviewboard.mozilla.org/r/77468/#review76028
Attachment #8789166 - Flags: review?(sphink) → review+

Comment 172

3 years ago
mozreview-review
Comment on attachment 8789159 [details]
Bug 1286075: delete the legacy kind;

https://reviewboard.mozilla.org/r/77456/#review76032

This looks almost ready, though r- for now because I believe moving routes.json will break buildbot+mozharness builds. I'd prefer if we can still maintain sharing the gecko.v2 route templates between Taskcluster and mozharness anyway (more details in another upcoming review), at least until all our builds are moved off of buildbot. Otherwise I think indexing styles will drift over time.

::: taskcluster/taskgraph/task/nightly_fennec.py:62
(Diff revision 1)
> +            'gaia_rev': gaia['git']['git_revision'],
> +            'gaia_ref': gaia['git']['branch'],
> +        }
> +
> +
> +def decorate_task_json_routes(task, json_routes, parameters):

How come nightly fennec can't use the same route logic that all the other tasks do? Is this just a placeholder?

::: taskcluster/taskgraph/task/nightly_fennec.py:69
(Diff revision 1)
> +
> +    :param dict task: task definition.
> +    :param json_routes: the list of routes to use from routes.json
> +    :param parameters: dictionary of parameters to use in route templates
> +    """
> +    routes = task.get('routes', [])

Since this is presumably a nightly build, I'd expect this to use the "nightly" routes in addition to the regular "routes" from routes.json
Attachment #8789159 - Flags: review?(mshal) → review-

Comment 173

3 years ago
mozreview-review
Comment on attachment 8789141 [details]
Bug 1286075: add more functionality to the task description;

https://reviewboard.mozilla.org/r/77420/#review76034

r- for now since I would like to see the gecko.v2 routes defined in a way that can be shared with mozharness. I think addressing this will also r+ the "delete the legacy kind" patch.

::: taskcluster/docs/transforms.rst:157
(Diff revision 1)
> +The parts of the task description that are specific to a worker implementation
> +are isolated in a ``worker`` object which has an ``implementation`` property
> +naming the worker implementation.  Thus the transforms that produce a task
> +description must be aware of the worker implementation to be used, but need not
> +be aware of the details of its payload format.
> +  

nit: whitespace

::: taskcluster/taskgraph/transforms/task.py:249
(Diff revision 1)
> +V1_ROUTE_TEMPLATES = [
> +    "index.gecko.v1.{project}.latest.linux.{job-name-gecko-v1}",
> +    "index.gecko.v1.{project}.revision.linux.{head_rev}.{job-name-gecko-v1}",
> +]
> +
> +V2_ROUTE_TEMPLATES = [

Is there a way we can share the V2 templates with mozharness? Perhaps just a gecko_v2.py that can be imported instead of routes.json. I'm fine with the other changes here (renaming build_product to product, and consolidating {build_name}-{build_type} into {job-name-gecko-v2}

Changing mozharness to use these template names instead of the old ones should be straightforward. I am primarily concerned with adding new v2 routes that will only be used by some jobs rather than all jobs.

::: taskcluster/taskgraph/transforms/task.py:251
(Diff revision 1)
> +    "index.gecko.v1.{project}.revision.linux.{head_rev}.{job-name-gecko-v1}",
> +]
> +
> +V2_ROUTE_TEMPLATES = [
> +    "index.gecko.v2.{project}.latest.{product}.{job-name-gecko-v2}",
> +    "index.gecko.v2.{project}.pushdate.{pushdate_long}.{product}.{job-name-gecko-v2}",

Actually, 'job-name-gecko-v2' (and others) should probably be 'job_name_gecko_v2' to match the style of 'pushdate_long' and 'head_rev'. (Or change pushdate_long and head_rev to pushdate-long and head_rev).

::: taskcluster/taskgraph/transforms/task.py:253
(Diff revision 1)
> +
> +V2_ROUTE_TEMPLATES = [
> +    "index.gecko.v2.{project}.latest.{product}.{job-name-gecko-v2}",
> +    "index.gecko.v2.{project}.pushdate.{pushdate_long}.{product}.{job-name-gecko-v2}",
> +    "index.gecko.v2.{project}.revision.{head_rev}.{product}.{job-name-gecko-v2}",
> +]

The v2 route templates seem to omit the routes used for nightly builds and l10n jobs that were present in routes.json. How are you planning to handle those?
Attachment #8789141 - Flags: review?(mshal) → review-
Comment on attachment 8789146 [details]
Bug 1286075: add a build kind, modify tests to use it;

apologies for delay in review. will look tomorrow morning PT

Comment 175

3 years ago
mozreview-review
Comment on attachment 8786802 [details]
Bug 1286075: set ARTIFACT_TASKID correctly for upload-symbol jobs;

https://reviewboard.mozilla.org/r/75664/#review76194

::: taskcluster/taskgraph/task/legacy.py:110
(Diff revision 2)
>  
> -    if 'treeherder' not in task['task']['extra']:
> -        task['task']['extra']['treeherder'] = {}
> +    if 'extra' not in task['task']:
> +        task['task']['extra'] = {}
>  
> +    # only set up treeherder information if the task contained any to begin with
> +    if 'treeherder' in task['task'].setdefault('extra', {}):

Given the code just above, this should be just `task['task']['extra']`.
Assignee

Comment 176

3 years ago
mozreview-review-reply
Comment on attachment 8789143 [details]
Bug 1286075: fix target task generation, including try;

https://reviewboard.mozilla.org/r/77424/#review75878

> add esr now? or only when we get ready to branch for esr with this code involved?

That will need to be a regex, right?  Is that similar for all release branches?

> I'm also tempted to bikeshed this and say "mainline" or such, and then leave "RELEASE" for either the stuff in hg.m.o/releases/ or just being beta/release/esr.
> 
> Instead I'm just going to leave this "color square" here and let you run with it as-is or change, at your discretion.

I'd like to call them whatever everyone else calls them.  Can you break that down for me and I'll reflect it here?

> huh -- did we not upload symbols before? (I'd imagine having symbols from symbol server for these builds could have been wanted by the devs.
> 
> I'd be happy with an additional review on <human who checked out ash> if you don't have a good answer on the need or no-need here.

From everything I can see, no, no symbols.
Assignee

Comment 177

3 years ago
mozreview-review-reply
Comment on attachment 8789150 [details]
Bug 1286075: add source-check kind;

https://reviewboard.mozilla.org/r/77438/#review75892

> The eslint, flake8 and wpt linters are all mozlint based linters. I'd recommend putting them all in a single 'mozlint.yml' file for now. There's probably an argument to be made that these should even be their own kind, but this is good for now. We can re-evaluate later if need be.
> 
> Also if there's a way we can share configs between them (without making a new kind) we should as the number of mozlint based linters are likely to grow. Otherwise a bit of duplication is fine for now.

I'll move them to another file.  The commonalities might be handled with a "mozlint" run-using, rather than a new kind.  I'll leave that to you in a followup :)

> Buildtypes don't make much sense for jobs that don't require a build, but I guess the /opt everywhere is needed to appease treeherder? Maybe we can at least specify it behind the scenes so it's not needed in the task configs.

Yes, treeherder and try syntax require it.  This is something I'd like to address in bug 1286086.  Maybe we allow jobs with no labels, or invent a label for this sort of job.
Assignee

Comment 178

3 years ago
mozreview-review-reply
Comment on attachment 8789157 [details]
Bug 1286075: add a (temporary) marionette-harness kind;

https://reviewboard.mozilla.org/r/77452/#review75944

> Why can't the newline be removed now? (The original task definition did not include a newline.)

It did, actually; the ">" syntax always includes a trailing newline:

      MOZHARNESS_SCRIPT: >
          testing/mozharness/scripts/marionette_harness_tests.py
          
I didn't remove it since that would have shown up as a diff in my tests, but I'll add an extra commit to remove it after the tests have run.
Assignee

Comment 179

3 years ago
mozreview-review-reply
Comment on attachment 8789148 [details]
Bug 1286075: add hazard kind;

https://reviewboard.mozilla.org/r/77434/#review75948

Both of these are changes from the existing functionality, so I'll make them in a new cset that occurs after the point where I'm testing for differences in tasks.  If you can flag this one as r+ (since it accurately reflects the existing taskgraph), that'd be great.

> It's not in this patch, but I don't really want linux64-mulet-haz to ever run unless someone explicitly requests it.
> 
> It's broken now anyway, and I have no intention of fixing it myself, but it may still be useful for whoever is doing b2g work? I don't know the state of that.

I will set `run-on-projects: []` to achieve that, with a suitable comment.
(In reply to Dustin J. Mitchell [:dustin] from comment #176)
> Comment on attachment 8789143 [details]
> Bug 1286075: fix target task generation, including try;
> 
> https://reviewboard.mozilla.org/r/77424/#review75878
> 
> > add esr now? or only when we get ready to branch for esr with this code involved?
> 
> That will need to be a regex, right?  Is that similar for all release
> branches?

Regex would be easiest, but the esr branches are predictable, so maybe regex complexity isn't needed. Either way, the other 'release branches' are stable names.

> > I'm also tempted to bikeshed this and say "mainline" or such, and then leave "RELEASE" for either the stuff in hg.m.o/releases/ or just being beta/release/esr.
> > 
> > Instead I'm just going to leave this "color square" here and let you run with it as-is or change, at your discretion.
> 
> I'd like to call them whatever everyone else calls them.  Can you break that
> down for me and I'll reflect it here?

They are called different things by different people, for different contexts, if there was one tried and true way I'd just say to use that instead :-). I'm happy to proceed as is as well, fwiw.


> > huh -- did we not upload symbols before? (I'd imagine having symbols from symbol server for these builds could have been wanted by the devs.
> > 
> > I'd be happy with an additional review on <human who checked out ash> if you don't have a good answer on the need or no-need here.
> 
> From everything I can see, no, no symbols.

Yesterday i saw :ted say that symbol uploads should actually happen only on nightlies/release jobs in theory anyway, so its likely that even if you're wrong (and we have symbols now) that the state you're doing here is actually wants wanted!

Comment 181

3 years ago
mozreview-review
Comment on attachment 8789160 [details]
Bug 1286075: add how-tos covering new functionality;

https://reviewboard.mozilla.org/r/77458/#review76244

::: taskcluster/docs/how-tos.rst:128
(Diff revision 1)
> +If you are adding a new task that is not a test suite, there are a number of
> +options.  A few questions to consider:
> +
> + * Is this a new build platform or variant that will produce an artifact to
> +   be run through the usual test suites?
> + 

boo a space!

::: taskcluster/docs/how-tos.rst:182
(Diff revision 1)
> +If your task depends on other tasks, then the decision of which tasks to create
> +may require some code.  For example, the ``upload-symbols`` kind iterates over
> +the builds in the graph, generating a task for each one.  This specific
> +post-build behavior is implemented in the general
> +``taskgraph.task.post_build:PostBuildTask`` kind implementation.  If your task
> +needs something more purpose-specific, then it may be time to write a new task

write a new task implementation => write a new kind implementation
Attachment #8789160 - Flags: review?(pmoore)
Assignee

Comment 182

3 years ago
mozreview-review-reply
Comment on attachment 8789155 [details]
Bug 1286075: add a spidermonkey kind;

https://reviewboard.mozilla.org/r/77448/#review75950

> I definitely want this on try as well.

(dropped; see below)

> Oh wow, I hadn't realized all this was here already. RyanVM recently opened bug 1301122 to add things to this list... but they already seem to be here. Well, with the exception of Makefile.in, moz.build, and config/\*\*.
> 
> I was planning on waiting until this landed to add those in.

Sorry, I'm not sure what you're asking me to do here.  Should I land as-is and you can make the list what you want in bug 1301122?

> This could be in job-defaults, if it can handle non-toplevel keys. I don't know how the merging works, though, so maybe specifying the spidermonkey-variant would wipe out the using? Or maybe you just prefer this to be explicit? I'm fine either way.

I have a light prefernce for being explicit (since the other attributes here are only valid for using=spidermonkey), but it does get repetitive.  I'll move it.

> All of these spidermonkey jobs should run on try too. Assuming I understand what that means -- I mean, they should run on try with try: -p linux64 if they are listed in the ridealongs, and they should be individually selectable on try no matter what.

`try` in `run-on-projects` means that they are selected with `-p all`.  These jobs are are already running on `-p linux64` due to the ridealong behavior, and are at any rate individually selectable.  So I think this is already doing what you want, and there's nothing to change here (same for your comment about sm-package/opt).  I do know that this patchset doesn't change these behaviors, so if you are happy with how things work now then I think that confirms there's nothing to change.

> It would be nice for this to be set in the job-defaults, as it should be the same for all spidermonkey jobs except for sm-package. But again, I don't know whether that's possible.

Merging allows appending to lists, but not editing them (it's the same merge we've been using for inheritance..).  But in this case, that's adequate, so I'll make this change.

> So... what does that mean? I want it to be selectable on try. I guess I don't really want it to run if you don't specifically request it.

That's exactly what it means.  I'll add a note to the docs to indicate that.
Assignee

Comment 183

3 years ago
mozreview-review-reply
Comment on attachment 8789158 [details]
Bug 1286075: add a (temporary) android-stuff kind;

https://reviewboard.mozilla.org/r/77454/#review75962

I don't know if there was much judgement here -- this is basically cargo-culting what I found in the legacy files :)

I agree that these shouldn't be their own kind.  I'll file a followup bug for you where we can work together to merge these into the "build" kind (or whatever we decide is best).  I couldn't do it myself because there were a number of significant differences from other tasks that I couldn't abstract because I didn't understand them.  It shouldn't be too hard for us to straighten out in isolation -- just too complicated for this bug.
Assignee

Comment 184

3 years ago
mozreview-review-reply
Comment on attachment 8789151 [details]
Bug 1286075: add upload-symbols kind; .mielczarek

https://reviewboard.mozilla.org/r/77440/#review75976

Indeed -- and I'll leave that to you to adjust in a followup bug, if you'd like to file one.
Assignee

Comment 185

3 years ago
mozreview-review-reply
Comment on attachment 8789153 [details]
Bug 1286075: add a static-analysis kind;

https://reviewboard.mozilla.org/r/77444/#review75984

> Not pretending in the slightest that I understand what's going on here, but this set of actions has "update" which is missing from the other one.  Is that intentional?

It matches what the old tasks do, which is my goal here.  Whether that's what *should* be done, I leave up to you!  Same for xvfb.  It seems this new way of writing task-graphs surfaces these differences better than the old!

If you'd like to adjust this, that'll be good to do in a followup bug.
Assignee

Comment 186

3 years ago
mozreview-review-reply
Comment on attachment 8786824 [details]
Bug 1286075: drop gecko.v1 routes for lint tasks;

https://reviewboard.mozilla.org/r/75708/#review75972

> It looks like these are just fixes from an earlier commit, right? ("simplify toolchain build tasks"). It doesn't seem related to gecko.v1 lint routes.

Ugh, looks like I messed up the mozreview ID's and conflated this diff with "scripts for testing (DO NOT LAND)".  I'll try to straighten that out on the next review request (no promises though.. mozreview is confusing!!)
Assignee

Comment 187

3 years ago
mozreview-review-reply
Comment on attachment 8789152 [details]
Bug 1286075: add a valgrind kind;

https://reviewboard.mozilla.org/r/77442/#review75990

> Should this just be "V" now instead of "tc(V)"? I thought the tc prefix was primarily for when things were both in buildbot and Taskcluster.

There's a project afoot to remove the `tc(..)`, but it hasn't landed yet.  When it does, yes, this will change to `symbol: V`.

Comment 188

3 years ago
mozreview-review
Comment on attachment 8789139 [details]
Bug 1286075: dump data that fails validation;

https://reviewboard.mozilla.org/r/77416/#review76276
Attachment #8789139 - Flags: review?(wcosta) → review+
Blocks: 1301720
Assignee

Comment 189

3 years ago
mozreview-review-reply
Comment on attachment 8789141 [details]
Bug 1286075: add more functionality to the task description;

https://reviewboard.mozilla.org/r/77420/#review76034

> Actually, 'job-name-gecko-v2' (and others) should probably be 'job_name_gecko_v2' to match the style of 'pushdate_long' and 'head_rev'. (Or change pushdate_long and head_rev to pushdate-long and head_rev).

I'll take care of this in the fixes in bug 1301720.

> The v2 route templates seem to omit the routes used for nightly builds and l10n jobs that were present in routes.json. How are you planning to handle those?

That's up to Jordan and Callek, respectively (and I see a bug is open already for the latter).

Comment 190

3 years ago
mozreview-review-reply
Comment on attachment 8789148 [details]
Bug 1286075: add hazard kind;

https://reviewboard.mozilla.org/r/77434/#review75948

I am confused by this. I am confident that I can get a try build right now with try: -b d -p linux64-shell-haz. Proof: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e049586da4bc

What are you comparing against?

Comment 193

3 years ago
mozreview-review-reply
Comment on attachment 8789155 [details]
Bug 1286075: add a spidermonkey kind;

https://reviewboard.mozilla.org/r/77448/#review75950

> Sorry, I'm not sure what you're asking me to do here.  Should I land as-is and you can make the list what you want in bug 1301122?

Yes, sorry. There are a couple of things to add, but I wouldn't want to hold up this bug for them. I should have unchecked the "open an issue" button for this comment.

> `try` in `run-on-projects` means that they are selected with `-p all`.  These jobs are are already running on `-p linux64` due to the ridealong behavior, and are at any rate individually selectable.  So I think this is already doing what you want, and there's nothing to change here (same for your comment about sm-package/opt).  I do know that this patchset doesn't change these behaviors, so if you are happy with how things work now then I think that confirms there's nothing to change.

Ah! Ok, I didn't realize that run-on-projects was for -p all. Yes, you are correct, the current behavior is exactly what I want.

> Merging allows appending to lists, but not editing them (it's the same merge we've been using for inheritance..).  But in this case, that's adequate, so I'll make this change.

Yeah, I can easily imagine wanting to add some things to this list (eg mfbt/**), and I'll always want those additions to be applied to everything.

Comment 194

3 years ago
mozreview-review
Comment on attachment 8789155 [details]
Bug 1286075: add a spidermonkey kind;

https://reviewboard.mozilla.org/r/77448/#review76348
Attachment #8789155 - Flags: review- → review+
Assignee

Comment 195

3 years ago
mozreview-review-reply
Comment on attachment 8789148 [details]
Bug 1286075: add hazard kind;

https://reviewboard.mozilla.org/r/77434/#review75948

I think this confusion was over what run-on-projects means (and that it applies to -p all, rather than disallowing runs on try at all)?  I agree that `-p linux64-shell-haz` works now and will continue to work after this patchset lands.
Blocks: 1301762

Comment 196

3 years ago
mozreview-review
Comment on attachment 8789141 [details]
Bug 1286075: add more functionality to the task description;

https://reviewboard.mozilla.org/r/77420/#review76378

This seems mostly good.

I'm not an expert on the index bits. So I didn't scrutinize those too closely. I defer to mshal for that part of the review.

I'll almost certainly rubber stamp r+ the next version of this.

::: taskcluster/docs/attributes.rst:23
(Diff revision 1)
> +run_on_projects
> +===============
> +
> +The projects where this task should be in the target task set.  This is how
> +requirements like "only run this on inbound" get implemented.  These are
> +either project names or the aliases
> +
> + * `integration` -- integration branches
> + * `release` -- release branches including mozilla-central
> + * `all` -- everywhere (the default)

I've never been a fan of the "project branches" terminology because in the context of Firefox there is only a single "project" and "branch" really means "repository."

What are your thoughts on calling this `run_on_repos` instead?

Also, what tasks do we have that run on "integration" but not "release" repos? Isn't "release" a superset of "integration?"

::: taskcluster/taskgraph/transforms/task.py:385
(Diff revision 1)
>  @transforms.add
> +def add_index_routes(config, tasks):
> +    for task in tasks:
> +        index = task.get('index')
> +        routes = task.setdefault('routes', [])
> +        if index:

Consider inverting this and doing an early yield to cut down on the pyramid indentation pattern.

::: taskcluster/taskgraph/transforms/task.py:396
(Diff revision 1)
> +                    'buildbot': base_name,
> +                    'gecko-v1': '{}.{}'.format(base_name, type_name),
> +                    'gecko-v2': '{}-{}'.format(base_name, type_name),
> +                }
> +            vars = config.params.copy()
> +            for n in job_name:

`vars` is a Python builtin. Please use a different variable.

::: taskcluster/taskgraph/transforms/task.py:415
(Diff revision 1)
> +            try:
> +                tier = task['treeherder']['tier']
> +            except KeyError:
> +                tier = 3  # default

tier = task['treeherder'].get('tier', 3)

::: taskcluster/taskgraph/transforms/task.py:466
(Diff revision 1)
>                                          config.params['pushlog_id'])
> -                for root in 'tc-treeherder', 'tc-treeherder-stage'
> +                for env in task_th['environments']
>              ])
>  
> +        if 'expires-after' not in task:
> +            task['expires-after'] = '14 days' if config.params['project'] == 'try' else '1 year'

I'm not a fan of comparing to "try" like this. Could we look at `level == 1` instead?
Attachment #8789141 - Flags: review?(gps) → review-

Comment 197

3 years ago
mozreview-review
Comment on attachment 8789142 [details]
Bug 1286075: introduce job descriptions and implementations;

https://reviewboard.mozilla.org/r/77422/#review76380

This is mostly good. My biggest concern is around the assumption that `run-task` implies VCS checkout.

::: taskcluster/docs/transforms.rst:132
(Diff revision 1)
> +A job description says what to run in the task.  It is a combination of a "run"
> +section and all of the fields from a task description.  The run section has a
> +"using" property that defines how this task should be run; for example,
> +"mozharness" to run a mozharness script, or "mach" to run a mach command.  The

Should probably use double backticks for key names and values so they are rendered as monospace code.

::: taskcluster/taskgraph/transforms/job/__init__.py:86
(Diff revision 1)
> +    """Given a build description, create a task description"""
> +    # import plugin modules first, before iterating over jobs
> +    import_all()
> +    for job in jobs:
> +        if 'label' not in job:
> +            assert 'name' in job, "job has neither a name nor a label"

`assert` gets optimized out when running `python -O`. So you want to raise an exception anytime you want to be sure you raise.

::: taskcluster/taskgraph/transforms/job/__init__.py:91
(Diff revision 1)
> +        taskdesc = copy.deepcopy(job)
> +
> +        # fill in some empty defaults to make run implementations easier
> +        taskdesc.setdefault('attributes', {})
> +        taskdesc.setdefault('dependencies', {})
> +        taskdesc.setdefault('routes', [])
> +        taskdesc.setdefault('scopes', [])
> +        taskdesc.setdefault('extra', {})
> +
> +        # give the function for job.run.using on this worker implementation a
> +        # chance to set up the task description.
> +        configure_taskdesc_for_run(config, job, taskdesc)
> +        del taskdesc['run']

I'm a fan of returning new objects instead of transforming in place. Is there a reason we can't do this?

::: taskcluster/taskgraph/transforms/job/__init__.py:113
(Diff revision 1)
> +    """Register the decorated function to set up a task description this task
> +    using for the given kind of run (`run_using`) on the given worker

Nit: Engrish fail

::: taskcluster/taskgraph/transforms/job/__init__.py:118
(Diff revision 1)
> +    The decorated function should have the signature `using_foo(config, job,
> +    taskdesc) and should modify the task description in-place.  The skeleton of

If you wanted to, you could use the `inspect` module to see if the wrapped function accepts said arguments. However, GIGO (garbage in garbage out) should apply and result in a run-time exception calling the function, which I think is acceptable.

::: taskcluster/taskgraph/transforms/job/__init__.py:158
(Diff revision 1)
> +def import_all():
> +    """Import all modules that are siblings of this one, triggering the decorator
> +    above in the process."""
> +    for f in os.listdir(os.path.dirname(__file__)):
> +        if f.endswith('.py') and f not in ('commmon.py', '__init__.py'):
> +            __import__('taskgraph.transforms.job.' + f[:-3])

The Python purist in me does not approve since this assumes modules are being loaded from a filesystem (as opposed to say a zip file). But we shouldn't have to support alternate module sources, so we should be fine.

::: taskcluster/taskgraph/transforms/job/run_task.py:35
(Diff revision 1)
> +    worker['caches'] = [{
> +        'type': 'persistent',
> +        'name': 'level-{}-hg-shared'.format(config.params['level']),
> +        'mount-point': "/home/worker/hg-shared",
> +    }, {
> +        'type': 'persistent',
> +        'name': 'level-{}-checkouts'.format(config.params['level']),
> +        'mount-point': "/home/worker/checkouts",
> +    }]

Not all `run-task` based tasks have a source checkout. So it seems wrong to do all this VCS foo for all docker-worker based tasks using run-task. It seems like you want 2 variations: one just doing `run-task` and another that is `run-task`+vcs?

Or you could add an attribute to the schema to control whether a VCS checkout is performed. Basically what I did in the latest commit in bug 1296397.
Attachment #8789142 - Flags: review?(gps) → review-

Comment 198

3 years ago
mozreview-review
Comment on attachment 8789143 [details]
Bug 1286075: fix target task generation, including try;

https://reviewboard.mozilla.org/r/77424/#review76384

::: taskcluster/taskgraph/target_tasks.py:15
(Diff revision 1)
> -BUILD_AND_TEST_KINDS = set([
> -    'legacy',  # builds
> -    'desktop-test',
> -    'android-test',
> +INTEGRATION_PROJECTS = set([
> +    'mozilla-inbound',
> +    'autoland',
> +])
> +
> +RELEASE_PROJECTS = set([

Again, I'd prefer to call these "repos" because that's what they are (for now).

Comment 199

3 years ago
mozreview-review
Comment on attachment 8789144 [details]
Bug 1286075: add support for optimizing based on files changed in the push;

https://reviewboard.mozilla.org/r/77426/#review76388

This is mostly good. r- mainly for the test hitting a remote server. Writing tests is hard :/

::: taskcluster/taskgraph/files_changed.py:19
(Diff revision 1)
> +from mozpack.path import match as mozpackmatch
> +
> +logger = logging.getLogger(__name__)
> +
> +
> +def get_changed_files(repository, revision, _cache={}):

Nit: specifying non-scalar types as default argument values in Python doesn't do what you expect. You probably want something like:

   def get_changed_files(repository, revision, _cache=None):
       _cache = _cache or {}
       
If you want to rely on the default behavior (which may be the case here because a cache), you should document it because an experienced Python programmer is conditioned to red flag code like this.

::: taskcluster/taskgraph/files_changed.py:27
(Diff revision 1)
> +        response = retry(requests.get, attempts=2, sleeptime=10,
> +                         args=(url,), kwargs={'timeout': 5})
> +        contents = response.json()

You probably want the `response.json()` in the code that is retried because if you get back a HTTP 503 the `response.json()` call will be the thing that fails.

::: taskcluster/taskgraph/files_changed.py:40
(Diff revision 1)
> +        _cache[repository, revision] = changed_files
> +    return _cache[repository, revision]

I don't think I've ever seen someone access keys like this without including the parens to indicate a tuple. I'm, uh, kinda surprised that works!

Still, I would create a local `key` variable to cut down on the redundant code.

::: taskcluster/taskgraph/files_changed.py:61
(Diff revision 1)
> +    changed_files = get_changed_files(repository, revision)
> +
> +    for pattern in file_patterns:
> +        for path in changed_files:
> +            if mozpackmatch(path, pattern):
> +                return True

Do you think it is worth logging which file matched which pattern?

::: taskcluster/taskgraph/optimize.py:17
(Diff revision 1)
>  
>  logger = logging.getLogger(__name__)
>  TASK_REFERENCE_PATTERN = re.compile('<([^>]+)>')
>  
>  
> -def optimize_task_graph(target_task_graph, do_not_optimize, existing_tasks=None):
> +def optimize_task_graph(target_task_graph, params, do_not_optimize, existing_tasks=None):

In the future, please split changes like this into their own commit.

::: taskcluster/taskgraph/test/test_files_changed.py:33
(Diff revision 1)
> +
> +class TestFilesChanged(unittest.TestCase):
> +
> +    def test_get_changed_files(self):
> +        """Get_changed_files correctly gets the list of changed files in a push.
> +        This tests against the production hg.mozilla.org so that it will detect

We don't want tests hitting remote servers.

Please mock this somehow, possibly by monkeypatching.
Attachment #8789144 - Flags: review?(gps) → review-

Comment 200

3 years ago
mozreview-review
Comment on attachment 8789145 [details]
Bug 1286075: allow defaults in TransformTask;

https://reviewboard.mozilla.org/r/77428/#review76398

It would be nice to have test coverage of this. But I assume future commits will add functionality that relies on this, implicitly testing it.
Attachment #8789145 - Flags: review?(gps) → review+

Comment 201

3 years ago
mozreview-review
Comment on attachment 8789146 [details]
Bug 1286075: add a build kind, modify tests to use it;

https://reviewboard.mozilla.org/r/77430/#review76400

Good stuff.

::: taskcluster/ci/build/android-partner.yml:15
(Diff revision 1)
> +    run-on-projects:
> +        - try
> +    worker-type: aws-provisioner-v1/android-api-15
> +    worker:
> +        implementation: docker-worker
> +        max-run-time: 36000

10 hours?!

::: taskcluster/ci/build/android.yml:18
(Diff revision 1)
> +    worker:
> +        implementation: docker-worker
> +        max-run-time: 7200
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build multi-l10n update]

Any reason to use the single line list syntax instead of putting entries on multiple lines like `config` below?

::: taskcluster/ci/build/kind.yml:14
(Diff revision 1)
> +    - linux.yml
> +    - windows.yml
> +    - macosx.yml
> +    - mulet.yml
> +    - android.yml
> +    - android-partner.yml

Let's alphabetize these.

::: taskcluster/ci/build/linux.yml:13
(Diff revision 1)
> +        symbol: tc(B)
> +        tier: 2
> +    worker-type: aws-provisioner-v1/gecko-{level}-b-linux
> +    worker:
> +        implementation: docker-worker
> +        max-run-time: 36000

10 hours?!

::: taskcluster/ci/build/linux.yml:23
(Diff revision 1)
> +            - builds/releng_base_linux_64_builds.py
> +            - balrog/production.py
> +        script: "mozharness/scripts/fx_desktop_build.py"
> +        secrets: true
> +        tooltool-downloads: public
> +        need-xvfb: true

I'm pretty sure we don't need xvfb on Linux builds.

::: taskcluster/ci/build/linux.yml:78
(Diff revision 1)
> +        secrets: true
> +        custom-build-variant-cfg: debug
> +        tooltool-downloads: public
> +        need-xvfb: true
> +
> +linux/opt:

Can we put "32" in this name so there is symmetry in naming?

::: taskcluster/ci/build/mulet.yml:10
(Diff revision 1)
> +        job-name:
> +            buildbot: linux64-mulet
> +            gecko-v1: mulet.dbg
> +            gecko-v2: mulet-dbg
> +    treeherder:
> +        platform: mulet-linux64/debug # ?!?%

?!?% ???

::: taskcluster/docs/kinds.rst:6
(Diff revision 1)
> -Builds
> +build
>  ------
>  
> -Builds are currently implemented by the ``legacy`` kind.
> +Builds are tasks that produce an installer or other output that can be run by
> +users or automated tests.  This is more restrictive than most definitions of
> +"build" in a Mozilla context: it does not include tasks that run build-like
> +actions for static analysis or to produce instrumented artifacts.

I'm somewhat tempted to call this kind a "distribution." But that's probably a bit too radical. And I don't want to hold landing this up on a bikeshed.

::: taskcluster/taskgraph/transforms/job/common.py:18
(Diff revision 1)
> +
> +
> +def docker_worker_add_workspace_cache(config, job, taskdesc):
> +    """Add the workspace cache based on the build platform/type and level,
> +    except on try where workspace caches are not used."""
> +    if config.params['project'] != 'try':

Again, it is tempting to use `level == 1` instead. Can we?

Also, use early return to avoid extra indentation.

::: taskcluster/taskgraph/transforms/job/mozharness.py:6
(Diff revision 1)
> +Support for running jobs via mozharness.  Ideally, most stuff gets run this
> +way, and certainly anything using mozharness should use this approach.

Please nix the "ideally" part of this. We have crazy plans to start moving away from mozharness for some things. Hit me up and I'll tell you about it :)

::: taskcluster/taskgraph/transforms/job/mozharness.py:56
(Diff revision 1)
> +        'public',
> +        'internal',
> +    ),
> +
> +    # The set of secret names to which the task has access; these are prefixed
> +    # with `project/releng/gecko/{treeherder.kind}/level-{level}/`.   Setting

I've been silent on the double spaces after periods so far in this series. But this triple space is just wrong, even if you cling to typewriter era formatting conventions :)

::: taskcluster/taskgraph/transforms/job/mozharness.py:153
(Diff revision 1)
> +        env['TOOLTOOL_REPO'] = 'https://github.com/mozilla/build-tooltool'
> +        env['TOOLTOOL_REV'] = 'master'

Hulk angry at external dependency. But fixing this requires injecting sanity into the desktop-test Docker image, which we can't do due to weird Valgrind failures from regenerating the Docker image :/

::: taskcluster/taskgraph/transforms/job/mozharness.py:196
(Diff revision 1)
> +    mh_command = [r"c:\mozilla-build\python\python.exe"]
> +    mh_command.append(".\\build\\src\\testing\\" + run['script'].replace('/', '\\'))

Not sure why you use a raw string once and a non-raw string later. Stick to one or the other.

::: taskcluster/taskgraph/transforms/job/mozharness.py:199
(Diff revision 1)
> +        mh_command.append(r"--config " + cfg.replace('/', '\\'))
> +    mh_command.append(r"--branch " + config.params['project'])
> +    mh_command.append(r"--skip-buildbot-actions --work-dir %cd:Z:=z:%\build")

The r'' strings here feel like a cargo cult. You only need the `r` prefix if you want to avoid escaping characters like `\`.

::: taskcluster/taskgraph/transforms/job/mozharness.py:202
(Diff revision 1)
> +    worker['command'] = [
> +        r"mkdir .\build\src",
> +        r"hg share c:\builds\hg-shared\mozilla-central .\build\src",
> +        r"hg pull -u -R .\build\src --rev %GECKO_HEAD_REV% %GECKO_HEAD_REPOSITORY%",
> +        " ".join(mh_command),

Your mixing of single and double quotes upsets my brain. There is no difference between single and double quotes in Python other than whether the string inside can contain unescaped single or double quotes. I prefer single quotes for everything because they are less noisy. But as long as things are consistent, I don't really care that much.

::: taskcluster/taskgraph/transforms/job/mulet.py:33
(Diff revision 1)
> +    Required('tooltool-manifest'): basestring,
> +})
> +
> +
> +@run_job_using("docker-worker", "mach-via-build-mulet-linux.sh", schema=build_mulet_linux_schema)
> +def docker_worker_make_via_build_mulet_linux_sh(config, job, taskdesc):

This function seems similar to the one above. Could you factor out common bits into a shared function?
Attachment #8789146 - Flags: review?(gps) → review-
See Also: → 1301785

Comment 202

3 years ago
mozreview-review
Comment on attachment 8789147 [details]
Bug 1286075: add artifact-build kind

https://reviewboard.mozilla.org/r/77432/#review76416

I trust you can fix the small issues in flight.

::: taskcluster/ci/artifact-build/kind.yml:27
(Diff revision 1)
> +            tier: 2
> +        worker-type: aws-provisioner-v1/gecko-{level}-b-linux
> +        worker:
> +            implementation: docker-worker
> +            docker-image: {in-tree: desktop-build}
> +            max-run-time: 36000

This can be jacked down to something like 15 minutes. Artifact builds should take less than 60s once the build starts.

::: taskcluster/ci/artifact-build/kind.yml:37
(Diff revision 1)
> +                - builds/releng_sub_linux_configs/64_artifact.py
> +                - balrog/production.py
> +            script: "mozharness/scripts/fx_desktop_build.py"
> +            secrets: true
> +            tooltool-downloads: public
> +            need-xvfb: true

We don't need xvfb for artifact builds.

::: taskcluster/docs/kinds.rst:17
(Diff revision 1)
> +Artifact builds produce "artifacts" that can be used by Firefox devs to quickly
> +test non-compiled changes without waiting for a slow compile run.

The artifact build task/configuration actually tests that artifact builds work: it has nothing to do with producing said artifacts.
Attachment #8789147 - Flags: review?(gps) → review+
Assignee

Comment 203

3 years ago
mozreview-review-reply
Comment on attachment 8789161 [details]
Bug 1286075: make redo logging prettier;

https://reviewboard.mozilla.org/r/77460/#review76008

In fact, the upstream version of redo already has some even better improvements to logging, so I'll drop this patch in favor of bug 1301785.

Comment 204

3 years ago
mozreview-review
Comment on attachment 8789154 [details]
Bug 1286075: add a toolchain kind;

https://reviewboard.mozilla.org/r/77446/#review76422

::: taskcluster/ci/toolchain/kind.yml:22
(Diff revision 1)
> +        platform: linux64/opt
> +    worker-type: aws-provisioner-v1/gecko-{level}-b-linux
> +    worker:
> +        implementation: docker-worker
> +        docker-image: {in-tree: desktop-build}
> +        max-run-time: 36000

10 hours?

::: taskcluster/taskgraph/transforms/job/toolchain.py:62
(Diff revision 1)
> +    env['TOOLTOOL_CACHE'] = '/home/worker/tooltool-cache'
> +    env['TOOLTOOL_REPO'] = 'https://github.com/mozilla/build-tooltool'
> +    env['TOOLTOOL_REV'] = 'master'

At least in the case of build-clang-linux.sh, it fetches tooltool on its own. So this may not be needed.

::: taskcluster/taskgraph/transforms/job/toolchain.py:66
(Diff revision 1)
> +    command = ' && '.join([
> +        "cd /home/worker/",
> +        "./bin/checkout-sources.sh",
> +        "./workspace/build/src/taskcluster/scripts/misc/" + run['script'],
> +    ])

Can you please file a bug and add a TODO to convert these to `run-task`?
Status: NEW → ASSIGNED
Flags: needinfo?(gps)
Assignee

Comment 205

3 years ago
mozreview-review-reply
Comment on attachment 8789141 [details]
Bug 1286075: add more functionality to the task description;

https://reviewboard.mozilla.org/r/77420/#review76378

> I've never been a fan of the "project branches" terminology because in the context of Firefox there is only a single "project" and "branch" really means "repository."
> 
> What are your thoughts on calling this `run_on_repos` instead?
> 
> Also, what tasks do we have that run on "integration" but not "release" repos? Isn't "release" a superset of "integration?"

The in-tree stuff is very consistent about calling these "project", dating to February when I started this journey.  I don't mind changing that (I think "repo" is probably the best alternative), but I think we should change it everywhere at once.  Maybe now, or at least soon before we get used to "run_on_projects" is the right time?

> I'm not a fan of comparing to "try" like this. Could we look at `level == 1` instead?

This also happens a lot of places.  Let's consider changing it wholesale in a new bug?

Comment 206

3 years ago
mozreview-review-reply
Comment on attachment 8789141 [details]
Bug 1286075: add more functionality to the task description;

https://reviewboard.mozilla.org/r/77420/#review76378

> The in-tree stuff is very consistent about calling these "project", dating to February when I started this journey.  I don't mind changing that (I think "repo" is probably the best alternative), but I think we should change it everywhere at once.  Maybe now, or at least soon before we get used to "run_on_projects" is the right time?

I'm fine deferring to follow-ups. I don't want to delay this series on account of bikeshedding.

> This also happens a lot of places.  Let's consider changing it wholesale in a new bug?

Follow-up is acceptable.

Comment 207

3 years ago
mozreview-review
Comment on attachment 8789146 [details]
Bug 1286075: add a build kind, modify tests to use it;

https://reviewboard.mozilla.org/r/77430/#review76412

wow, it's amazing that the whole mozreview squashed diff takes up 3 pages. I can't imagine how much planning and strategy has gone into this work. Impressive!

This revision only encapsulates a part of the whole so I am missing some context. Saying that, from what I can see, everything seems to make sense. While I still think that creating new tasks with out of ordinary, custom behaviour will be difficult for developers, e.g. something similar or worse to mulet, the overall scalability and expressive power this provides will be much appreciated by maintainers. Not to mention better tree health.

worth noting: dustin mentioned that he has tools to test legacy vs this patch's output so I am not diving that deep.

I am putting this as r- as I imagine some of my issues will turn out to be correct and require follow up patches. e.g. android partner description

::: taskcluster/ci/build/android-partner.yml:2
(Diff revision 1)
> +android-partner-sample1/opt:
> +    description: "Android 4.0 API15+ Debug"

hm, this might be a lie. I don't even think we enable debug for these varients.

it used to be: "Android armv7 API 15+ partner sample 1"

::: taskcluster/ci/build/linux.yml:16
(Diff revision 1)
> +    worker:
> +        implementation: docker-worker
> +        max-run-time: 36000
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build check-test generate-build-stats update]

I don't think we need to run the `update` action for any CI jobs. Even though we have safe guards[1] from jobs that accidently try to run update, I was never sure why legacy taskcluster explicitily said try to update.


[1] https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/building/buildbase.py?q=path%3Abuildbase.py&redirect_type=single#2036

::: taskcluster/ci/build/linux.yml:19
(Diff revision 1)
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build check-test generate-build-stats update]
> +        config:
> +            - builds/releng_base_linux_64_builds.py
> +            - balrog/production.py

likewise with the `update` comment, we don't need to pass balrog config items

::: taskcluster/ci/build/linux.yml:41
(Diff revision 1)
> +        implementation: docker-worker
> +        max-run-time: 36000
> +    coalesce-name: linux64-pgo
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build check-test generate-build-stats update]

rather than polluting the review with comments under every time you have 'update' in your list of actions or include 'balrog/production.py' in your config list, you can assume that my one comment applies to every platform+variant

::: taskcluster/ci/build/linux.yml:42
(Diff revision 1)
> +        max-run-time: 36000
> +    coalesce-name: linux64-pgo
> +    run:
> +        using: mozharness
> +        actions: [get-secrets build check-test generate-build-stats update]
> +        options: [enable-pgo]

ah, so this is how --nightly should be passed. neat

::: taskcluster/ci/build/linux.yml:136
(Diff revision 1)
> +    index:
> +        product: firefox
> +        job-name: linux64-asan-opt
> +    treeherder:
> +        platform: linux64/asan
> +        symbol: tc(Bo)

so generic variants, e.g. 'linux64', we use 'tc(B)' for the symbol regardless if it is opt or debug but for variants like 'linux64 asan', we use 'tc(Bd)' and 'tc(Bd)'?

I get that unlike generic variants, they share the same platform so we need to distinguish. However I wonder if we should use the 'Bo' 'Bd' everywhere even on 'opt'/'debug' platform rows?

::: taskcluster/taskgraph/transforms/build_attrs.py:22
(Diff revision 1)
> +    appropriately for that purpose.
> +    """
> +    for job in jobs:
> +        build_platform, build_type = job['name'].split('/')
> +
> +        # pgo builds are represented as a different platform, type opt

so I learn, why do we do this?

::: taskcluster/taskgraph/transforms/job/mozharness.py:25
(Diff revision 1)
> +    docker_worker_setup_secrets
> +)
> +
> +COALESCE_KEY = 'builds.{project}.{name}'
> +
> +mozharness_run_schema = Schema({

TIL voluptuous. pretty neat way to define schemas

::: taskcluster/taskgraph/transforms/job/mozharness.py:75
(Diff revision 1)
> +    # supported on Windows.
> +    Required('keep-artifacts', default=True): bool,
> +})
> +
> +# can probably use the same transform for docker-engine, too:
> +# @run_job_using("docker-engine", "mozharness-via-build.sh")

keeping this in?

::: taskcluster/taskgraph/transforms/job/mozharness.py:130
(Diff revision 1)
> +
> +    # if we're not keeping artifacts, set some env variables to empty values
> +    # that will cause the build process to skip copying the results to the
> +    # artifacts directory.  This will have no effect for operations that are
> +    # not builds.
> +    if not run['keep-artifacts']:

so I understand, where does keep-artifacts definition come from?

::: taskcluster/taskgraph/transforms/job/mozharness.py:168
(Diff revision 1)
> +    for prop in ['actions', 'options', 'custom-build-variant-cfg',
> +                 'tooltool-downloads', 'secrets', 'taskcluster-proxy',
> +                 'need-xvfb']:

hm, how does this work? it looks like we validate mozharness_on_windows against mozharness_run_schema. 

The mozharness_on_windows schema defines many props as `Required` from this list?

::: taskcluster/taskgraph/transforms/job/mozharness.py:196
(Diff revision 1)
> +    mh_command = [r"c:\mozilla-build\python\python.exe"]
> +    mh_command.append(".\\build\\src\\testing\\" + run['script'].replace('/', '\\'))
> +    for cfg in run['config']:
> +        mh_command.append(r"--config " + cfg.replace('/', '\\'))
> +    mh_command.append(r"--branch " + config.params['project'])
> +    mh_command.append(r"--skip-buildbot-actions --work-dir %cd:Z:=z:%\build")
> +    worker['command'] = [
> +        r"mkdir .\build\src",
> +        r"hg share c:\builds\hg-shared\mozilla-central .\build\src",
> +        r"hg pull -u -R .\build\src --rev %GECKO_HEAD_REV% %GECKO_HEAD_REPOSITORY%",
> +        " ".join(mh_command),
> +    ]

wow.... :)

::: taskcluster/taskgraph/transforms/job/mulet.py:37
(Diff revision 1)
> +@run_job_using("docker-worker", "mach-via-build-mulet-linux.sh", schema=build_mulet_linux_schema)
> +def docker_worker_make_via_build_mulet_linux_sh(config, job, taskdesc):
> +    run = job['run']
> +    worker = taskdesc.get('worker')
> +
> +    # assumes desktop-build (which contains the gecko checkout command)

sanity check. in the mulet.yml you have a custom, non default, docker-image defined: `docker-image: {in-tree: builder}`. The default is desktop-build.

Is this dependancy correct?

::: taskcluster/taskgraph/transforms/job/mulet.py:106
(Diff revision 1)
> +def docker_worker_mulet_simulator(config, job, taskdesc):
> +    run = job['run']
> +    worker = taskdesc.get('worker')
> +
> +    # assumes desktop-build (which contains the gecko checkout command)
> +    taskdesc['dependencies']['docker-image'] = 'build-docker-image-desktop-build'

same question re: desktop-build vs build for dep docker-image
Attachment #8789146 - Flags: review?(jlund) → review-
Assignee

Comment 208

3 years ago
mozreview-review-reply
Comment on attachment 8789142 [details]
Bug 1286075: introduce job descriptions and implementations;

https://reviewboard.mozilla.org/r/77422/#review76380

> I'm a fan of returning new objects instead of transforming in place. Is there a reason we can't do this?

In this case the two have identical structure except for the `run` attribute, so it seems silly to do a deep copy (especially when we're generating thousands of tasks).

> The Python purist in me does not approve since this assumes modules are being loaded from a filesystem (as opposed to say a zip file). But we shouldn't have to support alternate module sources, so we should be fine.

I'm open to a better way to do this.  I wanted to avoid enumerating files somewhere else.

> Not all `run-task` based tasks have a source checkout. So it seems wrong to do all this VCS foo for all docker-worker based tasks using run-task. It seems like you want 2 variations: one just doing `run-task` and another that is `run-task`+vcs?
> 
> Or you could add an attribute to the schema to control whether a VCS checkout is performed. Basically what I did in the latest commit in bug 1296397.

All of the tasks in inbound when I wrote this did, but I bet when I do the rebase I'll need to add an attribute.
Assignee

Comment 209

3 years ago
mozreview-review-reply
Comment on attachment 8789144 [details]
Bug 1286075: add support for optimizing based on files changed in the push;

https://reviewboard.mozilla.org/r/77426/#review76388

> Nit: specifying non-scalar types as default argument values in Python doesn't do what you expect. You probably want something like:
> 
>    def get_changed_files(repository, revision, _cache=None):
>        _cache = _cache or {}
>        
> If you want to rely on the default behavior (which may be the case here because a cache), you should document it because an experienced Python programmer is conditioned to red flag code like this.

I'd argue an experienced Python programmer should recognize the idiom :)

I'll add a comment.

> In the future, please split changes like this into their own commit.

You just want me to get to 100 csets in this patch request.. seriously though, this is related because we need to pass the parameters along so we have the repo and revision to look at.

> We don't want tests hitting remote servers.
> 
> Please mock this somehow, possibly by monkeypatching.

I think that's the majority of the value of this test. Of course, the test won't run if the server-side response changes, since the decision task will fail.  Arguably then there's no need for a test at all!  But I'll do some basic mocking.
Assignee

Comment 210

3 years ago
mozreview-review-reply
Comment on attachment 8789146 [details]
Bug 1286075: add a build kind, modify tests to use it;

https://reviewboard.mozilla.org/r/77430/#review76412

> I don't think we need to run the `update` action for any CI jobs. Even though we have safe guards[1] from jobs that accidently try to run update, I was never sure why legacy taskcluster explicitily said try to update.
> 
> 
> [1] https://dxr.mozilla.org/mozilla-central/source/testing/mozharness/mozharness/mozilla/building/buildbase.py?q=path%3Abuildbase.py&redirect_type=single#2036

Let's leave that to a followup.  My goal here is to replicate the existing taskgraph, and mixing that with too many wide-ranging changes will confuse the issue (and add more csets to an already huge patchset).  Please file a followup bug to make this change?

> likewise with the `update` comment, we don't need to pass balrog config items

Sounds like another good followup bug :)

> so generic variants, e.g. 'linux64', we use 'tc(B)' for the symbol regardless if it is opt or debug but for variants like 'linux64 asan', we use 'tc(Bd)' and 'tc(Bd)'?
> 
> I get that unlike generic variants, they share the same platform so we need to distinguish. However I wonder if we should use the 'Bo' 'Bd' everywhere even on 'opt'/'debug' platform rows?

Again, just replicating what exists, since I figure nobody wants symbols to suddenly change on them.  This sounds like a good followup.  It may also relate to bug 1286086, since it involves builds with multiple labels (linux64/asan/opt, linux64/asan/debug).

> so I learn, why do we do this?

I think because someone wrote it that way once.  Platforms are kind of crazy.  I'm hoping to clean that up in bug 1286086.

> so I understand, where does keep-artifacts definition come from?

It's part of the run definition, so from a job description like:

    somejob:
      ...
      run:
        using: mozharness
        keep-artifacts: false
        ...

The schema for that run section is above.

> hm, how does this work? it looks like we validate mozharness_on_windows against mozharness_run_schema. 
> 
> The mozharness_on_windows schema defines many props as `Required` from this list?

They both have the same run schema, since I want a mozharness run to look the same regardless of platform.  It's just that some of the features aren't (yet) implemented on Windows, so this code is a guard to make sure we don't try to use those features on that platform.  We eventually will want to, and this exception will be raised, and whoever wants to use those features will need to implement them.

`Required` properties with a default have the default applied, so the only properties that must be specified in the job description are `script` and `config`.

> sanity check. in the mulet.yml you have a custom, non default, docker-image defined: `docker-image: {in-tree: builder}`. The default is desktop-build.
> 
> Is this dependancy correct?

Oh, good point -- I'll adjust the comment.  The `builder` image is basically an old old version of `desktop-build`.
Assignee

Comment 211

3 years ago
mozreview-review-reply
Comment on attachment 8789147 [details]
Bug 1286075: add artifact-build kind

https://reviewboard.mozilla.org/r/77432/#review76416

> This can be jacked down to something like 15 minutes. Artifact builds should take less than 60s once the build starts.

Let's do this and the other changes to the task graph in a followup bug.  I don't want to pollute this bug with fixing the whole world.
Assignee

Comment 212

3 years ago
mozreview-review-reply
Comment on attachment 8789154 [details]
Bug 1286075: add a toolchain kind;

https://reviewboard.mozilla.org/r/77446/#review76422

> 10 hours?

Another good followup bug :)
Assignee

Comment 213

3 years ago
mozreview-review-reply
Comment on attachment 8789156 [details]
Bug 1286075: add a b2g-device kind;

https://reviewboard.mozilla.org/r/77450/#review75842

> I guess we will have to add pine too? But maybe that should be done only on pine tree :)

I will add it.  I do not have pine in my list of tested branches, so I missed that.

As long as such changes are present on the tree where they're required, that is enough.  But letting the changes merge around is fine, too.
Whew, thanks everyone for the reviews!

Since it came up a bunch of times: this bug is primarily about refactoring task-graph generation, and my tests reflect that: I'm checking that the taskgraphs before and after this bug are identical across a variety of parameters.  I've added a few changesets to the beginning where the old taskgraphs showed unnecessary irregularities that I didn't want to model with the new system, and I added a few changesets to the end making simple, uncontroversial changes.  However, I'd like to stop even making uncontroversial changes in this bug -- I'd rather land the patchset, and have all of you lovely people make those changes in followup bugs.  I've indicated that in the individual reviews above, but did not file the followup bugs except where I was asked to.

Especially in a big patchset like this, "r-" is a much clearer signal than clearing the review flag or leaving it at "r?".  I *think* everyone called on to review something here has looked, but neither bugzilla nor mozreview tells me for sure.

I should have a rebased version up today with issues marked "Fixed" actually fixed.
Assignee

Comment 215

3 years ago
mozreview-review-reply
Comment on attachment 8789159 [details]
Bug 1286075: delete the legacy kind;

https://reviewboard.mozilla.org/r/77456/#review76032

Ah, I didn't mean to move routes.json.  That will definitely break mozharness!  Will fix.

We discussed this offline and there are two issues to contend with:

 1. New route prefixes (maybe something like `gecko.v2.revision.<revision>` without the intervening branch) should appear at the same time in both BB/TC
 2. Porting jobs from BB to TC should generate the same index routes, and this is something most people are unlikely to think about or worry about.

For the first, the plan is to modify `routes.json` such that both tools can use it (so, same variable names), and pull that data into both tools.  I'll probably change to something other than JSON so I can include comments, and those comments will include a note that once BB is gone, the file can be removed and its functionality folded into `task.py`.

For the second, the plan is to include a whitelist of v2 job names in `task.py`, and if a job comes through that's not on the whitelist, throw an exception sugesting double-checking that the index routes haven't changed from what buildbot generates, and assuming that checks out to add the new job-name to the whitelist.  Again, the whitelist can be deleted once we're completely migrated and will have comments to that effect.

Since this may be a bit of work, and involves tweaking mozharness, I'll do it in bug 1301720 once this one is landed.

> Since this is presumably a nightly build, I'd expect this to use the "nightly" routes in addition to the regular "routes" from routes.json

The `nightly_fennec` is a total hack and is very temporary (I'm hoping it will get deleted within a few days of this patchset landing).  It's a hacked-up copy of the `legacy` kind, bringing along all of its legacy-ness, including this.
Blocks: 1302133
Assignee

Comment 216

3 years ago
mozreview-review-reply
Comment on attachment 8789146 [details]
Bug 1286075: add a build kind, modify tests to use it;

https://reviewboard.mozilla.org/r/77430/#review76400

> 10 hours?!

I don't make the rules, I just re-implement them..

> Any reason to use the single line list syntax instead of putting entries on multiple lines like `config` below?

Brevity.  The shorter (in lines) I can make each build description, the easier it is to read.

> I'm pretty sure we don't need xvfb on Linux builds.

I seem to recall it was added for a specific reason.  I know PGO needs it, and it may have been difficult to add the option just for PGO using the old system.  At any rate, I'm duplicating what's in place -- someone can easily delete this line later if it is clear it's unnecessary.

> Can we put "32" in this name so there is symmetry in naming?

I think that would make a bunch of people scream bloody murder.  The place to try is bug 1286086.  The easier solution is to wait until we drop the linux(32) platform entirely.

> ?!?% ???

I should put "bug 1286086" at the end of that.  It's expressing frustration that the platform is sometimes "linux64-mulet" and sometimes "mulet-linux64".

> I'm somewhat tempted to call this kind a "distribution." But that's probably a bit too radical. And I don't want to hold landing this up on a bikeshed.

It would be nice to avoid the term "build" everywhere, since at Mozilla at least its meaning has been diluted to, roughly, "noun, thing".  But yeah, I think that's going to be a hard change to pull off, and just renaming it one place is not going to work (look how well jlal's renaming "repo" to "project" worked..)

> Not sure why you use a raw string once and a non-raw string later. Stick to one or the other.

I did this because raw strings don't let you end the string with a backslash.

    >>> print(r"foo\")
      File "<stdin>", line 1
        print(r"foo\")
                     ^
    SyntaxError: EOL while scanning string literal
     >>> print(r"foo\\")
    foo\\

I suppose I could use `"\\".join(..)`.

Comment 217

3 years ago
mozreview-review
Comment on attachment 8789147 [details]
Bug 1286075: add artifact-build kind

https://reviewboard.mozilla.org/r/77432/#review76706

I would just re-iterate gps' comment about the docs being a little off.
Attachment #8789147 - Flags: review?(cmanchester) → review+
Comment hidden (mozreview-request)