Closed Bug 1277579 Opened 3 years ago Closed 3 years ago

Create task graph for Linux 64 nightlies

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla51

People

(Reporter: coop, Assigned: kmoir)

References

Details

Attachments

(5 files, 28 obsolete files)

58 bytes, text/x-review-board-request
kmoir
: review+
Details
1.28 KB, patch
philor
: review+
Details | Diff | Splinter Review
6.39 KB, patch
dustin
: review-
Details | Diff | Splinter Review
11.16 KB, patch
Details | Diff | Splinter Review
13.07 KB, patch
dustin
: feedback+
Details | Diff | Splinter Review
We should translate Aki's nightly one graph doc into a task graph and start plugging in the pieces:

https://docs.google.com/document/d/112mRn5ZntSYjr53M5iJAYDBxVjs8__Pzn379e3z8Ecs/edit#
FYI: We're most likely abandoning the audit service as a part of the automation.
https://docs.google.com/a/mozilla.com/document/d/1wEqD6vlQ_i0OAirBFJSzIRK94drZQDpgYvGBdcuIsqo/edit?usp=sharing is the new proposal, and we're still working on the threat modeling.

However, agreed, we can implement the graph in parallel.
Just to note in case there is something to obey, whenever we will have this graph live I would like to extend it with our firefox ui tests which are getting run on the en-US and all other generated locales. For now I implement my own tc task handling over on bug 1272236.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1277595
Chris, bug 1277595 is about Fennec while this bug handles Firefox. Do we use the same graph for both products from the very beginning?
Flags: needinfo?(coop)
(In reply to Henrik Skupin (:whimboo) from comment #4)
> Chris, bug 1277595 is about Fennec while this bug handles Firefox. Do we use
> the same graph for both products from the very beginning?

The graphs aren't sufficiently different to warrant separate efforts, so yes.
Flags: needinfo?(coop)
(In reply to Chris Cooper [:coop] from comment #5) 
> The graphs aren't sufficiently different to warrant separate efforts, so yes.

After a week's worth of development on the Fennec graph, it turns out I lied. Re-opening.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Have been running try runs for this all week, am making progress, or at least the error messages are changing :-)
I'm stuck on the same issue for the last day,

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ada2151d9b5e&selectedJob=24463056

:amiyaguchi do you have any suggestions on what might be wrong?  I think there is something with the taskgraph generation that isn't working.  The error is ImportError: No module named kind.nightly_desktop

 https://public-artifacts.taskcluster.net/V9LyjJySRHmk-pLIqbsKTg/0/public/logs/live_backing.log
Flags: needinfo?(amiyaguchi)
Task kind implementations are now under taskgraph.task.<module> instead of taskgraph.kind.<module>.
I haven't rebased on top of the changes from bug 1281004 yet if you've been referencing my patches. The files in taskgraph/kind would need to be moved to taskgraph/task, and the kind.yml would have to be modified accordingly. `mach taskgraph target` with parameters.yml from the try push can be used as a sanity check for these changes.
Flags: needinfo?(amiyaguchi)
Thanks Dustin and Anthony, now making more progress
now mach taskgraph decision works when I run it against my local and up to date clone on m-c but fails on try so debugging that
Figured out what was wrong, my patch queue was missing a file and now the decision task is green in try.
Now compiling the build, just working through an issue with symbols and uploading running another try run.
Now there is an issue with balrog submission which I'm investigating

15:43:58     INFO - [mozharness: 2016-07-29 15:43:58.467249Z] Finished update step (failed)
15:43:58    FATAL - Uncaught exception: Traceback (most recent call last):
15:43:58    FATAL -   File "/home/worker/workspace/build/src/testing/mozharness/mozharness/base/script.py", line 1770, in run
15:43:58    FATAL -     self.run_action(action)
15:43:58    FATAL -   File "/home/worker/workspace/build/src/testing/mozharness/mozharness/base/script.py", line 1709, in run_action
15:43:58    FATAL -     self._possibly_run_method(method_name, error_if_missing=True)
15:43:58    FATAL -   File "/home/worker/workspace/build/src/testing/mozharness/mozharness/base/script.py", line 1649, in _possibly_run_method
15:43:58    FATAL -     return getattr(self, method_name)()
15:43:58    FATAL -   File "/home/worker/workspace/build/src/testing/mozharness/mozharness/mozilla/building/buildbase.py", line 2053, in update
15:43:58    FATAL -     if self.submit_balrog_updates():
15:43:58    FATAL -   File "/home/worker/workspace/build/src/testing/mozharness/mozharness/mozilla/updates/balrog.py", line 22, in submit_balrog_updates
15:43:58    FATAL -     product = self.buildbot_config["properties"]["product"]
15:43:58    FATAL - KeyError: 'product'
15:43:58    FATAL - Running post_fatal callback...
15:43:58    ERROR - setting return code to 2 because fatal was called
15:43:58  WARNING - setting return code to 2
15:43:58    FATAL - Exiting -1
Linux gpg signing can happen in a task like this:  https://tools.taskcluster.net/task-inspector/#b3c1YMXRS52fRd5kbWt37g/
Anthony and I are debating whether the build creates a signingManifest or if we embed the links to the signed files in the task definition.

signingManifest:
* already supported in signing scriptworker
* requires build-side work to support (mozharness?)
* adds a sha, which adds reliability until we have a Chain of Trust artifact
* the files can be named whatever we want, so pretty names with versions are fine

links in task payload:
* easy enough to support at graph generation time
* signing scriptworker needs some changes to support it
* has no sha, so we'll just be going by file download exit codes and expect the link to not point to malicious bits until we have a Chain of Trust artifact
* the files need to be named in a predictable way, so likely we won't have pretty names in automation

I don't have a strong opinion; I think right now we're leaning towards task payload links.
The manifest contains shas, but would not itself have a sha, right?

Note that the naming issue can be fixed in the "links in task payload" case, too.  You need to know the name at graph generation time, but since that's running in-tree it shouldn't have trouble finding that information.
(In reply to Dustin J. Mitchell [:dustin] from comment #17)
> The manifest contains shas, but would not itself have a sha, right?

If you're talking about the build-generated signingManifest,

* mozharness creates the signingManifest at build time
* docker-worker creates the artifact sha for the chain of trust post-task

so yes, it would have a sha.
Is that what you meant?

> Note that the naming issue can be fixed in the "links in task payload" case,
> too.  You need to know the name at graph generation time, but since that's
> running in-tree it shouldn't have trouble finding that information.

This doesn't solve the naming issue if the decision task can't easily determine the name of the binaries without asking the build system for each build type (pretty names with version numbers in them, for example), which is what I'm referring to.  I'm a fan of not using those pretty names in automation, but I'm not sure what that will affect: tests may use them, and something inside the binaries may expect pretty naming.  Ideally we can use predictable names with little or no fallout, though.
Regarding the sha's, I didn't realize docker-worker was hashing anything.  I was suggesting that the "signingManifest" option gets you an artifact of the build task (the manifest) containing hashes of build outputs, but that artifact is itself un-hashed; while the "links in task payload" option gets you un-hashed build outputs.  In order to trust the hashes in the manifest, you need to trust an unhashed artifact, which is no worse than just trusting un-hashed build outputs.  Sorry if I'm not being clear -- the tl;dr is, I don't think there is much difference between the options as far as hash safety.
(In reply to Dustin J. Mitchell [:dustin] from comment #19)
> Regarding the sha's, I didn't realize docker-worker was hashing anything.

Not yet, but we have bug 1284991 for that work.

> I
> was suggesting that the "signingManifest" option gets you an artifact of the
> build task (the manifest) containing hashes of build outputs, but that
> artifact is itself un-hashed;

Correct.  signingManifest would only contain hashes of the binaries-to-sign, which doesn't include the signingManifest.
Since the manifest can be altered, this isn't a significant security win, but can help with verifying the download happened successfully.

> while the "links in task payload" option gets
> you un-hashed build outputs.  In order to trust the hashes in the manifest,
> you need to trust an unhashed artifact, which is no worse than just trusting
> un-hashed build outputs.  Sorry if I'm not being clear -- the tl;dr is, I
> don't think there is much difference between the options as far as hash
> safety.

Makes sense.  I also prefer the links-in-manifest as long as decision-time predictable names aren't difficult.
Once https://reviewboard.mozilla.org/r/71746 lands, the signing scriptworkers will use links in payload.  Sorry for hijacking this bug for the past few comments!
Attached patch desktop.patch (obsolete) — Splinter Review
wip patch, some hard coded values to get it to work in try.  Still working on getting the balrog submission working.
Attached patch desktop-2.patch (obsolete) — Splinter Review
taskgraph is generated but tasks are not added to the task queue to be run.  I recently changed the patches to enable signing. Debugging the problem.
Attachment #8783068 - Attachment is obsolete: true
Comment on attachment 8784067 [details] [diff] [review]
desktop-2.patch

>diff --git a/taskcluster/ci/signing/signing.yml b/taskcluster/ci/signing/signing.yml
>new file mode 100644
>--- /dev/null
>+++ b/taskcluster/ci/signing/signing.yml

Now I'm unclear: is the *build* task not generating artifacts, or is the *signing* task not generating artifacts?

>@@ -0,0 +1,19 @@
>+task:
>+  provisionerId: "scriptworker-prov-v1"
>+  workerType: "signing-linux-v1"
>+  scopes:
>+    - "project:releng:signing:cert:dep-signing"
>+    - "project:releng:signing:format:jar"

If you're using the same template as :amiyaguchi for signing, you want project:releng:signing:format:gpg (gpg signing) instead of project:releng:signing:format:jar (apk signing).

>+  created:
>+    relative-datestamp: "0 seconds"
>+  deadline:
>+    relative-datestamp: "24 hours"
>+  payload:
>+    unsignedArtifacts: []

The unsignedArtifacts will need to point at the links to the dependent tasks' artifacts, e.g. https://queue.taskcluster.net/v1/task/Bfg4fVweRNWz2JkVQtHiCQ/artifacts/public/build/target.tar.bz2
The build was not generating artifacts - because the target task set was not being generated.  I have fixed I think and can generate the correct task graph locally. I'm waiting for try to reopen so I can test my fix and then will upload new patches.
Attached patch another wip patch (obsolete) — Splinter Review
the taskgraph generation works locally but fails on try due to scopes which I assume are expected due to the issue that aki mentioned where tc containers don't have appropriate permissions to access signing.  I have to refactor the patches to include the comments from the latest patches in bug 1277595, and refactor them to accommodate fennec  (right now some fennec bits are commented out).  I also have to refactor to accomodate balrog submission which is still failing.
Attachment #8784067 - Attachment is obsolete: true
Depends on: 1298495
Attached patch wip patches (obsolete) — Splinter Review
patches rebased on Anthony's patches in bug 1277595 landing in tree
Attachment #8785100 - Attachment is obsolete: true
I was able to build a nightly desktop build using the hook like the instructions here indicated. 

https://gist.github.com/acmiyaguchi/f20c4f47e723b3a94ad9a3f82da457f9

I'll clean up my patches and ask for feedback/review.  

https://tools.taskcluster.net/task-inspector/#dw8JQs62Syya1VK6-w206A/0
Attached patch wip patches (obsolete) — Splinter Review
generates artifacts, trying to fix signing.py so it doesn't try to sign fennec when a desktop build is requested
Attachment #8788553 - Attachment is obsolete: true
For signing.yml, we'll need to change the "project:releng:signing:format:jar" scope to "project:releng:signing:format:gpg" for linux, or it'll expect an APK.  We'll also want the cert scope to be "project:releng:signing:cert:nightly-signing", although that won't break anything atm since they all point to the dep key currently.

Dustin said we'd probably need a different signing kind for l10n; this suggests to me that we may need to have an android-signing kind, a linux64-signing kind, an l10n-PLATFORM-signing kind, etc.  We may be able to use the same template.  I'm a little fuzzy on details on how that would work.

For Try, whatever works, works.  When you're ready for review, we'll have to deal with how we deal with the different kinds.  Let me know if you have any questions or want any help?
Comment on attachment 8789474 [details] [diff] [review]
wip patches

asking for feedback on the existing patches as I address the other ongoing issues
Attachment #8789474 - Flags: feedback?(aki)
The nightly-fennec as is very temporary and probably best not duplicated.  I'll be filing a bug to delete it as soon as 1286075 lands.  In general, I think it would be easiest at this point for you to base your work on bug 1286075, as it will see only minor changes before landing.

Aki, the reason I suggested two kinds is that they fit at different points in the dependency hierarchy

  build -> sign -> l10n-repack -> sign

This is necessary because otherwise you have a chicken-and-egg problem: do you define the l10n tasks first and then have the signing tasks record dependencies on them, in which case the l10n tasks have no tasks to depend on?  Or do you define signing tasks first, in which case the signing tasks on the right don't have any l10n tasks to depend on?  Defining two kinds that share an implementation but differ in the dependencies they search for is a good way around this.

I don't think that necessitates having multiple signing kinds for different platforms -- I'm regretting defining "android-test" and "desktop-test" kinds already, and planning to merge those into just "test" before someone invents "macosx-test" and "windows-test".  In general, I think platform-specific kinds are probably a red flag.
Dustin: awesome, good to hear.
Comment on attachment 8789474 [details] [diff] [review]
wip patches

I *think* our conversations in this bug + irc cover my thoughts on this, so unflagging myself for now.  I'm not an expert on the in-tree graph generation, but if you want me to take a closer look at the patch, please reflag me.
Attachment #8789474 - Flags: feedback?(aki)
Depends on: 1301740
Attached patch desktop-11.patch (obsolete) — Splinter Review
patches now use correct signing format, tried to create a new desktop hook with scopes to reflect that and ran into bug 1301740.  Will try another build once this is resolved.
Attachment #8789474 - Attachment is obsolete: true
The permissions error was a different one.  To create a task you need
   hooks:modify-hook:<hookGroupId>/<hookId> and assume:hook-id:<hookGroupId>/<hookId>
and the project-admin:releng role only contained the first of those.  So, an oversight on my part which I've remedied.
No longer depends on: 1301740
Thanks Dustin!

Now I have a new build running that I triggered manually via my hook.  

https://tools.taskcluster.net/task-inspector/#BkRrsG5PQvCiRzyvhOLXwg/0

https://xkcd.com/303/

They use the latest patched that I've attached.
Attachment #8789872 - Flags: feedback?(jlund)
The build compiled compiling and looks good.  I will test that these patches don't break anything when triggering with the nightly-fennec hook to create a nightly-fennec build
Attachment #8789872 - Flags: feedback?(jlund)
Attached patch desktop-12.patch (obsolete) — Splinter Review
nightly build worked here
https://tools.taskcluster.net/task-inspector/#JUxMD1myQMivdh1ImKovjw/0

One thing I'm not sure about is how to refactor taskcluster/taskgraph/task/signing.py so that only run nightly fennec or nightly desktop but not both.  Both tasks seem to be always be in the loaded_tasks. 

Also, 
def get_dependencies should to return the correct dependencies depending if they are fennec or nightly, but this depends on my question above
Attachment #8789872 - Attachment is obsolete: true
Attachment #8790401 - Flags: feedback?(jlund)
(In reply to Kim Moir [:kmoir] from comment #39)
> Created attachment 8790401 [details] [diff] [review]
> desktop-12.patch
> 
> nightly build worked here
> https://tools.taskcluster.net/task-inspector/#JUxMD1myQMivdh1ImKovjw/0
> 
> One thing I'm not sure about is how to refactor
> taskcluster/taskgraph/task/signing.py so that only run nightly fennec or
> nightly desktop but not both.  Both tasks seem to be always be in the
> loaded_tasks. 
> 
> Also, 
> def get_dependencies should to return the correct dependencies depending if
> they are fennec or nightly, but this depends on my question above



I'm not sure if anthony's logic here was taking into account extending this with linux64: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/task/signing.py?q=taskgraph%2Ftask%2Fsigning.py&redirect_type=direct#52

so it probably needs a lot of glue. one question to figure out is whether it is worth investigating fixing the deps as is with nightly-fennec kind or else port nightly-fennec kind over to 'build' kind with bug 1286075

iow - fix once or fix twice. dustin's bug looks like it is about to land real soon as patches are reviewed
Comment on attachment 8790401 [details] [diff] [review]
desktop-12.patch

Review of attachment 8790401 [details] [diff] [review]:
-----------------------------------------------------------------

::: .taskcluster.yml
@@ +111,5 @@
>                --head-repository='{{{url}}}'
>                --head-ref='{{revision}}'
>                --head-rev='{{revision}}'
>                --revision-hash='{{revision_hash}}'
> +              --triggered-by='nightly'

just for testing?
Comment on attachment 8790401 [details] [diff] [review]
desktop-12.patch

dustin is going to come to the tcmigration nightly meeting tomorrow to discuss how to proceed

we may need more [sub]kinds to support the differences in deps and tasks between fennec and desktop
Attachment #8790401 - Flags: feedback?(jlund)
great, looking forward to the discussion
From our meeting today, jlund will refactor fennec to use new build kind and then I will base desktop off his patches.

3:37 PM <jlund> kmoir: did you get my earlier messages of what next steps I was going to do? we are 1) dropping nightly-fennec kind altogether. 2) adding linux and fennec nightlies tasks to the build kind; probably via a flag in:
3:37 PM <jlund> http://hg.mozilla.org/integration/mozilla-inbound/file/2494e1f32e55/taskcluster/ci/build/android.yml#l51
3:38 PM <kmoir> no I didn;t see that, that makes sense
3:39 PM <jlund> + some added build kind logic to create both nightly and opt build tasks in full task set. removing them at the target layer
3:39 PM <Callek> jlund: I expect to do my patch similar to *existing* signing kind dep stuff, and then port it to the new way once your work removing nightly-fennec is done, so I know I'll have a bit of overlap...
3:39 PM <kmoir> jlund: what do you mean by flag?
3:40 PM <jlund> then, overhauling signing.py to be more like: https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/task/test.py#25,154
3:41 PM <kmoir> okay
3:42 PM <dustin> kmoir: in tests, we have an "e10s: both" option
3:42 PM <dustin> which causes that single YAML stanza to create two tasks, one with e10s enabled, one with it disabled
3:42 PM <jlund> kmoir: by flag I mean we say whether or not a build task can be a nightly. so we can say, asan will never be a nightly but opt can so create a nightly for that. and that nightly may be used depending on the target
3:42 PM <dustin> the idea would be "nightly: both" that would generate two build tasks, one with nightly stuff, one without
3:42 PM <kmoir> okay
3:43 PM <dustin> one benefit: you can run nightly builds in try (gps will love that)
3:43 PM <jlund> this is what I messaged you earlier re: 'what now' https://irccloud.mozilla.com/pastebin/rCnALvmm
 Plain Text • 2 lines raw | line numbers 
3:44 PM <kmoir> okay, well if you want to look at fennec I can look at signing and desktop
3:44 PM <jlund> k. I'm just conscious of overlap.
3:44 PM <jlund> but we can try to ||
3:45 PM <Callek> jlund: I too have some overlap since I'll need similar task-finding logic for repack stuff
3:45 PM <Callek> (my repack stuff is likely to want to dep on signing though, but I can get it tied in without signing for starters anyway)
3:45 PM <kmoir> yeah, we don't want to waste time with different implementations in parallel
3:46 PM <Callek> my initial work steps will be basing on the existing nightly-fennec kind and current signing's ability to find it, since I can mostly copy/paste for those parts
3:47 PM <kmoir> jlund: how long do you think it will take to refactor the fennec to use the build kind?
3:47 PM <Callek> but yea, I'll be sure to see how far along any of you are before I dive on the other bits
3:48 PM <jlund> assuming I don't get too confused, I can have a patch up today that we can analyze and work against in ||
3:48 PM <jlund> does that work?
3:48 PM <kmoir> sounds great!
3:48 PM <kmoir> jlund++
Attached patch new patches (obsolete) — Splinter Review
These are new patches which require the patches Jordan wrote in bug 1302590 to work.  

The task tree generation works

However, I'm running into a problem like this when I trigger it via the hook

https://tools.taskcluster.net/task-inspector/#X1DlHzfnTGaNeSanEOB2ew/0

[task 2016-09-16T20:12:37.163871Z] KeyError: u"task 'signing-nightly-linux' has no dependency with label 'build-linux64-nightly/opt'" 


The same error occurs when I try to trigger the fennec build via the hook 

Also, I'm not sure if ci/build-signing/kind.yml

should include both android and linux kinds 

# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.

implementation: 'taskgraph.task.signing:SigningTask'

jobs-from:
  - android-signing.yml
  - linux-signing.yml

kind-dependencies:
  - build
Attachment #8790401 - Attachment is obsolete: true
Attachment #8792101 - Flags: feedback?(jlund)
"task 'signing-nightly-linux' has no dependency with label 'build-linux64-nightly/opt'" should probably read "task 'signing-nightly-linux' has no dependency named 'build-linux64-nightly/opt'".  Dependencies are named, so for example there's a dependency named "docker-image" and tests have a depenendency named "build" that points to the build task.  Those names are what you refer to in task-references: things like {"task-reference": "<docker-image>"} or {"task-reference": "--installer-url=...<build>..."}.

So in this case you're referring to a dependency by its label instead of its name.

The use of the word "label" in the error message, where it should be "name", is definitely confusing!
Comment on attachment 8792107 [details]
Bug 1277579: use 'name' in dependency error message;

https://reviewboard.mozilla.org/r/79322/#review77866
Attachment #8792107 - Flags: review?(kmoir) → review+
Attachment #8792101 - Flags: feedback?(jlund)
thanks Dustin, I'll try that
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0b5473d5b966
use 'name' in dependency error message; r=kmoir
Attached patch test.patchSplinter Review
Attachment #8792173 - Flags: review?(kmoir)
Attachment #8792173 - Flags: review?(kmoir) → review+
Pushed by philringnalda@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/ef3c1ce0c1fb
fix taskgraph tests after code change; r=philor
https://hg.mozilla.org/mozilla-central/rev/0b5473d5b966
https://hg.mozilla.org/mozilla-central/rev/ef3c1ce0c1fb
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Kim Moir [:kmoir] from comment #45)
> Created attachment 8792101 [details] [diff] [review]

> 
> However, I'm running into a problem like this when I trigger it via the hook
> 
> https://tools.taskcluster.net/task-inspector/#X1DlHzfnTGaNeSanEOB2ew/0
> 
> [task 2016-09-16T20:12:37.163871Z] KeyError: u"task 'signing-nightly-linux'
> has no dependency with label 'build-linux64-nightly/opt'" 
> 

whoops, I was translating the signing dep (unsigned artifact) task's taskid by the label not the name defined in get_dependencies. I didn't catch this originally as I only tested up to `taskgraph target` and optimize is when we translate all the 'task-reference' locations.

here is the latest patch: https://reviewboard.mozilla.org/r/79178/diff/1-2/

> Also, I'm not sure if ci/build-signing/kind.yml
> 
> should include both android and linux kinds 

at this point, unless we need too much glue, I was thinking we could have them share the same build-signing kind and implementation: SigningTask but differ on the android-signing.yml and linux-signing.yml
Attached patch desktop-14.patch (obsolete) — Splinter Review
These new patches are on top of Jordan's patches in bug 1302590 to use the build kind instead of a special kind for nightlies.  It's building here

https://tools.taskcluster.net/task-inspector/#T35Qb8y2ReaQrsaql4imaQ/0

The only issue that remains is that the decision task includes both linux64 and android builds when it is triggered via the hook. I think I have to filter by platform or something so that only linux64 is included because we want to be able to trigger then automatically so I'm investigating that.
Attachment #8792101 - Attachment is obsolete: true
Attached patch desktop-15.patch (obsolete) — Splinter Review
I added a new attribute to the linux and android signing yml files to indicate if they were signing tasks for linux vs android.  Otherwise, both would be filtered into the taskgraph for either platform by virtue of the nightly attribute.  Not sure if this is the right approach or if I should filter on the task scopes instead (gpg vs jar)
Attachment #8792903 - Attachment is obsolete: true
Attached patch desktop-16.patch (obsolete) — Splinter Review
added a platform to the signing ymls at dustin's suggestion
Attachment #8792954 - Attachment is obsolete: true
Attachment #8792962 - Flags: review?(jlund)
Comment on attachment 8792962 [details] [diff] [review]
desktop-16.patch

Review of attachment 8792962 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm. This along with the fennec equivalent will eventually have to address dustin's concerns: Bug 1303894 - clean up build-signing kind, create nightly build based on opt equivalent. but I think it's okay to leave for now. At least until we have a staging nightly with these two platforms

my only main comment below that is a concern is whether or not we need run-on-projects: [], in the linux64 nightly build.

::: taskcluster/ci/build-signing/linux-signing.yml
@@ +1,1 @@
> +signing-nightly-linux:

should linux-signing.yml be linux64-signing.yml ?

::: taskcluster/ci/build/linux.yml
@@ +175,5 @@
>          need-xvfb: true
>  
> +linux64-nightly/opt:
> +    description: "Linux64 Nightly"
> +    attributes: 

\s

@@ +181,5 @@
> +    index:
> +        product: firefox
> +        job-name: linux64-nightly-opt
> +    treeherder:
> +        platform: linux64/opt 

\s

@@ +194,5 @@
> +        actions: [get-secrets build check-test generate-build-stats update]
> +        config:
> +            - builds/releng_base_linux_64_builds.py
> +            - disable_signing.py
> +            #- balrog/production.py

yeah, this should be safe to remove completely.

@@ +199,5 @@
> +            - taskcluster_nightly.py
> +        script: "mozharness/scripts/fx_desktop_build.py"
> +        secrets: true
> +        tooltool-downloads: public
> +        need-xvfb: true

I'm surprised you don't need this after here: https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/build/android.yml?q=path%3Aandroid.yml&redirect_type=single#101

does this job show up by default on inbound/central parameters.yml? Or when you do 'try -p all' for try?

::: taskcluster/taskgraph/target_tasks.py
@@ +108,5 @@
>      and, eventually, uploading the tasks to balrog."""
>      def filter(task):
> +        platform = task.attributes.get('build_platform')
> +        if platform in ('android-api-15-nightly', ):
> +            return (task.attributes.get('nightly', False))

ooc, how come you are wrapping the return value in parenthesis?

::: taskcluster/taskgraph/transforms/gecko_v2_whitelist.py
@@ +21,5 @@
>      'android-api-15-gradle-opt',
>      'android-api-15-opt',
>      'android-api-15-partner-sample1-opt',
>      'android-l10n-opt',
> +    'android-api-15-nightly-opt',

this should be on central already
Comment on attachment 8792962 [details] [diff] [review]
desktop-16.patch

clearing r? for now
Attachment #8792962 - Flags: review?(jlund)
Comment on attachment 8792962 [details] [diff] [review]
desktop-16.patch

Review of attachment 8792962 [details] [diff] [review]:
-----------------------------------------------------------------

::: taskcluster/ci/build-signing/android-signing.yml
@@ +18,5 @@
>        owner: "jlund@mozilla.com"
>        source: "https://tools.taskcluster.net/task-creator/"
>    attributes:
>      nightly: true
> +    build_platform: android-api-15-nightly

Eventually (as part of the cleanups Jordan mentioned) this should be copied from the build task on which the signing depends.  That's `android-api-15-nightly/opt`, so I think to get the equivalent here you need to add `build_type: opt` -- unless that's defaulted somewhere.

::: taskcluster/ci/build-signing/linux-signing.yml
@@ +1,1 @@
> +signing-nightly-linux:

For builds, we put both linux and linux64 in the same file -- I figure that's the intent here, too.
Attached patch desktop-17.patchSplinter Review
patch that address dustin and jlund's comments.  No, this job doesn't show up in parameters.yml when I push to try


https://treeherder.mozilla.org/#/jobs?repo=try&revision=95e0af186711

however, I'm not sure how to triggers nightly builds on try
Attachment #8792962 - Attachment is obsolete: true
Attachment #8794221 - Flags: review?(jlund)
Attachment #8794221 - Flags: review?(jlund) → review?(dustin)
Attachment #8794221 - Flags: review?(jlund)
From irc
1:06 PM <dustin> kmoir, jlund: I'd like to r- https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1277579&attachment=8794221 pending bug 1303894
1:06 PM <dustin> I think that patch can unblock downstream work without landing (since it only operates in try anyway)
1:07 PM <dustin> is that OK? or is there a strong reason to land that immediately (other than the end of the quarter..)
1:10 PM <jlund> I'm okay with not landing and just using it against try+date till we are ready. depends on how kim feels
1:10 PM <kmoir> well, co.op wanted to have nightlies working on date this week, I just landed that patch there
1:12 PM <kmoir> I didn't realize that more refactoring was required because bug  1303894 isn't a dependency on my nightly bug
1:13 PM <dustin> that seems fine
1:13 PM <jlund> it wasn't a dep. we were going to defer that cleanup and get linux/fennec nightlies at least running
1:14 PM <kmoir> and yes it is end of quarter and that is one of my goals 
1:14 PM <dustin> I hate quarters
1:14 PM <kmoir> anyways, I 'm working on getting them working on date, fixing scopes
1:14 PM <jlund> I mentioned that bug here: https://bugzilla.mozilla.org/show_bug.cgi?id=1277579#c59
1:14 PM <dustin> yeah
1:14 PM <dustin> I just don't want that to be an "always next quarter" bug
1:14 PM <jlund> dustin: tell that to the managers above :)
1:14 PM <dustin> I know :(
1:15 PM <dustin> but the only leverage I have is r-
1:15 PM <jlund> hahhaaha
1:15 PM <dustin> yet, I think product quality is more important than september 30
1:15 PM <dustin> and I think keeping the quality of the taskgraph generation high is a part of that
1:15 PM — dustin stuck :(
Comment on attachment 8794221 [details] [diff] [review]
desktop-17.patch

Review of attachment 8794221 [details] [diff] [review]:
-----------------------------------------------------------------

The idea with bug 1303894 was to get that patch landed so that we could get rid of the nightly-fennec kind, which was causing issues, then quickly address bug 1303894.  I don't really consider this stuff "production-ready" without those changes, and I'd rather not see more un-ready code hit the tree.

I'm trying to create a strong culture of "right the first time" around the task-graph implementation, so that we don't find ourselves eager to rip out its crufty carcass in two years' time.  So I apologize if I come across as dictatorial.. hopefully benevolent?

Since this is try-only, I don't think there's a great advantage to landing it quickly.  When pushed to try, this patch produces tasks that look the way you want them to (metadata aside), so I think it can be used as a skeleton to start work on the next critical steps in nightlies.  However, I'd like to see bug 1303894 fixed before this actually lands -- either as part of this bug, or separately and with this patch rebased on top of it.

I know this causes issues with quarterly goals.  Hopefully the powers that be can see a way to both reward everyone for their hard work and land patches only when they're ready :)

::: taskcluster/ci/build-signing/kind.yml
@@ +5,5 @@
>  implementation: 'taskgraph.task.signing:SigningTask'
>  
>  jobs-from:
>    - android-signing.yml
> +  - linux-signing.yml

I think Jordan asked the same -- shouldn't this be `linux64-signing.yml`?

::: taskcluster/ci/build-signing/linux-signing.yml
@@ +15,5 @@
> +    metadata:
> +      name: "Signing Scriptworker Task"
> +      description: "Sign Linux Build Tasks"
> +      owner: "kmoir@mozilla.com"
> +      source: "https://tools.taskcluster.net/task-creator/"

These should match the metadata for other Gecko tasks: `name` should match the label, `owner` should be the push user, and `source` should point to the `kind.yml` file at the pushed revision.  It may be worth factoring task descriptions (`taskcluster/taskgraph/transforms/task.py`) to support scriptworker tasks, which would get you this behavior for free.

@@ +23,5 @@
> +    build_type: opt
> +  unsigned-task:
> +    label: "build-linux64-nightly/opt"
> +    artifacts:
> +      - "public/build/target.tar.bz2"

These attributes, and the upstream tasks, should be generated by the kind implementation as it enumerates the build tasks that have `nightly: true`.  IIRC this was part of my "clean up when adding a second task" notes when Jordan added `android-signing.yml`.

::: taskcluster/ci/build/linux.yml
@@ +183,5 @@
> +    index:
> +        product: firefox
> +        job-name: linux64-nightly-opt
> +    treeherder:
> +        platform: linux64/opt 

trailing whitespace

::: taskcluster/taskgraph/transforms/gecko_v2_whitelist.py
@@ +34,5 @@
>      'linux64-ccov-opt',
>      'linux64-debug',
>      'linux64-jsdcov-opt',
>      'linux64-l10n-opt',
> +    'linux64-nightly-opt',

Is this where the Buildbot builds are indexed under gecko.v2?  We want to make sure that ported jobs have matching routes.
Attachment #8794221 - Flags: review?(dustin) → review-
Comment on attachment 8794221 [details] [diff] [review]
desktop-17.patch

from irc and most recent bug comments, we are going to work off date and try until we have end2end tc nightlies
Attachment #8794221 - Flags: review?(jlund)
Depends on: 1305505
I have a nightly fennec build running on date too.

https://tools.taskcluster.net/task-inspector/#PWKQNH-AQs25DziI3dGLVA/0

I adjusted to hook without taskcluster to run on a schedule but that is failing so I'm investigating that.
dustin: 

I need some clarification. You state

:: taskcluster/ci/build-signing/linux-signing.yml
@@ +15,5 @@
> +    metadata:
> +      name: "Signing Scriptworker Task"
> +      description: "Sign Linux Build Tasks"
> +      owner: "kmoir@mozilla.com"
> +      source: "https://tools.taskcluster.net/task-creator/"

These should match the metadata for other Gecko tasks: `name` should match the label, `owner` should be the push user, and `source` should point to the `kind.yml` file at the pushed revision.  It may be worth factoring task descriptions (`taskcluster/taskgraph/transforms/task.py`) to support scriptworker tasks, which would get you this behavior for free.

@@ +23,5 @@
> +    build_type: opt
> +  unsigned-task:
> +    label: "build-linux64-nightly/opt"
> +    artifacts:
> +      - "public/build/target.tar.bz2"

These attributes, and the upstream tasks, should be generated by the kind implementation as it enumerates the build tasks that have `nightly: true`.  IIRC this was part of my "clean up when adding a second task" notes when Jordan added `android-signing.yml`.

--

My question is: 
So it seems like I should be using transforms for the signing task instead of stating some of the metadata explicitly in the yml files for both android and linux64.  So if I look at 

Kims-MacBook-Pro:taskcluster kmoir$ cat ci/build/kind.yml
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.

implementation: taskgraph.task.transform:TransformTask

transforms:
   - taskgraph.transforms.build:transforms
   - taskgraph.transforms.build_attrs:transforms
   - taskgraph.transforms.job:transforms
   - taskgraph.transforms.task:transforms

jobs-from:
    - android-partner.yml
    .....


It's implementation is from taskgraph.task.transform:TransformTask

but for build-signing the implementation is taskgraph.task.signing:SigningTask

Kims-MacBook-Pro:taskcluster kmoir$ cat ci/build-signing/kind.yml
# This Source Code Form is subject to the terms of the Mozilla Public
# License, v. 2.0. If a copy of the MPL was not distributed with this
# file, You can obtain one at http://mozilla.org/MPL/2.0/.

implementation: 'taskgraph.task.signing:SigningTask'

jobs-from:
  - android-signing.yml
  - linux64-signing.yml

so are you suggesting that the code in taskgraph.task.signing:SigningTask be refactored to use transforms and that taskcluster/taskgraph/transforms/task.py also be modified to work with scriptworker tasks?
Flags: needinfo?(dustin)
Yes, that's about it :)

SigningTask would be a subclass TransformTask and override the task-enumeration logic (`get_inputs`).

As for task.py, you can implement scriptworker as a separate worker-implementation (`@payload_builder("scriptworker")`).  You may want to go a step further and implement specific scriptworker scripts as different payload builders (`@payload_builder("scriptworker-signing")`), since I imagine they will have somewhat different payloads.  We're doing this with the taskcluster-worker, for example -- rather than `@payload_builder("taskcluster-worker")`, we're using `@payload_builder("docker-engine")` and similarly for qemu-engine and windows-engine and any other taskcluster-worker engines we invent, each of which will have a different payload format.

Some of the task.py stuff may provide useful bells and whistles.  For example, if you let it fill in task.extra.treeherder you can have the tasks show up on treeherder.  You'll also automatically get task.expires set correctly on try pushes.
Flags: needinfo?(dustin)
So I am in a place where I'm stuck. 

Should signing.py also override load_tasks from TransformTask?  Or should that not matter anymore because I've set the special artifacts payload in the @payload_builder("scriptworker")` in task.py.  The problem I have is that if I don't override load_tasks in signing.py, the implementation from transform.py is used and it expects to have transforms in place.  From transform.py

63     @classmethod
 64     def load_tasks(cls, kind, path, config, params, loaded_tasks):
 65         inputs = cls.get_inputs(kind, path, config, params, loaded_tasks)
 66 
 67         transforms = TransformSequence()
 68         for xform_path in config['transforms']:
 69             transform = find_object(xform_path)
 70             transforms.add(transform)
 71 
 72         # perform the transformations
 73         trans_config = TransformConfig(kind, path, config, params)
 74         tasks = [cls(kind, t) for t in transforms(trans_config, inputs)]
 75         return tasks


However, I don't have any transforms defined in the kind.yml for build-signing because I'm subclassing TransformsTask in SigningTask.
Flags: needinfo?(dustin)
I think the fix is to define transforms in kind.yml.  Something like

implementation: taskgraph.task.signing:SigningTask

transforms:
   - taskgraph.transforms.signing:transforms
   - taskgraph.transforms.task:transforms

Here SigningTask would be responsible for the enumeration of tasks.  Basically it would look for tasks of the build kind with nightly attribute true, and build a build-signing task for them.  It would probably start by creating a dictionary you could call a "signing description" that basically says what to sign, with whatever other information is required (build platform?).  Something like:

  {
      'build-label': ..
      'build-platform': ..
  }

then the transforms in taskcluster/taskgraph/transforms/signing.py would start with that basic info and create a task description with worker.implementation = 'signing' and any relevant treeherder config, index config, etc.  Then your payload builder in taskgraph/taskcluster/transforms/tasks.py will convert that task description into a signing payload.

I raised the question of using @payload_builder('scriptworker') or @payload_builder('scriptworker-signing').  In the former case, the contents of the task description's `worker` property would need to be flexible enough to specify any scriptworker payload.  In the latter case, the schema for the task description's `worker` property would have some signing-specific fields, and the payload builder would do some smart things (apply defaults, etc.) to turn those into a scriptworker payload.

Sorry if this answer misses the point -- it's hard to be sure I'm answering the right question without seeing your code.  Feel free to schedule a meeting with me to talk about it if I have indeed missed the point or not answered the question.
Flags: needinfo?(dustin)
thanks Dustin, I understand what was wrong with my earlier approach, I have been refactoring my patches all day with the suggestions you made
Attached patch refactor.patch (obsolete) — Splinter Review
wip patches, not working yet

There are a lot of errors here
Traceback (most recent call last):
  File "/Users/kmoir/hg/date/taskcluster/mach_commands.py", line 228, in show_taskgraph
    tg = getattr(tgg, graph_attr)
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 103, in target_task_set
    return self._run_until('target_task_set')
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 219, in _run_until
    k, v = self._run.next()
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 168, in _run
    new_tasks = kind.load_tasks(self.parameters, list(all_tasks.values()))
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 36, in load_tasks
    parameters, loaded_tasks)
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/task/transform.py", line 74, in load_tasks
    tasks = [cls(kind, t) for t in transforms(trans_config, inputs)]
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/task.py", line 499, in build_task
    for task in tasks:
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/task.py", line 444, in add_index_routes
    for task in tasks:
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/task.py", line 439, in validate
    "In task {!r}:".format(task.get('label', '?no-label?')))
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/base.py", line 73, in validate_schema
    raise Exception('\n'.join(msg) + '\n' + pprint.pformat(obj))
Exception: In task u'signing-android-api-15-nightly/opt':
extra keys not allowed @ data[u'provisionerId']
extra keys not allowed @ data[u'worker-implementation']
required key not provided @ data[u'description']
{u'dependencies': {u'build': u'signing-android-api-15-nightly/opt'},
 u'label': u'signing-android-api-15-nightly/opt',
 u'provisionerId': u'scriptworker-prov-v1',
 u'scopes': [u'project:releng:signing:cert:nightly-signing',
             u'project:releng:signing:format:jar'],
 u'worker-implementation': u'scriptworker-signing'}
 
will look at it with fresh eyes in the morning, ran this against date branch
kmoir$ ./mach taskgraph target -p ~/Downloads/parameters.yml 

Kims-MacBook-Pro:date kmoir$ cat ~/Downloads/parameters.yml 
base_repository: https://hg.mozilla.org/mozilla-central
head_ref: tip
head_repository: https://hg.mozilla.org/try/
head_rev: 843bd791654730bc34a38f58e26b0c7f2bfb585e
level: '1'
message: 'try: -b o -p foo -u none -t none'
optimize_target_tasks: true
owner: kmoir@mozilla.com
project: try
pushdate: 0
pushlog_id: '0'
target_tasks_method: nightly_linux
triggered_by: nightly
Looks like good progress!  Let me know if/how I can help!
Attached patch refactor.patch (obsolete) — Splinter Review
patches from this morning

it is failing like this
 File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/task.py", line 571, in build_task
    logging.info(worker)
NameError: global name 'worker' is not defined


which is here taskcluster/taskgraph/transforms/task.py

# add the payload and adjust anything else as required (e.g., scopes)
570         logging.info("worker!!!")
571         logging.info(worker)
572         payload_builders[task['worker']['implementation']](config, task, task_def)


I'm not sure where worker should be defined among the transforms or the signing task.  suggestions please :-)
Attachment #8797352 - Attachment is obsolete: true
Flags: needinfo?(dustin)
I think it's `task['worker']` you're looking for there, although that's going to give you a lot of logging (about 900 tasks)!  It might be better to log it in `build_scriptworker_signing_payload` so you only see task['worker'] for the tasks you're interested in.

As for where the worker should be defined, let me back up a little.  The transforms you've defined are (correctly, IMHO):

   - taskgraph.transforms.signing:transforms
   - taskgraph.transforms.task:transforms

so if you think of each of those as transforming {something} into {something else}:

taskgraph.task.signing:SigningTask.get_inputs
   -> {signing description}
      -> taskgraph.transforms.signing:transforms
         -> {task description}
            -> taskgraph.transforms.task:transforms
               -> {task definition}

From the patch, a signing description looks something like this:

signing_description_schema = Schema({
    'provisionerId': basestring,
    'build-label': basestring,
    'build-platform': basestring,
    'description': basestring,
    'build-type': basestring,
    'scopes': [basestring],
    'worker-type': basestring,
})

you may eventually find you want to change that -- for example, the description is not really necessary since it's just built from the label.

The task description schema is at https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#31 but the relevant bits for your question are:

task_description_schema = Schema({
    # ...
    # information specific to the worker implementation that will run this task
    'worker': Any({
        Required('implementation'): Any('docker-worker', 'docker-engine'),
        # ..
    }, {
        Required('implementation'): 'generic-worker',
        # ..
    }, {
        Required('implementation'): 'buildbot-bridge',
        # ..
    }),
})

So taskgraph.transforms.task:transforms turns a task description into a {task definition}, which is exactly what gets passed to the queue.createTask(..) API and appears in the task inspector.  Aside from the task['worker'] stuff, the task transform takes care of lots of other things like indexing, treeherder, task expiration, etc.

The "Any" there means that task_description['worker'] can match any of these alternatives.  The idea is that, based on task_description['worker']['implementation'], the task transform can create a payload type appropriate to that worker.  The first alternative is for docker-worker (and later taskcluster-worker's docker-engine, which behaves about the same).  The second is for generic-worker, and the third for buildbot-bridge.  So for scriptworker-signing, you'll want to add a fresh new alternative to that list, rather than tacking on to docker-worker.

The contents of that worker schema are usually pretty close to the payload format for the worker, but simplified a little bit to make it easy to generate them.  Judging from https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/build-signing/android-signing.yml, that might be (comments omitted for brevity, please add them!!):

        # ..
    }, {
        Required('implementation'): 'scriptworker-signing',
        Required('max-run-time', default=600): int,
        Required('unsigned-artifacts'): [basestring],
    }, {
        # ..

the `build_scriptworker_signing_payload` function would then basically copy `task['worker']` into `task_def['payload']`, stripping the `implementation` key and re-capitalizing `max-run-time` to `maxRunTime` (I've tried to be consistent that all YAML files use lower-case, dashed identifiers).

The last bit that I haven't addressed here is taskgraph.transforms.signing:transforms.  Its job is to take a signing description and yield a task description.  That task description should have 

{
    'treeherder': {
        # ...
    },
    'worker': {
        'implementation': 'scriptworker-signing',
        'unsigned-artifacts': [..],
    },
}

along with any other task-description properties you might like to add.

As for actual implementation, it is up to you whether you want to try to "transform" one dictionary from a signing description into a task description, by adding and deleting keys, or to just create a fresh new dictionary representing the task description.  The latter option might be easier to code (and read), but either one is OK.

I hope that helps.  I'm happy to talk more.
Flags: needinfo?(dustin)
Attached patch refactor.patch (obsolete) — Splinter Review
thanks Dustin, that was very helpful.  I understand a lot more now.  

So I have implemented some of the changes you suggested in the attached patch.  However, it seems like it still fails 

Kims-MacBook-Pro:date kmoir$ ./mach taskgraph target -p ~/Downloads/parameters.yml 
 0:00.47 Loading kinds
 0:00.63 Generating full task set
 0:00.77 task_def!!!
 0:00.77 {u'workerType': u'android-api-15', u'scopes': ['docker-worker:relengapi-proxy:tooltool.download.internal', 'docker-worker:relengapi-proxy:tooltool.download.public'], u'tags': {u'createdForUser': 'amiyaguchi@mozilla.com'}, u'deadline': {u'relative-datestamp': u'1 day'}, u'created': {u'relative-datestamp': u'0 seconds'}, u'routes': [u'tc-treeherder.v2.try.843bd791654730bc34a38f58e26b0c7f2bfb585e.0', u'tc-treeherder-stage.v2.try.843bd791654730bc34a38f58e26b0c7f2bfb585e.0'], u'expires': {u'relative-datestamp': u'14 days'}, u'extra': {u'treeherderEnv': [u'production', u'staging'], u'treeherder': {u'jobKind': 'build', u'groupSymbol': 'tc', u'collection': {u'opt': True}, u'machine': {u'platform': u'android-4-0-armv7-api15'}, u'groupName': u'Executed by TaskCluster', u'tier': 2, u'symbol': 'lint'}}, u'provisionerId': u'aws-provisioner-v1', u'metadata': {u'owner': 'amiyaguchi@mozilla.com', u'source': u'https://hg.mozilla.org/try//file/843bd791654730bc34a38f58e26b0c7f2bfb585e/taskcluster/ci/android-stuff', u'description': 'Android lint', u'name': 'android-lint'}}
 0:00.77 worker!!!
Traceback (most recent call last):
  File "/Users/kmoir/hg/date/taskcluster/mach_commands.py", line 228, in show_taskgraph
    tg = getattr(tgg, graph_attr)
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 103, in target_task_set
    return self._run_until('target_task_set')
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 219, in _run_until
    k, v = self._run.next()
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 168, in _run
    new_tasks = kind.load_tasks(self.parameters, list(all_tasks.values()))
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/generator.py", line 36, in load_tasks
    parameters, loaded_tasks)
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/task/transform.py", line 74, in load_tasks
    tasks = [cls(kind, t) for t in transforms(trans_config, inputs)]
  File "/Users/kmoir/hg/date/taskcluster/taskgraph/transforms/task.py", line 577, in build_task
    logging.info(task_def['worker'])
KeyError: u'worker'

it doesn't get to the implementation specific payload_builder, it fails when it gets to here

taskcluster/taskgraph/transforms/task.py

in


payload_builders[task['worker']['implementation']](config, task, task_def)

and for some reason on the android-lint job

any suggestions?
Attachment #8797619 - Attachment is obsolete: true
Flags: needinfo?(dustin)
talked in vidyo to dustin
Flags: needinfo?(dustin)
Attached patch refactor.patch (obsolete) — Splinter Review
updated patches with more success

I'm not sure how to get the value of the unsigned artifact (it's not correct now)

from the build job to put into the payload of the signing job. Should I just look at the artifacts in the payload for the build job?  What's the best way to do that?
Attachment #8797715 - Attachment is obsolete: true
Flags: needinfo?(dustin)
figured it out
Flags: needinfo?(dustin)
Attached patch updated patches, testing on try (obsolete) — Splinter Review
Attachment #8798210 - Attachment is obsolete: true
Attached patch refactor.patch (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0d70d8b728de06bf9353f0424661bc24daafeed3

Dustin, could you provide feedback on my patches so I can conduct more testing on the date branch?
Attachment #8798606 - Attachment is obsolete: true
Attachment #8798675 - Flags: feedback?(dustin)
Comment on attachment 8798675 [details] [diff] [review]
refactor.patch

Review of attachment 8798675 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good -- I've gotten into some nitty-gritty with Python style etc., so don't take the quantity of comments as indicative of big issues :)

The only thing I see that's definitely not correct is the generation of the artifact URLs.  There is some detail about task-reference here;
  http://gecko.readthedocs.io/en/latest/taskcluster/taskcluster/taskgraph.html#task-parameterization

Other than that, I think you can shorten up the `get_inputs` function quite a lot, and move most of the interesting stuff into the transform, based on comments below.

Finally, I think the worker implementation schema has a bunch of copy/pasta from the BBB schema :)

::: taskcluster/taskgraph/task/signing.py
@@ +1,3 @@
> +# This Source Code Form is subject to the terms of the Mozilla Public License,
> +# v. 2.0. If a copy of the MPL was not distributed with this file, You can
> +# obtain one at http://mozilla.org/MPL/2.0/.

gps dinged me for word-wrapping this once -- `file` should be at the beginning of the 3rd line

@@ +19,2 @@
>  
> +    The `signing-template' kind configuration points to a yaml file which will

I don't see this referenced anywhere?

@@ +35,5 @@
> +                continue
> +            is_nightly = False
> +            is_nightly = task.attributes.get('nightly')
> +            if not is_nightly:
> +                continue

these four lines could be two (`if not task.attributes.get('nightly'): continue`)

@@ +36,5 @@
> +            is_nightly = False
> +            is_nightly = task.attributes.get('nightly')
> +            if not is_nightly:
> +                continue
> +            signing_task = copy.deepcopy(signing_description)

You end up setting every key in the dictionary anyway, so why not just start with an empty dictionary?  Or signing_task = { .. } with all of those keys listed explicitly?

@@ +38,5 @@
> +            if not is_nightly:
> +                continue
> +            signing_task = copy.deepcopy(signing_description)
> +            label = ((str(task.label)).replace("build-", "signing-"))
> +            signing_task['build-label'] = label

I like the generation of the label with a substitution, but at this point, it's not the build-label anymore, it's the signing label!  You could just pass `task.label` here, and do the `.replace` in the transform

@@ +43,5 @@
> +            signing_task['description'] = label
> +            signing_task['dependencies'] = {'build': task.label}
> +            signing_task['build-platform'] = task.attributes.get('build_platform')
> +            signing_task['build-type'] = task.attributes.get('build_type')
> +            signing_task['provisionerId'] = "scriptworker-prov-v1"

I don't think you need this?  It's embedded in `worker-type`

@@ +44,5 @@
> +            signing_task['dependencies'] = {'build': task.label}
> +            signing_task['build-platform'] = task.attributes.get('build_platform')
> +            signing_task['build-type'] = task.attributes.get('build_type')
> +            signing_task['provisionerId'] = "scriptworker-prov-v1"
> +            signing_task['artifactid'] = task.task_id

This is going to be `None` at this point -- tasks don't get taskIds until the optimization stage.

@@ +51,5 @@
> +            if "linux64" in label:
> +                signing_format = "gpg"
> +            signing_format_scope = "project:releng:signing:format:" + signing_format
> +            signing_task['scopes'] = ["project:releng:signing:cert:nightly-signing",
> +                                      signing_format_scope]

All of this signing-format and scope-related work would be better handled in a transform, since it's calculated based on the build platform (preferably, rather than based on the label)

::: taskcluster/taskgraph/transforms/signing.py
@@ +20,5 @@
> +transforms = TransformSequence()
> +
> +
> +@transforms.add
> +def fill_template(config, tasks):

maybe call the transform "make_task_description"?

@@ +25,5 @@
> +    for task in tasks:
> +        # Fill out the dynamic fields in the task description
> +        artifacts = ['public/build/target.tar.bz2']
> +        if 'android' in task['build-platform']:
> +            artifacts = ['public/build/target.apk', 'public/build/en-US/target.apk']

It's kind of a style thing, but I'd rather see this as

if 'android' in task['build-platform']:
    artifacts = ['public/build/target.apk', 'public/build/en-US/target.apk']
else:
    artifacts = ['public/build/target.tar.bz2']

since it nicely puts the alternatives side-by-side

@@ +28,5 @@
> +        if 'android' in task['build-platform']:
> +            artifacts = ['public/build/target.apk', 'public/build/en-US/target.apk']
> +        unsigned_artifacts = []
> +        for artifact in artifacts:
> +            url = ARTIFACT_URL.format(str(task['artifactid'])

This URL will need to use task-reference:

ARTIFACT_URL = 'https://queue.taskcluster.net/v1/task/<{}>/artifacts/{}'
#                                               NOTE: ^  ^
...
  url = {"task-reference": ARTIFACT_URL.format(task['build-label'], artifact)
#                                                   ^^^^^^^^^^^^^ need this to actually be the label of the build!

::: taskcluster/taskgraph/transforms/task.py
@@ +245,5 @@
> +            'branch': basestring,
> +            Optional('revision'): basestring,
> +            Optional('repository'): basestring,
> +            Optional('project'): basestring,
> +        },

buildername? sourcestamp?? this looks a lot like a BBB payload...

@@ +422,5 @@
> +def build_scriptworker_signing_payload(config, task, task_def):
> +    worker = task['worker']
> +
> +    task_def['payload'] = {
> +        'implementation': worker['implementation'],

I don't think `implementation` needs to be in the payload, right?
Attachment #8798675 - Flags: feedback?(dustin) → feedback+
Attached patch refactor.patch (obsolete) — Splinter Review
Attachment #8798675 - Attachment is obsolete: true
Attached patch refactor.patch (obsolete) — Splinter Review
updated patches with your suggestions, not sure if the the task reference in the payload for the unsigned artifacts is correct

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c7f05f470b766460d8ad0466765cf3b774cea277
Attachment #8799833 - Attachment is obsolete: true
Attachment #8799858 - Flags: feedback?(dustin)
Attached patch refactor.patch (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0c28be99de87e68d634ae256ca2a73e8e7fe26fa
Attachment #8799858 - Attachment is obsolete: true
Attachment #8799858 - Flags: feedback?(dustin)
Attachment #8799936 - Flags: review?(dustin)
Attachment #8799936 - Flags: review?(dustin) → feedback?(dustin)
Attached patch bug1277579refactor.patch (obsolete) — Splinter Review
Attachment #8799936 - Attachment is obsolete: true
Attachment #8799936 - Flags: feedback?(dustin)
Comment on attachment 8800030 [details] [diff] [review]
bug1277579refactor.patch

Review of attachment 8800030 [details] [diff] [review]:
-----------------------------------------------------------------

\o/
Attachment #8800030 - Flags: review+
Attached patch bug1277579refactor-2.patch (obsolete) — Splinter Review
patch I actually checked in (removed logging imports that I no longer need to debug)
Attached patch bug1277579dt.patch (obsolete) — Splinter Review
fix to address issue where signing tasks are included in the decision task when they shouldn't be by setting the nightly attribute
Attached patch bug1277579dt2.patch (obsolete) — Splinter Review
add build-type and build-platform to signing task
Attachment #8800325 - Flags: review+
Attached patch bug1277579dt3.patch (obsolete) — Splinter Review
fix dependencies so it is name, not label
Attached patch wip patches (obsolete) — Splinter Review
will test on try

other patches backed out due to issues on date that did not reveal themselves on try
Attachment #8800030 - Attachment is obsolete: true
Attachment #8800216 - Attachment is obsolete: true
Attachment #8800304 - Attachment is obsolete: true
Attachment #8800325 - Attachment is obsolete: true
Attachment #8800375 - Attachment is obsolete: true
So I ran it on try

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aea0ad71331b0824b65b185bd6ddc92dae46aba3&selectedJob=29057944

https://public-artifacts.taskcluster.net/FP633ekIT0KbHiiY2RQLWw/0/public/full-task-graph.json


the taskgraph still shows 

"signing-linux64-nightly/opt": {
    "attributes": {
      "build_platform": "linux64-nightly",
      "build_type": "opt",
      "kind": "build-signing",
      "nightly": true,
      "run_on_projects": [
        "all"
      ]
    },

run_on_projects as all. I tried to change the run_on_projects to [] but that is just overwritten since the default is all if has an empty value.  Not sure how to overwrite this without making all not the default.
Attached patch bug1277579wip.patch (obsolete) — Splinter Review
Attachment #8800427 - Attachment is obsolete: true
Woo, 100 comments!  Now we are having fun! ;)

It looks like you're modifying run-on-projects: [] on the build tasks, but it's a signing task you excerpted in comment 98.

Probably the most future-proof approach is to copy the run_on_projects attribute from the upstream build task into the signing task.  I'll update attachment 8800438 [details] [diff] [review] to do that.
Attached patch bug1277579refactor-3.patch (obsolete) — Splinter Review
Dustin, thanks for your help.  I think the patch you attached is the one from yesterday.

I tried the approach you suggested in the latest patch but since value is [] it is overwritten in 

task.py to  all 

576         attributes = task.get('attributes', {})
577         attributes['run_on_projects'] = task.get('run-on-projects', ['all'])
Attachment #8800438 - Attachment is obsolete: true
The problem with my earlier patch that I was setting the task attribute in the signing transform which was being overwritten because the task transform is looking for a value of run-on-projects in the task itself, not the attributes.  i.e. attributes['run_on_projects'] = task.get('run-on-projects', ['all'])

The taskgraph for this run on try looks good

https://public-artifacts.taskcluster.net/FP633ekIT0KbHiiY2RQLWw/0/public/full-task-graph.json

https://treeherder.mozilla.org/#/jobs?repo=try&revision=aea0ad71331b0824b65b185bd6ddc92dae46aba3
Attachment #8800467 - Attachment is obsolete: true
Attachment #8800691 - Flags: feedback?(dustin)
Comment on attachment 8800691 [details] [diff] [review]
bug1277579refactor-4.patch

Review of attachment 8800691 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome!

::: taskcluster/ci/build/android.yml
@@ -96,5 @@
>              - taskcluster_nightly.py
>          script: "mozharness/scripts/fx_desktop_build.py"
>          custom-build-variant-cfg: api-15
>          tooltool-downloads: internal
> -    run-on-projects: []

No need for this, but of course it doesn't hurt (similar in linux.yml)
Attachment #8800691 - Flags: feedback?(dustin) → feedback+
bug to land this on m-c is bug 1309947.  Currently this is just landed on date.
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → FIXED
Product: TaskCluster → Firefox Build System
You need to log in before you can comment on or make changes to this bug.