Closed Bug 1486970 Opened Last year Closed 8 months ago

Create revision and pushlog-id index routes for cron decision tasks

Categories

(Firefox Build System :: Task Configuration, task)

task
Not set

Tracking

(geckoview62 fixed, firefox-esr60 fixed, firefox62 fixed, firefox63 fixed)

RESOLVED FIXED
Tracking Status
geckoview62 --- fixed
firefox-esr60 --- fixed
firefox62 --- fixed
firefox63 --- fixed

People

(Reporter: Callek, Assigned: Callek)

Details

Attachments

(6 files)

No description provided.
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

Dustin J. Mitchell [:dustin] pronoun: he has approved the revision.
Attachment #9004741 - Flags: review+
Pushed by jwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3db3193ec938
Create revision and pushlog-id index routes for cron decision tasks. r=dustin
https://hg.mozilla.org/mozilla-central/rev/3db3193ec938
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Backed out for braking the nightlies with signing issues.

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&filter-resultStatus=pending&filter-resultStatus=running&filter-resultStatus=exception&filter-searchStr=nightly&fromchange=2b50a2ad969a326c3d066426d6e823c44de5b7d4&selectedJob=196541308

Failure log: https://tools.taskcluster.net/groups/EmFetL3aRzuX55Teie-z_w/tasks/CxbKW1BYRY6NuH7pQ4H1zQ/runs/1/logs/public%2Flogs%2Fchain_of_trust.log#L61

2018-08-29T23:46:35 CRITICAL - Can't find task signing CxbKW1BYRY6NuH7pQ4H1zQ in signing:parent EmFetL3aRzuX55Teie-z_w task-graph.json!
2018-08-29T23:46:35 CRITICAL - Chain of Trust verification error!
Traceback (most recent call last):
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1815, in verify_chain_of_trust
    task_count = await verify_task_types(chain)
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1570, in verify_task_types
    await valid_task_types[task_type](chain, obj)
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 1351, in verify_parent_task
    verify_link_in_task_graph(chain, link, target_link)
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 921, in verify_link_in_task_graph
    task_link.name, task_link.task_id, decision_link.name, decision_link.task_id
  File "/builds/scriptworker/lib/python3.6/site-packages/scriptworker/cot/verify.py", line 288, in raise_on_errors
    raise CoTError("\n".join(errors))
scriptworker.exceptions.CoTError: "Can't find task signing CxbKW1BYRY6NuH7pQ4H1zQ in signing:parent EmFetL3aRzuX55Teie-z_w task-graph.json!"
2018-08-29T23:46:35    ERROR - Hit ScriptWorkerException: "Can't find task signing CxbKW1BYRY6NuH7pQ4H1zQ in signing:parent EmFetL3aRzuX55Teie-z_w task-graph.json!"
2018-08-29T23:46:35    DEBUG -            "/builds/scriptworker/artifacts/public/logs/chain_of_trust.log" is encoded with "None" and has mime/type "text/plain"
2018-08-29T23:46:35     INFO - "/builds/scriptworker/artifacts/public/logs/chain_of_trust.log" can be gzip'd. Compressing...
Status: RESOLVED → REOPENED
Flags: needinfo?(bugspam.Callek)
Resolution: FIXED → ---
Target Milestone: mozilla63 → ---
Backout by csabou@mozilla.com:
https://hg.mozilla.org/mozilla-central/rev/e547a1a4ac86
Backed out changeset 3db3193ec938 for braking the nightlies with signing issues. a=backout
Does this affect all operating systems?
:aki, can you provide any insight on how we can solve the error in https://bugzilla.mozilla.org/show_bug.cgi?id=1486970#c5 without also breaking other trees at the same time?

I'm not sure I'm following the call graph well enough to identify this.. :(

https://github.com/mozilla-releng/scriptworker/blob/master/scriptworker/cot/verify.py#L1020
Flags: needinfo?(bugspam.Callek) → needinfo?(aki)
Flags: needinfo?(aki) → needinfo?(bugspam.Callek)
Flags: needinfo?(aki)
Aiui, all trees expect these hardcoded values. At first blush, it seems like a difficult problem -- you want to change the hardcoded values on a per-tree basis, but we have nothing to base that logic on.

We might be able to do a couple things:

First option,
0. Try this on try or something with `verify_cot` locally to make sure it works
1. Get .taskcluster.yml to provide the default hardcodes for the push, if the push context isn't specified.
2. Uplift that change to all trees.
3. When that change is uplifted to all trees, remove this block https://github.com/mozilla-releng/scriptworker/blob/master/scriptworker/cot/verify.py#L1025-L1032
4. Roll out new scriptworker with that change
5. On autoland, land your patch which removes the need for the default push hardcodes for cron tasks

Second option, crashland everywhere, knowing that tasks will break until the change is uplifted on every branch and scriptworker is rolled out.

I'd lean towards the first option.
Flags: needinfo?(aki)
Attachment #9004741 - Attachment description: Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin → Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin! r=aki!
This changeset actually creates the new routes we want for decision tasks, and must land after scriptworker is updated
and deployed. This has the affect of also un-hardcoding the .taskcluster.yml fields applied in the first part of this bug.

This changeset in particular will ride trains and not be uplifted (outside of compelling reasons).
Attached file GitHub Pull Request
This particular PR was tested along with https://github.com/mozilla-releng/scriptworker/pull/252
Flags: needinfo?(bugspam.Callek)
Attachment #9007035 - Flags: review?(aki)
For testing:

bare m-c-based push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dbfb07c24d5a102c807c143b01aca913d660ae0c
  (broken N's were due to a check for public csets, which is ok in this case)
  - Passes CoT validation against the D and Nd for unpatched scriptworker
  - Fails CoT validation against the D and Nd for patched scriptworker

m-c based push with first patch (+ a minor bug that broke the main D, said issue was fixed in patches applied to this bug)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=05631aab184a2d82997b998a59b52c3bec407d28
   - Passes validation of unpatched scriptworker against the Nd
   - Passes validation of patched scriptworker against the Nd

m-c based push with both patches:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c91baa29f68721be466af9c847747174c7af39bf
   - Passes validation of patched scriptworker against both D and Nd
   - Passes validation of unpatched scriptworker against D
   - Fails validation of unpatched scriptworker against Nd
Marking leave-open since this bug will land in multiple parts.
Keywords: leave-open
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

Dustin J. Mitchell [:dustin] pronoun: he has been removed from the revision.
Attachment #9004741 - Flags: review+
(In reply to Justin Wood (:Callek) from comment #13)
> m-c based push with first patch (+ a minor bug that broke the main D, said
> issue was fixed in patches applied to this bug)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=05631aab184a2d82997b998a59b52c3bec407d28
>    - Passes validation of unpatched scriptworker against the Nd
>    - Passes validation of patched scriptworker against the Nd

I think it's worthwhile to verify that the first patch doesn't break cot verification for the main decision task as well, or a) the patch will bounce on autoland due to busted dep-signing tasks, and b) that will probably break cot verification for action tasks based on the busted decision task.
Thanks Callek! I've left comments in the various PRs and phab.

Once we get a good 1st in-tree patch, I'm guessing a good course of action might be:

1. autoland, verify nothing breaks in either cron or on-push decision tasks.
2. uplift to the various release branches including esr60.
3. wait for several days, so we know we're not going to try to verify a revision without the change in #2.
4. land and roll out the scriptworker fix.
5. land the index fix.

If we're confident our changes are limited to cron only, then (3) can be shorter; as long as no cron graphs need additional verification (e.g. retriggers/reruns in nightly graphs). If the changes might affect hg-push or action tasks, then a very conservative approach could be to wait til the next dot release or next release cycle to update scriptworker, to avoid breaking partner repack respins for the latest release.
Comment on attachment 9007033 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks.

Dustin J. Mitchell [:dustin] pronoun: he has approved the revision.
Attachment #9007033 - Flags: review+
Attachment #9007035 - Flags: review?(aki) → review-
Attachment #9004741 - Attachment description: Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin! r=aki! → Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

Aki Sasaki [:aki] has approved the revision.
Attachment #9004741 - Flags: review+
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

Dustin J. Mitchell [:dustin] pronoun: he has approved the revision.
Attachment #9004741 - Flags: review+
Pushed by jwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/79d78bba6f68
Create revision and pushlog-id index routes for cron decision tasks. r=dustin,aki
Backout by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/32089e2911b7
Backed out changeset 79d78bba6f68 for braking backfiling and add new jobs tasks. CLOSED TREE
When trying to add new jobs or backfill a job this is the message that I've received:
 Taskcluster: While firing hook: InterpreterError at template["_pushDate"]: object has no property pushdate --- * method: triggerHook * errorCode: InputError * statusCode: 400 * time: 2018-09-13T18:32:34.744Z 

https://irccloud.mozilla.com/file/UgBDdF4i/image.png

eg this job: https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=199138891&searchStr=linux,opt,mochitests,with,e10s,test-linux32%2Fopt-mochitest-browser-chrome-e10s-2,m-e10s(bc2)&tochange=8eeaf62be0dc07c0710dc8b1f8442ede33adaedb&fromchange=ee4b2d3fd97ab313008588dd78d0c2766158836b

tomprince hinted this might be the bug which triggered this so that's why the backout
Flags: needinfo?(bugspam.Callek)
Comment on attachment 9007033 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks.

Aki Sasaki [:aki] has approved the revision.
Attachment #9007033 - Flags: review+
Pushed by jwood@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/85bab1e29962
Create revision and pushlog-id index routes for cron decision tasks. r=dustin,aki
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

This is to start to facilitate easy index searching for all decision task types, and is necessary to land on all production trees before we switch Chain Of Trust to trusting the new design (next patch) this needs uplift to all active branches.

There is some slight risk (it was backed out twice since initially landing) however we currently have a nightly built with this code in, so the confidence in this variant being good is high.
Flags: needinfo?(bugspam.Callek)
Attachment #9004741 - Flags: approval-mozilla-release?
Attachment #9004741 - Flags: approval-mozilla-geckoview62?
Attachment #9004741 - Flags: approval-mozilla-esr60?
Attachment #9004741 - Flags: approval-mozilla-beta?
Comment on attachment 9004741 [details]
Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin r=aki

NPOTB, Beta63+, Release63+, ESR60.3+, GV62+
Attachment #9004741 - Flags: approval-mozilla-release?
Attachment #9004741 - Flags: approval-mozilla-release+
Attachment #9004741 - Flags: approval-mozilla-geckoview62?
Attachment #9004741 - Flags: approval-mozilla-geckoview62+
Attachment #9004741 - Flags: approval-mozilla-esr60?
Attachment #9004741 - Flags: approval-mozilla-esr60+
Attachment #9004741 - Flags: approval-mozilla-beta?
Attachment #9004741 - Flags: approval-mozilla-beta+
Attachment #9007033 - Attachment description: Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks. r=dustin! r=aki! → Bug 1486970 - Create revision and pushlog-id index routes for cron decision tasks.
Pushed by mozilla@hocat.ca:
https://hg.mozilla.org/integration/autoland/rev/d6229a4bce1a
[taskgraph] Make find_hg_revision_pushlog_id more re-usable; r=Callek
https://hg.mozilla.org/integration/autoland/rev/b59f4aa8d4a3
[taskgraph] Add retries to getting pushlog information; r=Callek
https://hg.mozilla.org/integration/autoland/rev/e75b3f5571fe
Create revision and pushlog-id index routes for cron decision tasks. r=dustin,aki
https://hg.mozilla.org/integration/autoland/rev/11828ed48817
[taskgraph] Remove dead parameter to `make_decision_task` in cron; r=Callek

I think this is all landed now.

Keywords: leave-open
Status: REOPENED → RESOLVED
Closed: Last year8 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.