Closed Bug 1300795 Opened 9 years ago Closed 9 years ago

Taskgraph entry for signing-nightly-fennec is missing the 'extra' key causing /runnable_jobs/ to HTTP 500

Categories

(Tree Management :: Treeherder, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Assigned: wlach)

References

Details

Attachments

(1 file, 1 obsolete file)

47 bytes, text/x-github-pull-request
camd
: review+
Details | Review
This is eg: https://rpm.newrelic.com/accounts/677903/applications/4180461/filterable_errors#/show/06e70e9a-743f-11e6-a90b-b82a72d22a14_0_4465/stack_trace?top_facet=transactionUiName&primary_facet=error.class&barchart=barchart&_k=07w3d3 KeyError: 'extra' For the link in comment 0, the taskgraph URL is: https://queue.taskcluster.net/v1/task/JQhjcIXeQhinwpCX8TtzjA/artifacts/public/full-task-graph.json One entry within is missing 'extra': "signing-nightly-fennec": { "attributes": { "kind": "signing" }, "dependencies": { "build-nightly-fennec": "build-nightly-fennec" }, "kind_implementation": "taskgraph.task.signing:SigningTask", "label": "signing-nightly-fennec", "task": { "created": { "relative-datestamp": "0 seconds" }, "deadline": { "relative-datestamp": "24 hours" }, "metadata": { "description": "Testing the signing scriptworker", "name": "Signing Scriptworker Task", "owner": "amiyaguchi@mozilla.com", "source": "https://tools.taskcluster.net/task-creator/" }, "payload": { "maxRunTime": 600, "unsignedArtifacts": [ { "task-reference": "https://queue.taskcluster.net/v1/task/<build-nightly-fennec>/artifacts/public/build/target.apk" }, { "task-reference": "https://queue.taskcluster.net/v1/task/<build-nightly-fennec>/artifacts/public/build/en-US/target.apk" } ] }, "provisionerId": "scriptworker-prov-v1", "scopes": [ "project:releng:signing:cert:dep-signing", "project:releng:signing:format:jar" ], "workerType": "signing-linux-v1" } } Which I'm guessing is a regression from bug 1277595. 1) Presumably the extra key should exist for every task type, so needs adding for signing-nightly-fennec 2) Ideally the mozilla-central TC code should enforce that the 'extra' key exists, so this gets caught upstream 3) Treeherder shouldn't 500 if the key is missing, and instead just skip that task (but still allow retriggering of the others) This bug is about #3. Jordan - can I leave you to file TC/... bugs about #1/#2? Thanks :-)
Blocks: 1277595
Flags: needinfo?(jlund)
Summary: runnablejobs endpoint gets a server error 500 → Taskgraph entry for signing-nightly-fennec is missing the 'extra' key causing /runnable_jobs/ to HTTP 500
Keys without task.extra.treeherder are valid and just not reported to treeherder, so treeherder should probably 404 for those.
Based on what Dustin just said, approach 3) should be used in my opinion, where we just skip this task and move on to the next task present in the full-task-graph file. Also, in the case there are other issues with TC, (eg: file not found), we should still show the BB jobs. I'm not sure if I'd accounted for this in the API, but I think I had accounted for a few cases.
(In reply to Dustin J. Mitchell [:dustin] from comment #2) > Keys without task.extra.treeherder are valid and just not reported to > treeherder, so treeherder should probably 404 for those. based on this reply, I'm going to clear the needinfo for now as #1 and #2 seem inapplicable. Happy to apply a fix though if a consensus is reached.
Flags: needinfo?(jlund)
Comment on attachment 8788777 [details] [review] [treeherder] KWierso:1300795 > mozilla:master Ed how about this? I think this is what you want? If an exception happens while trying to pull information out of a job in the task graph, we just pass over that particular job.
Attachment #8788777 - Flags: review?(emorley)
I'm going to jump on this, because it's affecting people. I like the idea of Kwierso's fix, I'm going to base my solution on that.
Assignee: nobody → wlachance
Attached file PR
This is the same idea of KWierso's fix, but a little bit more restricted in scope so we don't obscure other errors. I have deployed this to stage and things are working. e.g. https://treeherder.allizom.org/api/project/try/runnable_jobs/?decisionTaskID=DbRp1i8pRtO7j_VPR7-Jqw (warning: may make firefox slow due to volume, but that's nothing new I don't think)
Attachment #8788777 - Attachment is obsolete: true
Attachment #8788777 - Flags: review?(emorley)
Attachment #8788918 - Flags: review?(emorley)
Attachment #8788918 - Flags: review?(cdawson)
Attachment #8788918 - Flags: review?(cdawson) → review+
Comment on attachment 8788918 [details] [review] PR I don't think we need :emorley's review now (though of course he's still welcome to provide feedback) :)
Attachment #8788918 - Flags: review?(emorley)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Accidentally attached the pull requests / commits to the wrong bug. Here's the actual fix: https://github.com/mozilla/treeherder/commit/2f8ffca016735bb944f0e1e852fa6b252d800dac
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: