Closed Bug 1369671 Opened 8 years ago Closed 8 years ago

HTTPError: 403 Client Error: Forbidden for url: http://taskcluster/queue/v1/task/AA_wTrLEQZicpyP-VEvPXQ

Categories

(Taskcluster :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: cbook, Unassigned)

Details

https://treeherder.mozilla.org/logviewer.html#?job_id=103999640&repo=mozilla-inbound was one of 2 instances where the decision task failed with: HTTPError: 403 Client Error: Forbidden for url: http://taskcluster/queue/v1/task/AA_wTrLEQZicpyP-VEvPXQ maybe its worth to look into it Traceback (most recent call last): [task 2017-06-02T07:47:19.265717Z] File "/home/worker/checkouts/gecko/taskcluster/mach_commands.py", line 164, in taskgraph_decision [task 2017-06-02T07:47:19.266355Z] return taskgraph.decision.taskgraph_decision(options) [task 2017-06-02T07:47:19.266499Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/decision.py", line 140, in taskgraph_decision [task 2017-06-02T07:47:19.267239Z] create_tasks(tgg.morphed_task_graph, tgg.label_to_taskid, parameters) [task 2017-06-02T07:47:19.267864Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/create.py", line 89, in create_tasks [task 2017-06-02T07:47:19.267993Z] f.result() [task 2017-06-02T07:47:19.268661Z] File "/home/worker/checkouts/gecko/python/futures/concurrent/futures/_base.py", line 398, in result [task 2017-06-02T07:47:19.269263Z] return self.__get_result() [task 2017-06-02T07:47:19.269405Z] File "/home/worker/checkouts/gecko/python/futures/concurrent/futures/thread.py", line 55, in run [task 2017-06-02T07:47:19.270221Z] result = self.fn(*self.args, **self.kwargs) [task 2017-06-02T07:47:19.270319Z] File "/home/worker/checkouts/gecko/taskcluster/taskgraph/create.py", line 108, in create_task [task 2017-06-02T07:47:19.271163Z] res.raise_for_status() [task 2017-06-02T07:47:19.271305Z] File "/home/worker/checkouts/gecko/python/requests/requests/models.py", line 840, in raise_for_status [task 2017-06-02T07:47:19.271913Z] raise HTTPError(http_error_msg, response=self) [task 2017-06-02T07:47:19.272012Z] HTTPError: 403 Client Error: Forbidden for url: http://taskcluster/queue/v1/task/ZW061-lSR72x6Jw3f6ktgg
Flags: needinfo?(jopsen)
The error is [task 2017-06-02T08:58:26.944108Z] to have one of the following sets of scopes: [task 2017-06-02T08:58:26.944488Z] [ [task 2017-06-02T08:58:26.944529Z] [ [task 2017-06-02T08:58:26.944936Z] "queue:route:tc-treeherder.v2.mozilla-inbound.db0bd0d1b8fdd78de2132c4f005e6005ea7a39ce.90970", [task 2017-06-02T08:58:26.945325Z] "queue:route:tc-treeherder-stage.v2.mozilla-inbound.db0bd0d1b8fdd78de2132c4f005e6005ea7a39ce.90970" [task 2017-06-02T08:58:26.945367Z] ] [task 2017-06-02T08:58:26.945399Z] ] [task 2017-06-02T08:58:26.945426Z] [task 2017-06-02T08:58:26.945455Z] You only have the scopes: [task 2017-06-02T08:58:26.945489Z] [ ... [task 2017-06-02T08:58:26.957033Z] "queue:route:tc-treeherder-stage.mozilla-inbound.*", [task 2017-06-02T08:58:26.957077Z] "queue:route:tc-treeherder-stage.try.*", [task 2017-06-02T08:58:26.957118Z] "queue:route:tc-treeherder-stage.v2.mozilla-inbound.*", [task 2017-06-02T08:58:26.957149Z] "queue:route:tc-treeherder-stage.v2.try.*", [task 2017-06-02T08:58:26.957174Z] "queue:route:tc-treeherder.mozilla-inbound.*", [task 2017-06-02T08:58:26.957205Z] "queue:route:tc-treeherder.try.*", [task 2017-06-02T08:58:26.957245Z] "queue:roqueue:scheduler-r.v2.mozilla-inbound.*", [task 2017-06-02T08:58:26.957280Z] "queue:route:tc-treeherder.v2.try.*", [task 2017-06-02T08:58:26.957318Z] "queue:scheduler-id:gecko-level-1", [task 2017-06-02T08:58:26.957354Z] "queue:scheduler-id:gecko-level-2", [task 2017-06-02T08:58:26.957400Z] "queue:scheduler-id:gecko-level-3", [task 2017-06-02T08:58:26.957442Z] "scheduler:create-task-graph", [task 2017-06-02T08:58:26.957470Z] "scheduler:extend-task-graph:*", [task 2017-06-02T08:58:26.957493Z] "secrets:get:garbage/*", [task 2017-06-02T08:58:26.957516Z] "secrets:get:project/releng/gecko/build/level-1/*", [task 2017-06-02T08:58:26.957542Z] "secrets:get:project/releng/gecko/build/level-2/*", [task 2017-06-02T08:58:26.957578Z] "secrets:get:project/releng/gecko/build/level-3/*", [task 2017-06-02T08:58:26.957625Z] "secrets:get:project/taskcluster/gecko/build/level-2/*", [task 2017-06-02T08:58:26.957679Z] "secrets:get:project/taskcluster/gecko/build/level-3/*", [task 2017-06-02T08:58:26.957729Z] "secrets:get:project/taskcluster/gecko/hgfingerprint", [task 2017-06-02T08:58:26.957766Z] "secrets:set:garbage/*" [task 2017-06-02T08:58:26.957792Z] ] so something's funny here: that v2 is before the tree name in the route used in the task, but there's no corresponding scope. That said, the role *does* have that route: https://tools.taskcluster.net/auth/roles/#repo:hg.mozilla.org%252fintegration%252fmozilla-inbound:* assume:moz-tree:level:3 index:insert-task:buildbot.branches.mozilla-inbound.* index:insert-task:buildbot.revisions.* index:insert-task:docker.images.v1.mozilla-inbound.* index:insert-task:gecko.v2.mozilla-inbound.* queue:route:coalesce.v1.builds.mozilla-inbound.* queue:route:index.buildbot.branches.mozilla-inbound.* queue:route:index.buildbot.revisions.* queue:route:index.docker.images.v1.mozilla-inbound.* queue:route:index.gecko.v2.mozilla-inbound.* queue:route:tc-treeherder-stage.mozilla-inbound.* queue:route:tc-treeherder-stage.v2.mozilla-inbound.* queue:route:tc-treeherder.mozilla-inbound.* queue:route:tc-treeherder.v2.mozilla-inbound.* So something else is funny here: "queue:roqueue:scheduler-r.v2.mozilla-inbound.*", "queue:route:tc-treeherder.v2.mozilla-inbound.*" ++++++++++++++++ I don't know if that's an artifact of logging? Some pointer issue in the queue? The role says it was last edited 3 months ago, so it's not a short-term copy/paste error in the role UI.
> "queue:roqueue:scheduler-r.v2.mozilla-inbound.*", @dustin, looks a lot like log-lines being mangled, I've seen that before. But not cross the \n boundaries.
Flags: needinfo?(jopsen)
My guess is that whatever expands the client scopes is doing so in a parallel fashion, with no mutex to serialise the writes to the array of scopes, such that two appends interfere with each other, leaving "queue:roqueue:scheduler-r.v2.mozilla-inbound.*". However, it is strange that if this would be the case, that the operations seem not to corrupt the memory and cause a crash. Note, the scope requirement is for *both* scopes to exist: queue:route:tc-treeherder.v2.mozilla-inbound.db0bd0d1b8fdd78de2132c4f005e6005ea7a39ce.90970 queue:route:tc-treeherder-stage.v2.mozilla-inbound.db0bd0d1b8fdd78de2132c4f005e6005ea7a39ce.90970 The wording of the error message is somewhat confusing in this matter, since it assumes the reader is familiar with scope sets, which is somewhat of an internal concept. But that is a separate matter, but wanted to call it out if that caused confusion to anyone reading this bug. In general it is true that the scopes listed comment 1 do not satisfy the scope set also listed in comment 1 (since "queue:route:tc-treeherder.v2.mozilla-inbound.db0bd0d1b8fdd78de2132c4f005e6005ea7a39ce.90970" is not satisfied by any given scope). My suspicion is that "queue:roqueue:scheduler-r.v2.mozilla-inbound.*" should actually be the two scopes: "queue:route:tc-treeherder.v2.mozilla-inbound.*" "queue:scheduler-r.v2.mozilla-inbound.*" Where is the code which generates the array of given scopes?
Flags: needinfo?(jopsen)
The array of given scopes comes from the auth service, in response to the `authenticateHawk` call. I agree that there should have been two scopes, but the second is "queue:scheduler-id:gecko-level-3", not "queue:scheduler-r.v2.mozilla-inbound.*". And that second scope *is* present in the "you only have" list. Array operations in JS would be working on string objects (via pointer), so the strings themselves wouldn't be mangled. However, something that operates on strings might do this. When the error message is constructed, the array is sorted first, and the "roqueue" line is out of order but the others are properly ordered, so I tend to agree with Jonas -- this is likely occuring at log time. But if that's the case, why this error? If this has only occurred once, I'm tempted to chalk it up to cosmic rays and/or bad RAM causing some corruption in the Node process that both caused the error and then caused the mangled output (and hopefully crashed the EC2 instance next).
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(jopsen)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.