secrets service does not correctly evaluate scopeset satisfaction when authorized scopes contains assume scopes

RESOLVED FIXED

Status

RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: dustin, Assigned: jonasfj)

Tracking

Details

Attachments

(5 attachments)

https://tools.taskcluster.net/task-inspector/#MJfHXooPSDOahg3ZJWDrtg/
{
  "image": "taskcluster/desktop-test:0.5.2",
  "command": [
    "sh",
    "-c",
    "curl http://taskcluster/secrets/v1/secrets/garbage/dustin/foo"
  ],
  "maxRunTime": 300,
  "features": {
    "taskclusterProxy": true
  }
}

gives

[taskcluster] === Task Starting ===
{
  "message": "Unauthorized: Bad mac"
}[taskcluster] === Task Finished ===

yet with the new, shiny secrets UI I can confirm that the secret does, indeed, exist.  Also, none of the other failure modes we've been able to reproduce with direct access to secrets.taskcluster.net (no such secret, unauthorized) result in "Bad mac".

According to papertrail, the requests never make it to the secrets service (I see no logging from the time the task executed)

Am I using the wrong URL somehow?
Blocks: 1231320
It works to run the proxy locally:

dustin@dustin-tc-devel ~/p/taskcluster-proxy [master] $ curl 172.17.0.5:80/secrets/v1/secrets/garbage/dustin/foo
{
  "secret": {
    "foo": "bar"
  },
  "expires": "2015-12-29T15:27:37.253Z"
}

This is using my TC permacreds, so I suspect that the proxy isn't getting the right credentials passed to it -- in fact, I'm guessing that this is due to the use of temporary credentials in the workers..
Depends on: 1235399
I'll see if this is fixed now with bug 1235399 being fixed.
The url was not quite right, it should be:
  * http://taskcluster/secrets/v1/secret/garbage/dustin/foo
rather than:
  * http://taskcluster/secrets/v1/secrets/garbage/dustin/foo

I tested locally with the latest official release of taskcluster-proxy[1], and it is now working fine:

Session 1:

$ taskcluster-proxy -p 60022 KVnDzQ8iS6OqoulibdRLbw

Session 2:

$ curl http://localhost:60022/secrets/v1/secret/garbage/pmoore/foo
{
  "secret": {
    "pete": "moore"
  },
  "expires": "2016-01-21T15:33:56.645Z"
}

1) https://github.com/taskcluster/taskcluster-proxy/releases/tag/v3.0.4


It may be that we need to wait a while for the instances to pick up version 3.0.4 (which is also currently tagged as "latest" on dockerhub).
secrets -> secret was a recent change to the service

I'm still not seeing this work:

https://tools.taskcluster.net/task-inspector/#PLkqZwqzT62mhvwfCZ58uA/0
Oh, that's because the temp creds are still those of the worker, rather than those of the task.  And the worker doesn't have any secrets scopes at all.  I fixed that temporarily, creating https://tools.taskcluster.net/auth/roles/#worker-type:aws-provisioner-v1%252fami-test and granting it secrets:get:*.

https://tools.taskcluster.net/task-inspector/#OuKSAEzbTPSmU_LQhCib1A/0

Still didn't work.  What am I missing here?
Flags: needinfo?(pmoore)
Maybe an old version of the taskcluster-proxy? It might be due to the age of the instance. You could spit out the http response headers to see if the version of the proxy is included. Ideally it should be 3.0.4 or 3.0.5...
Flags: needinfo?(pmoore)
I totally should have followed the link before making assumptions. I see it is 3.0.4.... Hmmmmm.
https://tools.taskcluster.net/task-inspector/#aSWbFAodROOEMQcCE1Lwpg/1 at least shows the error: "Bad mac". But I'm using Linux! ;)

And that's with the worker role granted secrets:get:garbage/*.  Even if that was wrong, I would expect something different from "Bad mac", which suggests it's doing the hashing wrong or not supplying a cert.

So, I think something remains wrong here..
Depends on: 1220738

Comment 9

3 years ago
Most likely this is caused by the worker using the temp creds provided by the provisioner and not injecting the certificate into the proxy when starting up.

If you want to test this out, you can either add perma creds to that worker type or use a worker type that has them already (such as gaia)

This should be fixed up when bug 1220738 is fixed where it will use the temp creds provided by the queue when claiming/reclaiming a task.
(In reply to Greg Arndt [:garndt] from comment #9)
> Most likely this is caused by the worker using the temp creds provided by
> the provisioner and not injecting the certificate into the proxy when
> starting up.

Ah, interesting - thanks Greg! That would indeed explain a bad mac, if temp creds were being used but the certificate wasn't getting added to the ext field inside the base64 decoded Authorization http header...

I couldn't find the code in docker-worker that sets task.runtime.taskcluster.{clientId,accessToken,certificate} to the values from claim.credentials.{clientId,accessToken,certificate} - can you tell me where that is done?

If that is already done (and I just didn't spot it) I think we can fix this bug with a patch like this:

diff --git a/lib/features/taskcluster_proxy.js b/lib/features/taskcluster_proxy.js
index 3cdc3bf..d46c187 100644
--- a/lib/features/taskcluster_proxy.js
+++ b/lib/features/taskcluster_proxy.js
@@ -34,6 +34,7 @@ export default class TaskclusterProxy {
     var cmd = [
         '--client-id=' + task.runtime.taskcluster.clientId,
         '--access-token=' + task.runtime.taskcluster.accessToken,
+        '--certificate=' + task.runtime.taskcluster.certificate,
         task.status.taskId,
         assumedScope
     ];

I searched in docker-worker (taskcluster org, master branch) for references to 'claim.credentials' but couldn't find any, so wasn't quite sure how the credentials in http://schemas.taskcluster.net/queue/v1/task-claim-response.json# were getting consumed.

Thanks!
Flags: needinfo?(garndt)
Note: we should also test that all is ok when used with permanent credentials, i.e. if certificate does not exist (null/undefined/empty string??) what command this generates for launching the proxy, and whether the proxy interprets it correctly (i.e. if it generates `taskcluster-proxy .... --certificate= ....` whether that actually works, and the proxy interprets that as meaning permanent credentials, and does not set a certificate at all (rather than an empty string certificate, which would be invalid json).

Maybe simplest is to adapt the patch above to check if task.runtime.taskcluster.certificate is a non-empty string (ideally valid json) that it is passed through as a parameter, otherwise not....
Ahhhh - the temp creds used by the provisioner, not by the queue - my bad - should have read it properly! :)
Created attachment 8710400 [details] [review]
Github Pull Request for docker-worker

This patch passes through the certificate as a command line parameter to taskcluster-proxy command, if it is set.
Flags: needinfo?(garndt)
Attachment #8710400 - Flags: review?(garndt)

Comment 14

3 years ago
Commits pushed to master at https://github.com/taskcluster/docker-worker

https://github.com/taskcluster/docker-worker/commit/eaf95bf091d8a4e6d6b1fd0e865cdc1765db7e06
Bug 1234929 - pass certificate to taskcluster-proxy if it is set

https://github.com/taskcluster/docker-worker/commit/9de6d1a1c36c3b93a00e0f888254a7e21f4bb083
Merge pull request #206 from taskcluster/bug1234929

Bug 1234929 - pass certificate to taskcluster-proxy if it is set

Updated

3 years ago
Attachment #8710400 - Flags: review?(garndt) → review+
That's enough for bug 1231320 then -- once this new docker-worker is deployed to ami-test, we can re-try the test in comment 8.  Once it's deployed to

queue:create-task:aws-provisioner-v1/android-api-11
queue:create-task:aws-provisioner-v1/b2g-desktop-*
queue:create-task:aws-provisioner-v1/b2gbuild*
queue:create-task:aws-provisioner-v1/b2gtest*
queue:create-task:aws-provisioner-v1/dbg-*
queue:create-task:aws-provisioner-v1/desktop-test*
queue:create-task:aws-provisioner-v1/dolphin
queue:create-task:aws-provisioner-v1/emulator-*
queue:create-task:aws-provisioner-v1/flame-kk*
queue:create-task:aws-provisioner-v1/gecko-decision
queue:create-task:aws-provisioner-v1/mulet-opt
queue:create-task:aws-provisioner-v1/opt-*
queue:create-task:aws-provisioner-v1/spidermonkey

we can try this for real for bug 1231320.
https://tools.taskcluster.net/task-inspector/#WA2SDmaCTMiw_-J7V5CI7A/ -- might expire before the other tasks finish it, but it can be re-triggered then.
Well, it ran, but failed.

Role `worker-type:aws-provisioner-v1/ami-test` has `secrets:get:garbage/*`.
The task has scope `secrets:get:garbage/*`.
Yet we get 401'd with
{
  "message": "ext.authorizedScopes oversteps your scopes"
}
taskcluster-proxy adds additionalScopes and task.scopes to get authorizedScopes.

Per https://github.com/taskcluster/docker-worker/blob/master/lib/features/taskcluster_proxy.js#L31
    var assumedScope = 'assume:worker-id:' + task.runtime.workerGroup +
                       '/' + task.runtime.workerId;
and
  https://tools.taskcluster.net/task-inspector/#OM9Tup2qSAuB2E3C6dBmDw
that should be ['assume:worker-id:us-west-2a/i-1ce029db', 'secrets:get:garbage:/*'].

I don't remember why we add the `assume:worker-id:*` and I suspect that's unnecessary, but setting that aside..

The proxy gets the worker's temporary credentials, which come from the aws provisioner and have scopes 
  "scopes": [
    "assume:worker-type:aws-provisioner-v1/ami-test",
    "assume:worker-id:*"
  ]
per https://tools.taskcluster.net/aws-provisioner/#ami-test/view

Per https://tools.taskcluster.net/auth/roles/#worker-type:aws-provisioner-v1%252fami-test, that expands to ['assume:worker-type:aws-provisioner-v1/ami-test', 'assume:worker-id:*', 'secrets:get:garbage/*'] (I added 'assume:worker-id:*' for the moment just to see if it helped; and in fact it expands to a lot more scopes than that because of some *-suffixed roles, but that doesn't matter here).

So the authenticateHawk endpoint is looking at authorizedScopes =
  ['assume:worker-id:us-west-2a/i-1ce029db', 'secrets:get:garbage:/*']
and determining that it is not satisfied by 
  ['assume:worker-type:aws-provisioner-v1/ami-test', 'assume:worker-id:*', 'secrets:get:garbage/*']

and here's the part where a smart person would delete this comment, but hey, we work in the open, right?  There's a missing `:`.
Nope, lies.  The stray `:` was just a typo in the bug comment.

So the authenticateHawk endpoint is looking at authorizedScopes =
  ['assume:worker-id:us-west-2a/i-1ce029db', 'secrets:get:garbage/*']
and determining that it is not satisfied by 
  ['assume:worker-type:aws-provisioner-v1/ami-test', 'assume:worker-id:*', 'secrets:get:garbage/*']

and that doesn't make any sense.

Does anyone see what I'm missing?
(In reply to Dustin J. Mitchell [:dustin] from comment #17)
> Well, it ran, but failed.
> 
> Role `worker-type:aws-provisioner-v1/ami-test` has `secrets:get:garbage/*`.
> The task has scope `secrets:get:garbage/*`.
> Yet we get 401'd with
> {
>   "message": "ext.authorizedScopes oversteps your scopes"
> }

This is a classic case of an error message not providing relevant content - all our error messages should provide the details of the problem they report - e.g. "ext.authorizedScopes oversteps your scopes; client scopes: 'a', 'b', 'c'; ext.authorizedScopes: 'a', 'c', 'd'.".

This is unfortunately a repeating pattern that we have in our error messages - I'll raise a bug to fix this error message - but we should all be extra careful when creating error messages that the pertinent context to the error is also reported in the error message.
See Also: → bug 1241808
Bug 1241808 raised to fix the error message.
Note, the proxy also records the request payloads it sends, so we'd be able to base64 decode the Authorization header to see what authorizedScopes were set and which scopes were defined in the temporary certificate - however at the moment these logs are not sent to e.g. papertrail.

One option would be to hook taskcluster-proxy up to papertrail to capture this, to get the details about which scope(s) were in ext.authorizedScopes vs what were listed in the scopes of the temporary certificate. This is an alternative to waiting for bug 1241808 to be resolved.
Created attachment 8710969 [details] [review]
Github Pull Request for taskcluster-proxy
Attachment #8710969 - Flags: review?(dustin)
Either the debug flag could be enabled in a worker for this:
https://github.com/taskcluster/docker-worker/blob/master/lib/features/taskcluster_proxy.js#L70

Or we could kick off a task that doesn't end immediately, ssh in and do "docker logs" on the proxy container to see what's happening.
Attachment #8710969 - Flags: review?(dustin) → review+
I'll try the `docker logs` trick shortly.

Pete, what do you think about making the scope headers optional, via some switch in the task?  They are likely to be large in many cases, and useful only for debugging.  Or maybe I'm over-engineering, since they are transferred via localhost..
(In reply to Pete Moore [:pmoore][:pete] from comment #20)
> This is a classic case of an error message not providing relevant content -
> all our error messages should provide the details of the problem they report
> - e.g. "ext.authorizedScopes oversteps your scopes; client scopes: 'a', 'b',
> 'c'; ext.authorizedScopes: 'a', 'c', 'd'.".

It's important to be a little bit careful with authorization-failure messages, as otherwise they can leak information.  For example, if we add more to the "Bad mac" error, an attacker might find it useful to incrementally reverse-engineer the hash.  I think this case would be OK, since the authorizedScopes are in plaintext in the request and the client scopes are just an un-authenticated API call away.  I just wanted to raise the general concern :)

Comment 27

3 years ago
Commits pushed to master at https://github.com/taskcluster/taskcluster-proxy

https://github.com/taskcluster/taskcluster-proxy/commit/8b43e347babc2d36858b7f43f637f61f4211e4c4
Bug 1234929 - add X-Taskcluster-Proxy-Temp-Scopes and X-Taskcluster-Authorized-Scopes headers to proxy responses

https://github.com/taskcluster/taskcluster-proxy/commit/8e32e09396c0fb4d40bc8f8962cdb43d9d3f3ffd
Merge pull request #12 from taskcluster/add-scopes-headers-to-proxy-response

Bug 1234929 - add extra headers for troubleshooting scopes issues
I've created 3.0.6 release of taskcluster-proxy and pushed to dockerhub (https://github.com/taskcluster/taskcluster-proxy#making-a-release) - so should be ready for testing now.

Please note I didn't docker push 'latest' tag, we can do that if it runs ok on ami-test worker type, I would say.
(In reply to Greg Arndt [:garndt] from comment #24)
> Either the debug flag could be enabled in a worker for this:
> https://github.com/taskcluster/docker-worker/blob/master/lib/features/
> taskcluster_proxy.js#L70
> 
> Or we could kick off a task that doesn't end immediately, ssh in and do
> "docker logs" on the proxy container to see what's happening.

awesome tips, thanks greg!
(In reply to Dustin J. Mitchell [:dustin] from comment #25)
> I'll try the `docker logs` trick shortly.
> 
> Pete, what do you think about making the scope headers optional, via some
> switch in the task?  They are likely to be large in many cases, and useful
> only for debugging.  Or maybe I'm over-engineering, since they are
> transferred via localhost..

Yeah, I was thinking the same. It is a tough call. In some ways it is nice to consistently provide data so when problems occur, you don't need to dig around to find out how to get the extra data you want - but at the same time, we don't want to overload the headers or impact performance for the 99% of the time we don't care.

Another option might be to only add them for non-200 (or maybe just 401 and 403) responses.

Ideally they'll end up in the auto error responses, and then maybe we can ditch them entirely from the proxy.

Another option is that you could pass a "debug" request header, and if you ever pass that, you are asking to get debug headers back. That would save having them for every call in the task, and just for the ones you want. My hesitation for adding something like that is though that often a barrier to solving a problem can be knowing that you just need to add a debug header, or set a task option, or add a DEBUG variable etc. Hmmmmm.
(In reply to Dustin J. Mitchell [:dustin] from comment #26)
> (In reply to Pete Moore [:pmoore][:pete] from comment #20)
> > This is a classic case of an error message not providing relevant content -
> > all our error messages should provide the details of the problem they report
> > - e.g. "ext.authorizedScopes oversteps your scopes; client scopes: 'a', 'b',
> > 'c'; ext.authorizedScopes: 'a', 'c', 'd'.".
> 
> It's important to be a little bit careful with authorization-failure
> messages, as otherwise they can leak information.  For example, if we add
> more to the "Bad mac" error, an attacker might find it useful to
> incrementally reverse-engineer the hash.  I think this case would be OK,
> since the authorizedScopes are in plaintext in the request and the client
> scopes are just an un-authenticated API call away.  I just wanted to raise
> the general concern :)

Yeah, agree with this security caveat, good call. For the setting of headers I decided to just add the temporary scopes and authorized scopes for this reason, rather than the full Authorization header, as I figured these shouldn't be a security risk - whereas other content in the Authorization header could indeed be.
Yeah, let's leave the scope headers in there unconditionally.  And comment 25 would make a good bug in the "auth" component :)
OK, with 3.0.6:

< HTTP/1.1 401 Unauthorized                                                                                             
< Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin                               
< Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT                                          
< Access-Control-Allow-Origin: *                                                                                        
< Access-Control-Request-Method: *                                                                                      
< Connection: keep-alive                                                                                                
< Content-Length: 61                                                                                                    
< Content-Type: application/json; charset=utf-8                                                                         
< Date: Fri, 22 Jan 2016 22:13:16 GMT                                                                                   
< Etag: W/"3d-80f8c3af"                                                                                                 
< Server: Cowboy                                                                                                        
< Via: 1.1 vegur                                                                                                        
< X-Powered-By: Express                                                                                                 
< X-Taskcluster-Authorized-Scopes: [assume:worker-id:us-west-1b/i-9b616b29 secrets:get:garbage/*]                       
< X-Taskcluster-Endpoint: https://secrets.taskcluster.net/v1/secret/garbage/pmoore/foo                                  
< X-Taskcluster-Proxy-Temp-Scopes: [assume:worker-type:aws-provisioner-v1/ami-test assume:worker-id:*]                  
< X-Taskcluster-Proxy-Version: Taskcluster proxy 3.0.6                 

as predicted.

The temp scopes are generated from https://tools.taskcluster.net/auth/clients/#aws-provisioner

which also has secrets:get:garbage/*

Jonas, what peculiarity of scope/role expansion are we missing here?
Flags: needinfo?(jopsen)
Created attachment 8711327 [details] [review]
Github Pull Request taskcluster/taskcluster-proxy - enhancements

I'm afraid I couldn't help myself - I cleaned up the tests and discovered a couple of issues in the process:

1) not consistently adding headers
2) proxy crash when using a malformed certificate, rather than returning 500 status code

I fixed these, and the updated tests now pass.

In addition, for the sake of consistency, when the proxy runs with permanent credentials, it reports its client id in http response header `X-Taskcluster-Proxy-Perm-ClientId`, to complement the `X-Taskcluster-Proxy-Temp-Scopes` header which is returned when running with temporary credentials.
Attachment #8711327 - Flags: review?(dustin)
(In reply to Dustin J. Mitchell [:dustin] from comment #33)
> OK, with 3.0.6:
> 
> < HTTP/1.1 401 Unauthorized                                                 
> 
> < Access-Control-Allow-Headers:
> X-Requested-With,Content-Type,Authorization,Accept,Origin                   
> 
> < Access-Control-Allow-Methods:
> OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT                              
> 
> < Access-Control-Allow-Origin: *                                            
> 
> < Access-Control-Request-Method: *                                          
> 
> < Connection: keep-alive                                                    
> 
> < Content-Length: 61                                                        
> 
> < Content-Type: application/json; charset=utf-8                             
> 
> < Date: Fri, 22 Jan 2016 22:13:16 GMT                                       
> 
> < Etag: W/"3d-80f8c3af"                                                     
> 
> < Server: Cowboy                                                            
> 
> < Via: 1.1 vegur                                                            
> 
> < X-Powered-By: Express                                                     
> 
> < X-Taskcluster-Authorized-Scopes: [assume:worker-id:us-west-1b/i-9b616b29
> secrets:get:garbage/*]                       
> < X-Taskcluster-Endpoint:
> https://secrets.taskcluster.net/v1/secret/garbage/pmoore/foo                
> 
> < X-Taskcluster-Proxy-Temp-Scopes:
> [assume:worker-type:aws-provisioner-v1/ami-test assume:worker-id:*]         
> 
> < X-Taskcluster-Proxy-Version: Taskcluster proxy 3.0.6                 
> 
> as predicted.
> 
> The temp scopes are generated from
> https://tools.taskcluster.net/auth/clients/#aws-provisioner
> 
> which also has secrets:get:garbage/*
> 
> Jonas, what peculiarity of scope/role expansion are we missing here?

This looks like it might still be a taskcluster-proxy bug...

I wrote a script to generate temporary credentials to match those given in production, and then:

1) used them to run taskcluster-proxy, and issued the curl request - the result was the same as in production with a 401 with message "ext.authorizedScopes oversteps your scopes"
2) used them to directly query the secrets service using the taskcluster-client-go, and instead got a 404 with message "Secret not found"

In other words, it looks like the proxy is not sending the right request - I'll add some debug to capture the http request headers to see what is going on....
Flags: needinfo?(jopsen)
My mistake - in my go program I forgot to explicitly limit the authorized scopes to the task scopes. When I do this, I get the same behaviour as dustin saw.

Two minor differences appeared though, which is unrelated to this problem, so I'll create a bug for it. It doesn't affect the outcome, but is a little strange. The difference is with the method line in the http request:

1) In the native client:
    `GET /v1/secret/garbage%2Fpmoore%2Ffoo HTTP/1.1`
2) In the proxy:
   `HEAD /v1/secret/garbage/pmoore/foo HTTP/1.1`

The difference in the path is because the curl statement in the task does not url encode the secret key, which is probably should. However, it is strange that in the proxy a HEAD request is made, and not a GET. That said, the body does get returned in the proxy response, so maybe internally the body is fetched separately (I don't remember at this point).

NI'ing Jonas again, as I was wrong about it being a taskcluster-proxy bug, I think it is weirdness in the validation of whether a scopeset is satisfied on the auth server side.
Flags: needinfo?(jopsen)
For reference, to reproduce locally, set TASKCLUSTER_CLIENT_ID and TASKCLUSTER_ACCESS_TOKEN env variables to permanent credentials, and `go run` this file:

------------

package main

import (
	"log"
	"os"
	"time"

	"github.com/taskcluster/taskcluster-client-go/secrets"
	"github.com/taskcluster/taskcluster-client-go/tcclient"
)

func main() {
	permaCreds := tcclient.Credentials{
		ClientId:    os.Getenv("TASKCLUSTER_CLIENT_ID"),
		AccessToken: os.Getenv("TASKCLUSTER_ACCESS_TOKEN"),
	}
	tempCreds, err := permaCreds.CreateTemporaryCredentials(
		1*time.Hour,
		"assume:worker-type:aws-provisioner-v1/ami-test",
		"assume:worker-id:*",
	)
	if err != nil {
		log.Printf("%s", err)
	}
	tempCreds.AuthorizedScopes = []string{
		"secrets:get:garbage/*",
		"assume:worker-id:us-west-1b/i-9b616b29",
	}
	mySecrets := secrets.New(tempCreds)
	secret, cs, err := mySecrets.Get("garbage/pmoore/foo")
	if err != nil {
		log.Printf("%v", cs.HttpRequest.Header)
		log.Print(cs.HttpRequestBody)
		log.Fatalf("%s", err)
	}
	log.Printf("%v", secret.Secret)
}
Component: Docker-Worker → Authentication
Summary: secrets API doesn't work via taskcluster-proxy → auth service does not correctly evaluate scopeset satisfaction when authorized scopes contains assume scopes
Please note, I'm assuming this is a problem with the auth service, as I guess there is an endpoint called which evaluates the scope satisfaction. If this is not the case, and the secrets service hits an auth endpoint to expand scopes, and then uses the taskcluster-client library to lexically evaluate the satisfaction against raw strings, the problem could also be on the secrets service side that it is not expanding the scopes first...

Now I think about it, this could be a more likely candidate...
Component: Authentication → Secrets
QA Contact: dustin
Summary: auth service does not correctly evaluate scopeset satisfaction when authorized scopes contains assume scopes → secrets service does not correctly evaluate scopeset satisfaction when authorized scopes contains assume scopes
Attachment #8711327 - Flags: review?(dustin) → review+
Component: Secrets → Authentication
QA Contact: dustin
Jonas said he is going to take a look at this...
Assignee: dustin → jopsen
(Assignee)

Comment 40

3 years ago
Created attachment 8711900 [details] [review]
Github PR for tc-lib-api

This will also need a fix in taskcluster-auth to provide an expandScopes method
Flags: needinfo?(jopsen)
Attachment #8711900 - Flags: review?(dustin)
Attachment #8711900 - Flags: review?(dustin) → review+
Created attachment 8712034 [details] [review]
Github Pull Request for taskcluster-auth
Dustin,

Are we safe to roll these changes out to production?

pmoore@Petes-iMac:~/git/taskcluster-auth master $ git log --pretty=oneline heroku/master..mozilla/master
c33ee5d6cb6fd0fb64e49bd3b661a467031e52e3 Merge taskcluster/taskcluster-auth:bug-1234929 (PR #33)
f1f78f13ff3f58b78a3b28ea83d9b0f1121d3923 don't make a client that expires immediately
3c7ecad89b6ba0d3cddc70403e9f1b9c1afdc593 Merge djmitche/taskcluster-auth:bug1242714 (PR #32)
0baa798c2d3fdd4dfe9265e57fda14fb5814817c Fix bug 1234929
32633e8eedd5080293e2e7e5acf0315238983d8d Merge pull request #31 from djmitche/testing-fixes
acaf960e08866041888792dd5d66d9ca6cfa0f67 Bug 1242714: reject expired clients
9e76bf4ec101e55ffa5e088768965210db8101dc Merge https://github.com/taskcluster/taskcluster-auth/pull/9
8e389a6e4be2c675bafba48b0acc4aa74b3e7123 don't require azure or influx credentials to run tests
dbd3109be15db69674a2dabc6b95ebc68d365b4f use newer azure-entities
25532d58509d205de7b8a41301db661150bae06f use newer taskcluster-lib-app for better logging
3864a198ad95592e0b8463c3a8309c976f77f9ba refactor server to use taskcluster-lib-loader
f9af332026a3dec08593dbe580f72def6dae07ec remove debug output
eb0d0b93c8b210cf0580b77b8772344488b69172 Revert "use an internally-generated root access token for tests"
89b37ec6c6341f8cfbb2240ba9a86bf01ae4cd60 add example of auth file
d65f36135479f1ef10c4b20119962296466d12f0 use an internally-generated root access token for tests
pmoore@Petes-iMac:~/git/taskcluster-auth master $ 


Pete
Flags: needinfo?(dustin)
I think so -- I don't know if there's a good way to test auth deployments, since if they break the whole world falls apart.  I'm willing to ship it and see..
Flags: needinfo?(dustin)
The world fell apart.  In fact, I think it was https://github.com/taskcluster/taskcluster-auth/pull/33 that broke things.  I saw errors like

{
  "message": "Authorization Failed",
  "error": {
    "info": "None of the scope-sets was satisfied",
    "scopesets": [
      [
        "aws-provisioner:list-worker-types"
      ]
    ],
    "scopes": [
      "assume:mozilla-user:dmitchell@mozilla.com",
      "assume:mozilla-group:scm_level_1",
      "assume:mozilla-group:scm_level_3",
      "assume:mozilla-group:scm_level_2",
      "assume:mozilla-group:team_relops",
      "assume:mozilla-group:shipit",
      "assume:mozilla-group:team_taskcluster",
      "assume:mozilla-group:shipitdev",
      "assume:mozilla-group:team_moco"
    ]
  }
}

which is to say, none of my scopes were expanded.  That makes sense, since the patch removes the `result.scopes = resolve(result.scopes)`.  That won't be necessary eventually, but it's still necessary now!!

Jonas, in terms of deploying these two patches, maybe it makes sense to move the local validator code out of tc-lib-api and into tc-auth?

I'm redeploying with bd16d2aa771d686cf1df192ff734e40d8c1387ec reverted/
Flags: needinfo?(jopsen)
Redeployment succeeded.
(Assignee)

Comment 47

3 years ago
> The world fell apart.  In fact, I think it was
> https://github.com/taskcluster/taskcluster-auth/pull/33 that broke things.
Yes, it'll have to be rolled out with the updated tc-lib-api, or it won't work.

> Jonas, in terms of deploying these two patches, maybe it makes sense to move the
> local validator code out of tc-lib-api and into tc-auth?
I think it gets used in tc-lib-api, for various tests... and in tc-lib-testing for mockAuthServer.
And mockAuthServer is then again used in tc-client.

We have tests for security in tc-client, tc-lib-api, and tc-auth. Since all of them tests
from different perspectives that not necessarily a bad idea. On the other hand I wouldn't oppose
work to move it. If that is feasible, but if tc-auth isn't the only dependent it might be bad.
Flags: needinfo?(jopsen)
I think that PR rolled this bug together with bug 1240452, and the bits for the latter didn't work.

Do you want to put up a separate PR for the updated tc-lib-api and the fix for this bug?

Updated

3 years ago
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Great Success!

* About to connect() to taskcluster port 80 (#0)
*   Trying 172.17.0.1... connected
> GET /secrets/v1/secret/garbage/pmoore/foo HTTP/1.1

> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3

> Host: taskcluster

> Accept: */*

> 

< HTTP/1.1 200 OK

< Access-Control-Allow-Headers: X-Requested-With,Content-Type,Authorization,Accept,Origin

< Access-Control-Allow-Methods: OPTIONS,GET,HEAD,POST,PUT,DELETE,TRACE,CONNECT

< Access-Control-Allow-Origin: *

< Access-Control-Request-Method: *

< Connection: keep-alive

< Content-Length: 83

< Content-Type: application/json; charset=utf-8

< Date: Tue, 02 Feb 2016 21:44:50 GMT

< Etag: W/"53-b2395246"

< Server: Cowboy

< Via: 1.1 vegur

< X-Powered-By: Express

< X-Taskcluster-Authorized-Scopes: [assume:worker-id:us-east-1d/i-c10d5f48 secrets:get:garbage/*]

< X-Taskcluster-Endpoint: https://secrets.taskcluster.net/v1/secret/garbage/pmoore/foo

< X-Taskcluster-Proxy-Temp-Scopes: [assume:worker-type:aws-provisioner-v1/ami-test assume:worker-id:*]

< X-Taskcluster-Proxy-Version: Taskcluster proxy 3.0.6

< 

{
  "secret": {
    "petey": "moore"
  },
  "expires": "2016-02-03T21:36:54.354Z"
* Connection #0 to host taskcluster left intact
* Closing connection #0
You need to log in before you can comment on or make changes to this bug.