Closed Bug 1566190 Opened 1 year ago Closed 3 months ago

Switch to new third-party login approach for Taskcluster

Categories

(Tree Management :: Treeherder: Frontend, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Assigned: sclements)

References

Details

Attachments

(1 file)

The current proposal is here:

"We will have a staging instance set up with this capability in mid-August.

I think the transition to the new approach will need to happen at the same time as the cutover to the new production Taskcluster deployment in mid-September (which will be a TCW)."

Armen: Are you volunteering to do this work?

Priority: -- → P2
Assignee: nobody → sclements
Status: NEW → ASSIGNED

https://docs.taskcluster.net/docs/manual/using/integration/frontend contains a tutorial that implements third-party login and allows users to execute a Taskcluster API call. This should work once bug 1561905 is resolved.

Awesome, thanks Hassan!

As a quick note for whoever implements this:

implement third-party sign-in, in such a way that Treeherder can hold user creds for multiple TC deployments at the same time

We will want to index the third-party logins by rootUrl, and allow a single treeherder instance to get and hold credentials for multiple Taskcluster deployments at the same time. It's OK if that's a followup (the immediate need is to get credentials for the firefox-ci deployment), but it would be good to design with that in mind from the start.

I'm planning to get started on this next week - reading the docs and asking questions. What's the hard deadline for getting this switched over?

Flags: needinfo?(dustin)

We will be shutting off the old deployment on November 9. We'll have a go/no-go decision in mid-October, at which point we'll want to know that this is well on its way to being completed (at a point where there are no risky unknowns).

Flags: needinfo?(dustin)
Priority: P2 → P1

Just checking in on status of this. We have a staging instance against which this could be tested, as well as the "community CI" instance. While I think TH can continue to use the legacy https://login.taskcluster.net after November 9, there's a maybe 5% chance that that just won't work and we'll be scrambling. I'd much rather scramble to finish a mostly-complete set of patches than start fresh on November 10 :)

Flags: needinfo?(sclements)

I'm prioritizing working on it this week (got pulled away by Perfherder and Outreachy the past few weeks) so I can give you an update next Monday if you like.

Flags: needinfo?(sclements)

I heard from hassan after I posted here that he's in touch with you, so as long as you to are in communication, I'm happy!

Depends on: 1587145
No longer depends on: 1587145

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #5)

As a quick note for whoever implements this:

implement third-party sign-in, in such a way that Treeherder can hold user creds for multiple TC deployments at the same time

We will want to index the third-party logins by rootUrl, and allow a single treeherder instance to get and hold credentials for multiple Taskcluster deployments at the same time. It's OK if that's a followup (the immediate need is to get credentials for the firefox-ci deployment), but it would be good to design with that in mind from the start.

So if I'm understanding this correctly, per your comment above and here: https://bugzilla.mozilla.org/show_bug.cgi?id=1574651#c9 we initially only need to hold credentials for the taskcluster.net rootUrl (which you've added as tc_root_url in the repositories.json fixture) but we'll also need to support other TC deployments that will have their own rootUrl's (such as the Servo https://community-tc.services.mozilla.com rootUrl mentioned in that bug).

When a new rootUrl needs to be added, will that be a file-a-bug-in-Treeherder type of process?

Flags: needinfo?(dustin)
See Also: → 1574651

Correct. And https://taskcluster.net will be shut off on November 9.

Flags: needinfo?(dustin)

When a new rootUrl needs to be added, will that be a file-a-bug-in-Treeherder type of process?

Yes, we'll coordinate the change on the 9th, I think.

Ok, so the tc_root_url for firefox-ci projects will be changed to a new rootUrl and that will be figured out on the 9th?

When I'm testing out this authorization flow login approach, should I be using the Hassan's staging instance as the rootUrl (to hold credentials for taskcluster.net), so hassan.taskcluster-dev.net/login/oauth/authorize?

Flags: needinfo?(dustin)

Yes, and now that I think of it we should involve someone from your team in the TCW planning for that date. It should be a straightforward config change, though.

I think that's a good choice, and hassan will correct me if I'm wrong.

Flags: needinfo?(dustin)

Oh, you are invited :)

It'd be a good idea to invite Cam also.

I've looked through TH code and the implementation of the new authorization-flow and I have more questions:

What it appears we're currently doing in TH when a user hits the login button is (simplified):

  1. in frontend, request access_token via auth0 web client
  2. Once we have the token, which we store in browser as userSession, we include it in headers and call our auth/login API
  3. In the backend, we use header info to call the auth0 jwt validator, to validate the token and get user details
  4. That info as also used to query a user in our db, and is used for session management - user info is returned
  5. credentials/access_token is verified in frontend via the taskcluster-web-client's OIDCCredentialAgent method when TC actions are performed

So, what it looks like I need to change is:

  1. in frontend, request authorization code via <rootUrl>/login/oauth/authorize
  2. using that code, request token via <rootUrl>/login/oauth/token
  3. instead of in the backend, in the frontend will query <rootUrl>/login/oauth/credentials to verify token and get credentials; store those in the browser userSession, indexed by rootUrl
  4. then we would need to pass pertinent info re session expiry, username and email to the backend auth/login in order to return the user details as before (being careful not to pass access_token per the guidelines)
  5. create an instance of taskcluster.Auth, passing in credentials and rootUrl params that is called when TC actions are performed

The actual questions:

  • Does that sound right?
  • Should the firefox-ci credentials always be retrieved when a user hits the login button, and community-ci/servo only if a user tries to perform a TC action on that particular repo (for the first time that happens, if not previously stored)?
  • When the TC credentials expire, do we need to do step 3 or steps 1-3?
Flags: needinfo?(dustin)
Flags: needinfo?(cdawson)

Close, but the key difference is this: Treeherder should continue to use auth0 directly for login (1-4 of the first set of steps). The third-party bit does not provide any identifying information, just credentials, so it can't be used to support a full login process (and opsec would very much like it NOT to be used for that purpose).

So for the second set of steps, I think 1-3 and 5 are correct, but since all Taskcluster API calls are made from the frontend, I think there's no need to involve the backend at all.

To the second bullet, I think the credentials retrieval should happen only when a user tries to perform an action, if not previously stored.

Flags: needinfo?(dustin)

Aha, that makes sense. What threw me is the use of audience: login.taskcluster.net and taskcluster-credentials in the scope for requesting the access_token via auth0 web client. Those would need to be removed completely, since we're now using the authorization flow approach mentioned above.

Flags: needinfo?(cdawson)

So when the TC credentials have expired, do we need we need to retrieve a new authorization code, token and new credentials?

Flags: needinfo?(dustin)

What threw me

Is that still in the docs somewhere? You're right, that's no longer necessary.

Hassan can answer comment 21 -- I don't remember the details.

Flags: needinfo?(dustin) → needinfo?(helfi92)

(In reply to Dustin J. Mitchell [:dustin] (he/him) from comment #22)

Is that still in the docs somewhere? You're right, that's no longer necessary.

No, I didn't see it in your docs. I meant that seeing that in our code base was causing confusion about the big picture - how auth0 will still be used for login, but that the authorization/token for TC credentials will be decoupled from that and now handled through a separate process. Thanks for clearing that up!

Correct. Refreshing TC credentials would require doing the whole oauth2 dance again.

Flags: needinfo?(helfi92)

This was deployed to prod successfully on Nov 9th.

Status: ASSIGNED → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.