Open Bug 1834790 Opened 1 year ago Updated 1 year ago

treeherder cannot consume pushes from repositories in a SAML-enabled GitHub organization

Categories

(Tree Management :: Treeherder, defect)

defect

Tracking

(Not tracked)

People

(Reporter: bhearsum, Unassigned)

Details

Attachments

(5 files)

We added this repo & its staging counterpart to treeherder in https://github.com/mozilla/treeherder/pull/7678/files. The staging version seems to work just fine in Treeherder - picking up pushes to main without issue: https://treeherder.mozilla.org/jobs?repo=staging-firefox-translations-training

The production version however, does not. https://treeherder.mozilla.org/jobs?repo=firefox-translations-training has been empty since it was added.

The decision tasks for staging and production respectively both include treeherder symbols and routes, so I think everything is fine on the task side of things (although it's possible I'm overlooking something).

I guess the question is, how does treeherder determine there is a new push on a git repo? do we advertise on pulse?

Flags: needinfo?(bhearsum)

(In reply to Joel Maher ( :jmaher ) (UTC -8) from comment #1)

I guess the question is, how does treeherder determine there is a new push on a git repo? do we advertise on pulse?

I have no idea.

Flags: needinfo?(bhearsum)

the previous attachment ^ was data collected from:
https://firefox-ci-tc.services.mozilla.com/pulse-messages?bindings%5B0%5D%5Bexchange%5D=exchange%2Ftaskcluster-github%2Fv1%2Fpush&bindings%5B0%5D%5Bpattern%5D=%23

this indicates pulse is seeing the messages, so somehow we should be able to see those messages in treeherder.

There seems to be no differences on the pulse side for both staging and prod.
Also in terms of bindings.

/resultssets are listening to following exchanges:

exchange/hgpushes/v1
exchange/taskcluster-github/v1/pull-request
exchange/taskcluster-github/v1/push

and /tasks listen to:

exchange/taskcluster-queue/v1/task-completed
exchange/taskcluster-queue/v1/task-exception
exchange/taskcluster-queue/v1/task-failed
exchange/taskcluster-queue/v1/task-pending
exchange/taskcluster-queue/v1/task-running
Attached file pull_request.open.json

We don't log pulse messages at the time of publishing, but, for example, for that repo, here's a part of message that consumer sees (taskcluster github service)

Attached file push.json

here's also push event from the eyes of taskcluster's github consumer

Attached file push.json

sorry, previous file was empty with push event, here's the correct one

upd: those seems to be raw github webhook payload messages, not the pulse messages we publish directly. To see pulse messages one would have to set up listener like jmaher above suggests and push something to see what kind of event is being produced

(In reply to Yarik Kurmyza [:yarik] (he/him) (UTC+1) from comment #5)

Created attachment 9337156 [details]
Screenshot 2023-06-02 at 10.02.51.png

There seems to be no differences on the pulse side for both staging and prod.
Also in terms of bindings.

Thanks for digging into this. Just to be clear, treeherder staging is not at play here. The "staging" in question is https://github.com/mozilla-releng/staging-firefox-translations-training which reports fine to the production treeherder, while https://github.com/mozilla/firefox-translations-training does not. (It looks like the same is true for them both reporting back staging treeherder as well, although that doesn't really matter to me.)

Aryx and I ended up digging into this a bit more a couple of weeks ago. My understanding is that the root of the issue is that this repository is in a SAML-enabled organization, and extra hoops are needed to make requests to those tokens. From https://docs.github.com/en/rest/overview/authenticating-to-the-rest-api?apiVersion=2022-11-28#about-authentication:

If you use a personal access token (classic) to access an organization that enforces SAML single sign-on (SSO) for authentication, you will need to authorize your token after creation. Fine-grained personal access tokens are authorized during token creation, before access to the organization is granted. For more information, see "Authorizing a personal access token for use with SAML single sign-on."

If you do not authorize your personal access token (classic) for SAML SSO before you try to use it to access an organization that enforces SAML SSO, you may receive a 404 Not Found or a 403 Forbidden error. If you receive a 403 Forbidden error, you can follow the URL in the X-GitHub-SSO header to authorize your token. The URL expires after one hour. If you requested data that could come from multiple organizations, the API will not return results from the organizations that require SAML SSO. The X-GitHub-SSO header will indicate the ID of the organizations that require SAML SSO authorization of your personal access token (classic). For example: X-GitHub-SSO: partial-results; organizations=21955855,20582480.

(Authorizing information is in https://docs.github.com/en/enterprise-cloud@latest/authentication/authenticating-with-saml-single-sign-on/authorizing-a-personal-access-token-for-use-with-saml-single-sign-on)

This repository happens to be the first one in Treeherder in such an org - but this will become a big deal when mozilla-mobile becomes SAML-enabled - as many repos in that org use Treeherder.

Summary: treeherder not showing pushes to https://github.com/mozilla/firefox-translations-training → treeherder cannot consume pushes from repositories in a SAML-enabled GitHub organization
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: