Migrate code-coverage CI to community taskcluster deployment, hooks to firefox deployment
Categories
(Taskcluster :: Operations and Service Requests, task)
Tracking
(Not tracked)
People
(Reporter: bastien, Unassigned)
References
Details
Attachments
(7 files)
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
The project uses Taskcluster for its CI needs, and needs access on the Mozilla CI instance:
- trigger some hooks to process coverage
Comment 1•5 years ago
|
||
As discussed on IRC, the CI for the project will be in the community deployment, the hooks will be in the Firefox deployment.
Reporter | ||
Comment 2•5 years ago
|
||
I merged the PR to run CI on the community instance.
We still need a way to update the hooks on firefox CI
Reporter | ||
Comment 3•5 years ago
|
||
These hooks need to be migrated on the Firefox-CI instance before the 9th so that the code-coverage generation runs continously :
- project-relman/code-coverage-cron-testing
- project-relman/code-coverage-cron-production
- project-relman/code-coverage-repo-testing
- project-relman/code-coverage-repo-production
They each need access to their respective secrets on the same instance, related to their environment:
project/relman/code-coverage/runtime-testing
project/relman/code-coverage/runtime-production
The cron
tasks are triggered by the following schedule: 0 0 0 * * *
The generated tasks run on a specific workerType relman-svc-memory using r5d.large
EC2 instances.
Comment 4•5 years ago
|
||
Comment 5•5 years ago
|
||
Comment 6•5 years ago
|
||
I copied the worker-type definition from the existing configuration, and it includes lots of instance types, among them r5d.large.
The patches above manage the hooks, but do not manage the secrets. I think those will remain "manually' managed. I'm happy to copy those over if someone gives me scopes on the firefox-ci deployment to write to them -- Callek, perhaps?
Comment 7•5 years ago
|
||
Comment 8•5 years ago
|
||
Comment 9•5 years ago
|
||
Comment 10•5 years ago
|
||
I think the next big step here is to land the patches for this repo, code-review, and bugzilla-dashboard-backend. If you need to make a tag to make a new image that includes the new rootUrls, we can then deploy the images to these hooks.
Reporter | ||
Comment 11•5 years ago
|
||
The code-coverage hooks look OK to me, but a scheduled cron job failed due to missing scopes
Comment 12•5 years ago
|
||
Ah, oops! Copy-past error. I believe https://phabricator.services.mozilla.com/D52464 will fix this.
Reporter | ||
Comment 13•5 years ago
|
||
The latest code-coverage (11 hours ago) still had those issues.
I do not have the scopes needed to trigger the task to check it now...
Comment 14•5 years ago
|
||
The task was successfully triggered now, but failed with:
taskcluster.exceptions.TaskclusterRestFailure: Secret not found
Comment 15•5 years ago
|
||
(the secret being project/relman/code-coverage/runtime-production)
Comment 16•5 years ago
|
||
(In reply to Marco Castelluccio [:marco] from comment #14)
The task was successfully triggered now, but failed with:
taskcluster.exceptions.TaskclusterRestFailure: Secret not found
(In reply to Marco Castelluccio [:marco] from comment #15)
(the secret being project/relman/code-coverage/runtime-production)
So turns out I can't access this secret in the legacy deployment of taskcluster atm [secrets service and clientID's are disabled so I can't login to view it]
Do you have the original secret, and if so can you send it to me in FirefoxSend and slack msg me the url?
Comment 17•5 years ago
|
||
Tom has a copy of the old secrets in SOPS, if that helps.
Comment 18•5 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #16)
So turns out I can't access this secret in the legacy deployment of taskcluster atm [secrets service and clientID's are disabled so I can't login to view it]
Do you have the original secret, and if so can you send it to me in FirefoxSend and slack msg me the url?
Unfortunately I don't have a copy of the original secret. Maybe Bastien has, or Tom can get it from SOPS.
(project/relman/code-coverage/runtime-production and project/relman/code-coverage/runtime-testing)
Comment 19•5 years ago
|
||
Suspicion is this was already copied but not before the task referenced in this comment ran...
Updated•5 years ago
|
Reporter | ||
Comment 20•5 years ago
|
||
I do not have a backup of our secrets.
We also need several Taskcluster auth clients on the firefox-ci instance to run parts of our stack on Heroku.
Fortunately the old clients are still accessible on the taskcluster.net instance, some firefox-ci admin needs to copy them (i don't have the scopes on project/relman
!):
- project/relman/code-coverage/backend-production
- project/relman/code-coverage/backend-testing
- project/relman/code-coverage/events-production
- project/relman/code-coverage/events-testing
Could you create those 4 clients and send me their access tokens through email (here is my GPG public key)
Reporter | ||
Comment 21•5 years ago
|
||
Johan created these clients, and i was able to restart cleanly the backend & events Heroku instance.
I confirm that the secrets are available, and work as expected.
I'm not marking this as fixed until a hook is triggered, but it should be OK soon.
Comment 22•5 years ago
|
||
Would it be possible to retrigger the job which failed?
Comment 23•5 years ago
|
||
(In reply to Bastien Abadie [:bastien] from comment #21)
A bit more context to Bastien's comment (in case I need to get back to this bug in the future).
Discussed with Bastien over Slack. This project, while developed on Github and thus using the Community TC instance, runs jobs on Firefox CI. The clients requested in comment 20 are needed in Firefox CI.
I then created them:
- https://firefox-ci-tc.services.mozilla.com/auth/clients/project%2Frelman%2Fcode-coverage%2Fbackend-production
- https://firefox-ci-tc.services.mozilla.com/auth/clients/project%2Frelman%2Fcode-coverage%2Fbackend-testing
- https://firefox-ci-tc.services.mozilla.com/auth/clients/project%2Frelman%2Fcode-coverage%2Fevents-production
- https://firefox-ci-tc.services.mozilla.com/auth/clients/project%2Frelman%2Fcode-coverage%2Fevents-testing
These clients required these 2 roles, which I created:
Reporter | ||
Comment 24•5 years ago
|
||
Comment 25•5 years ago
|
||
needinfo me if I can help with this, but it looks mostly firefox-ci now.
Reporter | ||
Comment 26•5 years ago
|
||
Comment 27•5 years ago
|
||
My understanding is that this is now done.
Reporter | ||
Comment 28•5 years ago
|
||
We still have some OOM issues on some workers, but most tasks run successfully. Thanks all for your help.
Description
•