ci-admin: manage workerTypes
Categories
(Firefox Build System :: Task Configuration, task)
Tracking
(Not tracked)
People
(Reporter: dustin, Assigned: dustin)
References
Details
Attachments
(5 files, 1 obsolete file)
Assignee | ||
Comment 1•7 years ago
|
||
Comment 2•7 years ago
|
||
Assignee | ||
Comment 3•7 years ago
|
||
Comment 4•7 years ago
|
||
Assignee | ||
Comment 5•7 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 6•6 years ago
|
||
Here's the descriptions of the AMIs for each of the gecko-related workerTypes:
ami-test 692406183521/taskcluster-docker-worker-overlay2-1555602245
ami-test-pv 692406183521/taskcluster-docker-worker-overlay2-PV-1516498823
android-api-15 692406183521/taskcluster-docker-worker-overlay2-1555602245
balrog 692406183521/taskcluster-docker-worker-overlay2-1555602245
dbg-linux32 692406183521/taskcluster-docker-worker-overlay2-1555602245
dbg-linux64 692406183521/taskcluster-docker-worker-overlay2-1555602245
dbg-macosx64 692406183521/taskcluster-docker-worker-overlay2-1555602245
desktop-test 692406183521/taskcluster-docker-worker-overlay2-1555602245
desktop-test-large 692406183521/taskcluster-docker-worker-overlay2-1555602245
desktop-test-xlarge 692406183521/taskcluster-docker-worker-overlay2-1555602245
fp-gecko-3-b-linux 692406183521/taskcluster-docker-worker-overlay2-trusted-1532606009
gecko-1-b-android 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-b-linux 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-b-linux-gps 692406183521/taskcluster-docker-worker-overlay2-1525347283
gecko-1-b-linux-large 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-b-linux-usw2 692406183521/taskcluster-docker-worker-overlay2-1532606008
gecko-1-b-linux-xlarge 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-b-macosx64 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-b-win2012 Gecko builder for Windows; TaskCluster worker type: gecko-1-b-win2012, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-1-b-win2012-beta Gecko builder for Windows; TaskCluster worker type: gecko-1-b-win2012-beta, OCC version 1efc404df03a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/1efc404df03a9e392b7ae67319bf7a3055bf25d8}
gecko-1-decision 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-images 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-1-linux-shared 692406183521/taskcluster-docker-worker-overlay2-1521032225
gecko-2-b-android 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-b-linux 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-b-linux-large 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-b-linux-xlarge 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-b-macosx64 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-b-win2012 Gecko builder for Windows; TaskCluster worker type: gecko-2-b-win2012, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-2-decision 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-images 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-2-linux-shared 692406183521/taskcluster-docker-worker-overlay2-1521032225
gecko-3-b-android 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-b-linux 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-b-linux-large 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-b-linux-xlarge 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-b-macosx64 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-b-win2012 Gecko builder for Windows; TaskCluster worker type: gecko-3-b-win2012, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-3-b-win2012-c4 Gecko builder for Windows; TaskCluster worker type: gecko-3-b-win2012-c4, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-3-b-win2012-c5 Gecko builder for Windows; TaskCluster worker type: gecko-3-b-win2012-c5, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-3-decision 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-images 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-3-linux-shared 692406183521/taskcluster-docker-worker-overlay2-1521032225
gecko-3-t-linux-xlarge 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
gecko-misc 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-t-linux-large 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-t-linux-xlarge 692406183521/taskcluster-docker-worker-overlay2-1555602245
gecko-t-win10-64 Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-t-win10-64-alpha Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-alpha, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-t-win10-64-beta Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-beta, OCC version 1efc404df03a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/1efc404df03a9e392b7ae67319bf7a3055bf25d8}
gecko-t-win10-64-cu Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-cu, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-t-win10-64-gpu Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-gpu, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-t-win10-64-gpu-a Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-gpu-a, OCC version cae6a929f27c, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/cae6a929f27cd1c24b2a3a7302d25abc1e91ad4b}
gecko-t-win10-64-gpu-b Gecko tester for Windows 10 64 bit; TaskCluster worker type: gecko-t-win10-64-gpu-b, OCC version 1efc404df03a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/1efc404df03a9e392b7ae67319bf7a3055bf25d8}
gecko-t-win7-32 Gecko test worker for Windows 7 32 bit; TaskCluster worker type: gecko-t-win7-32, OCC version 474f7675e40a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/474f7675e40a07399d9c56562930edb1af148194}
gecko-t-win7-32-beta Gecko test worker for Windows 7 32 bit; TaskCluster worker type: gecko-t-win7-32-beta, OCC version 421ceabe3446, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/421ceabe3446014a4ebb06672ffee975b06cd3a1}
gecko-t-win7-32-cu Gecko test worker for Windows 7 32 bit; TaskCluster worker type: gecko-t-win7-32-cu, OCC version 474f7675e40a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/474f7675e40a07399d9c56562930edb1af148194}
gecko-t-win7-32-gpu Gecko test worker for Windows 7 32 bit; TaskCluster worker type: gecko-t-win7-32-gpu, OCC version 474f7675e40a, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/474f7675e40a07399d9c56562930edb1af148194}
gecko-t-win7-32-gpu-b Gecko test worker for Windows 7 32 bit; TaskCluster worker type: gecko-t-win7-32-gpu-b, OCC version a7677c181815, https://github.com/mozilla-releng/OpenCloudConfig.git/tree/a7677c181815298e015a731dbecba20e9c903ff9}
github-worker 692406183521/taskcluster-docker-worker-overlay2-1555602245
hg-worker 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-1-b-andrcmp 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-1-b-fenix 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-1-b-ref-browser 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-1-decision 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-1-images 692406183521/taskcluster-docker-worker-overlay2-1555602245
mobile-3-b-andrcmp 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
mobile-3-b-fenix 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
mobile-3-b-ref-browser 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
mobile-3-decision 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
mobile-3-images 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
mozillaonline-1-b-linux 692406183521/taskcluster-docker-worker-overlay2-1555602245
mozillaonline-3-b-linux 692406183521/taskcluster-docker-worker-overlay2-1555602245
mulet-debug 692406183521/taskcluster-docker-worker-overlay2-1555602245
mulet-opt 692406183521/taskcluster-docker-worker-overlay2-1555602245
nss-win2012r2 firefox desktop builds on windows - taskcluster worker - version CdcEDDWnQemvUVYvrNE_RA
nss-win2012r2-new firefox desktop builds on windows - taskcluster worker - version Y8MTXWoeTOyC1y2r8H-L7Q
opt-linux32 692406183521/taskcluster-docker-worker-overlay2-1555602245
opt-linux64 692406183521/taskcluster-docker-worker-overlay2-1555602245
opt-macosx64 692406183521/taskcluster-docker-worker-overlay2-1555602245
rustbuild 692406183521/taskcluster-docker-worker-overlay2-1555602245
spidermonkey 692406183521/taskcluster-docker-worker-overlay2-1555602245
symbol-upload 692406183521/taskcluster-docker-worker-overlay2-1555602245
taskcluster-generic 692406183521/taskcluster-docker-worker-overlay2-1555602245
taskcluster-images 692406183521/taskcluster-docker-worker-overlay2-trusted-1555602246
version-control-tools 692406183521/taskcluster-docker-worker-overlay2-1532606008
win2012r2 firefox desktop builds on windows - taskcluster worker - version S7C-no6lTuiuECEONV4qOA
We won't be managing windows workertypes initially, so those are safe to ignore, and the rest look pretty straightforward.
Assignee | ||
Comment 7•6 years ago
|
||
Assignee | ||
Comment 8•6 years ago
|
||
Depends on D32081
Assignee | ||
Comment 9•6 years ago
|
||
Depends on D32082
Assignee | ||
Comment 10•6 years ago
|
||
Assignee | ||
Comment 11•6 years ago
|
||
I had a hard think about this and came up with a pretty minimal approach to start with, allowing room to grow.
There are a few things I would like to accomplish:
- get a human-written, human-reviewed change history for worker types so we're not wondering "who broke X" or "why Y is configured that way"
- simplify configuration of workerTypes so it's more human-readable, with fewer illegible
sg-abcdef
andami-12345678
identifiers - support the transition to worker-manager at the same time as we transition from https://taskcluster.net to the new deployment
This set of patches is a start toward the first point, covering all docker-worker workers for which there are scope grants in grants.yml
. It does not address the second point at all. I think we should handle that in worker-manager, although I don't think anyone knows how yet. It does address the second point in that it treats aws-provisioner workerTypes as distinct from worker-manager workerTypes, and will configure only the former on https://taskcluster.net, and (once they are implemented) only the latter on any other deployment. Since we're also changing the provisionerId of all of those workers, that works out just fine.
The worker-types.yml
in the fourth patch is generated from a script and when run through ci-admin diff
produces no differences. AWS Provisioner has some janky bits in that there are a bunch of unused properties and lots of rarely-used properties that default to empty containers. So I added some light editing to the to_api
/from_api
functions and to the generator to allow worker-types.yml
to omit all but the relevant details. With this in place, at least a single worker-type in that file should fit on a single editor screen.
Future plans:
- update the docker-worker deployment process to modify ci-configuration instead of calling API methods directly
- bring generic-worker workers into the fold, adding them to ci-configuration and modifying occ and the generic-worker deploy process to modify ci-configuration instead of calling API methods directly
- configure worker-manager workerTypes with this tool, too
- possibly add code to support whatever we do to address the second point above ("simplify configuration ..")
Before I get into code review, I'd like a, say, 50% review from a broad swath of you as to the overall direction here. You can probably get that just by looking at D32085 and reading this comment.
Comment 12•6 years ago
|
||
I cursorily looked over the non-D32085 patches and they make sense to me. D32085 itself is really neat and something I've wanted for a looong time. It is quite easy to read compared to aws-provisioner UI, even without the improvements you plan on making.
On point 2 -- figuring out where to draw the line for what magic worker-manager provides will be quite interesting. I don't have a great idea of how that process will work in the cleanest way. I think most of the identifiers like sg-abcdef and ami-12345678 will come from tooling like terraform/packer/etc so keeping the magic in ci-admin based on config files actually still makes the most sense to me? I am not holding this position strongly however.
Comment 14•6 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #11)
The
worker-types.yml
in the fourth patch is generated from a script and when run throughci-admin diff
produces no differences.
Is the script it was generated from included in the patchset?
Assignee | ||
Comment 15•6 years ago
|
||
Is the script it was generated from included in the patchset
No, the script is a terrible hack job. I only mentioned that to indicate that (a) I can "rebase" this patchset over any worker-type modifications that occur before it lands and (b) this has been tested to have zero impact on production. I'll attach the script here but it's not suitable for landing anywhere.
Assignee | ||
Comment 16•6 years ago
|
||
used-workertypes.txt
is the list of workertypes for which scopes are granted, and existing-workertypes.json
is the set of existing workertypes downloaded out-of-band from the production AWS provisioner. The images-*.json
are lists of AMIs pulled from AWS with the AMI description in them.
Comment 17•6 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] pronoun: he from comment #11)
I had a hard think about this and came up with a pretty minimal approach to
start with, allowing room to grow.There are a few things I would like to accomplish:
- get a human-written, human-reviewed change history for worker types so
we're not wondering "who broke X" or "why Y is configured that way"- simplify configuration of workerTypes so it's more human-readable, with
fewer illegiblesg-abcdef
andami-12345678
identifiers- support the transition to worker-manager at the same time as we
transition from https://taskcluster.net to the new deploymentThis set of patches is a start toward the first point, covering all
docker-worker workers for which there are scope grants ingrants.yml
. It
does not address the second point at all. I think we should handle that in
worker-manager, although I don't think anyone knows how yet. It does
address the second point in that it treats aws-provisioner workerTypes as
distinct from worker-manager workerTypes, and will configure only the former
on https://taskcluster.net, and (once they are implemented) only the latter
on any other deployment. Since we're also changing the provisionerId of all
of those workers, that works out just fine.The
worker-types.yml
in the fourth patch is generated from a script and
when run throughci-admin diff
produces no differences. AWS Provisioner
has some janky bits in that there are a bunch of unused properties and lots
of rarely-used properties that default to empty containers. So I added some
light editing to theto_api
/from_api
functions and to the generator to
allowworker-types.yml
to omit all but the relevant details. With this in
place, at least a single worker-type in that file should fit on a single
editor screen.Future plans:
- update the docker-worker deployment process to modify ci-configuration
instead of calling API methods directly- bring generic-worker workers into the fold, adding them to
ci-configuration and modifying occ and the generic-worker deploy process to
modify ci-configuration instead of calling API methods directly- configure worker-manager workerTypes with this tool, too
- possibly add code to support whatever we do to address the second point
above ("simplify configuration ..")Before I get into code review, I'd like a, say, 50% review from a broad
swath of you as to the overall direction here. You can probably get that
just by looking at D32085 and reading this comment.
I would like to leave the management of AMI IDs to docker-worker. Today it is pretty simple and fast to rollback a deployment in case of a bustage. If we delegate to ci-admin, this would mean a new patch to revert the changes, get it reviewed and then someone with powers run it.
Comment 18•6 years ago
|
||
(In reply to Wander Lairson Costa [:wcosta] from comment #17)
I would like to leave the management of AMI IDs to docker-worker. Today it is pretty simple and fast to rollback a deployment in case of a bustage. If we delegate to ci-admin, this would mean a new patch to revert the changes, get it reviewed and then someone with powers run it.
How does this work in generic-worker?
I don't want to design this system around docker-worker, but if it also benefits generic-worker deployments, then we should consider it.
Assignee | ||
Comment 19•6 years ago
|
||
Wander and I chatted a little. There's some circumstances that allay the rollback concerns: rolling back a ci-configuration change is a normal and quick process (and can even be done without landing the rollback, since ci-admin is run by hand). Also, once staging is running in the next month or so, we'll be able to test -- both at the ami-test level and at the level of a full gecko push -- in staging, so hopefully there will be fewer calls for rollbacks. And, ci-admin can configure staging, too, so we can be confident it's running the same configuration.
We talked about how a docker-worker build process will feed into this configuration. Editing yaml files in place is not a great option. Wander suggested having ci-admin look into the GitHub releases (https://github.com/taskcluster/docker-worker/releases/, but via the API) to translate a version string in worker-types.yml
into a set of AMI IDs. That's a little unusual for ci-admin since it means it's reaching out to an external service for deployment data, but maybe that's OK. Tom, what do you think? An alternative might be to reflect that release data into ci-configuration as static data, using some script to update that static data.
Comment 20•6 years ago
|
||
This is looking good. A few comments:
- whether or not we worker-manager ends up having configuration for subnets and the like, I don't think we want to configure those by region in each worker type. Similarly, given that we only have a handful sets of AMIs, I don't think we want those listed in the worker-types themselves. That suggests that we want an additional set configurations for both of those, that get referenced by the worker-types.
- I'd rather not refer directly to github release artifacts, largely because they are mutable, and so while I don't expect them to be deliberately changed, the history of them aren't easily auditable. So, I'd prefer to have a script that pulls the AMI info and vendors it in the ci-config repository. And we can investigate automating running that with a scriptworker at some point in the future.
Assignee | ||
Comment 21•6 years ago
|
||
To the first point, I played around with that for a bit, and it's not clear how best to handle that, or where to stop with the abstraction. Since this is to support configs for a deprecated service (aws-provisioner), and since we don't have any history for why worker-type X is subtly different from worker-type Y, I don't think it's worth over-engineering the DRYness of the representation. It's also a little premature, as for example generic-worker uses a distinct AMI and distinct userdata for each workerType, and that won't abstract the same way.
In general, I see this as an initial beachhead into managing workertypes (or as we will call them soon, worker pools). There are a bunch of next-steps to take, and as we do so I'd like to stick to the easy verification of "changes nothing". In a few weeks, as we implement worker-manager, I'd like to simplify the config definitions within that tool, hopefully resulting in simpler configs in ci-configuration as well. We're still not sure how to do that.
Vendoring the release data sounds like a good idea, too. Since docker-worker is also not long for this world, I don't think there's a use in automating it.
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 22•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Updated•6 years ago
|
Description
•