Open Bug 1766815 Opened 3 years ago Updated 3 years ago

Automatically grant read access of taskcluster worker AMIs between FirefoxCI and Stage-TC AWS accounts

Categories

(Release Engineering :: Firefox-CI Administration, defect, P3)

Tracking

(Not tracked)

People

(Reporter: pmoore, Unassigned)

References

(Regression)

Details

Currently Release Engineering perform smoke testing on the Stage-TC environment when there is a new release of taskcluster. See http://docs.mozilla-releng.net/en/latest/taskcluster/tc_staging.html#run-fxci-to-send-mozilla-central-tasks-to-the-staging-cluster for more details.

Recent FirefoxCI pushes are tested against Stage-TC, and therefore Stage-TC Worker Manager's AWS Provider AWS account (710952102342) needs read/launch permission to a subset of FirefoxCI Worker Manager's AWS account (692406183521) production images.

Either:

  1. Both OCC and the docker-worker AMI generation process should grant access to the worker AMIs they generate in aws account 692406183521, to aws account 710952102342 after creating them, or
  2. The fxci tool that currently copies taskcluster worker pool definitions, etc[1] between FirefoxCI and Stage-TC should take care of sharing the worker AMIs.

I would propose option 2). The fxci tool already synchronises state between the two clusters, and this would nicely collocate all of the the synchronisation code to a single location. For example, if at runtime it was discovered that certain images hadn't been shared between the aws accounts, it is a harder problem to solve when a separate process is required to share the images. The fxci tool could simply scan the worker pool definitions that it has copied across to staging, extract the AMI IDs, and then execute aws commands[2] to ensure the aws staging account has appropriate access to launch instances using those AMIs.

I'm not sure which subset of machine images from scm level 1,2,3 need to be shared, but that should be determinable from the above linked docs, or from the Release Engineering team, if it isn't clear.


[1] Based on this it may be that https://hg.mozilla.org/build/braindump/file/8b5a9f4328615aef1635cfe16c86b430e8a40b4b/taskcluster/copy_secrets_to_staging.py is used for copying secrets to staging, rather than fxci, I'm not sure.
[2] The command to share an AMI is:

aws --region "${REGION}" ec2 modify-image-attribute --image-id "${AMI_ID}" --launch-permission 'Add=[{UserId=710952102342}]'
Regressed by: 1766518
Severity: -- → S3
Priority: -- → P3

a) level 1 only works and seems to avoid security concerns.
b) Releng is the primary user of the fxci tool, and doesn't (shouldn't?) have access to the AWS accounts. If we add an fxci subcommand it will likely be a different team that will use it.

You need to log in before you can comment on or make changes to this bug.