Closed Bug 1479601 Opened 7 years ago Closed 7 years ago

Create an IAM role for testing data platform services e.g. telemetry-airflow

Categories

(Data Platform and Tools Graveyard :: Operations, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: amiyaguchi, Assigned: hwoo)

Details

(Whiteboard: [DataOps])

Attachments

(1 file)

It would be helpful to have a role for testing new jobs that are added to airflow. This role should have r/w access to `telemetry-batch-view`, read-only access to the rest of the buckets, and related permissions to spin up EMR machines. This would make it easier to not accidentally overwrite production testing while testing out changes to the DAG.
Correction, the bucket should be `telemetry-test-bucket`.
Assignee: nobody → hwoo
Whiteboard: [DataOps]
I need a little more information because I don't know where everything lives. - Do you need an IAM role(arn) or user(aws keypair)? I'm assuming this is for the cloudservices-aws-dev account because the EMR nodes for prod wtmo are spawned there? Do you know which aws account the telemetry-test-bucket lives in? - Which other buckets need read only access? And in which aws accounts? - I am not sure how production testing is done currently. Do we distinguish between production and dev testing by using a different s3 bucket only (use same prod wtmo)?
Priority: -- → P1
Yes, this is for the cloudservices-aws-dev account, but I don't know what account that telemetry-test-bucket lives in though. The other buckets of interest are * telemetry-parquet * net-mozaws-prod-us-2-pipeline-analysis * net-mozaws-data-us-west-2-ops-ci-artifacts And testing is distinguished by the DEPLOY_ENVIRONMENT variable that's passed to airflow. This will set the public/private buckets appropriately in airflow. I test by setting the credentials in the Dockerfile.dev and starting a DAG.
It turns out that your dev aws keys already have all the required permissions in the dev account. * telemetry-test-bucket - is in the dev account * telemetry-parquet - is in the dev account - with full permissions. If this is a problem, we should discuss options. * net-mozaws-prod-us-2-pipeline-analysis - does not exist * net-mozaws-data-us-west-2-ops-ci-artifacts - bucket is public, with readonly permissions to everyone
* net-mozaws-prod-us-west-2-pipeline-analysis - in aws prod but not public
after discussing with Anthony, non access to the pipeline bucket is okay. The cross account s3 access is tricky since the dev root user already needs RW and we only want RO for this use case. Setting up sts assume role is an option but a pain.
regarding telemetry-parquet, we can modify the group that developers inherit to have a policy which explicitly denies write privs to the s3 bucket 'telemetry-parquet'. The only problem is that some backfills are done using developer keys. So we can either leave things as is, or create some special backfill keys that your team has access to and restrict the existing dev accounts.
Flags: needinfo?(amiyaguchi)
There is a configurable EMR_SERVICE_ROLE variable currently set to EMR_DefaultRole in the dev Dockerfile. Is is possible to create an EMR service role for Airflow? It looks like the launched EMR clusters can assume a role with limited capacity, which avoids having to apply a group policy to the users.[1] The current ideas for access control seem to involve quite a bit of setup or a change in process. [1] https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-iam-roles.html
Flags: needinfo?(amiyaguchi) → needinfo?(hwoo)
See Also: → 1475630
Did you mean is it possible to create a iam role for local airflow different from the EMR_DefaultRole? I've created arn:aws:iam::927034868273:policy/EMR_devrole. This has similar permissions to EMR_DefaultRole except it has write access to the telemetry-test-bucket instead of the telemetry-aggregates bucket. This still doesn't fix the fact that dev keys are unrestricted. But if this works for you I'm okay with it. thx
Flags: needinfo?(hwoo)
Blocks: 1475630
See Also: 1475630
No longer blocks: 1475630
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: