Closed
Bug 1479601
Opened 7 years ago
Closed 7 years ago
Create an IAM role for testing data platform services e.g. telemetry-airflow
Categories
(Data Platform and Tools Graveyard :: Operations, enhancement, P1)
Data Platform and Tools Graveyard
Operations
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: amiyaguchi, Assigned: hwoo)
Details
(Whiteboard: [DataOps])
Attachments
(1 file)
It would be helpful to have a role for testing new jobs that are added to airflow. This role should have r/w access to `telemetry-batch-view`, read-only access to the rest of the buckets, and related permissions to spin up EMR machines.
This would make it easier to not accidentally overwrite production testing while testing out changes to the DAG.
Reporter | ||
Comment 1•7 years ago
|
||
Correction, the bucket should be `telemetry-test-bucket`.
Updated•7 years ago
|
Assignee: nobody → hwoo
Whiteboard: [DataOps]
Assignee | ||
Comment 2•7 years ago
|
||
I need a little more information because I don't know where everything lives.
- Do you need an IAM role(arn) or user(aws keypair)? I'm assuming this is for the cloudservices-aws-dev account because the EMR nodes for prod wtmo are spawned there? Do you know which aws account the telemetry-test-bucket lives in?
- Which other buckets need read only access? And in which aws accounts?
- I am not sure how production testing is done currently. Do we distinguish between production and dev testing by using a different s3 bucket only (use same prod wtmo)?
Assignee | ||
Updated•7 years ago
|
Priority: -- → P1
Reporter | ||
Comment 3•7 years ago
|
||
Yes, this is for the cloudservices-aws-dev account, but I don't know what account that telemetry-test-bucket lives in though.
The other buckets of interest are
* telemetry-parquet
* net-mozaws-prod-us-2-pipeline-analysis
* net-mozaws-data-us-west-2-ops-ci-artifacts
And testing is distinguished by the DEPLOY_ENVIRONMENT variable that's passed to airflow. This will set the public/private buckets appropriately in airflow. I test by setting the credentials in the Dockerfile.dev and starting a DAG.
Assignee | ||
Comment 4•7 years ago
|
||
It turns out that your dev aws keys already have all the required permissions in the dev account.
* telemetry-test-bucket - is in the dev account
* telemetry-parquet - is in the dev account - with full permissions. If this is a problem, we should discuss options.
* net-mozaws-prod-us-2-pipeline-analysis - does not exist
* net-mozaws-data-us-west-2-ops-ci-artifacts - bucket is public, with readonly permissions to everyone
Assignee | ||
Comment 5•7 years ago
|
||
* net-mozaws-prod-us-west-2-pipeline-analysis - in aws prod but not public
Assignee | ||
Comment 6•7 years ago
|
||
after discussing with Anthony, non access to the pipeline bucket is okay. The cross account s3 access is tricky since the dev root user already needs RW and we only want RO for this use case. Setting up sts assume role is an option but a pain.
Assignee | ||
Comment 7•7 years ago
|
||
regarding telemetry-parquet, we can modify the group that developers inherit to have a policy which explicitly denies write privs to the s3 bucket 'telemetry-parquet'. The only problem is that some backfills are done using developer keys.
So we can either leave things as is, or create some special backfill keys that your team has access to and restrict the existing dev accounts.
Flags: needinfo?(amiyaguchi)
Reporter | ||
Comment 8•7 years ago
|
||
There is a configurable EMR_SERVICE_ROLE variable currently set to EMR_DefaultRole in the dev Dockerfile. Is is possible to create an EMR service role for Airflow? It looks like the launched EMR clusters can assume a role with limited capacity, which avoids having to apply a group policy to the users.[1]
The current ideas for access control seem to involve quite a bit of setup or a change in process.
[1] https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-iam-roles.html
Flags: needinfo?(amiyaguchi) → needinfo?(hwoo)
Assignee | ||
Comment 9•7 years ago
|
||
Did you mean is it possible to create a iam role for local airflow different from the EMR_DefaultRole?
I've created arn:aws:iam::927034868273:policy/EMR_devrole. This has similar permissions to EMR_DefaultRole except it has write access to the telemetry-test-bucket instead of the telemetry-aggregates bucket.
This still doesn't fix the fact that dev keys are unrestricted. But if this works for you I'm okay with it.
thx
Flags: needinfo?(hwoo)
Comment 10•7 years ago
|
||
Reporter | ||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Updated•2 years ago
|
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•