Set up a S3 bucket with read/write permissions for storing intermediate and result data



Cloud Services
Metrics: Pipeline
3 years ago
3 years ago


(Reporter: mreid, Unassigned)


Firefox Tracking Flags

(Not tracked)




3 years ago
We need a bucket in which to store report outputs, intermediate analysis data, results, and other derived data sets.

The bucket should be readable and writable by users of telemetry-dash.m.o (Spark in particular), as well as by the pipeline-dev-iam-access-IamRole-UVGYDHTV1VZD role in "new dev".

Comment 1

3 years ago
Would it be possible to expedite this? All of my v4 validation work requires consolidating v4 data by clientId, deduping, etc, which can take hours-- and since telemetry self-serve analysis clusters are killed every 24hrs, I have to redo this *every day*, which means that many days I can't get around to doing actual work.

Until Mark has built the tools to just provide consolidated v4 data, I really need a place to dump my cleaned data sets so that I only have to run these cleaning scripts like once per week or every few days, and can operate on the cleaned data without having to completely reprocess it every time. Can we aim to have this done this week?
Priority: -- → P1

Comment 2

3 years ago
The bucket net-mozaws-prod-us-west-2-pipeline-analysis has been created for this purpose. It should have S3:GetBucketLocation, S3:ListBucket, S3:PutObject, S3:GetObject, S3:DeleteObject permissions from the telemetry-spark-emr role in old dev and pipeline-dev-iam-access-IamRole-UVGYDHTV1VZD in new dev, in addition to Saptarshi's IAM.

Comment 3

3 years ago
For now, please prefix any temp/intermediate data with your username to avoid conflicts and help keep things organized. So s3://net-mozaws-prod-us-west-2-pipeline-analysis/mreid/awesome_data_set_3/...

Comment 4

3 years ago
Brendan, please reopen if you have any issues using the bucket for intermediate storage.
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.