Component: General → Tiles: Ops
Product: Cloud Services → Content Services
unfortunately this has proven more difficult than expected. the redshift cluster is in our prod IAM, while the spark clusters are in our dev IAM. this means that firewall access is not as simple as adding a security group rule to both ends, because traffic is sent over the internet, and spark clusters do not have specific EIPs. We have a few options here: 1) we can create daily rollup files in s3, and give spark access to those. 2) we can set up some sort of nat, so that spark tries to access redshift from a consistent source ip. I think 1 will take less effort to set up, but I don't know what your requirements are for this.
Hi Daniel, sorry about the delay here and thanks for looking into this. 1) Sounds fine for me. Are there any more details you'd need from me to follow through on 1)?
not yet, i'll look into it and let you know here if I run into issues.
Hey relud, I just wanted to follow up on this and see what the status is on it? Thanks!
sorry, this fell off my to do list. here it is: https://github.com/mozilla-services/puppet-config/pull/2222 https://github.com/mozilla-services/svcops/pull/1209 it's up in stage, so you can see some stage outputs with: > aws s3 ls --recursive s3://net-mozaws-stage-us-east-1-pipeline-analysis/tiles/ in prod it would be > aws s3 ls --recursive s3://net-mozaws-prod-us-west-2-pipeline-analysis/tiles/
Summary: Setup security group + permissions for tiles redshift to be used with spark → Setup exports for tiles redshift data to be used with spark
fixed tables exporting: https://github.com/mozilla-services/puppet-config/pull/2224
this is exporting as expected in prod
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.