Closed
Bug 1590825
Opened 6 years ago
Closed 5 years ago
Migrate TAAR-Lite, TAAR Similarity, Collaborative and Dynamo jobs to GCP
Categories
(Data Platform and Tools Graveyard :: Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: vng, Assigned: vng)
References
Details
Attachments
(1 file)
These are the last 3 jobs that need to be migrated to GCP from AWS+EMR.
Requirements:
Similarity
This is a PySpark job.
S3 reads:
- s3://telemetry-parquet/clients_daily/v6/
- s3://telemetry-parquet/telemetry-ml/addon_recommender/
S3 write to: - s3://telemetry-parquet/taar/similarity/
Collaborative job
This is a Scala Spark job, an experimental PySpark branch exists but
has not been deployed because of a breaking Spark change in EMR.
Let's try the PySpark port again with GCP and it's version of Spark.
S3 reads:
- s3://telemetry-parquet/clients_daily/v6/
- s3://telemetry-parquet/telemetry-ml/addon_recommender/
S3 write to: - s3://telemetry-private-analysis-2/telemetry-ml/addon_recommender/
Dynamo job
S3 reads:
- s3://telemetry-parquet/clients_daily/v6/
Dynamo writes to the production instance: - Region: us-west-2
- table: taar_addon_data_20180206
- IAM Role: "arn:aws:iam::361527076523:role/taar-write-dynamodb-from-dev"
Note that the same table exists in both dev and prod, but we only care
about writing to prod.
The biggest risk here is the Collaborative job as the pyspark rewrite
was never deployed because of some breaking Spark changes in AWS EMR.
| Assignee | ||
Updated•6 years ago
|
Assignee: nobody → vng
Updated•6 years ago
|
Blocks: data-migration-gcp-misc
Updated•6 years ago
|
No longer blocks: data-migration-gcp-misc
| Assignee | ||
Updated•6 years ago
|
Summary: Migrate TAAR Similarity, Collaborative and Dynamo jobs to GCP → Migrate TAAR-Lite, TAAR Similarity, Collaborative and Dynamo jobs to GCP
Comment 1•6 years ago
|
||
I will be handling running the scala Collaborative job with dataproc & gcs, without migrating to pyspark.
Comment 2•5 years ago
|
||
Updated•5 years ago
|
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•