Closed Bug 1590825 Opened 6 years ago Closed 5 years ago

Migrate TAAR-Lite, TAAR Similarity, Collaborative and Dynamo jobs to GCP

Categories

(Data Platform and Tools Graveyard :: Operations, task)

Other
Other
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vng, Assigned: vng)

References

Details

Attachments

(1 file)

These are the last 3 jobs that need to be migrated to GCP from AWS+EMR.

Requirements:

Similarity

This is a PySpark job.
S3 reads:

  • s3://telemetry-parquet/clients_daily/v6/
  • s3://telemetry-parquet/telemetry-ml/addon_recommender/
    S3 write to:
  • s3://telemetry-parquet/taar/similarity/

Collaborative job

This is a Scala Spark job, an experimental PySpark branch exists but
has not been deployed because of a breaking Spark change in EMR.
Let's try the PySpark port again with GCP and it's version of Spark.

S3 reads:

  • s3://telemetry-parquet/clients_daily/v6/
  • s3://telemetry-parquet/telemetry-ml/addon_recommender/
    S3 write to:
  • s3://telemetry-private-analysis-2/telemetry-ml/addon_recommender/

Dynamo job

S3 reads:

  • s3://telemetry-parquet/clients_daily/v6/
    Dynamo writes to the production instance:
  • Region: us-west-2
  • table: taar_addon_data_20180206
  • IAM Role: "arn:aws:iam::361527076523:role/taar-write-dynamodb-from-dev"
    Note that the same table exists in both dev and prod, but we only care
    about writing to prod.

The biggest risk here is the Collaborative job as the pyspark rewrite
was never deployed because of some breaking Spark changes in AWS EMR.

Assignee: nobody → vng
Summary: Migrate TAAR Similarity, Collaborative and Dynamo jobs to GCP → Migrate TAAR-Lite, TAAR Similarity, Collaborative and Dynamo jobs to GCP
Blocks: 1594807

I will be handling running the scala Collaborative job with dataproc & gcs, without migrating to pyspark.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: