Closed Bug 1664564 Opened 5 years ago Closed 5 years ago

Use Spark instead of Gnu Parallel in bin/process script for prio-procesor

Categories

(Data Platform and Tools :: General, task, P1)

task
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: amiyaguchi, Assigned: amiyaguchi)

References

Details

Attachments

(1 file)

This replaces the old processing logic with Spark. The processing pipeline will be either able to run using container or to shell out to a dataproc cluster for processing.

Attached file GitHub Pull Request

This has been shipped as the v3.0.0 tag. This will need to be deployed to Airflow, but with modifications due to the new convention that makes the processing idempotent.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Blocks: 1666946
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: