Closed Bug 1481274 Opened 7 years ago Closed 7 years ago

Add tbv and mozetl compatible Databricks operator to Airflow

Categories

(Data Platform and Tools :: General, enhancement, P1)

enhancement
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: amiyaguchi, Assigned: amiyaguchi)

References

Details

(Whiteboard: [DataPlatform])

Attachments

(1 file)

There is a Databricks operator in Airflow that was proven to work during the POC.[1] This should be added to telemetry-airflow in the form of a EMRSparkOperator compatible operator so we can move jobs in an opaque way. Some benefits (lifted from bug 1479132): * Ephemeral instances boostrap in minutes * Compute costs are charged by the minute * Auto-scaling can be enabled to avoid guessing cluster sizes [1] https://gist.github.com/acmiyaguchi/205519fb1d7c8f5dcb61fb92b6236b76
Blocks: 1479132
Whiteboard: [DataPlatform]
Blocks: 1464484
Assignee: nobody → amiyaguchi
Priority: -- → P1
See Also: → 1484331
See Also: → 1484334
Blocks: 1484337
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Scheduling → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: