Closed
Bug 1481274
Opened 7 years ago
Closed 7 years ago
Add tbv and mozetl compatible Databricks operator to Airflow
Categories
(Data Platform and Tools :: General, enhancement, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: amiyaguchi, Assigned: amiyaguchi)
References
Details
(Whiteboard: [DataPlatform])
Attachments
(1 file)
There is a Databricks operator in Airflow that was proven to work during the POC.[1] This should be added to telemetry-airflow in the form of a EMRSparkOperator compatible operator so we can move jobs in an opaque way.
Some benefits (lifted from bug 1479132):
* Ephemeral instances boostrap in minutes
* Compute costs are charged by the minute
* Auto-scaling can be enabled to avoid guessing cluster sizes
[1] https://gist.github.com/acmiyaguchi/205519fb1d7c8f5dcb61fb92b6236b76
| Assignee | ||
Updated•7 years ago
|
Assignee: nobody → amiyaguchi
Priority: -- → P1
Comment 1•7 years ago
|
||
| Assignee | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
Component: Scheduling → General
You need to log in
before you can comment on or make changes to this bug.
Description
•