Closed Bug 1318762 Opened 8 years ago Closed 8 years ago

Add the ability to run more complex python + spark jobs

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 1340595

People

(Reporter: mreid, Unassigned)

References

Details

Mark Reid [:mreid]

Reporter

Description

•

8 years ago

We have the ability to run Jupyter Notebooks on both ATMO and Airflow, but they are not conducive to writing modular, well-tested, reusable code. They are great for simple tasks but as the notebook gets more complex, the convenience factor becomes less compelling than stability and correctness. It would be nice to have something for Python jobs similar to what we have with Scala jobs in telemetry-batch-view[1] where we can structure the code and tests nicely, but still easily run a job on an ATMO-launched cluster or via Airflow. We should provide a straightforward path from a functional Jupyter Notebook to a more robust python job. It could be done within the existing telemetry-batch-view repo, python_moztelemetry, one of the various other existing repos, or we could create a new repo for python analyses. The important parts in my mind are: - test harness that runs on push (and includes test coverage info) - code can be structured as more than one python source file - code can easily be reused across analyses - code can easily be run on atmo and by airflow [1] https://github.com/mozilla/telemetry-batch-view

Roberto Agostino Vitillo (:rvitillo)

Updated

•

8 years ago

Blocks: 1284522

Katie Parlante

Updated

•

8 years ago

Priority: -- → P2

Mauro Doglio [:mdoglio]

Comment 1

•

8 years ago

This is covered by bug 1340595

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → DUPLICATE

BMO Automation

Updated

•

6 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Add the ability to run more complex python + spark jobs

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

Tracking

(Not tracked)

People

(Reporter: mreid, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Updated