Closed
Bug 1336617
Opened 8 years ago
Closed 8 years ago
Investigate configuration-only solution to simple testpilot pipelines
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: harter, Assigned: harter)
References
Details
We'll be doing a lot more experimentation in 2017.
Currently, analyzing testpilot data requires the analyst to filter and transform their experiment data using a scheduled ATMO job. It would be nice if this could be done without custom code and clusters.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → rharter
Assignee | ||
Comment 2•8 years ago
|
||
We discussed this today. Documenting for posterity.
My goal is to be able to analyze experimental data from testpilot and testpilottest. Since many of the important testpilottest fields are experiment specific, we would need a new config for each experiment. It sounds like the deploy time for the solution described in Bug 1333206 would be prohibitive for this task.
I have an example implementation in this notebook[0]. The config structure makes it clear how we're mapping input fields to output columns.
https://gist.github.com/harterrt/2a052f653c50df10920cfdb19c362438#file-cliqz-testpilot-pipeline-py-L79
Flags: needinfo?(rharter)
Assignee | ||
Comment 3•8 years ago
|
||
I refactored the code from the cliqz_pipeline and started a small helper library here:
https://github.com/harterrt/betl
I'll let this grow as needed. Closing this bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•