Investigate configuration-only solution to simple testpilot pipelines

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
P1
normal
RESOLVED FIXED
10 months ago
9 months ago

People

(Reporter: harter, Assigned: harter)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

10 months ago
We'll be doing a lot more experimentation in 2017. 

Currently, analyzing testpilot data requires the analyst to filter and transform their experiment data using a scheduled ATMO job. It would be nice if this could be done without custom code and clusters.
(Assignee)

Updated

10 months ago
Assignee: nobody → rharter
Is this addressed by Bug 1333206?
Flags: needinfo?(rharter)
(Assignee)

Comment 2

10 months ago
We discussed this today. Documenting for posterity.

My goal is to be able to analyze experimental data from testpilot and testpilottest. Since many of the important testpilottest fields are experiment specific, we would need a new config for each experiment. It sounds like the deploy time for the solution described in Bug 1333206 would be prohibitive for this task.

I have an example implementation in this notebook[0]. The config structure makes it clear how we're mapping input fields to output columns.  

https://gist.github.com/harterrt/2a052f653c50df10920cfdb19c362438#file-cliqz-testpilot-pipeline-py-L79
Flags: needinfo?(rharter)
(Assignee)

Updated

9 months ago
See Also: → bug 1340595
(Assignee)

Comment 3

9 months ago
I refactored the code from the cliqz_pipeline and started a small helper library here:
https://github.com/harterrt/betl

I'll let this grow as needed. Closing this bug.
Status: NEW → RESOLVED
Last Resolved: 9 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.