Investigate configuration-only solution to simple testpilot pipelines

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
2 years ago
a month ago

People

(Reporter: harter, Assigned: harter)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

2 years ago
We'll be doing a lot more experimentation in 2017. 

Currently, analyzing testpilot data requires the analyst to filter and transform their experiment data using a scheduled ATMO job. It would be nice if this could be done without custom code and clusters.
(Assignee)

Updated

2 years ago
Assignee: nobody → rharter
Is this addressed by Bug 1333206?
Flags: needinfo?(rharter)
(Assignee)

Comment 2

2 years ago
We discussed this today. Documenting for posterity.

My goal is to be able to analyze experimental data from testpilot and testpilottest. Since many of the important testpilottest fields are experiment specific, we would need a new config for each experiment. It sounds like the deploy time for the solution described in Bug 1333206 would be prohibitive for this task.

I have an example implementation in this notebook[0]. The config structure makes it clear how we're mapping input fields to output columns.  

https://gist.github.com/harterrt/2a052f653c50df10920cfdb19c362438#file-cliqz-testpilot-pipeline-py-L79
Flags: needinfo?(rharter)
(Assignee)

Updated

2 years ago
See Also: → bug 1340595
(Assignee)

Comment 3

2 years ago
I refactored the code from the cliqz_pipeline and started a small helper library here:
https://github.com/harterrt/betl

I'll let this grow as needed. Closing this bug.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED

Updated

a month ago
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.