Closed Bug 1175583 Opened 5 years ago Closed 4 years ago
Executive dashboard with v2 + v4 data (implementation)
+++ This bug was initially created as a clone of Bug #1174912 +++ We have a rollup based on v4 data: https://bugzilla.mozilla.org/show_bug.cgi?id=1155871 This bug is for the design and implementation of handling the v2->v4 transition, where we will need to combine data from both rollups.
There is a data set based on FHRv2 available here: s3://net-mozaws-prod-us-west-2-pipeline-analysis/mreid/exec_signal20150812/nightly/f/ This was based on today's version of the incoming FHR nightly data from: s3://mozillametricsfhrsamples/nightly/ Attached is a (sanitized) ipython notebook for the code.
Oops, accidentally swapped "example_docid" and "example_clientid" when sanitizing.
Attachment #8649470 - Attachment is obsolete: true
Trink needs the v2 data source, we need real access to it so that we can pull it like a normal stream.
Assignee: nobody → mreid
Priority: P2 → P1
Handy links... * Design Document: https://docs.google.com/document/d/1VzQHfzfA-S_lO2wpXDFjDzSJntJCMwP03TzefIj7RrE/edit#heading=h.69zh7w46tg7s * Review of v4 rollup, which surfaced some issues we need to address: https://etherpad-mozilla.org/exex-summary * Submission date proposal: https://docs.google.com/document/d/1mLP4DY-FIQHof6Nxh2ioVQ-ZvvlnIZ_6yLqYp8idXG4/edit * Schema for v4 stream: https://mana.mozilla.org/wiki/display/CLOUDSERVICES/Executive+Summary+Schema * Earlier work where we compared v4 and v2 streams (aka "fruit rollup"): https://etherpad-mozilla.org/executive-rollup-v2-v4
Saptarshi has agreed to take on the development of the "Digest FHR" code by basing it on the existing executive report R code. The output of that job will contain all the data from the full deorphaned FHR data set. The process for combining the FHR data set with the Unified Telemetry data set will be as follows: The FHR output will include one extra field for "expected to exist in v4" which we will use to filter out records which *should* appear in the v4 data. This field will be defined based on when the Unified Telemetry code landed in Firefox and whether Telemetry was enabled or not. I made an initial pass at determining the specifics of which version / channel / build combinations are expected to submit Unified Telemetry, and Georg has agreed to help improve the list. @Georg, can you post a link to that here when available?
I did a pass on this: https://docs.google.com/spreadsheets/d/1RadVpeg0cUBBseiEh37FjmWMsCz7rfbnE2LgtcyI3JE/ This lists the versions and buildids from which on the required data points are available or fixed. The data points that we need are listed in the "Digest Unified Telemetry" section in the design doc: https://docs.google.com/document/d/1VzQHfzfA-S_lO2wpXDFjDzSJntJCMwP03TzefIj7RrE/edit#heading=h.y75928usu2ym I arrived at those versions & build ids by going through the "phase 3" bug dependency list: https://bugzilla.mozilla.org/showdependencytree.cgi?id=1120356&maxdepth=3&hide_resolved=0 ... and checking for the last bugs that affected those data points: https://public.etherpad-mozilla.org/p/YqZ7yWjsht There might have been quality improvements after those bugs, but i assume that is negligible in this context.
Saptarshi and I have been working on the v2 export code here: https://github.com/mozilla/fhr-v2-v4-executive Currently we have a v2-based data set being produced, but some of the records have an unexpected number of columns. I am able to run the existing executive report code on this new data set, and sample outputs will be available shortly.  https://github.com/mozilla-services/data-pipeline/blob/master/heka/sandbox/filters/firefox_executive_report.lua
will make sure the last few remaining issues are captured, closing this as implemented.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.