Closed
Bug 1329844
Opened 7 years ago
Closed 7 years ago
Productionize Topline report
Categories
(Data Platform and Tools :: General, defect, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: amiyaguchi, Assigned: amiyaguchi)
References
Details
The Topline (executive) report is now taking advantage of the main_summary for generating aggregates. This involves scheduling an airflow job for the data set, and uploading the results of the job to the production location.
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Assignee | ||
Updated•7 years ago
|
Priority: P1 → P2
Assignee | ||
Updated•7 years ago
|
Points: 1 → 3
Component: Metrics: Pipeline → Datasets: General
Priority: P2 → P1
Product: Cloud Services → Data Platform and Tools
Comment 1•7 years ago
|
||
While it's still fresh in my mind, here are the steps we discussed: - Deploy the topline job to airflow to generate weekly and monthly output in Parquet form. Backfill as far as makes sense. - Write code to output a CSV view of "all time" based on parquet data with a new naming standard ("topline-weekly.csv" and "topline-monthly.csv" seem reasonable). - Update Diagnostic Dashboard[1] to display new csv data. - Backfill parquet data from historic CSV files[2], dropping columns that are no longer being generated (such as five-of-seven, inactives). - Compare new and old datasets for consistency via Diagnostic Dashboard. - Update Firefox Dashboard[3] to use new naming standard. - Run old and new code in parallel for some period of time in case there are problems. - Stop old code. - Celebrate! [1] https://github.com/mozilla/diagnostic-data-viewer [2] Using data from s3://telemetry-private-analysis-2/executive-report-<period>/data/executive_report.<period>.yyyymmdd.csv [3] https://github.com/mozilla/firefox-dashboard
Assignee | ||
Comment 2•7 years ago
|
||
The backfilling process and dashboard output looks pretty consistent. [1] There is a little bit of noise due to floating point arithmetic, but this ends up being a total of 12 minutes throughout the whole dataset. I'm going to go ahead and copy over the historical backfill to the primary location. [1] https://gist.github.com/acmiyaguchi/a8f18830a4d8ba3fbae0790ba4503658
Comment 3•7 years ago
|
||
This is done! The Firefox Dashboard[1] is now using the topline data source. The previous data "v4-weekly.csv" and "v4-monthly.csv" is still available for comparison purposes, but will likely disappear in the next few months. [1] https://metrics.services.mozilla.com/firefox-dashboard/
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•