Closed
Bug 1329844
Opened 8 years ago
Closed 8 years ago
Productionize Topline report
Categories
(Data Platform and Tools :: General, defect, P1)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: amiyaguchi, Assigned: amiyaguchi)
References
Details
The Topline (executive) report is now taking advantage of the main_summary for generating aggregates.
This involves scheduling an airflow job for the data set, and uploading the results of the job to the production location.
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Updated•8 years ago
|
Priority: P1 → P2
Assignee | ||
Updated•8 years ago
|
Points: 1 → 3
Component: Metrics: Pipeline → Datasets: General
Priority: P2 → P1
Product: Cloud Services → Data Platform and Tools
Comment 1•8 years ago
|
||
While it's still fresh in my mind, here are the steps we discussed:
- Deploy the topline job to airflow to generate weekly and monthly output in Parquet form. Backfill as far as makes sense.
- Write code to output a CSV view of "all time" based on parquet data with a new naming standard ("topline-weekly.csv" and "topline-monthly.csv" seem reasonable).
- Update Diagnostic Dashboard[1] to display new csv data.
- Backfill parquet data from historic CSV files[2], dropping columns that are no longer being generated (such as five-of-seven, inactives).
- Compare new and old datasets for consistency via Diagnostic Dashboard.
- Update Firefox Dashboard[3] to use new naming standard.
- Run old and new code in parallel for some period of time in case there are problems.
- Stop old code.
- Celebrate!
[1] https://github.com/mozilla/diagnostic-data-viewer
[2] Using data from s3://telemetry-private-analysis-2/executive-report-<period>/data/executive_report.<period>.yyyymmdd.csv
[3] https://github.com/mozilla/firefox-dashboard
Assignee | ||
Comment 2•8 years ago
|
||
The backfilling process and dashboard output looks pretty consistent. [1] There is a little bit of noise due to floating point arithmetic, but this ends up being a total of 12 minutes throughout the whole dataset.
I'm going to go ahead and copy over the historical backfill to the primary location.
[1] https://gist.github.com/acmiyaguchi/a8f18830a4d8ba3fbae0790ba4503658
Comment 3•8 years ago
|
||
This is done! The Firefox Dashboard[1] is now using the topline data source. The previous data "v4-weekly.csv" and "v4-monthly.csv" is still available for comparison purposes, but will likely disappear in the next few months.
[1] https://metrics.services.mozilla.com/firefox-dashboard/
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•