Closed Bug 1298763 Opened 3 years ago Closed 3 years ago
Create a derived parquet dataset for the sync ping
The next step on the path to dashboards is to create derived parquet dataset. Dexter tells me the steps are roughly: - You have to build a new view for the Sync pings (i.e. as done for longitudinal and main summary). You can find an example for the MainSummary at  (the same repository also contains the code for the longitudinal). - This should live in the https://github.com/mozilla/telemetry-batch-view repo. - I suggest to use IntelliJ IDEA  to write your code, as you can easily import the SBT project from the repo and have it download all the dependencies, along with all its code completion cool features.  - https://github.com/mozilla/telemetry-batch-view/tree/master/src/main/scala/com/mozilla/telemetry/views  - https://www.jetbrains.com/idea/ I'll schedule a time to chat with Dexter soon, but getting this on file and so I don't lose the above steps a second time :)
Mark, do you have any suggestion or additional information about how to add Sync pings to re:dash?
There are 2 other smaller steps. After developing and testing the Sync View code per your description, we'll need to schedule it to run from Airflow. There are several examples in the "dags" directory. We also need to import the new dataset into the Hive Metastore used by re:dash. This is a matter of adding a crontab entry on the re:dash server (at least until bug 1269781 lands) and should only take a few minutes.  https://github.com/mozilla/telemetry-airflow
Whiteboard: [sync-metrics] → [sync-metrics][measure-integrity]
This code was merged in https://github.com/mozilla/telemetry-batch-view/pull/114. Bug 1307317 and bug 1307318 are 2 followups that must be done by the pipeline team to finally get this data in re:dash - so all the work here is done.
You need to log in before you can comment on or make changes to this bug.