To power the fennec-dashboard, we need to built CSV data exports from the "core" ping, following this format: https://metrics.services.mozilla.com/fennec-dashboard/data/fennec_weekly_data.csv https://metrics.services.mozilla.com/fennec-dashboard/data/fennec_monthly_data.csv This currently contains these columns: os_version,geo,channel,date,actives,abnormals,new_records,d1,d7,d30,hours,google,yahoo,bing,other abnormals will be cut, search counts will also not be available (at least initially), so depending on the plans we can drop those or fill them with 0s.
The exports can go into: s3://net-mozaws-prod-metrics-data/fennec-dashboard To keep the convention established by the Desktop v4 dashboard update, we should name them: fennec-v4-weekly.csv fennec-v4-monthly.csv
Priority: -- → P2
Hamilton, what do you think about storing the Spark script used to generate the CSV data on the dashboard repository?  - https://mail.mozilla.org/pipermail/fhr-dev/2016-March/000884.html
Talking to mreid, we decided to let this live in the pipeline repository for now: * repo: https://github.com/mozilla-services/data-pipeline/ * path: reports/fennec_dashboard That way we can easily find it easily in case we make any bigger changes. In the medium- to longer-term we'd want to move away from this spark job and power this from a longitudinal, client-oriented or other more appropriate derived stream.
We will also need to support 3 modes of operation here: * weekly & monthly for incremental updates of the csv files * backfill for the whole time period we are looking at Ideally we'd want to power that from the same notebook just by looking at the submission arguments or the job name. Roberto, do you have an idea on how we can do that properly? Can we see the "Spark submission args" there? Or maybe get the job name and look for a "-weekly"/"-monthly" suffix?
(In reply to Georg Fritzsche [:gfritzsche] from comment #4) > Roberto, do you have an idea on how we can do that properly? > Can we see the "Spark submission args" there? > Or maybe get the job name and look for a "-weekly"/"-monthly" suffix? The job name suffix will work but it's a hack. I filed 1258685.
This is being reviewed on Github: https://github.com/mozilla-services/data-pipeline/pull/195
Roberto, any suggestion about how to fetch the job name from a Spark notebook?
You could try to read the filename of the notebook (e.g. YOURJOB.ipynb) from the current working directory.
I checked that the active users computed by the script in comment 6, for the week starting on the 6th of March ("beta" population) roughly match the ones from this query: https://sql.telemetry.mozilla.org/queries/85/source#table . They do, so we should be producing sane data from the Spark job.
This was merged: https://github.com/mozilla-services/data-pipeline/commit/ddd255e8b2c5440ad94819fcea88678f894bcce3 Currently we can't power the fennec-dashboard yet due to bug 1257589, we will look into scheduling this for Fennec 46 in bug 1260715.
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.