Closed Bug 1260715 Opened 8 years ago Closed 8 years ago

Review and schedule CSV summary export for the fennec-dashboard for Fennec 46

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [measurement:client])

In bug 1251189, we build an IPython notebook that generates us the weekly & monthly CSV data needed for the fennec-dashboard.

However, with bug 1257589 we don't have useful data from Fennec 45 yet.
We will have to:
* wait until we are clear on what ping version we have in Fennec 46
* potentially update the CSV export to those changes
* wait for validation of the 46 "core" ping data
* if we are good, start scheduling the job for 46+
To find "new records" / "new clients" we use check like "profile creation date == submission date".
This seems pretty fragile to missing pings, temporary loss of network, ...

We could change that into something like "profile creation date is in current week/month range" (depending on the job type) to make it more stable.
From a first backfill viewed in the diagnostics viewer [1], the retention data seems off.
I see d1 < d7 < d30, expected is d1 > d7 > d30.

It looks like the retention checks in the notebook [2] are the wrong way around:
> # Is the user still engaged after 1 day (d1)?
> if days_after_creation == 1:
>   safe_increment(acc, 'd1')
>
> # And after 7 days (d7)?
> if days_after_creation <= 7:
>   safe_increment(acc, 'd7')
...

1: https://metrics.services.mozilla.com/diagnostic-data-viewer/?dataset=fennec-v4-weekly#
2: https://github.com/mozilla-services/data-pipeline/blob/master/reports/fennec_dashboard/summarize_csv.ipynb
(In reply to Georg Fritzsche [:gfritzsche] from comment #2)
> From a first backfill viewed in the diagnostics viewer [1], the retention
> data seems off.
> I see d1 < d7 < d30, expected is d1 > d7 > d30.

Good catch, my bad. I'll fix this right now and link to the PR.
The retention change was merged.

Other things:
* we need to update this to work from schema version 2
* we need to think about how to avoid having to bump the schema version manually all the time
* we should print out how many date chunk a backfill is working over (so we have an idea of job progress)
We also need to remove the count() calls that trigger redundant transformations etc.
Also we need to filter for os=="Android" until we know how/if iOS is supposed to be integrated.
(In reply to Georg Fritzsche [:gfritzsche] from comment #5)

> * we need to think about how to avoid having to bump the schema version
> manually all the time

I had the same issue with the ETL script used to move core ping data into Re:dash. Mark Reid suggested using source_version='*' in get_pings. It works! A single get_pings can grab all the data for all the pings.
Priority: P3 → P2
I backfilled the weekly dataset from the 46 release date on:
https://metrics.services.mozilla.com/diagnostic-data-viewer/?dataset=fennec-v4-weekly#
Assignee: nobody → gfritzsche
We backfilled the weekly summary, looking at it in the diagnostic data viewer it seems good:
https://metrics.services.mozilla.com/diagnostic-data-viewer/?dataset=fennec-v4-weekly#

We have to wait for a full month of data for the first monthly backfill and more weekly sanity checks.
I'll pick this up again early June and wrap it up then.
Assignee: gfritzsche → nobody
Assignee: nobody → gfritzsche
Priority: P2 → P1
Points: 2 → 3
This was backfilled:
* weekly data: until May 29
* monthly data: for the whole of May

I scheduled a job for the weekly update (to monday 11AM UTC).
PR with a refactoring for speed-up and some logging improvements:
https://github.com/mozilla-services/data-pipeline/pull/214
We had a first meeting in London with adavis and bbermes: Overall the numbers don't look completely off.
* The release channel ADI is within the Adjust data bounds and growing toward that install base.
* Sadly Adjust is not giving us a breakdown by app version, so we can't directly cross-check (although adavis had ideas on cross-checking raw Adjust report & "core" ping submission numbers).
* The retention numbers are lower than what we see in Adjust (but not by very large margins), we don't know why yet.
* Using "fixed retention" seems good, as it seems to be the "industry standard".

Actions from there:
* We need a follow-up meeting and need to make a decision on whether the dashboard data can go live
* hide FHR retention data from dashboard, because the metric changed (fixed vs. rolling retention)
* poke mreid for the PR review

The backfill for last week already happened per the scheduled job, so the scheduling seems to work now.
Notes from the London meeting (moco only due to ADI numbers):
https://docs.google.com/document/d/1cYGaQ3s2Vhk489oi-bLPtwtmLkNZqsRA_QhzWk6bvCY/edit
The monthly data for June is now also available & the weekly scheduled jobs have been running fine:
https://metrics.services.mozilla.com/fennec-dashboard/
Blocks: 1284932
The scheduling happened and should be working properly.
I'm breaking out the rentention data investigation into bug 1284932.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.