Investigate latency of seeing first baseline ping from `first_run_date`
Categories
(Data Platform and Tools :: Glean: SDK, defect, P1)
Tracking
(Not tracked)
People
(Reporter: mdroettboom, Assigned: travis_)
References
Details
Attachments
(1 file)
|
119.98 KB,
image/png
|
Details |
In :gkabbz 's investigation it was discovered that on Fenix and Fennec iOS (using Glean), only 73% of clients send their first baseline ping on the same day as first_run_date. After 3 days, we've only heard from 89% of clients. Even at 60 days, we've only heard from 99% of clients for the first time. (See attachment for full table).
For point of comparison, this is worse than core ping latency Fennec for Android, where 97% of clients appear on the first day (this probably represents a lower bound of what we could achieve on Android). This is significantly better than core ping latency on Fennec iOS, but there we can assume significant problems in legacy telemetry.
We should investigate what might be causing this effect -- is it client-side, or is there something in the pipeline causing this delay? Does separating out baseline pings by "reason" reveal any pattern?
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Updated•5 years ago
|
| Reporter | ||
Comment 1•5 years ago
|
||
A couple of random thoughts:
What is the denominator in the study -- is it the number of clients with a metrics ping, maybe? Why, for example, even after 60 days are we not at 100%, even though we don't look back more than 60 days?
Hopefully separating out by ping reason will reveal something. Particularly the "dirty startup" pings -- we would expect high latency with those. Are most of the late ones "dirty startup"? That would suggest users trying once briefly and not coming back until much later, maybe...
| Reporter | ||
Comment 2•5 years ago
|
||
| Assignee | ||
Updated•5 years ago
|
| Assignee | ||
Comment 3•5 years ago
|
||
| Assignee | ||
Comment 4•5 years ago
|
||
Next step is to determine why this differs from George's analysis.
| Assignee | ||
Comment 5•5 years ago
|
||
From what I can tell from the wonderful analysis that George did, my approach ended up being basically the same with the difference being that his analysis pulled from the baseline_clients_last_seen derived datasets to get the "first ping" date while I was just pulling this from the raw ping table. I currently know nothing about how the baseline_clients_last_seen dataset is generated, so I can't really explain why it would cause this difference yet.
| Assignee | ||
Comment 6•5 years ago
|
||
Thanks to frank for pointing out that the numbers are pretty close for iOS when you look at the "Including the day prior to profile creation date" section of George's analysis colab notebook. At day 7 his analysis shows that 98.3969% of the population has sent a ping by this time, and my analysis showed 97.048% (so actually slightly worse).
Fenix, on the other hand, I show 99.249% of the population has sent us a ping by day-7, while George's analysis shows only 92.0123%. Then I looked at why this is, and I had filtered out dates that included the migration from Fennec to Fenix (by using a post-migration date range) while George's analysis focused on those dates that fell into the migration (2020-08-01 to 2020-10-30).
I think that after looking at this, I'm still interested in why iOS has 3% at day 7 that have yet to send us a ping, but I'm not nearly as worried about Fenix as the most recent data shows that > 99% have sent us a ping by day 7.
| Assignee | ||
Comment 7•5 years ago
|
||
Closing this as "FIXED" since it looks like the problem is fairly small.
Description
•