Closed
Bug 1309169
Opened 8 years ago
Closed 8 years ago
Very low number of client_ids reporting engagement measurements in longitudinal
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Dominik, Assigned: Dexter)
References
Details
(Whiteboard: [measurement:client])
Attachments
(1 file)
Looking at the newly created engagement measures onb Beta, I only see a very small number of distinct client_ids reporting data on Beta. For the last days, the max number of client_ids was <400. Query is here: https://sql.telemetry.mozilla.org/queries/1387#2437
The engagement measures dashboard per e10s cohort is here: https://sql.telemetry.mozilla.org/dashboard/engagement-measures-per-e10s-cohort
On the other hand, I can see appr. 500k DAUs for each e10s cohort on Beta for the same days: https://sql.telemetry.mozilla.org/queries/165#table
As there is currently big interest in first engagement analysis going into Quantum and e10s in terms of analysing our hypothesis that performance drives engagement, these low counts for the engagement measures are strongly impacting reliability of the results.
Comment 1•8 years ago
|
||
Moving to the pipeline component, as the longitudinal dataset is managed by the data pipeline.
Component: Metrics: Product Metrics → Metrics: Pipeline
Comment 2•8 years ago
|
||
Alessio, can you take a first look on whats going on here?
Flags: needinfo?(alessio.placitelli)
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → alessio.placitelli
Blocks: 1276200
Points: --- → 2
Priority: -- → P1
Whiteboard: [measurement:client]
Assignee | ||
Comment 3•8 years ago
|
||
(In reply to Georg Fritzsche [:gfritzsche] from comment #2)
> Alessio, can you take a first look on whats going on here?
It looks like there's a bug either in the client or in the code landed by bug 1288180 to add the scalars to the longitudinal view.
I wrote to queries, [1] that checks for the presence of the "max_concurrent_tab_count" engagement scalar in a ping from the beta clients and [2] for the same but on the nightly population.
I'm checking for that particular scalar as it should almost always be there. The query show that most of the time it isn't.
My next steps are:
1) Run https://gist.github.com/Dexterp37/9bea37f536d5b25651aecc8b22d2dfb6 again and explicitly check for the max_concurrent_tab_count scalar.
2) If we see that it is missing from the majority of pings, that suggests a bug in the client. Otherwise, we should continue the investigation on the code which adds scalars to the longitudinal.
[1] - https://sql.telemetry.mozilla.org/queries/1395
[2] - https://sql.telemetry.mozilla.org/queries/1397
Status: NEW → ASSIGNED
Flags: needinfo?(alessio.placitelli)
Assignee | ||
Comment 4•8 years ago
|
||
I've updated the analysis linked in comment 3 to dig a bit into the pings that were not reporting the engagement measurements. It turns out that:
- Only 0.3% (1728 over 555501) of the pings from Nightly do not report engagement measurements.
- Only 0.2% (65021 over 29402868) of the pings from Beta do not report engagement measurements.
Given the relatively low volume of problematic pings, the problem seems to be in the code that adds the scalars to the longitudinal dataset.
Specifically, the problem seems to be at [1]. Basically, the longitudinal is known for cutting off data for malformed pings (e.g. it chops off the whole histograms section for a single client if he receives a corrupted data). Before we landed the scalars, no user had a scalar section, so most of the pings didn't have a scalar section.
Now they do and we're skipping the scalars for these clients (the majority).
I'm working on a fix.
[1] - https://github.com/mozilla/telemetry-batch-view/blob/master/src/main/scala/com/mozilla/telemetry/views/Longitudinal.scala#L765
Assignee | ||
Comment 5•8 years ago
|
||
Assignee | ||
Comment 6•8 years ago
|
||
The PR was reviewed by :rvitillo and merged. We verified and the amount of scalar data available is significantly more, so the fix seems to work correctly.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•