Filter out weird buckets coming from Glean SDK <= v19.0.0
Categories
(Data Platform and Tools :: General, defect, P2)
Tracking
(Not tracked)
People
(Reporter: Dexter, Assigned: klukas)
References
Details
Attachments
(1 file)
In bug 1591938 we found that the Kotlin implementation of the Glean SDK is generating buckets that are only 1ns apart on Android SDK 22. This might have to do with different rounding implementation on that platform.
This problem will go away with Glean SDK version 19.0.0, which implements this part in Rust.
We decided to filter out this flaky bucket data from the views.
Comment 1•4 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #0)
We decided to filter out this flaky bucket data from the views.
Can you expand a bit more on this?
Reporter | ||
Comment 2•4 years ago
|
||
Hey Mike, can you add some more info about how to identify flaky buckets and how to filter them?
Comment 3•4 years ago
|
||
If we just want to remove the incorrectly-collected data, we would just remove all pings where client_info.telemetry_sdk_version
< 19.0.0.
To actually correct the buckets, I could probably manually generate a mapping by looking at broken vs. correct bucketing and we could update the numbers. Even then, we'd be introducing a very small amount of error, due to the differences in the original buckets. Since this only affects GV data which is pretty new anyway, I don't know if it's worth the trouble, though.
Assignee | ||
Comment 4•4 years ago
|
||
Since this only affects GV data which is pretty new anyway, I don't know if it's worth the trouble, though.
What tables or products are affected by this? All android products?
If we just want to remove the incorrectly-collected data, we would just remove all pings where client_info.telemetry_sdk_version < 19.0.0.
I assume we'll still be receiving some amount of pings with the old version for some time. Is that true?
Assignee | ||
Comment 5•4 years ago
|
||
:Dexter - It is still unclear to me what criteria we want to use to filter out affected data. The easiest path seems to be to filter in user-facing views based on client_info.telemetry_sdk_version < 19.0.0
and I am inclined to move forward with that, but I still need to know whether all glean tables are affected or just particular products.
Reporter | ||
Comment 6•4 years ago
|
||
Hey Mike, any chance you could fill in the details for Jeff? (see comment 5)
Comment 7•4 years ago
|
||
Comment 5 seems right to me, assuming we are just doing that for the histograms (and not losing other data from those old pings).
Assignee | ||
Comment 8•4 years ago
|
||
(In reply to Michael Droettboom [:mdroettboom] from comment #7)
Comment 5 seems right to me, assuming we are just doing that for the histograms (and not losing other data from those old pings).
Based on this discussion, it sounds like we need to alter all metrics views similar to the following:
SELECT
* REPLACE ( (
SELECT
AS STRUCT metrics.* REPLACE(
IF
(SAFE_CAST(SPLIT(client_info.telemetry_sdk_build, '.')[
OFFSET
(0)] AS INT64) >= 19,
metrics.timing_distribution,
NULL) AS timing_distribution)) AS metrics)
FROM
`moz-fx-data-shared-prod.org_mozilla_fenix_stable.metrics_v1` AS m
WHERE
DATE(submission_timestamp) = "2019-12-12"
LIMIT
1000
A few assumptions in there that I'd like to have validated:
- Only
metrics
pings contain histograms - All histograms appear under
metrics.timing_distribution
- All
metrics
pings will contain a nestedtiming_distribution
field
If any of the above is not correct, then it may not be feasible to do this generically as part of view generation, and we'll instead have to target individual tables for which we want to provide this filtering in their views. Although we can probably check if a metrics.timing_distribution
field exists as part of the logic for creating views.
Comment 9•4 years ago
•
|
||
A few assumptions in there that I'd like to have validated:
- Only
metrics
pings contain histograms
In general, this is not true; however in this specific case, it is.
- All histograms appear under
metrics.timing_distribution
Again, in general, not true; however, that's the only piece it looks like needs hiding.
- All
metrics
pings will contain a nestedtiming_distribution
field
For Fenix and Fenix nightly, the metrics ping schema has a timing_distribution
field. That is again not true in general.
Does that clear things up? Solving this specific case is all I believe is necessary here.
Assignee | ||
Comment 10•4 years ago
|
||
Does that clear things up? Solving this specific case is all I believe is necessary here.
Based on what you just said, we only need to worry about org_mozilla_fenix*.metrics
. I will plan to apply the replacement as in the query above and filter it only to metrics pings from fenix products.
Comment 11•4 years ago
|
||
Assignee | ||
Comment 12•4 years ago
|
||
The view change is now deployed. Closing.
Description
•