Small (but countable) number of pings from Release clients contain pre-release data.
Categories
(Data Science :: Investigation, task, P3)
Tracking
(Not tracked)
People
(Reporter: chutten, Unassigned)
Details
Brief Description of the request (required):
We're seeing 0.0003% of pings on release containing pre-release data (bug 1514031 and bug 1521597). We don't know why. We don't want this to happen.
Possible avenues of investigation:
- Finding some common attributes shared by all "pre-release on release" reporting clients.
- Learning something about the pre-release data sent by these clients (maybe they're only sending data with 1 or 2 samples which might point to a possible timing issue? Maybe they're only sending one type of metric? ...)
Business purpose for this request (required):
Establishing trust and confidence in the Data Platform.
Requested timelines for the request or how this fits into roadmaps or critical decisions (required):
No rush. Isn't depended on by active projects.
Links to any assets (e.g Start of a PHD, BRD; any document that helps describe the project):
The (sketchy) analysis I performed to get the 0.0003% number: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/98925/command/98926
Name of Data Scientist (If Applicable):
N/A
Reporter | ||
Comment 1•5 years ago
|
||
See also bug 1544039 where a similar proportion of pings were seen to contain data that wasn't permitted to be recorded on that process. Which processes are permitted to collect Telemetry is a little more complex than the prerelease/release check, but may be init'd at the same time... weak evidence for a timing problem?
Updated•5 years ago
|
Comment 2•4 years ago
|
||
Work for the DS team is now tracked in Jira. You can search with the Data Science Jira project for the corresponding ticket.
Reporter | ||
Comment 3•4 years ago
|
||
Description
•