Determine the subset of the Telemetry Environment which should be sent in Glean crash pings
Categories
(Toolkit :: Crash Reporting, task)
Tracking
()
People
(Reporter: afranchuk, Assigned: afranchuk)
References
(Blocks 1 open bug)
Details
Telemetry crash pings send the Environment
along with the ping, which is a fairly large set of information. Pieces of this information are commonly used to organize crash ping information. The Environment values should be audited thoroughly to determine those that should be retained in the Glean crash pings.
Some of these values may already be present in other Glean metrics, in which case they can be easily modified to also be sent in crash pings. These pre-existing metrics should be identified to make it clear which metrics need to be added and reviewed.
Comment 1•2 years ago
|
||
Why not just add the whole Environment
now and audit it later? This gives us query compatibility with the existing telemetry pings and avoids blocking on the work required to do the audit.
Assignee | ||
Comment 2•2 years ago
|
||
Adding the whole environment is a big task (and would require a more involved data review)! I haven't yet looked at the environment fields closely, but others were of the opinion that portions may not be needed at all (and at least some parts have known duplicates already in the crash ping or the Glean client_info
). And perhaps more importantly, some of the fields may not have appropriate Glean metric types to represent them yet. However that will become more obvious once I inspect things.
If it turns out that a large majority or all of the environment can be easily represented, then including the whole environment may be a shortcut we can take. But we are hoping to remove some cruft as part of the conversion to using Glean.
Comment 3•2 years ago
|
||
Does it make sense to get blocked on a data review of data we're already sending?
I was imagining that the environment could just go into the ping as JSON and get put into the same fields as the legacy telemetry ping via ETL.
I understand wanting to clean things up here, I just don't want the crash ping work to bogged down in sorting this out.
Assignee | ||
Comment 4•2 years ago
|
||
I certainly understand your position. As some precedent, we chose to do a data review of the new bits we've started sending already in Glean pings, only because the Telemetry stuff is old enough that there aren't associated data reviews for things.
People have seemed somewhat resistant to using JSON here (I had the same thought for the crash ping stack traces which doesn't have a nice way to be sent in structured Glean metrics yet), in part because we'd like some client-side verification. It'd be cool if Glean supported verifying a field against some JSON schema.
Assignee | ||
Comment 5•2 years ago
|
||
But I agree that I don't want the Environment to be a large roadblock in getting functional crash pings going. One option is to do things incrementally (which we've already started doing anyway).
Assignee | ||
Comment 6•2 years ago
|
||
Here's my analysis of the Telemetry Environment (and the crash ping contents as well). It's not insurmountable, there are just some decisions to be made with regard to where the Glean metrics end up.
Comment 7•2 years ago
|
||
One thing that may be missing from the analysis is where and when to record the information. It's generally best to put instrumentation closest to the position where the code it instruments lives (and where it's tested) so that it can evolve along. We don't want to end up instrumenting the Telemetry Environment itself more than the systems it hoped to reflect.
(That being said, it might make sense to put a lot of stuff inside the Environment for the time being. I'd prefer not, as eventually I'd like to see it retired... but it is a convenient place that is already interested in recording information and keeping it up-to-date.)
Assignee | ||
Comment 8•2 years ago
|
||
(In reply to Chris H-C :chutten from comment #7)
One thing that may be missing from the analysis is where and when to record the information. It's generally best to put instrumentation closest to the position where the code it instruments lives (and where it's tested) so that it can evolve along. We don't want to end up instrumenting the Telemetry Environment itself more than the systems it hoped to reflect.
(That being said, it might make sense to put a lot of stuff inside the Environment for the time being. I'd prefer not, as eventually I'd like to see it retired... but it is a convenient place that is already interested in recording information and keeping it up-to-date.)
I was definitely hoping to put most of the metric recording as close to the information as possible (i.e. not just dovetailing off of Telemetry). I didn't information in the analysis yet because I haven't gone deep enough in the code to determine those locations.
Assignee | ||
Comment 9•2 years ago
|
||
If the document is looking generally acceptable, I'd say we can probably close this bug and open a new one for adding the crash metrics and relevant Environment fields. Maybe that should be two separate bugs, since the Environment-related things will likely end up spread out in the codebase.
Assignee | ||
Updated•2 years ago
|
Description
•