New Metric Type: (Keyed) Enumerated counts (needs a better name)
Categories
(Data Platform and Tools :: Glean Metric Types, enhancement)
Tracking
(Not tracked)
People
(Reporter: chutten, Unassigned)
References
Details
(Whiteboard: [telemetry-parity])
Proposal for changing an existing or adding a new Glean metric type
Who is the individual/team requesting this change?
:chutten, Project FOG
Is this about changing an existing metric type or creating a new one?
Either.
Can you describe the data that needs to be recorded?
Here's the keyed ones:
- HTTP3_CONNECTTION_CLOSE_CODE
- URLCLASSIFIER_UPDATE_REMOTE_NETWORK_ERROR
- URLCLASSIFIER_UPDATE_REMOTE_STATUS2
- URLCLASSIFIER_UPDATE_TIMEOUT
- URLCLASSIFIER_COMPLETE_REMOTE_STATUS2
- URLCLASSIFIER_UPDATE_ERROR
- FX_MIGRATION_ERRORS
- FX_MIGRATION_USAGE
- FX_URLBAR_SELECTED_RESULT_INDEX_BY_TYPE_2
- POPUP_NOTIFICATION_STATS
- NETWORK_CACHE_SIZE_SHARE
- NETWORK_CACHE_ENTRY_COUNT_SHARE
- SANDBOX_FAILED_LAUNCH_KEYED
Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)
In general, when a return code from an API is an enum (as in C++'s enum
), mapping that to the appropriate label of a categorical histogram (or keyed categorical histogram) is unpleasant. If we cannot make an ergonomic API for taking (subsets of) C++ enum
s as the label for a count (with or without a second-level key), then we'll need a new metric type that needs to take the integer representation of the enum
as the label for the count.
What would be really neat is if glean_parser and the SDK could somehow conspire to generate an API that took the Actual enum
as the parameter. e.g. for SANDBOX_FAILED_LAUNCH_KEYED how awesome would it be if we generated an api that used sandbox::ResultCode
like
glean::sandbox::failed_launch.Accumulate(sandbox::ResultCode aBucket, uint32_t aSample = 1);
And we'd take care of the details of rewriting it to a transformed string based on the enum
label, adjusting to additions or removals from the enum
... just magicked the whole thing away.
What is the business question/use-case that requires the data to be recorded?
Various.
How would the data be consumed?
GLAM and redash
Why existing metric types are not enough?
Nothing exists that can deal with the keyed case, and the story for the non-keyed case isn't fantastic either.
What is the timeline by which the data needs to be collected?
Q1 2021
Comment 1•4 years ago
|
||
I would use this for GCReason
histograms. It is an enumeration of different reasons that we might GC, currently collected in the Telemetry measurement GC_REASON_2. I would like to have it keyed by the relevant state of the browser, something like (1) there is user input pending, (2) an animation is playing, or (3) neither of the above. GCReason
is a pretty large enumeration of over 50 different reasons that we slowly add to, but never shift around the numbering even when one stops being used.
Description
•