Open Bug 1657473 Opened 5 years ago Updated 2 years ago

New Metric Type: (Keyed) Enumerated counts (needs a better name)

Categories

(Data Platform and Tools :: Glean Metric Types, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: chutten, Unassigned)

References

Details

(Whiteboard: [telemetry-parity])

Proposal for changing an existing or adding a new Glean metric type

Who is the individual/team requesting this change?

:chutten, Project FOG

Is this about changing an existing metric type or creating a new one?

Either.

Can you describe the data that needs to be recorded?

Here's the keyed ones:

  • HTTP3_CONNECTTION_CLOSE_CODE
  • URLCLASSIFIER_UPDATE_REMOTE_NETWORK_ERROR
  • URLCLASSIFIER_UPDATE_REMOTE_STATUS2
  • URLCLASSIFIER_UPDATE_TIMEOUT
  • URLCLASSIFIER_COMPLETE_REMOTE_STATUS2
  • URLCLASSIFIER_UPDATE_ERROR
  • FX_MIGRATION_ERRORS
  • FX_MIGRATION_USAGE
  • FX_URLBAR_SELECTED_RESULT_INDEX_BY_TYPE_2
  • POPUP_NOTIFICATION_STATS
  • NETWORK_CACHE_SIZE_SHARE
  • NETWORK_CACHE_ENTRY_COUNT_SHARE
  • SANDBOX_FAILED_LAUNCH_KEYED

Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)

In general, when a return code from an API is an enum (as in C++'s enum), mapping that to the appropriate label of a categorical histogram (or keyed categorical histogram) is unpleasant. If we cannot make an ergonomic API for taking (subsets of) C++ enums as the label for a count (with or without a second-level key), then we'll need a new metric type that needs to take the integer representation of the enum as the label for the count.

What would be really neat is if glean_parser and the SDK could somehow conspire to generate an API that took the Actual enum as the parameter. e.g. for SANDBOX_FAILED_LAUNCH_KEYED how awesome would it be if we generated an api that used sandbox::ResultCode like

glean::sandbox::failed_launch.Accumulate(sandbox::ResultCode aBucket, uint32_t aSample = 1);

And we'd take care of the details of rewriting it to a transformed string based on the enum label, adjusting to additions or removals from the enum... just magicked the whole thing away.

What is the business question/use-case that requires the data to be recorded?

Various.

How would the data be consumed?

GLAM and redash

Why existing metric types are not enough?

Nothing exists that can deal with the keyed case, and the story for the non-keyed case isn't fantastic either.

What is the timeline by which the data needs to be collected?

Q1 2021

I would use this for GCReason histograms. It is an enumeration of different reasons that we might GC, currently collected in the Telemetry measurement GC_REASON_2. I would like to have it keyed by the relevant state of the browser, something like (1) there is user input pending, (2) an animation is playing, or (3) neither of the above. GCReason is a pretty large enumeration of over 50 different reasons that we slowly add to, but never shift around the numbering even when one stops being used.

See Also: → 1672273
See Also: → 1807016
You need to log in before you can comment on or make changes to this bug.