FOG: Implement `quantity` metrics in Firefox-on-Glean
Categories
(Toolkit :: Telemetry, enhancement, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox90 | --- | fixed |
People
(Reporter: nalexander, Assigned: chutten)
Details
(Whiteboard: [telemetry:fog:m6])
Attachments
(2 files)
Proposal for changing an existing or adding a new Glean metric type
Who is the individual/team requesting this change?
:nalexander, Install/Update team
Is this about changing an existing metric type or creating a new one?
It's about exposing an existing metric type, quantity
.
Can you describe the data that needs to be recorded?
I wanted to record the exit status code for a command line program (a background task).
Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)
It'd be an integer between 0 and 127.
What is the business question/use-case that requires the data to be recorded?
We use exit codes to capture various failure states; it's important for our operational metrics to know what the various failure states we see in the wild are.
How would the data be consumed?
I guess we'd use GLAM to see proportions of failures to successes, and then to break down the types of failures.
Why existing metric types are not enough?
Well, they are, if you squint. We could convert the integers to strings, for example. Or we can do what I did, which is to make things coarser, and have two booleans, one for success and one for failure. But surely there are numbers that we actually want to record that aren't distributions or timespans or whatever.
What is the timeline by which the data needs to be collected?
I've worked around this so there's no timeline. But it's weird to see quantity in various docs, and in use in GeckoView, and then to find that it's not supported in Firefox-on-Glean.
Comment 1•4 years ago
|
||
We started allowing quantity everywhere a while back and I don't see a reason to not allow it in Firefox Desktop.
It can therefore be turned into an implementation bug (the RLB already supports it, so it's really just missing the implementation)
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Assignee | ||
Comment 2•4 years ago
|
||
(In reply to Jan-Erik Rediger [:janerik] from comment #1)
We started allowing quantity everywhere a while back and I don't see a reason to not allow it in Firefox Desktop.
It can therefore be turned into an implementation bug (the RLB already supports it, so it's really just missing the implementation)
In addition to this, I wonder if there is use for a error_code
metric type. Something that is a sparse distribution of numbers, that carries along meaning for some specific numbers... something like
avif:
decode_error:
type: error_code
description: The error code returned by `libavif`.
...
codes:
0: success
1: generic_error
127: invalid_dimensions
255: stream_failure
(This here? This is scope creep)
This'd be a specialization of labeled_counter, more or less, I guess? Anyhoo, just an idea. Implementing quantity
is the obvious way forward.
Reporter | ||
Comment 3•4 years ago
|
||
(In reply to Chris H-C :chutten from comment #2)
(In reply to Jan-Erik Rediger [:janerik] from comment #1)
We started allowing quantity everywhere a while back and I don't see a reason to not allow it in Firefox Desktop.
It can therefore be turned into an implementation bug (the RLB already supports it, so it's really just missing the implementation)In addition to this, I wonder if there is use for a
error_code
metric type. Something that is a sparse distribution of numbers, that carries along meaning for some specific numbers... something likeavif: decode_error: type: error_code description: The error code returned by `libavif`. ... codes: 0: success 1: generic_error 127: invalid_dimensions 255: stream_failure
(This here? This is scope creep)
This'd be a specialization of labeled_counter, more or less, I guess? Anyhoo, just an idea. Implementing
quantity
is the obvious way forward.
I quite like this proposal! Can I can get my hands on the defined codes:
from consumers in some way? If not, then I have a "multiple sources of truth" problem, which isn't novel but would be pleasant to avoid.
Assignee | ||
Comment 4•4 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #3)
I quite like this proposal! Can I can get my hands on the defined
codes:
from consumers in some way? If not, then I have a "multiple sources of truth" problem, which isn't novel but would be pleasant to avoid.
I'm not sure I understand "Can I [...] get my hands on the defined codes:
from consumers in some way?". If you mean "will glean_parser generate enums in JS, C++, and Rust for me to use" then, boy howdy would we ever! (Firefox Telemetry does this in C++ for histograms of kind: "categorical"
after all, and we wouldn't want to be left behind).
And if you're wondering if we have a bug about this, we do. No timeline on implementation, though having concrete use cases lined up to use it really helps us prioritize work.
Now, if you mean the reverse (that the code already uses an existing enum and you want Glean to figure out what it means), then we have a bit of a trial ahead. We could (for strongly-typed languages) generate APIs that take only the specific enum type. But then we'd need some way of knowing what each of those enum values mean.
For example, NS_OK
is 0 and NS_ERROR_FAILURE
is 2147500037. We could totally write an API that in Rust and C++ takes an nsresult and in JS takes an... int? nserror? Something clever, I'm sure. But at some point we need to know that 2147500037 maps to NS_ERROR_FAILURE
. Either on the client, so we can map from the value to the string and then store and transmit the string. Or on the pipeline, so we can store and transmit the value, but map it to the string in tooling.
I'm not sure how to instruct a Python codegen step to read Gecko C++ to generate a map from nsresult
value to a string. And that's for nsresult
where every enum value is specified. I'm not sure how to teach Python how to read C++ enum declaration syntax. Probably a surmountable problem (I haven't googled "How do I get the enum name from its value C++" yet). But probably not easy (I don't wanna think about generated enums. Need to run the preprocessor first, ye gads).
But having a single source of truth is crucial. Otherwise someone'll make a mistake, or we'll add a new value, and it all falls apart.
The advantage of codes
above is that anything that isn't in the codes
list gets mapped to __other__
so at least you know when you need to update your second source of truth : |
Comment 5•4 years ago
|
||
IMO we should copy that discussion to the Labeled Metrics with staticly-defined Labels bug. It's pretty good reasoning why we would want it.
This bug however is about exposing what we already have, and :chutten has that soon ready.
Assignee | ||
Comment 6•4 years ago
|
||
Assignee | ||
Comment 7•4 years ago
|
||
Depends on D112424
Comment 9•4 years ago
|
||
Backed out 2 changesets (bug 1704846) for backout conflict with Bug 1685402. CLOSED TREE
Backout:
https://hg.mozilla.org/integration/autoland/rev/1fc80942c629854899e60dc5d8aef0d55d32a6e8
Assignee | ||
Comment 10•4 years ago
|
||
Comment 11•4 years ago
|
||
Comment 12•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/fce291ab0507
https://hg.mozilla.org/mozilla-central/rev/7ddd2af76d34
Description
•