Open Bug 1728784 Opened 3 years ago Updated 2 years ago

Add a stack metric type to Glean

Categories

(Data Platform and Tools :: Glean Metric Types, enhancement)

enhancement

Tracking

(Not tracked)

People

(Reporter: mak, Unassigned)

References

(Blocks 1 open bug)

Details

Proposal for changing an existing or adding a new Glean metric type

I'd like to propose a simple way to collect part of a stack with Glean, so one can report to telemetry performance issues or unexpected errors with a reference to what caused them.

Who is the individual/team requesting this change?

Mak so far for Storage and Search but it may be useful to other teams too.

Is this about changing an existing metric type or creating a new one?

Creating a new metric type.

Can you describe the data that needs to be recorded?

Part of the stack when the glean collecting function is invoked, the probe definition could contain the depth of the stack to collect.

Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)

Example 1: module A passes a promise to module B, module b races that promise with a timeout because it has to do some work before and after it (database transaction can be an example). If the promise is not resolved, module B on timeout can capture a stack and record that module A passed a promise that took too long.
Example 2: a call to a method fails with a very unexpected error, we'd like to know who is the caller and how often that error happens.

What is the business question/use-case that requires the data to be recorded?

We'd like to understand how often an unexpected error or timeout happens and who is causing it, so that code can be optimized.

How would the data be consumed?

I would like to see aggregated counts per stack.

Why existing metric types are not enough?

BHR is using its own very special way to collect stacks, and the only other alternative we have right now is keyed scalars, but you must manipulate the stack to some identifying string. It doesn't tempt people to use it, we could improve quality of the product with a simpler stack collector.

What is the timeline by which the data needs to be collected?

There's no timeline on this, I just think it would be useful long term, in the meanwhile we'll keep using keyes scalars...

See Also: → 1704854
See Also: → 1816744
You need to log in before you can comment on or make changes to this bug.