Open Bug 1639329 Opened 5 years ago Updated 10 months ago

Timer-based Metrics: What if you don't know which to use until you stop the timer?

Categories

(Data Platform and Tools :: Glean: SDK, enhancement, P4)

enhancement

Tracking

(Not tracked)

People

(Reporter: chutten, Unassigned)

Details

(Whiteboard: [telemetry-parity])

Proposal for changing an existing or adding a new Glean metric type

Who is the individual/team requesting this change?

:chutten, on behalf of :froydnj's excellent point.

Is this about changing an existing metric type or creating a new one?

Changing timing metrics.

Can you describe the data that needs to be recorded?

Our users may not know when starting a timer-based metric (timespan, timing distribution) which metric they need. For example, when you get the first byte of an HTTP response you want to start a timer measuring how long it takes to process the response... but which timing distribution? networking.http.response_success? networking.http.response_redirect? It's not until you're a few bytes in (and perhaps a few classes deep) that you know the response code.

Can you provide a raw sample of the data that needs to be recorded (this is in the abstract, and not any particular implementation details about its representation in the payload or the database)

N/A

What is the business question/use-case that requires the data to be recorded?

N/A

How would the data be consumed?

In the usual way.

Why existing metric types are not enough?

If we add "labeled" support to timespans and timing distributions that'd probably do the trick. Or we'd have to ask our users to start all possible timers and cancel the ones they didn't use.

What is the timeline by which the data needs to be collected?

Q4 2020 or earlier (Project FOG Telemetry compat)

Whiteboard: [telemetry-parity]

I have a slightly different use case: I think it's similar in that there are multiple code paths but we don't want to call stop for all of them – we'd want to cancel for some but in practice, I can't. I have an entry point to the application, IntentReceiverActivity.onCreate. It decides which code path to take based on the user's intent: e.g. it can open the home screen, open the browser, etc.

I want to record a timing distribution metric only under certain conditions – when we enter that entry point and we end up on the browser screen, loading a URL. With the current API, I'd call start() in the entry function and call stop when loadUrl is called. However, due to encapsulation, it'd be challenging to call cancel for all of the the conditions I don't want to record the metric under. To visualize:

IntentReceiverActivity.onCreate (start) -> browser screen -> loadUrl (stop)
                                                        | -> restore last tab (cancel)
                                                        | -> ...? (cancel)
                                      | -> home screen (cancel)
                                      | -> ...? (cancel)

Essentially, there are too many end states to call cancel at the end state so, in IntentReceiverActivity, we'd need a single place where we say, "This is the state we're transitioning to". However, that's not the code we'd have (even though it'd probably be better).

That being said, in my particular case, I realized I only ever want to record a single timer so I could call cancel when the app is backgrounded and the entry point occurs again but I don't know if this applies to every use case.

Another concrete case: performance.{responsiveness|pageload}.req_anim_frame_callback. Share a start time but we only establish which is to be accumulated to at the end time.

Component: Glean Metric Types → Glean: SDK
Priority: -- → P4
You need to log in before you can comment on or make changes to this bug.