Open Bug 1931369 Opened 1 year ago Updated 1 year ago

Consider reporting timer values in the Glean API when stopped

Categories

(Data Platform and Tools :: Glean: SDK, enhancement, P4)

enhancement

Tracking

(Not tracked)

People

(Reporter: aabh, Unassigned, Mentored)

References

Details

(Whiteboard: [good next bug][lang=rust])

The Glean SDK has two metrics that allow a user to start and stop (or cancel) a timer: the Timespan metric, and TimingDistribution metric. We would like to be able to add (a) profiler marker(s) that track(s) when a timer starts and stops, but this is non-trivial, due to design decisions taken by both the Profiler and Glean APIs.

From the profiler side: Adding markers to delineate a duration can be done in one of two ways. Either the user emits a single "interval", which contains both the start and end time of the duration, or the user emits two interval markers, one at the start of the duration, and one at the end, containing only the start/end time respectively. The latter case seems like a natural fit for the FoG API: We emit a marker when a timer is started, and another when the timer ends. Unfortunately, due to the way in which we distinguish between markers in the profiler, and how we match up start/end markers (see Bug 1929070), we end up with inconsistent data in the profiler frontend.

The other case (emitting a single "interval" marker) is blocked by our ability to know the start time of a timer when it is finishing. Recall that an interval marker is recorded once, with both the start and end time. It must, therefore, be recorded when a timer either stops or is cancelled. Unfortunately, we cannot do this either, as the Glean APIs for timers (Timespan, TimingDistribution) do not provide us with access to information about the timer when we stop it.

In this bug, I propose that the following methods:

fn TimespanMetric::stop(&self) -> ();
fn TimespanMetric::cancel(&self) ->();

become, respectively:

fn TimespanMetric::stop(&self) -> Option<(u64, u64)>;
fn TimespanMetric::cancel(&self) -> Option<(u64, u64)>;

returning the time of the corresponding call to TimespanMetric::start, and time of the call to the method (as calculated by Glean) in nanoseconds. If the corresponding call to TimespanMetric::start has not happened, then the methods will return None.

Similarly, the methods:

fn TimingDistribution::stop_and_accumulate(&self, id: TimerId) -> ();
fn TimingDistribution::cancel(&self, id: TimerId) -> ();

become, respectively:

fn TimingDistribution::stop_and_accumulate(&self, id: TimerId) -> Option<(u64, u64)>;
fn TimingDistribution::cancel(&self, id: TimerId) -> Option<(u64, u64)>;

again, returning the equivalent values to TimespanMetric, but for a given TimerId.

Component: Telemetry → Glean: SDK
Product: Toolkit → Data Platform and Tools

This is a great idea, but let's hold off until the beginning of the year to be able to better deal with breaking changes.

Priority: -- → P4

(In reply to Travis Long [:travis_] from comment #1)

This is a great idea, but let's hold off until the beginning of the year to be able to better deal with breaking changes.

Could the same result be achieved with an internal only API change, so there's no breaking change?

(In reply to Florian Quèze [:florian] from comment #2)

(In reply to Travis Long [:travis_] from comment #1)

This is a great idea, but let's hold off until the beginning of the year to be able to better deal with breaking changes.

Could the same result be achieved with an internal only API change, so there's no breaking change?

I think so, as long as we can record the marker from within FOG within the start and stop functions here: https://searchfox.org/mozilla-central/source/toolkit/components/glean/api/src/private/timing_distribution.rs, then there would be no need to change the return of the API functions.

Mentor: tlong
Whiteboard: [good next bug][lang=rust]
You need to log in before you can comment on or make changes to this bug.