Closed Bug 852974 Opened 11 years ago Closed 7 years ago

Add hook to TelemetryPing so other code can easily provide simple measures

Categories

(Toolkit :: Telemetry, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: Irving, Unassigned)

References

Details

Attachments

(1 file)

It's difficult for other modules to provide data to telemetry in "simple measure" format, and we only document adding histograms. This leads to people using histograms, which are quite heavy weight, when they really should be using simple measures.

We may want to make this more general purpose so it can be used to add arbitrary new sections to the telemetry ping, not just simple measures.

Here's a rough description and API:


TelemetryPing.addSimpleMeasure(aTag, aMeasure)

When TelemetryPing builds a packet, it will add an attribute with the name in aTag and the value in aMeasure to that packet.

If aMeasure is a simple value, it is copied into the packet.

If aMeasure is an object, it is included in the packet by reference - this means the caller can continue to modify the contents of aMeasure, and any time TelemetryPing generates a new packet, it will contain up-to-date values.

If aMeasure is a function, TelemetryPing will call the function each time it builds a telemetry packet and include the result in the packet.


TelemetryPing.removeSimpleMeasure(aTag)

un-register the simple measure; for when a module is shutting down or unloading and no longer wants Telemetry to hold a reference. In any case TelemetryPing should release all registered simple measures to the GC when it shuts down, so removeSimpleMeasure is optional unless aMeasure is a callback that would fail after the module is shut down.


We could similarly support new top level sections with something like TelemetryPing.addSection(aTag, aMeasure) and ..removeSection(aTag).

Perhaps it would be too error prone, but we might be able to fold these into a single API by making the aTag parameter structured so that the calls look like

..addMeasure(["simpleMeasurements", "myTag"<optional , "mySubTag">], value);

to put something in simpleMeasurements and

..addMeasure("mySection", value) to add a new section - though ["mySection"] would behave the same
After discussion with :froydnj, we decided to restrict this interface to only simpleMeasurements. Changes to other parts of the telemetry packet, such as adding new sections, should require changes to TelemetryPing.js to force discussion and review.
Assignee: nobody → irving
Status: NEW → ASSIGNED
Attachment #728221 - Flags: review?(nfroyd)
Comment on attachment 728221 [details] [diff] [review]
Add an API to TelemetryPing.js so that clients can add simpleMeasurement entries

Review of attachment 728221 [details] [diff] [review]:
-----------------------------------------------------------------

Seems reasonable.  I have concerns, though.

::: toolkit/components/telemetry/TelemetryPing.js
@@ +81,5 @@
> +// KNOWN_SIMPLE_MEASUREMENTS lists the known identifiers for simple measurements
> +// gathered by builtin code in this module. Clients of TelemetryPing will trigger
> +// an exception if they attempt to add a new simpleMeasurement with a conflicting
> +// name
> +const KNOWN_SIMPLE_MEASUREMENTS = {

Since there's no tie between this variable and the code in getSimpleMeasurements, we should make one.  Something like a check at the end of getSimpleMeasurements that says:

for (let prop of result /* or whatever, I forget exactly */) {
  if (!(prop in KNOWN_SIMPLE_MEASUREMENTS) && !(prop in this._simpleMeasures)) {
    throw something.
  }
}

so people know to update this list.  I'm not sure we'd add everything to simpleMeasurements with xpcshell tests that we would need to (e.g. startup timeline events).  So we'd wind up with xpcshell tests that pass but busted telemetry in the field with a real browser.  Do we add everything we'd need to in xpcshell?

I think the initialization is slightly more elegantly done as:

[ 'start', 'main' ... ].reduce(function(prev, current, idx, array) { prev[current] = 1; return prev; }, {});

But your call on that one.
Attachment #728221 - Flags: review?(nfroyd) → feedback+
The KNOWN_SIMPLE_MEASUREMENTS check is certainly an ugly hack. I don't want to refactor the whole telemetry ecosystem in this patch, but we could certainly do better with some rework to TelemetryTimestamps.jsm and others.

It occurred to me that shutdown time leads to some interesting corner cases. If a client removes their measurement using TelemetryPing.removeSimpleMeasurement(), the final saved-session telemetry packet won't contain their measurement.

In order to get proper behaviour, we may need to impose the rule that measurements can't be removed by the client and that TelemetryPing will hold onto the reference until quit-application-granted, which is when we currently collect the last set of simpleMeasurements.

This means we'll need to clearly document the life cycle of these callbacks to give the clients a chance to use them correctly...
What's the use case for removing measurements?  If we want to avoid leakiness, I think it'd be better to document the API as "somebody will release the references to the callbacks you register".
(In reply to Nathan Froyd (:froydnj) from comment #4)
> What's the use case for removing measurements?  If we want to avoid
> leakiness, I think it'd be better to document the API as "somebody will
> release the references to the callbacks you register".

Given the issue described in comment 3, the only reason to remove a measurement is to make sure that even though it's in idle-daily packets, it doesn't show up in the saved-session version of the telemetry for that session. I don't think that's a useful thing to support, so I agree that we should document that TelemetryPing will hold the reference until quit-application-granted and then release it.

Even if we keep removeSimpleMeasurement(), I was going to update the patch to make sure all the references are released at the next GC after the last simpleMeasurements is built, to avoid leaks and/or late destructions.
Yoric pointed out that if we want to add simpleMeasurements from C++, any such measurements would have to be restricted to after JS starts up.  Moving the simpleMeasurements registration bits (and querying to get the current values, presumably) into C++ would help mitigate that.
We only ever gather values in JS, at least for the time being, because TelemetryPing.js does that job. We could move addSimpleMeasurement() into Telemetry.idl and implement it in C++ but I'd still prefer to have the gathering done in JS to make it easy to add structured subsections like I did for Addon Manager.

We're implementing quite a few XPCOM components in JS these days; is it really that much of a burden to require C++ clients to wait for JS to be initialized?
(In reply to Irving Reid (:irving) from comment #7)
> We only ever gather values in JS, at least for the time being, because
> TelemetryPing.js does that job. We could move addSimpleMeasurement() into
> Telemetry.idl and implement it in C++ but I'd still prefer to have the
> gathering done in JS to make it easy to add structured subsections like I
> did for Addon Manager.

Sure, that'd probably be the easiest.

> We're implementing quite a few XPCOM components in JS these days; is it
> really that much of a burden to require C++ clients to wait for JS to be
> initialized?

Maybe?  I don't even know if:

- there is such a well-defined point; or
- C++ clients are notified at that point; or
- XPCOM magic would initialize JS automagically at the point that it would be needed regardless.
It would be nice to also have an easy way to record time intervals in simple measurements; see the end of https://bugzilla.mozilla.org/show_bug.cgi?id=853388#c105 and my response in ...#c115
Depends on: 906524
Doesn't TelemetryTimestamps.jsm already satisfy this use case?
Ah, I guess that only handles measures of time relative to process start, and the goal is to enable arbitrary measures (relative to something else, or absolute, or not even necessarily of time). Why not just extend TelemetryTimestamps to support that?
What's the proposed interface for this?  Were I using it, I would expect it to be one or more of:
```
addEvent("pseudonamespace", json_of_extra_data, when=Date.now())

or 

addEvent([list of strings],  when=Date.now()) 

or 

addCountable(key (jsonable/hashable), count=1)

```

In Micropilot, events were:

```
jsonable: required keys, data=somejson, ts=when
```

This ended up being a little *too* abstract, and we adopted conventions around 'pseudonamespaces'.

All in all, I like the idea of improving Telemetry's channel!
Assignee: irving → nobody
Gregg, I don't think we'll have time to get to this soon. What use case did you have in mind for this?
Flags: needinfo?(glind)
Vladan,

I was mostly giving an example of the kind of 'simple thing' that was used often in test pilot.  

In general I favor making it easy for addons to add things to telemetry, but I will defer to Benjamin or others here, for specifics.

GL
Flags: needinfo?(glind)
We have scalars now, that should cover the most common needs.

https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/scalars.html
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: