Closed Bug 1302681 Opened 8 years ago Closed 7 years ago

Extend Event Telemetry core for recording from addons

Categories

(Toolkit :: Telemetry, defect, P1)

defect
Points:
3

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(3 files, 7 obsolete files)

4.32 KB, patch
Dexter
: review+
Details | Diff | Splinter Review
2.51 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
78.99 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
We need to extend the Event Telemetry core to support recording from addons.
That means:
* Extend the API for registration of new events
* Extending the snapshotting API if needed
* Adding test coverage through:
  (1) adding a YAML registry like an addon for test builds
  (2) generating a JSON blob from it
  (3) registering the blob dynamically (from xpcshell test, no addon needed)
  (4) recording into the events, snapshotting them, checking the results
Blocks: 1302683
Per the discussions around bug 1313592 et al, addons should be able to override event recording limits if needed.
This covers short-term needs of studies & experiments, which will not have significant impact on our budget.
No longer depends on: 1302678
We can simplify this by starting out with only support for expiries of "never".
Expiry by date and version follow in bug 1367368 and bug 1367369.
Priority: P3 → P2
Priority: P2 → P1
Assignee: nobody → gfritzsche
Points: --- → 1
Points: 1 → 3
While this is still blocked on a data steward decision, i dont expect big changes resulting from that. So asking for feedback on this pretty final state to see if there are any bigger issues.
Attachment #8884281 - Flags: feedback?(alessio.placitelli)
Comment on attachment 8884281 [details] [diff] [review]
Extend Event Telemetry core for recording from addons

Review of attachment 8884281 [details] [diff] [review]:
-----------------------------------------------------------------

The implementation makes sense, I'm only curious about the "dynamic process" choice here, so I left a couple of comments below.

::: toolkit/components/telemetry/Processes.yaml
@@ +18,5 @@
>      separately to avoid skewing other content process Telemetry.
>  gpu:
>    gecko_enum: GeckoProcessType_GPU
>    description: This is the compositor or GPU process.
> +dynamic:

Is this a real process? Or it's a caveat to show addon events in a separate section? If the latter, we should clearly state so

::: toolkit/components/telemetry/TelemetryEvent.cpp
@@ +297,5 @@
>  typedef nsTArray<EventRecord> EventRecordArray;
>  nsClassHashtable<nsUint32HashKey, EventRecordArray> gEventRecords;
>  
> +// The details on dynamic events that are recorded from addons are registered here.
> +nsTArray<DynamicEventInfo> gDynamicEventInfo;

Should we account for this in the |SizeOfIncludingThis| function?

@@ +914,5 @@
> +    if (!eventName.init(cx, eventPropertyIds[i])) {
> +      return NS_ERROR_FAILURE;
> +    }
> +
> +    if (!IsValidIdentifierString(NS_ConvertUTF16toUTF8(eventName), 20, false)) {

Since this number is used in more than one place, should we make it a const?

@@ +948,5 @@
> +        return NS_ERROR_FAILURE;
> +      }
> +    }
> +
> +    // TODO: Check extra key formats.

Are you planning on doing his as part of this bug or as a follow-up?

@@ +961,5 @@
> +      expired = temp.toBoolean();
> +    }
> +
> +    for (uint32_t methodIdx = 0, methodsLength = methods.Length(); methodIdx < methodsLength; ++methodIdx) {
> +      if (!IsValidIdentifierString(methods[methodIdx], 20, false)) {

Potential constant usage

@@ +967,5 @@
> +        return NS_ERROR_INVALID_ARG;
> +      }
> +
> +      for (uint32_t objectIdx = 0, objectsLength = objects.Length(); objectIdx < objectsLength; ++objectIdx) {
> +        if (!IsValidIdentifierString(objects[objectIdx], 20, false)) {

Potential constant usage

::: toolkit/components/telemetry/docs/collection/events.rst
@@ +182,5 @@
>  
>  Version History
>  ===============
>  
>  - Firefox 52: Initial event support (`bug 1302663 <https://bugzilla.mozilla.org/show_bug.cgi?id=1302663>`_).

Should add an entry to the version History too?

::: toolkit/components/telemetry/nsITelemetry.idl
@@ +492,5 @@
>    [implicit_jscontext, optional_argc]
> +  jsval snapshotEvents(in uint32_t aDataset, [optional] in boolean aClear);
> +
> +  /**
> +   * {

Mh, I think this could use some more documentation. What does it do? What are this parameters for?
Attachment #8884281 - Flags: feedback?(alessio.placitelli) → feedback+
Talking this through with Rebecca, we will:
- integrate the typology of data categories [1] into the probe registries (that probably doesn't need to be reflected in the client API)
- start to use the new notion of release/pre-release ("default on in release" vs. opt-in/opt-out)
- for expiry, use expiry dates at the addon build side (so the client API is limited to expired:true/false to enable expiry without code removal)

1: https://wiki.mozilla.org/Firefox/Data_Collection#Data_Collection_Categories
Completed the patch & updated it to the discussion results.
Attachment #8890224 - Flags: review?(alessio.placitelli)
Attachment #8884281 - Attachment is obsolete: true
Fixes that about:telemetry only shows events if there are any present in processes.parent.

For some reason switching the processes in the events section also triggers HistogramSection.render(), which breaks when there are no histograms recorded in the selected process.
This patch fixes it to make things work.
Attachment #8890277 - Flags: review?(chutten)
Blocks: 1384511
Comment on attachment 8890224 [details] [diff] [review]
Part 1 - Extend Event Telemetry core for recording from addons

Review of attachment 8890224 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good with the changes below addressed, which mostly relate to documentation updates.

::: toolkit/components/telemetry/TelemetryEvent.cpp
@@ +110,5 @@
> +const uint32_t kMaxObjectNameByteLength = 20;
> +// Maximum length of extra key names, in UTF8 byte sequence length.
> +const uint32_t kMaxExtraKeyNameByteLength = 15;
> +// The maximum number of valid extra keys for an event.
> +const uint32_t kMaxExtraKeyCount = 10;

Should we document these in events.rst?

@@ +937,5 @@
> +  return true;
> +}
> +
> +nsresult
> +TelemetryEvent::RegisterEvents(const nsACString& aCategory,

Let's make sure that dynamic events don't occur in the Events.yaml already, e.g. attempting to register telemetry.test.something which is already defined in Events.yaml

Since we don't want this to happen, maybe we can clearly state that in the events.rst file in a "note" block.

I wonder if we should be addressing, in the future, the collisions the other way around: let's say an addon starts recording events in telemetry.test.something and suddenly we land something in Events.yaml with the same name. This will be hard to catch, probably just noticeable by the drop in the submissions in the "dynamic" process. I don't think that's an immediate problem, but potentially worth thinking about this in the near future.

::: toolkit/components/telemetry/nsITelemetry.idl
@@ +525,5 @@
> +   * @param aEventData.<name>.expired Optional, whether this event entry is expired. This allows
> +   *                                  recording it without error, but it will be discarded. Defaults to false.
> +   */
> +  [implicit_jscontext]
> +  void registerEvents(in ACString aCategory, in jsval aEventData);

Should we also document that in events.rst -> Public API?

::: toolkit/components/telemetry/tests/unit/test_TelemetryEvents.js
@@ +396,5 @@
> +  Assert.equal(snapshot.dynamic.length, 1, "Should have one opt-out event in the snapshot.");
> +  Assert.deepEqual(snapshot.dynamic.map(e => e.slice(1)), expected);
> +});
> +
> +add_task(function* test_dynamicEventRegistrationValidation() {

Should we also test how dynamic events behave when when registering an event that is already advertised in the Events.yaml file?
Attachment #8890224 - Flags: review?(alessio.placitelli) → review+
(In reply to Alessio Placitelli [:Dexter] from comment #10)
> This looks good with the changes below addressed, which mostly relate to
> documentation updates.

Documentation is coming in patch #3 here.
Comment on attachment 8890277 [details] [diff] [review]
Part 2 - Fix about:telemetry for displaying dynamic events

Review of attachment 8890277 [details] [diff] [review]:
-----------------------------------------------------------------

:flyingrub, please add this test case to the list of things to test manually when making changes.

Also, does this look right to you?
Attachment #8890277 - Flags: review?(chutten)
Attachment #8890277 - Flags: review+
Attachment #8890277 - Flags: feedback?(flyinggrub)
Comment on attachment 8890277 [details] [diff] [review]
Part 2 - Fix about:telemetry for displaying dynamic events

looks good to me.
Attachment #8890277 - Flags: feedback?(flyinggrub) → feedback+
Reject events that are already registered and add test coverage for that scenario. Good catch!
Attachment #8890224 - Attachment is obsolete: true
Updated documentation.
Attachment #8890856 - Flags: review?(alessio.placitelli)
Attachment #8890856 - Flags: review?(alessio.placitelli) → review+
Comment on attachment 8890856 [details] [diff] [review]
Part 3 - Update documentation to cover dynamic events

Rebecca, this should cover what we talked about?
I'll not block landing on this due to time constraints, but happy to change things in follow-up bugs if needed.
Attachment #8890856 - Flags: feedback?(rweiss)
Keywords: checkin-needed
I believe we need to get the Addons team up to speed on this proposal and make sure we're on their radar.  Adding :kev for questions and feedback.
Flags: needinfo?(kev)
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7068c8d4448c
Part 1: Extend Event Telemetry core for recording from addons. r=Dexter
https://hg.mozilla.org/integration/mozilla-inbound/rev/8e79158a7a1c
Part 2: Fix about:telemetry for displaying dynamic events. r=chutten
https://hg.mozilla.org/integration/mozilla-inbound/rev/9c0bdff48a0a
Part 3: Update documentation to cover dynamic events. r=Dexter
Keywords: checkin-needed
Attachment #8890380 - Attachment is obsolete: true
try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1ea68acbfd67f7f0a6f3e4d89eb2954ddf16c6d4

linux64-stylo doesn't exist on try, so i can't include that (see bug 1383616 et al).
Flags: needinfo?(gfritzsche)
Fixed some tests that were using snapshotBuiltinEvents() instead of snapshotEvents().
Attachment #8891296 - Attachment is obsolete: true
Global static nsTArrays are showing up as leaks, use heap-allocation instead and clean it up on shutdown. Its enough to look at the diff, the rest is reviewed already. try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=179916e04552819b8d273731442565ce39337a30
Attachment #8893044 - Flags: feedback?(alessio.placitelli)
Attachment #8891363 - Attachment is obsolete: true
Attachment #8893044 - Flags: review+
Comment on attachment 8893044 [details] [diff] [review]
Part 1 - Extend Event Telemetry core for recording from addons

Review of attachment 8893044 [details] [diff] [review]:
-----------------------------------------------------------------

That's what I call persistence ;-) f+!
Attachment #8893044 - Flags: feedback?(alessio.placitelli) → feedback+
Updated patch to fix current about:telemetry state.
Attachment #8890277 - Attachment is obsolete: true
Attachment #8893281 - Flags: review+
try looks good now, the leak is gone.
Keywords: checkin-needed
Comment on attachment 8893044 [details] [diff] [review]
Part 1 - Extend Event Telemetry core for recording from addons

Approval Request Comment
[Feature/Bug causing the regression]: Event Telemetry for addons.
[User impact if declined]: Mozilla add-ons can't record new events into Telemetry.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Currently underway.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: This is a limited change to our Event Telemetry module. It is low-risk as any impact is restricted to having addons active that record new events.
[String changes made/needed]: None.
Attachment #8893044 - Flags: approval-mozilla-beta?
Comment on attachment 8893281 [details] [diff] [review]
Part 2 - Fix about:telemetry for displaying dynamic events

Approval Request Comment
... see comment 29.
Attachment #8893281 - Flags: approval-mozilla-beta?
At least part 1 needs rebasing.
Flags: needinfo?(gfritzsche)
Keywords: checkin-needed
Attachment #8893044 - Attachment is obsolete: true
Attachment #8893044 - Flags: approval-mozilla-beta?
Flags: needinfo?(gfritzsche)
Keywords: checkin-needed
Comment on attachment 8893396 [details] [diff] [review]
Part 1 - Extend Event Telemetry core for recording from addons

Approval Request Comment
... see comment 29.
Attachment #8893396 - Flags: review+
Attachment #8893396 - Flags: approval-mozilla-beta?
Do we really need this in beta 56? Can it ride with 57? Was there ever a data review decision as mentioned in comment 3?
Flags: needinfo?(gfritzsche)
Per IRC discussion with rweiss, this is still waiting on Kev to sign off.
Keywords: checkin-needed
This does not technically require data review as it adds an API, not actual data collection.
The API was shopped around for feedback already.

I don't see why this needs signoff from the addon team (this is not an API exposed to any non-Mozilla addon users).
I'll clear this up with Rebecca.
Flags: needinfo?(gfritzsche)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #34)
> Do we really need this in beta 56? Can it ride with 57? Was there ever a
> data review decision as mentioned in comment 3?

We are looking at potential search studies using this collection method in 56.
This only adds an API, not any data collection itself, so technically no data collection review is required.
Comment on attachment 8890856 [details] [diff] [review]
Part 3 - Update documentation to cover dynamic events

Talking through this with Rebecca, it seems unblocked.

There was concern about missing coordination with the addons team.
However, i met with the addons team a while back (as far as i remember with amckay and aswan) to run through the technical details of this plan, especially better integration into addon builds.
Attachment #8890856 - Flags: feedback?(rweiss)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dc324afba772
Part 1: Extend Event Telemetry core for recording from addons. r=dexter
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6f1f0ad81b3
Part 2: Fix about:telemetry for displaying dynamic events. r=chutten
https://hg.mozilla.org/integration/mozilla-inbound/rev/cb99e3171937
Part 3: Update documentation to cover dynamic events. r=dexter
Keywords: checkin-needed
Georg, would it be ok to try this out for beta 2?  I'd like to wait and see how it does on m-c for a day or two.
Flags: needinfo?(gfritzsche)
Yes, that is fine - we mostly want this on release 56.
Flags: needinfo?(gfritzsche)
https://hg.mozilla.org/projects/date/rev/dc324afba772c1a2abaf81f1744643d58c4a3c5b
Bug 1302681 - Part 1: Extend Event Telemetry core for recording from addons. r=dexter

https://hg.mozilla.org/projects/date/rev/c6f1f0ad81b3ee6731f76b5d68c7f8e8a38b78c4
Bug 1302681 - Part 2: Fix about:telemetry for displaying dynamic events. r=chutten

https://hg.mozilla.org/projects/date/rev/cb99e3171937ce564838e89cbaf94314e8457d6e
Bug 1302681 - Part 3: Update documentation to cover dynamic events. r=dexter
Comment on attachment 8893044 [details] [diff] [review]
Part 1 - Extend Event Telemetry core for recording from addons

Large patch here, includes tests. Since we want to have a ways to get accurate data from mozilla add-ons, let's now try this out in beta 56.
Attachment #8893281 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8893396 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Flags: needinfo?(kev)
(In reply to Georg Fritzsche [:gfritzsche] from comment #29)
> [Is this code covered by automated tests?]: Yes.
> [Has the fix been verified in Nightly?]: Currently underway.
> [Needs manual test from QE? If yes, steps to reproduce]: No.

Setting qe-verify- based on Georg's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.