Closed Bug 983360 Opened 10 years ago Closed 10 years ago

Telemetry experiments: add monitoring data to Telemetry payload

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Firefox 31

People

(Reporter: gfritzsche, Assigned: gfritzsche)

References

Details

Attachments

(1 file, 2 obsolete files)

Forking the telemetry part out from bug 974024, comment 2:

Telemetry payload:
* Log experiments that we considered: if a condition didn't match (or the JS function threw an exception) log the reason that the experiment was not accepted
* Log experiment install/success/failure.
* Log user-disable of an experiment.
* Log normal experiment termination (via script from the experiment itself, or via expiration of the experiment).
Depends on: 974009
We need to know which which experiments the events occurred for.

Example report:
> 1500 people installed experiment X yesterday, and 150000 people didn't because condition Foo didn't match
So i think we will need two actual log events: activation success and termination.

"EXPERIMENT_ACTIVATION" can be one of: ACTIVATED=0, INSTALL_FAILURE=1, REJECTED=2.
It will use an additional field for details (rejection reason, failure details).

"EXPERIMENT_TERMINATION" can be one of: USERDISABLED=0, FROM_EXPERIMENT=1, EXPIRED=2, RECHECK=3.
I'm wondering for this, if it actually makes a difference on the server-/analysis-side whether we submit:

> log("EXPERIMENT_ACTIVATION", [ACTIVATED, id]);
> log("EXPERIMENT_ACTIVATION", [REJECTED, id, "startTime"]);
> ...

... or say:

> log("EXPERIMENT_ACTIVATION_ACTIVATED", [id]);
> log("EXPERIMENT_ACTIVATION_REJECTED", [id, "startTime"]);
> ...
Flags: needinfo?(benjamin)
It doesn't make a difference at the moment, no. I'll pick the first form. (Although I'd expect "reason" and not "startTime").
Flags: needinfo?(benjamin)
Rejection examples:
log("EXPERIMENT_ACTIVATION", [REJECTED, id, "startTime"])
log("EXPERIMENT_ACTIVATION", [REJECTED, id, "jsfilter-false"])
log("EXPERIMENT_ACTIVATION", [REJECTED, id, "jsfilter-threw", expection.message])
Attached patch Telemetry, fixup, no tests (obsolete) — Splinter Review
I need to finish up tests tomorrow, but this looks reasonable so far.

It required some fixup for the rejection reason propagation and the tests.
Attachment #8394371 - Flags: review?(benjamin)
Attachment #8394371 - Attachment is obsolete: true
Attachment #8394371 - Flags: review?(benjamin)
Attachment #8394835 - Flags: review?(benjamin)
Open question: how to actually tell that an experiment disabled itself?
The only thing i can use at the moment is expecting it to do:

> Experiments.instance().disableExperiment(id, false)
Blocks: 986530
Yes, that's what they will do. Although I wonder if we shouldn't make .disableExperiment not take the ID, since there's only one experiment at a time...
That sounds better, i'll file a follow-up.
Blocks: 986539
Comment on attachment 8394835 [details] [diff] [review]
Experiments telemetry, test coverage, small fixes

>+const TELEMETRY_LOG = Object.freeze({
>+  // log(key, [kind, experimentId, details])
>+  ACTIVATION: Object.freeze({
>+    key: "EXPERIMENT_ACTIVATION",
>+    kind: Object.freeze({
>+      ACTIVATED: 0,       // successfully activated
>+      INSTALL_FAILURE: 1, // failed to install the extension
>+      REJECTED: 2,        // experiment was rejected because of it's conditions.
>+    }),
>+  }),
>+
>+  // log(key, [kind, experimentId, details])
>+  TERMINATION: Object.freeze({
>+    key: "EXPERIMENT_TERMINATION",
>+    kind: Object.freeze({
>+      USERDISABLED: 0,    // the user disabled this experiment
>+      FROM_EXPERIMENT: 1, // the experiment disabled itself
>+      EXPIRED: 2,         // experiment expired e.g. by exceeding the end-date
>+      RECHECK: 3,         // disabled after re-evaluating conditions

Will the RECHECK case have the condition which failed as REJECTED does?

>+    }),
>+  }),
>+});

Let's not freeze any of this: it's complicated and unnecessary. But also using this below as TELEMETRY_LOG.TERMINATION.kind.USERDISABLED seems verbose. Maybe shorten this structure to:

TLOG.TERMINATION.USER_DISABLED ?

>     let active = this._getActiveExperiment();
>     let promise = Promise.resolve();
>     if (active) {
>-      promise = active.stop();
>+      promise = active.stop(-1);

Need constants for this and elsewhere. It's really unclear at the call site what "-1" means here.

>+        let terminationKind = TELEMETRY_LOG.TERMINATION.kind.USERDISABLED;
>+        if (!userDisabled) {
>+          // assuming this comes from the experiment, but how would we tell?
>+          terminationKind = TELEMETRY_LOG.TERMINATION.kind.FROM_EXPERIMENT;

Maybe instead of FROM_EXPERIMENT this should just be "FROM_API"?

>   /*
>    * Stop running the experiment if it is active.
>-   * @param userDisabled (optional) Whether this is disabled by user action, defaults to false.
>+   * @param terminationKind The termination kind, e.g. USERDISABLED or EXPIRED.
>    * @return Promise<> Resolved when the operation is complete.
>    */
>-  stop: function (userDisabled=false) {
>-    gLogger.trace("ExperimentEntry::stop() - id=" + this.id + ", userDisabled=" + userDisabled);
>+  stop: function (terminationKind) {
>+    gLogger.trace("ExperimentEntry::stop() - id=" + this.id + ", terminationKind=" + terminationKind);
>     if (!this._enabled) {
>       gLogger.warning("ExperimentEntry::stop() - experiment not enabled: " + id);
>       return Promise.reject();
>     }
> 
>     this._enabled = false;
>     let deferred = Promise.defer();
>     let updateDates = () => {
>@@ -1328,63 +1379,76 @@ Experiments.ExperimentEntry.prototype = 
>       let listener = {};
>       let handler = addon => {
>         if (addon.id !== this._addonId) {
>           return;
>         }
> 
>         updateDates();
> 
>+        let desc = TELEMETRY_LOG.TERMINATION;
>+        let values = [desc.kind[key] for (key of Object.keys(desc.kind))];
>+        if (values.indexOf(terminationKind) != -1) {

This is very strange, we're basically going string -> int -> string and this is very hard to read. Why don't we just make these string constants to begin with, so:

TELEMETRY_LOG.TERMINATION.EXPIRED = "EXPIRED" and then you can just use the string directly?

Also I see what the -1 is doing above. You should just leave it out by passing stop() and here terminationKind === undefined.


>diff --git a/browser/experiments/test/xpcshell/test_telemetry.js b/browser/experiments/test/xpcshell/test_telemetry.js

>+const TELEMETRY_LOG = Object.freeze({

I don't think you need this in the test: you could just hardcode the values we expect.
Attachment #8394835 - Flags: review?(benjamin) → review-
Addressed the review comments. Would like something shorter than TELEMETRY_LOG, but TLOG seems a little cryptic and brevity seems reasonable now.
Attachment #8394835 - Attachment is obsolete: true
Attachment #8394899 - Flags: review?(benjamin)
Comment on attachment 8394899 [details] [diff] [review]
Experiments telemetry, incorporating review comments

> 
>     let active = this._getActiveExperiment();
>     let promise = Promise.resolve();
>     if (active) {
>-      promise = active.stop();
>+      promise = active.stop(-1);

Forgot a -1?


>+          if (!applicable && reason && reason[0] != "was-active") {
>+            // Report this from here to avoid over-reporting.
>+            let desc = TELEMETRY_LOG.ACTIVATION;
>+            let data = [TELEMETRY_LOG.ACTIVATION.REJECTED, id];
>+            reason.forEach(a => data.push(a));

Not .concat? Push is probably faster, but it's a little hard to read. (And below).

r=me with nits
Attachment #8394899 - Flags: review?(benjamin) → review+
https://hg.mozilla.org/mozilla-central/rev/0eb7e1fa8a0f
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 31
Depends on: 1021002
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: