Duplicate telemetry submissions being received by the server

RESOLVED FIXED in Firefox 39

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: mreid, Assigned: gfritzsche)

Tracking

39 Branch
mozilla40
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox38 unaffected, firefox39 fixed, firefox40 fixed)

Details

Attachments

(7 attachments, 3 obsolete attachments)

589 bytes, text/plain
Details
3.36 KB, text/plain
Details
2.06 KB, patch
vladan
: review+
Details | Diff | Splinter Review
1.57 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
9.68 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
7.84 KB, patch
Yoric
: review+
Details | Diff | Splinter Review
10.70 KB, patch
vladan
: review+
Yoric
: feedback+
Details | Diff | Splinter Review
Reporter

Description

4 years ago
We have seen duplicate Document IDs coming in to the new data pipeline. This includes both "main" pings and "saved-session" pings.

The duplicate saved-session pings I have checked so far (about 50 cases) have all been exact duplicates, though some document ids have been sent more than twice (one was seen 7 times).

Some of the duplicate main pings are identical, but some contained slightly different payloads. The duplicate main pings that I checked were all seen exactly twice.

I have several examples, if you'd like access to them please send me email.

One possible server-side explanation I had was that the "old" telemetry backend sent back a 500 response (while the new backend accepted the submission), causing the client to retry and double up the submission in the new pipeline.  

This could account for the identical duplicates, but would *not* explain the case where the payloads are slightly different.

I plan to check the corresponding data in the old backend and see if the duplicates appear there as well.
Reporter

Comment 1

4 years ago
I've confirmed that in my test set, the exact same duplicates appear in the old backend as I found in the new backend.

An example docId of a "slightly different" main ping:
04753c65-e17f-454b-8e78-5344cc57457f

The difference is that one ping has "shutdown_MS" in the "addonDetails" section for several addons, while the other doesn't have the shutdown times.
Reporter

Comment 2

4 years ago
Here is an example filter you can use with the telemetry MapReduce code to reproduce my analysis.
Reporter

Comment 3

4 years ago
A list of the document IDs that were seen more than once using the example filter.

Comment 4

4 years ago
Hey Georg and Benjamin-- the case of slightly different payloads being submitted under one docIs is the kind of unexpected client behavior that will make the Metrics Team nervous as we move towards go/no-go. Just to reiterate: after the experience of v2, our stance is that v4 is guilty until proven innocent, and that apparently little bugs may be the tips of bigger icebergs (icebugs?). Please keep me posted as you look into this-- we're looking for whatever reassurance you can offer :-) Thanks!

Updated

4 years ago
Blocks: 1069869

Comment 5

4 years ago
From raw data for one of the dups: diff between the two docs is:

@@ -558,8 +558,9 @@
           "scan_MS": 0,
           "scan_items": 1
         },
         "<redacted>": {
+          "shutdown_MS": 2,
           "startup_MS": 22,
...
@@ -590,8 +591,9 @@
           "location": "app-profile",
           "scan_MS": 0
         },
         "<redacted>": {
+          "shutdown_MS": 1,
           "startup_MS": 42,
...

So we're resubmitting that one with extra addon shutdown data.

For the other, there are changes within UITelemetry:

       "UITelemetry": {
         "UITour": {
           "seenPageIDs": []
         },
         "contextmenu": {},
         "toolbars": {
           "durations": {
             "customization": []
           },
           "countableEvents": {
             "__DEFAULT__": {
               "search": {
                 "urlbar": 1
               },
               "click-bookmarks-bar": {
                 "item": {
                   "left": 1
                 }
               },
               "click-menubar": {
+                "other": {
+                  "left": 1
+                },
                 "menuitem": {
-                  "left": 5
+                  "left": 9
                 },
                 "menu": {
-                  "left": 11
+                  "left": 18
                 }
               },
               "click-builtin-item": {
+                "urlbar-container": {
+                  "right": 1,
+                  "left": 2
+                },
                 "urlbar-reload-button": {
-                  "left": 1
+                  "left": 4
                 },
                 "tabview-button": {
-                  "left": 8
+                  "left": 14
                 },
                 "tabbrowser-tabs": {
-                  "right": 10,
-                  "left": 45
+                  "right": 16,
+                  "left": 60
                 }
               }
             }
           },
Assignee

Updated

4 years ago
Version: 37 Branch → 39 Branch
Assignee

Comment 6

4 years ago
mreid checked the server timestamps:

For 04753c65-e17f-454b-8e78-5344cc57457f:
* server timestamp: 2015-03-01 17:18:59.025598464 -0400 AST
* server timestamp: 2015-03-01 17:31:18.050208768 -0400 AST
* client timestamp (.creationDate) is the same for both: "2015-03-01T21:18:55.780Z"

For 849ec591-2f63-4277-b922-80835e36a390:
* server timestamp: 2015-02-28 21:11:20.727574272 -0400 AST
* server timestamp: 2015-03-01 12:14:28.044041728 -0400 AST
* client timestamp (.creationDate) is the same for both: "2015-03-01T01:10:36.008Z"

The server timestamps are not close to each other, so most likely we are running into problems here:
https://hg.mozilla.org/mozilla-central/annotate/008b3f65a7e0/toolkit/components/telemetry/TelemetryPing.jsm#l405
... for some reason we submit the ping, but get a rejection, hence still doing TelemetryFile.savePing().

The two data differences above are in parts of the payload that are not properly cloned:
* the addon details are just returned directly by reference: 
  https://hg.mozilla.org/mozilla-central/annotate/008b3f65a7e0/toolkit/mozapps/extensions/AddonManager.jsm#l2605
* UITelemetry only returns a shallow copy:
  https://hg.mozilla.org/mozilla-central/annotate/008b3f65a7e0/toolkit/components/telemetry/UITelemetry.jsm#l223

Given that, we should probably just do a structured clone of the payload object in TelemetrySession.getSessionPayload() to avoid future modifications.
I'm still looking into the actual reason that leads to a rejection and triggers this issue.
Assignee

Updated

4 years ago
Assignee: nobody → gfritzsche
Assignee

Updated

4 years ago
Status: NEW → ASSIGNED
Assignee

Comment 7

4 years ago
This doesn't fix duplicate submissions yet, but takes care of payloads possibly changing after being submitted to TelemetryPing.
The duplicate submission exposed that.
Assignee

Comment 8

4 years ago
This improves the Telemetry ping submission code:
* catches more error paths
* makes Telemetry timeout on ping submission after an interval
* (hopefully) improves readability
Attachment #8579369 - Flags: review?(vdjeric)
Assignee

Comment 9

4 years ago
Comment on attachment 8579354 [details] [diff] [review]
Part 1 - Do a structured clone of the payloads

Trivial patch, i promise.
Attachment #8579354 - Flags: review?(vdjeric)
Assignee

Comment 10

4 years ago
This should fix some (most?) of the double submissions here.

The idea is that we may lose network connectivity (through shutdown, network issues, ...) after the server received the submission, but before the client completely receives the proper responses.
This was not exposed before because it was very unlikely to have an idle-daily submission still pending on shutdown.

We need to take care of waiting for pending ping submissions etc. on Telemetry shutdown, which is what this patch adresses.
Attachment #8579373 - Flags: review?(dteller)
Attachment #8579373 - Flags: feedback?(vdjeric)
Reporter

Comment 11

4 years ago
(In reply to Georg Fritzsche [:gfritzsche] [away Feb 27 - March 15] from comment #7)
> This doesn't fix duplicate submissions yet, but takes care of payloads
> possibly changing after being submitted to TelemetryPing.
This was my primary concern. If we can ensure that any subsequent submissions with the same DocumentID are exactly identical, then we can correctly handle duplicates on the server side (see bug 1142153 for our proposal).
Comment on attachment 8579373 [details] [diff] [review]
Part 3 - Make Telemetry shutdown block on pending ping activity

Review of attachment 8579373 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +390,5 @@
>    },
>  
>    /**
> +   * Track any pending ping send and save tasks through the promise passed here.
> +   * This is needed to block shutdown on any outstanding ping activity.

Why doesn't each ping block shutdown individually?

Take a look at the `connections` barrier in Sqlite.jsm, I suspect that this would be somewhat clearer and less dependent on maintaining yet another global array.

@@ +813,5 @@
>          this._initialized = false;
>          this._initStarted = false;
>        };
> +      return this._shutdownBarrier.wait()
> +                 .then(() => PromiseUtils.allResolvedOrRejected(this._pendingPingTasks.values()))

I believe that this would be clearer with a `Task`.

::: toolkit/modules/PromiseUtils.jsm
@@ +23,5 @@
> +   * Like Promise.all(), but waits until all Promises have been resolved or
> +   * rejected.
> +   *
> +   * @param promises An array of promises to wait on.
> +   * @return A promise that is resolved once all the passed promises are resolved or rejected.

Nit: Please match the doc format of `defer`.

@@ +27,5 @@
> +   * @return A promise that is resolved once all the passed promises are resolved or rejected.
> +   */
> +  allResolvedOrRejected: function(promises) {
> +    // Use a dummy catch callback to turn any rejections into resolves.
> +    let ps = [for (p of promises) Promise.resolve(p).catch(() => {})];

Hey, my first review with the standardized array comprehension syntax :)
Attachment #8579373 - Flags: review?(dteller) → feedback+
Assignee

Comment 13

4 years ago
Thanks, this addresses your comments.

Turns out that Experiments.jsm didn't actually use allResolvedOrRejected() anymore and the changes to the blocker approach in TelemetryPing dropped it, so that change is gone.
Attachment #8579373 - Attachment is obsolete: true
Attachment #8579373 - Flags: feedback?(vdjeric)
Attachment #8580058 - Flags: review?(dteller)
(In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> For 04753c65-e17f-454b-8e78-5344cc57457f:
> * server timestamp: 2015-03-01 17:18:59.025598464 -0400 AST
> * server timestamp: 2015-03-01 17:31:18.050208768 -0400 AST
> * client timestamp (.creationDate) is the same for both:
> "2015-03-01T21:18:55.780Z"
> 
> For 849ec591-2f63-4277-b922-80835e36a390:
> * server timestamp: 2015-02-28 21:11:20.727574272 -0400 AST
> * server timestamp: 2015-03-01 12:14:28.044041728 -0400 AST
> * client timestamp (.creationDate) is the same for both:
> "2015-03-01T01:10:36.008Z"
> 
> The server timestamps are not close to each other, so most likely we are
> running into problems here:
> ... < snip > ...
> I'm still looking into the actual reason that leads to a rejection and
> triggers this issue.

Is it possible we're generating a ping and sending it and also saving it at shutdown? This would explain why the duplicate doesn't get sent until much later.
*at Firefox shutdown
Attachment #8579354 - Flags: review?(vdjeric) → review+
Comment on attachment 8579369 [details] [diff] [review]
Part 2 - Improve XHR usage in TelemetryPing

Review of attachment 8579369 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +40,5 @@
>  const TELEMETRY_TEST_DELAY = 100;
>  // The number of days to keep pings serialised on the disk in case of failures.
>  const DEFAULT_RETENTION_DAYS = 14;
> +// Timeout after which we consider a ping submission failed.
> +const PING_SUBMIT_TIMEOUT = 2 * 60 * 1000;

PING_SUBMIT_TIMEOUT_MS

@@ +578,5 @@
>      request.overrideMimeType("text/plain");
>      request.setRequestHeader("Content-Type", "application/json; charset=UTF-8");
>  
>      let startTime = new Date();
> +    let deferred = PromiseUtils.defer();

what's the deference between Promise.defer() and PromiseUtils.defer()?

@@ +606,5 @@
> +    request.ontimeout = errorhandler;
> +    request.onabort = errorhandler;
> +
> +    request.onload = (event) => {
> +      let status = request.status;

the spec says .status might not have give the right results when there are network errors :/ https://xhr.spec.whatwg.org/#introduction

@@ +607,5 @@
> +    request.onabort = errorhandler;
> +
> +    request.onload = (event) => {
> +      let status = request.status;
> +      let statusClass = status - (status % 100);

you could just parseInt(status/100)

@@ +614,5 @@
> +      if (statusClass === 200) {
> +        // We can treat all 2XX as success.
> +        this._log.info("doPing - successfully loaded, status: " + status);
> +        success = true;
> +      } else if (statusClass === 400) {

did you test that onload gets called for these different error codes? does onError get called for all of them as well?
Attachment #8579369 - Flags: review?(vdjeric) → review+
Assignee

Comment 18

4 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #15)
> (In reply to Georg Fritzsche [:gfritzsche] from comment #6)
> > For 04753c65-e17f-454b-8e78-5344cc57457f:
> > * server timestamp: 2015-03-01 17:18:59.025598464 -0400 AST
> > * server timestamp: 2015-03-01 17:31:18.050208768 -0400 AST
> > * client timestamp (.creationDate) is the same for both:
> > "2015-03-01T21:18:55.780Z"
> > 
> > For 849ec591-2f63-4277-b922-80835e36a390:
> > * server timestamp: 2015-02-28 21:11:20.727574272 -0400 AST
> > * server timestamp: 2015-03-01 12:14:28.044041728 -0400 AST
> > * client timestamp (.creationDate) is the same for both:
> > "2015-03-01T01:10:36.008Z"
> > 
> > The server timestamps are not close to each other, so most likely we are
> > running into problems here:
> > ... < snip > ...
> > I'm still looking into the actual reason that leads to a rejection and
> > triggers this issue.
> 
> Is it possible we're generating a ping and sending it and also saving it at
> shutdown? This would explain why the duplicate doesn't get sent until much
> later.

So, my hypothesis (and the only suspect code path i could find so far) is that:
* we send it
* the server receives
* we dont get confirmation of receiving due to e.g. network error
* due to that error, we save the ping here: https://hg.mozilla.org/mozilla-central/annotate/008b3f65a7e0/toolkit/components/telemetry/TelemetryPing.jsm#l406
* after the next session startup, we submit this ping

That we apparently only send the ping at least "N minutes" later seems to confirm this.
Assignee

Comment 19

4 years ago
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #17)
> @@ +578,5 @@
> > +    let deferred = PromiseUtils.defer();
> 
> what's the deference between Promise.defer() and PromiseUtils.defer()?

Functionally they are equivalent, but Promise.defer() is deprecated over the new standardized DOM Promises.

> @@ +606,5 @@
> > +    request.ontimeout = errorhandler;
> > +    request.onabort = errorhandler;
> > +
> > +    request.onload = (event) => {
> > +      let status = request.status;
> 
> the spec says .status might not have give the right results when there are
> network errors :/ https://xhr.spec.whatwg.org/#introduction

Which section are you referring to specifically?
If we hit onload, the fetch succeeded (i.e. status set properly?): https://xhr.spec.whatwg.org/#event-xhr-load

> @@ +607,5 @@
> > +    request.onabort = errorhandler;
> > +
> > +    request.onload = (event) => {
> > +      let status = request.status;
> > +      let statusClass = status - (status % 100);
> 
> you could just parseInt(status/100)

I didn't want to go for a string just for this, but i guess it doesn't matter much either way.

> @@ +614,5 @@
> > +      if (statusClass === 200) {
> > +        // We can treat all 2XX as success.
> > +        this._log.info("doPing - successfully loaded, status: " + status);
> > +        success = true;
> > +      } else if (statusClass === 400) {
> 
> did you test that onload gets called for these different error codes? does
> onError get called for all of them as well?

"error" and "load" are mutually exclusive:
https://xhr.spec.whatwg.org/#suggested-names-for-events-using-the-progressevent-interface

"load" gets called for successful fetches with the status codes set appropriately.
Assignee

Comment 20

4 years ago
Addressed comment.
Attachment #8579369 - Attachment is obsolete: true
Attachment #8580679 - Flags: review+
Comment on attachment 8580059 [details] [diff] [review]
Part 4 - Bonus: Remove unused function from Experiments.jsm

Review of attachment 8580059 [details] [diff] [review]:
-----------------------------------------------------------------

Rubberstamp.
Attachment #8580059 - Flags: review?(dteller) → review+
Comment on attachment 8580058 [details] [diff] [review]
Part 3 - Make Telemetry shutdown block on pending ping activity

Review of attachment 8580058 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +396,5 @@
> +   * Track any pending ping send and save tasks through the promise passed here.
> +   * This is needed to block shutdown on any outstanding ping activity.
> +   *
> +   * @return Promise A promise that is resolved or rejected with the original pings
> +   *                 status once it is untracked.

Doc error: this doesn't seem to return any value.

@@ +449,4 @@
>      let pingsIterator = Iterator(this.popPayloads());
>      let p = [data for (data in pingsIterator)].map(data => this.doPing(data, true));
> +
> +    let promise = Promise.all(p);

If a call to `doPing` rejects, `Promise.all(p)` will reject immediately, without waiting for the other calls to `doPing` to resolve. Is this the intended behavior? If `doPing` cannot reject, please update the documentation of `doPing`.
Attachment #8580058 - Flags: review?(dteller) → feedback+
Attachment #8581562 - Flags: review?(dteller) → review+
Assignee

Comment 29

4 years ago
Alessio isolated the issue here that didnt show up on my try pushes, working on it.
Flags: needinfo?(gfritzsche)
Assignee

Comment 30

4 years ago
The problem was pending ping requests:
Their timeouts are greater then the AsyncShutdown timeout.
If we happen to trigger a ping send shortly before shutdown we wait for it to do an orderly shutdown.
If it times out, we potentially run into an AsyncShutdown timeout before the timeout triggers.

While this is much less likely to hit outside test environments, it might happen.

This patch
* adds a test-case for that scenario
* aborts pending ping XHRs on shutdown, which triggers persisting them properly
Attachment #8586101 - Flags: review?(vdjeric)
Comment on attachment 8586101 [details] [diff] [review]
Part 5 - Abort pending ping requests on shutdown

Review of attachment 8586101 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/tests/unit/test_TelemetryPingShutdown.js
@@ +49,5 @@
> +
> +  run_next_test();
> +}
> +
> +add_task(function* test_shutdown() {

maybe a less generic name like test_send_timeout?

@@ +72,5 @@
> +  // If we wait much longer then the timeout we set for AsyncShutdown
> +  // above, then we can be sure that the shutdown did not hang.
> +  setTimeout(() => {
> +    Assert.ok(true, "Didn't time out on shutdown.");
> +    do_test_finished();

check that the ping got saved
Attachment #8586101 - Flags: review?(vdjeric) → review+
Comment on attachment 8586101 [details] [diff] [review]
Part 5 - Abort pending ping requests on shutdown

Yoric, as the resident AsyncShutdown expert, can you take a quick look at TelemetryPing.jsm in this patch :)
Attachment #8586101 - Flags: feedback?(dteller)
Comment on attachment 8586101 [details] [diff] [review]
Part 5 - Abort pending ping requests on shutdown

Review of attachment 8586101 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryPing.jsm
@@ +606,5 @@
>                (error) => {
>                  this._log.error("doPing - request success: " + success + ", error" + error);
>                  onCompletion();
>                });
>      };

Nit: This callback soup is starting to get messy. Task.jsm would help clean it up.

@@ +827,5 @@
>  
> +    // Abort any pending ping XHRs.
> +    for (let [url, request] of this._pendingPingRequests) {
> +      this._log.trace("_cleanupOnShutdown - aborting ping request for " + url);
> +      request.abort();

Nit: If this can fail, add try/catch. Otherwise, document it.

::: toolkit/components/telemetry/tests/unit/test_TelemetryPingShutdown.js
@@ +1,3 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */
> +

Nit: "use strict"

@@ +5,5 @@
> +// to AsyncShutdown timeouts.
> +
> +const { utils: Cu, interfaces: Ci, classes: Cc } = Components;
> +
> +Cu.import("resource://gre/modules/Services.jsm");

Nit: `, this` for all imports, if we want these tests to be B2G-compatible. Not strictly necessary, but it doesn't cost much.

@@ +14,5 @@
> +
> +XPCOMUtils.defineLazyGetter(this, "gDatareportingService",
> +  () => Cc["@mozilla.org/datareporting/service;1"]
> +          .getService(Ci.nsISupports)
> +          .wrappedJSObject);

Is a lazy getter really useful in a test?

@@ +30,5 @@
> +  response.setHeader("Content-Type", "text/plain");
> +}
> +
> +function run_test() {
> +  do_test_pending();

Since you make use of this `do_test_pending` only at the end, with your `setTimeout`, could you move them together?

@@ +51,5 @@
> +}
> +
> +add_task(function* test_shutdown() {
> +  const TIMEOUT = 100;
> +  Services.prefs.setBoolPref("toolkit.asyncshutdown.testing", true);

Since this preference is not documented anywhere, please document its use.

@@ +64,5 @@
> +  TelemetryPing.send("test-ping-type", {});
> +
> +  // Trigger the AsyncShutdown phase TelemetryPing hangs off.
> +  Services.obs.notifyObservers(null, "profile-before-change", null);
> +  Services.obs.notifyObservers(null, "profile-before-change2", null);

Well, since you're using toolkit.asyncshutdown.testing, you should be able to use directly `AsyncShutdown.profileBeforeChange._trigger()` and `AsyncShutdown.sendTelemetry._trigger()`.

@@ +67,5 @@
> +  Services.obs.notifyObservers(null, "profile-before-change", null);
> +  Services.obs.notifyObservers(null, "profile-before-change2", null);
> +
> +  // AsyncShutdown currently doesn't have a way to properly wait for a
> +  // shutdown phase, so we use a workaround here.

Please file a bug :)
Attachment #8586101 - Flags: feedback?(dteller) → feedback+
Assignee

Comment 34

4 years ago
(In reply to David Rajchenbach-Teller [:Yoric] (use "needinfo") from comment #33)
> @@ +67,5 @@
> > +  Services.obs.notifyObservers(null, "profile-before-change", null);
> > +  Services.obs.notifyObservers(null, "profile-before-change2", null);
> > +
> > +  // AsyncShutdown currently doesn't have a way to properly wait for a
> > +  // shutdown phase, so we use a workaround here.
> 
> Please file a bug :)

Nevermind that part, using phase._trigger() waiting works just fine.
Assignee

Comment 35

4 years ago
Other comments addressed and full stack is green on try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=256df1f0f2f6
This and everything else from that push is backed out in https://hg.mozilla.org/integration/fx-team/rev/6901a267f856 for Windows PGO XPCShell failures:

https://treeherder.mozilla.org/logviewer.html#?job_id=2520751&repo=fx-team
Flags: needinfo?(gfritzsche)
Assignee

Comment 38

4 years ago
Hmpf, #3 of "things that never showed up on try".

PGO try push with fix: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9cb8ce2e0436
Flags: needinfo?(gfritzsche)
Assignee

Comment 39

4 years ago
This should actually be handled on bug 1140558.
Assignee

Comment 42

4 years ago
Comment on attachment 8579354 [details] [diff] [review]
Part 1 - Do a structured clone of the payloads

Approval Request Comment

[Feature/regressing bug #]:
FHR/Telemetry unification. Note that this is requesting approval for several bugs & patches.
[User impact if declined]:
Telemetry bugs & negative impact on assuring proper behavior on the trains.
[Describe test coverage new/current, TreeHerder]:
We have good automation coverage of the Telemetry features.
Data verification on the server side showed reasonable behavior since the landings and there were no other issues reported so far.
[Risks and why]:
Through the above steps, we can call this low risk. The changes are fairly contained to Telemetry and we checked that Telemetry behaves fine.
[String/UUID change made/needed]:
None.


The try push of the stack on Aurora is looking good so far:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=31596f200e20

As discussed with lmandel on Vidyo today, i am flagging for approval on one patch here. The approval request extends to the following patches:
 0 - Bug 1139460 - Part 1 - Avoid Telemetry payload data changes after collection. r=vladan
 1 - Bug 1139460 - Part 2 - Overhaul telemetry ping submission code. r=vladan
 2 Bug 1139460 - Part 3 - Make TelemetryPing shutdown block on pending submissions. r=yoric
 3 - Bug 1139460 - Bonus: Remove unused function from experiments code. rs=yoric
 4 - Bug 1139460 - Part 5 - Fix race leading to TelemetryPing shutdown timeout by aborting all pending ping requests on shutdown. r=vladan,f=yoric
 5 - Bug 1140558 - Part 1 - Switch TelemetryEnvironment to a model which keeps track of the current state constantly and makes the current environment available synchronously. r=vladan/Dexter
 6 - Bug 1140558 - Part 2 - Make the testing deepEqual implementation shared properly in ObjectUtils.jsm. r=yoric
 7 - Bug 1140558 - Part 3 - Pass the old environment data to event listeners on environment changes. r=vladan
 8 - Bug 1140558 - Part 4 - Fix TelemetryEnvironment returning NaN for GFX RAM on error. r=gfritzsche
 9 - Bug 1143714 - Throttle Telemetry environment changes. r=vladan
10 - Bug 1143796 - Increase TelemetryScheduler ticking interval when user is not active. r=gfritzsche
11 - Bug 1143796 - Add test for TelemetryScheduler tick interval changing when user is idle. r=gfritzsche
12 - Bug 1139751 - Try to collect data for Telemetry pings when the user is idle. r=vladan
13 - Bug 1139751 - Fix aborted-session saves that can race to after the shutdown aborted-session removal. r=vladan
14 - Bug 1154228 - Remove unnecessary semicolon in toLocalTimeISOString in TelemetrySession.jsm. r=gfritzsche
15 - Bug 1154717 - Fix toLocalTimeISOString(). r=yoric
16 - Bug 1154856 - (Aurora version) Fix TelemetrySession test issues. r=vladan
17 - Bug 1154902 - Don't gather profile data in Telemetry on Android devices. r=gfritzsche
18 - Bug 1151994 - test_telemetryPing.js in asyncSetup if datareporting.healthreport.service.enabled is undefined or false. r=vdjeric
19 - Bug 1137404 - Remove DWriteVersion reporting from Telemetry. r=gfritzsche
20 - Bug 1154518 - Make sure extended data gathering (Telemetry) is disabled when FHR is disabled. r=Gijs
Attachment #8579354 - Flags: approval-mozilla-aurora?
Comment on attachment 8579354 [details] [diff] [review]
Part 1 - Do a structured clone of the payloads

I am approving this set of patches for uplift. These changes are needed as they address issues with Telemetry introduced in 39. In addition, we will test the data integrity of the Telemetry package on Aurora 39 in order to put this feature in a better position to ship in 40. Note that unified Telemetry is planned to be disabled before 39 merges to Beta.
Attachment #8579354 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8580059 - Flags: approval-mozilla-aurora+
Attachment #8580679 - Flags: approval-mozilla-aurora+
Attachment #8581562 - Flags: approval-mozilla-aurora+
Attachment #8586101 - Flags: approval-mozilla-aurora+
Georg - Can you please get a bug on file if you haven't already to disable unified Telemetry before 39 merges to Beta?
Flags: needinfo?(gfritzsche)
Assignee

Comment 46

4 years ago
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #44)
> Georg - Can you please get a bug on file if you haven't already to disable
> unified Telemetry before 39 merges to Beta?

Sorry for clearing this - i had filed this as bug 1158317.
Flags: needinfo?(gfritzsche)
Reporter

Updated

4 years ago
See Also: → 1167456
You need to log in before you can comment on or make changes to this bug.