Bug 1586810 Comment 1 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

I'm struggling to find any patterns to the pings that have duplicate document ids.  This happens across all Android SDK versions, 105 different device models, all versions of Fenix and a-c that have run over the past day, 32 different locales.  There also seems to be no relationship to ping duration, or how old the profile is.

The GCP pipeline doesn't retain the rejected pings to inspect what they might contain.  However, from the submission URL, we can get the document id that was rejected, and then cross-reference that with the accepted pings in the database to inspect the pings' contents.  This is going on the assumption that the contents of the accepted and rejected pings are the same, which is a huge, unconfirmed assumption.  

Using this technique, it is never the case that a duplicate document id came from different client ids (since the client id is retained in the error database), so we can at least rule out uuid collisions across different devices.

There are some interesting patterns I've found by looking at the difference in `submission_timestamp` between the accepted ping and the corresponding rejected pings.  (See plot below).

(1) In the last day, there were 3 pings that were rejected *before* they were submitted, but only by a difference of at most 36 seconds.  I would assume this is due to the `submission_timestamp` being generated early on in the pipeline and the duplicate document id rejection happening later, with a race condition in between.

(2) There is a pretty normal looking decay curve in the left of the graph where the delays are short.  Many of the duplicates are sent in the first five minutes.

(3) There is a peculiar spike between `5h:24m` and `5h:27m` that seems much too large to be random.  What is it about the ~5:30 that is meaningful here?
I'm struggling to find any patterns to the pings that have duplicate document ids.  This happens across all Android SDK versions, 105 different device models, all versions of Fenix and a-c that have run over the past day, 32 different locales.  There also seems to be no relationship to ping duration, or how old the profile is.

The GCP pipeline doesn't retain the rejected pings to inspect what they might contain.  However, from the submission URL, we can get the document id that was rejected, and then cross-reference that with the accepted pings in the database to inspect the pings' contents.  This is going on the assumption that the contents of the accepted and rejected pings are the same, which is a huge, unconfirmed assumption.  

Using this technique, it is never the case that a duplicate document id came from different client ids (since the client id is retained in the error database), so we can at least rule out uuid collisions across different devices.

There are some interesting patterns I've found by looking at the difference in `submission_timestamp` between the accepted ping and the corresponding rejected pings.  (See plot below).

(1) In the last day, there were 3 pings that were rejected *before* they were submitted, but only by a difference of at most 36 seconds.  I would assume this is due to the `submission_timestamp` being generated early on in the pipeline and the duplicate document id rejection happening later, with a race condition in between.

(2) There is a pretty normal looking decay curve in the left of the graph where the delays are short.  Many of the duplicates are sent in the first five minutes.

(3) There is a peculiar spike between `5h:24m` and `5h:27m` that seems much too large to be random.  What is it about the ~5:30 that is meaningful here?

EDIT: All of the pings in (3) came from a single client, and all in a single incident of sending a bunch of pings within a single one minute period, so it's probably fair to say that it's an anomaly related to something that happened on that particular device rather than anything systemic.

Back to Bug 1586810 Comment 1