Closed Bug 1547234 Opened 6 years ago Closed 5 years ago

Investigate duplicates in the reported data

Tracking

(Not tracked)

Status:

RESOLVED DUPLICATE of bug 1586810

People

(Reporter: Dexter, Unassigned)

References

Details

Alessio Placitelli [:Dexter]

Reporter

Description

•

6 years ago

In bug 1525603 Chris found that we have a problem with duplicates:

(In reply to Chris H-C :chutten from bug 1525603 comment #2)

Taking a closer look at the Dupes, only about two-thirds of them are fully dupes (ie, having the same docid). Over a third have different document ids.

We seem to have two different problems:

(1) full dupes with same document id, hinting at the fact that we might be sending dupes spread across a long time period or that the deduper on the pipeline is not catching them for other reasons;
(2) "half dupes", aka dupes with different document id, hinting at the fact that we have a problem in the SDK of re-using sequence numbers when we shouldn't

Jeff Klukas [:klukas] (UTC-4)

Comment 1

•

6 years ago

For (1), we expect imperfect deduplication on the AWS pipeline due to maintaining seen docIds separately on the various hindsight servers; duplicate pings that hit different servers won't be filtered out. We are observing substantially better deduplication performance on the GCP pipeline, which stores seen docIds centrally on a Redis cluster and maintains 24 hours of history.

Alessio Placitelli [:Dexter]

Reporter

Comment 2

•

6 years ago

(In reply to Jeff Klukas [:klukas] (UTC-4) from comment #1)

For (1), we expect imperfect deduplication on the AWS pipeline due to maintaining seen docIds separately on the various hindsight servers; duplicate pings that hit different servers won't be filtered out.

Yup, we're aware of this. However, we're seeing 9% of duplicates, which seems a bit too high. We're interested in understanding why they are not filtered out on the pipeline, in addition to fixing the root cause in the SDK. Knowing this would also point us to the right direction on the SDK side :-D

We are observing substantially better deduplication performance on the GCP pipeline, which stores seen docIds centrally on a Redis cluster and maintains 24 hours of history.

Nice!

Jeff Klukas [:klukas] (UTC-4)

Comment 3

•

6 years ago

we're seeing 9% of duplicates, which seems a bit too high

Indeed. In a validation exercise we undertook earlier this week, we saw an overall dupe rate below 1% on AWS. But if something in the SDK is causing more retried glean payloads than the general case, I could certainly see that pushing the dupe rate higher. But I'd agree we can't rule out an issue in the pipeline as a contributor to the dupes.

Michael Droettboom [:mdroettboom]

Comment 4

•

6 years ago

For reference: Here's a query to pick up a specific instance of "half-dupes" (2):

https://sql.telemetry.mozilla.org/queries/62480/

Michael Droettboom [:mdroettboom]

Comment 5

•

6 years ago

Interesting findings from the above query:

For these pings with the same seq, but different doc_id. The time periods as marked in ping_info are all non-overlapping and a least 5 minutes apart, so racing on updating the seq in SharedPreferences seems unlikely. My best theory is that updating the seq number in SharedPreferences is just failing for a very long time for these clients...?

Chris H-C :chutten

Comment 6

•

6 years ago

Time distribution of duped pings, by "full" (docid is duped, too) or "half" (docid is different, duped {client_id, seq} tuple) dupe: https://sql.telemetry.mozilla.org/queries/62490/source#160463

Most (over 90%) of dupes are received within a day of each other.
In fact, most are received within 10min of each other.

Alessio Placitelli [:Dexter]

Reporter

Updated

•

6 years ago

Priority: -- → P1

Alessio Placitelli [:Dexter]

Reporter

Updated

•

6 years ago

Comment 7

•

6 years ago

As mentioned in bug 1548819 comment 16, I accidentally learned that (at least on Firefox for Fire TV) it appears as though a disproportionate amount of these duplicate pings are sent from clients running Android SDK 22 (Android 5.1 Lollipop).

Alessio Placitelli [:Dexter]

Reporter

Updated

•

6 years ago

Updated

•

6 years ago

Blocks: 1552507

Michael Droettboom [:mdroettboom]

Comment 8

•

5 years ago

1596810 is a newer investigation of this with Fenix (created accidentally as a dupe).

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → DUPLICATE

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Investigate duplicates in the reported data

Categories

(Data Platform and Tools :: Glean: SDK, defect, P1)

Tracking

(Not tracked)

People

(Reporter: Dexter, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Updated

Comment 7

Updated

Updated

Comment 8