Open Bug 1554729 Opened 5 years ago Updated 2 years ago

Investigate how broad the glean "baseline" ping duplicate problem _truly_ is

Categories

(Data Platform and Tools :: General, defect, P3)

defect
Points:
3

Tracking

(Not tracked)

People

(Reporter: chutten, Unassigned)

References

Details

Glean "baseline" pings from &browser and Fenix have a duplicates problem. We should not ever be sending pings with identical {client_id, seq}, and we certainly should not be sending pings that also have identical doc_id.

We see both in production. And worse so, we see them actually making their way into datasets where they can interfere with analyses.

So far we haven't looked into how many of these dupes are actually caught and instead have focused on those pings that make it into datasets. This bug is about looking into those dupes that get caught and seeing if there's anything useful to learn there.

See Also: → 1547234
Points: --- → 3
Priority: -- → P3

Do we know if this is still a problem?

Component: Pipeline Ingestion → General
You need to log in before you can comment on or make changes to this bug.