Closed Bug 948930 Opened 8 years ago Closed 5 years ago

Filter duplicate Telemetry submissions

Categories

(Webtools Graveyard :: Telemetry Server, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1342111

People

(Reporter: mreid, Unassigned)

Details

Currently duplicate submissions are accepted at the HTTP server level.  Under normal circumstances, a saved session submission should only ever be sent once, and an idle-daily submission should be sent at most once per day.

If we see the same document being submitted more than once per day, we can discard these documents after the first one.

As a simple fix, we can keep a list of the N most-recently-seen document ids, and if a submission appears in this list, we discard it.  The list should be truncated once per day.
Nathan, Irving - do you see any problems with this approach?
We already need to track IDs to replace idle-daily data with updates (if the same session gets idle-daily on multiple days); i'm not sure if the new infrastructure also removes the idle-daily when we get the final saved-session for that session ID.

For our problem clients, we're getting the same session sent over and over again over a period of months, so a solution that stores that session once per day is only a partial solution to the problem. The problem might be small enough that a partial solution is good enough, but it's still a sign of a client side bug to be getting the same session sent over and over again, so it would be nice to know when & how it's happening.

If I'm not mistaken, we changed the host name for telemetry some time ago but left forwarding behind - do we have a plan to track how many clients (on what versions of all applications in the Mozilla suite) are using the old address, and eventually drop the forwarding? If all our multiple submissions are coming from old versions that will mostly make the problem go away, aside from all the failing connection attempts to the old address...
The host name was changed for new Firefox builds (in bug 915796), and submissions to the old host name were forwarded to the new endpoint with some fancy zeus tricks (in bug 916243).

:cturra, do we have any visibility on how much telemetry traffic is still hitting data.mozilla.com?
Flags: needinfo?(cturra)
i don't see any obvious way of having zeus graph that data, but maybe this something we can check on the telemetry servers?
Flags: needinfo?(cturra)
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1342111
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.