Open Bug 1324801 Opened 7 years ago Updated 2 years ago

Telemetry should throttle pings to a sane limit per day

Categories

(Toolkit :: Telemetry, defect, P3)

defect

Tracking

()

Tracking Status
firefox53 --- affected

People

(Reporter: gfritzsche, Unassigned)

References

Details

(Whiteboard: [measurement:client][fce-active-legacy])

For custom ping types, one of the implementation concerns is that they could end up sending way more pings per user than expected (e.g. crash ping types with heavy crash spikes, sync in bug 1287473, ...).

We should have a general safety-limit in place in Telemetry that throttles pings to a high, but still "reasonable" amount of pings per day per client.
For anything over the limit, we would track this per ping type in Telemetry.
Whiteboard: [measurement:client] → [measurement:client][fce-active]
Blocks: 1310674
Unless the limit is very high, I'm a bit skeptical that this is a good idea. What kind of limit are you considering? I would think we want at least 100 crash pings per day before giving up.
Flags: needinfo?(gfritzsche)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #1)
> Unless the limit is very high, I'm a bit skeptical that this is a good idea.
> What kind of limit are you considering? I would think we want at least 100
> crash pings per day before giving up.

I think we should apply a "high" limit as mentioned.
I'm not really worried about clients occasionally submitting 100 pings, but about extraordinary circumstances leading to say >1000 pings.

We need to define what we can take for infrastructure & cost and settle on a maximum from there.
Flags: needinfo?(gfritzsche)
The wonderful thing about having a limit is that it'll force us to be intelligent about what we send.

Maybe instead of or in addition to #pings-per-doctype/day it could be MB-per-doctype/day, since that's what users and our servers are charged by.

For now the easiest first step could be to create a keyed scalar each for how many pings and how many bytes we send so that we know if there are clients in danger of hitting it. Could uplift that pretty high if we want upper-branch details as well.
When 1352496 lands (which should be very soon), we should revisit this.
See Also: → 1352496
Whiteboard: [measurement:client][fce-active] → [measurement:client][fce-active-legacy]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.