Closed Bug 1399153 Opened 7 years ago Closed 7 years ago

[Telemetry Health] Investigate client chattiness (pings/client/day) as a measure of Telemetry Client Health

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox57 --- unaffected

People

(Reporter: chutten, Assigned: chutten)

References

(Blocks 1 open bug)

Details

Let's expand the Telemetry Health dashboard to look at additional measures of Telemetry Client Health. Two metrics we discussed in SF were deserving of a closer look were "Client Chattiness" (or how many pings should we expect clients to send per day) and "Missing Subsessions" (are there holes in our data?).
Blocks: 1400351
Splitting "Missing Subsessions" out to its own bug.

So the first swing at "Client Chattiness" is up on the dashboard[1] (its own query is here[2])

First notes from the shape of the data:
 * On average we see fewer than 4 pings per client per day on every channel.
 * Beta's usually the most chatty. Aurora's usually the least chatty. But they're normally never more than 0.8 pings/client apart.
 * There was a chattiness event on Nightly from Aug 3 to Aug 8, with some increased activity in late July leading up to it.

So, questions about where to go from here:
1) Does this mean we should throttle/investigate/block clients sending us more than 4 pings per day? 40? 400?
 * No. Getting a 95%ile might be more illuminating if we're trying to provide thresholds or guarantees. Are we planning on doing that?
2) What if we see another "chattiness event" like the one we saw on Nightly? Should we inform relman? File bugs?
 * File a bug, yes. See if there's anything we can glean through subgroup analysis. Need MacroBase?
3) What metrics/alerts/etc do we actually want out of this?
 * The notes[3] aren't 100% clear on what we actually hoped to get out of this. Maybe the existing plot is good enough?


[1]: https://sql.telemetry.mozilla.org/dashboard/telemetry-health
[2]: https://sql.telemetry.mozilla.org/queries/36239#97020
[3]: https://docs.google.com/document/d/17aHdr6ThLkokDyz1WYDIRI-Ds7VJ5b8CIapaNhxqsLw/edit#heading=h.kxxbnsi2npgo
Summary: [Telemetry Health] Investigate client chattiness (pings/client/day) and missing subsessions as measures of Telemetry Client Health → [Telemetry Health] Investigate client chattiness (pings/client/day) as a measure of Telemetry Client Health
Let's get a dist view of how many pings per client. (hat-tip, :mreid) Is it outlier-heavy?
Blocks: 1177737
Work has completed on the documented write-up: https://docs.google.com/document/d/1o3r2wdi8ndFDgSj7HAWyVL8A1BzpddI51eN1Lc9RptU/edit?ts=59df66a1#

tl;dr - This work is not expected to result in new actionable information. Instead, it should result in a better understanding of current telemetry use, and provide a regression baseline to help us identify anomalous events should they occur.

Future work is being done in bug 1407608 to figure out if my intuition of "Wow, that seems a little chatty, eh?" is correct and there's something weird happening... or if my intuition is false.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.