Closed Bug 1621743 Opened 5 years ago Closed 3 years ago

Add uptake telemetry to Grafana dashboards

Categories

(Data Platform and Tools :: General, enhancement, P3)

enhancement
Points:
3

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: jlockhart, Unassigned)

Details

We currently have a dashboard in Grafana for monitoring delivery enrollment here:

https://grafana.telemetry.mozilla.org/d/XspgvdxZz/experiment-enrollment?orgId=1&var-experiment_id=pref-activity-stream-password-import-cfr-release-74-74-bug-1617735&from=1583712000000&to=

Separately we track the "uptake" telemetry of each recipe as each client receives and evaluates it, but that telemetry is not incorporated into this dashboard.

I would like to see a graph on this dashboard that shows the total number of clients that receive a given recipe, and the different uptake states ie

Total number of clients that received recipe over time
Total number of clients that successfully enroll in recipe over time
Total number of clients that decline to enroll in recipe over time
Total number of clients that report an uptake error over time

I believe this will help evaluate the effectiveness of recipe targetting for a given recipe with the goal of reducing the number of deliveries that are restarted because of unexpected or malformed targeting.

I think the current state of play with uptake telemetry in Normandy is that we report recipe status for every recipe on every run. I'm not sure how to turn that into a number of unique clients.

I think we separately record telemetry events regarding enrollment in recipes, which is of a different "cardinality" (an experiment will be seen many times and only enrolled once). Enrollment is therefore relatively easy to get a number of clients for. "Decline to enroll" is not covered apart from "didn't match filter" in recipe status, as far as I know.

"Uptake error" can include "filter broken" errors (which are not 100% understood; conjectures are that these indicate difficulties reading Telemetry data off disk or failures to talk to classify-client) as well as errors in executing the action.

Thanks ethan! Then maybe it doesn't make sense to try to frame the uptake telemetry as a relation to unique clients, but even just graphing each uptake response to the recipe separately I think would be helpful.

Points: --- → 3
Priority: -- → P3

The new operational monitoring from aschultz has outmoded the need for this, I'll close.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INVALID
Component: Datasets: Experiments → General
You need to log in before you can comment on or make changes to this bug.