Closed Bug 972889 Opened 7 years ago Closed 7 years ago
Automatic e-mail notifications of Telemetry validation error rate spikes & drops
We should set up automatic alerts based on the validation statistics to be added in bug 972887.
As discussed yesterday on IRC, I played with the Fourier Transform to see if we can use it to identify a drop in submissions. It seems that it can be used to detect the drop but only after several days, which is not good enough. Turns out an ensemble predictor based on Welch's t test of the last 10 daily means works pretty well on the training data I had: https://gist.github.com/vitillo/9023560. Red dots signal a submission drop in the plots. The predictor takes a series of samples and if enough data is available, it makes a predictions returning True if the submissions are dropping and False otherwise. If not enough data is available, it will just return False; that's the reason you don't see red dots at the beginning of the two plots even though the submissions were dropping (in retrospect though).
Note: Alerting on the *submission* rate is bug 962811. This bug is about the *validation error* rate. Anomaly detection will likely be similar between both bugs, but comment 1 is using data about the submission rate.
See Also: → 962811
Sorry about that. If you point me to some data I can have a look at how the algorithm behaves with the validation error rate.
Validation error rate monitoring has been deployed. Code: https://github.com/mozilla/telemetry-server/blob/master/monitoring/process_incoming/error_rates.py
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.