Add hindsight monitor for validation error rates

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
P1
normal
RESOLVED FIXED
8 months ago
6 months ago

People

(Reporter: Dexter, Assigned: trink)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

8 months ago
We want:
- email alerts to telemetry-client-dev@mozilla.com
- graphing over time, per-channel, per-type
- regression detection, per-channel, per-type
- failure samples ready for triaging

This should alert us if there's any sudden spike in the schema validation errors for any ping, e.g. the main ping fails to suddenly validate for a portion of the clients.

Having a sample of the ping which fails to validate would be very helpful to help diagnosing issues.

Bonus points for making this easy to use for testing new ping schemas.
(Reporter)

Updated

8 months ago
Blocks: 1344235
(Reporter)

Updated

8 months ago
Depends on: 1344249

Updated

8 months ago
Assignee: nobody → mtrinkala
Points: --- → 1
Priority: -- → P1
(Assignee)

Comment 1

7 months ago
Proposed alert configuration (this would track all types of error not just schema failures):

      ingestion_errors = {
        max_error_ratio = 0.005,
      },
Blocks: 1348863
Summary: Add hindsight monitor for schema validation error rates → Add hindsight monitor for validation error rates
(Assignee)

Comment 2

6 months ago
https://mozilla-services.github.io/lua_sandbox_extensions/moz_telemetry/sandboxes/heka/analysis/moz_telemetry_doctype_monitor.html
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.