Closed
Bug 1252050
Opened 9 years ago
Closed 3 years ago
Fire emails on ping size budget monitoring alerts
Categories
(Data Platform and Tools :: Monitoring & Alerting, defect, P3)
Data Platform and Tools
Monitoring & Alerting
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: gfritzsche, Unassigned)
Details
(Whiteboard: [measurement:client:tracking])
(Mark Reid [:mreid] from bug 1249343, comment 2)
> We'd need to make several changes, but it does seem worthwhile to monitor
> the size of incoming pings.
>
> The code that computes the aggregates for the budget dashboard also fires
> alerts if we exceed our estimates. Those alerts could be extended to do some
> detection of increases (or decreases) in size or volume of submissions.
Reporter | ||
Comment 1•9 years ago
|
||
The histogram regression alerts have some relatively stable regression detection, we might be able to lift something off from there?
Comment 2•9 years ago
|
||
Georg, can we simply add a regular histogram on the client for the length of the uncompresssed payload? Then we'd get alerting for free.
Points: --- → 2
Flags: needinfo?(gfritzsche)
Priority: -- → P3
Reporter | ||
Comment 3•9 years ago
|
||
(In reply to Mark Reid [:mreid] from comment #2)
> Georg, can we simply add a regular histogram on the client for the length of
> the uncompresssed payload? Then we'd get alerting for free.
Yes, but that wouldn't cover the "core" ping (which has a really constrained set of data points, bug 1249343 will request adding that to the budget monitor).
It would also be nice to not have to trust the clients and monitor the actual incoming data, including meta data / headers.
We do have some "opt-in" measurements already for "too big" pings on the Fx Desktop (TELEMETRY_PING_SIZE_EXCEEDED_SEND, TELEMETRY_DISCARDED_SEND_PINGS_SIZE_MB), we could add more fine-grained ones and request making them opt-out.
Flags: needinfo?(gfritzsche)
Comment 4•8 years ago
|
||
:trink and :gfritzsche - does the recently-deployed doctype monitoring solve this use case?
Flags: needinfo?(mtrinkala)
Flags: needinfo?(gfritzsche)
Comment 5•8 years ago
|
||
This is what is currently firing.
####
Subject: Hindsight [analysis.moz_telemetry_doctype_monitor_crash#release] - size
MIME-Version: 1.0
Date: Mon, 24 Apr 2017 03:34:07 +0000
From: <hindsight@pipeline-cep.prod.mozaws.net>
To: AlertRecipients <noreply@example.com>
Content-Type: text/plain; charset="iso-8859-1"
X-Mailer: LuaSocket 3.0-rc1
Message-ID: <0101015b9e060a8d-dffaa9ca-66c1-4eca-9e35-571606054795-000000@us-west-2.amazonses.com>
X-SES-Outgoing: 2017.04.24-54.240.27.113
Feedback-ID: 1.us-west-2.9obwqSuHxAmNPKpejVDo3cEAmnSHOVLO3+B/64gdyXQ=:AmazonSES
Hostname: pipeline-cep.prod.mozaws.net
Pid: 54712
The average message size has changed by 101.995% (current avg: 16812B)
graph: https://pipeline-cep.prod.mozaws.net/dashboard_output/graphs/analysis.moz_telemetry_doctype_monitor_crash.size.html
###
Acceptable? Sadly we won't know what the new average size of the crash ping will actually be until well after the 53 roll-out as the average size continues to increase.
Flags: needinfo?(mtrinkala)
Reporter | ||
Comment 6•8 years ago
|
||
This is great already.
Would it be hard to get a per-docType monitor going?
AFAICT, currently we can't tell which docType is changing based on this monitor alone.
Flags: needinfo?(gfritzsche)
Updated•7 years ago
|
Component: Metrics: Pipeline → Monitoring & Alerting
Product: Cloud Services → Data Platform and Tools
Comment 7•7 years ago
|
||
You can monitor this for any configured docType. Trink, where are the monitored doctypes configured?
Flags: needinfo?(mtrinkala)
Comment 8•7 years ago
|
||
Flags: needinfo?(mtrinkala)
Comment 9•3 years ago
|
||
we have a working process to manage size now- also legacy telemetry is migrating to Glean in foreseeable future
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•