Closed Bug 1252050 Opened 9 years ago Closed 3 years ago

Fire emails on ping size budget monitoring alerts

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: gfritzsche, Unassigned)

Details

(Whiteboard: [measurement:client:tracking])

Georg Fritzsche [:gfritzsche]

Reporter

Description

•

9 years ago

(Mark Reid [:mreid] from bug 1249343, comment 2) > We'd need to make several changes, but it does seem worthwhile to monitor > the size of incoming pings. > > The code that computes the aggregates for the budget dashboard also fires > alerts if we exceed our estimates. Those alerts could be extended to do some > detection of increases (or decreases) in size or volume of submissions.

Georg Fritzsche [:gfritzsche]

Reporter

Comment 1

•

9 years ago

The histogram regression alerts have some relatively stable regression detection, we might be able to lift something off from there?

Mark Reid [:mreid]

Comment 2

•

9 years ago

Georg, can we simply add a regular histogram on the client for the length of the uncompresssed payload? Then we'd get alerting for free.

Points: --- → 2

Flags: needinfo?(gfritzsche)

Priority: -- → P3

Georg Fritzsche [:gfritzsche]

Reporter

Comment 3

•

9 years ago

(In reply to Mark Reid [:mreid] from comment #2) > Georg, can we simply add a regular histogram on the client for the length of > the uncompresssed payload? Then we'd get alerting for free. Yes, but that wouldn't cover the "core" ping (which has a really constrained set of data points, bug 1249343 will request adding that to the budget monitor). It would also be nice to not have to trust the clients and monitor the actual incoming data, including meta data / headers. We do have some "opt-in" measurements already for "too big" pings on the Fx Desktop (TELEMETRY_PING_SIZE_EXCEEDED_SEND, TELEMETRY_DISCARDED_SEND_PINGS_SIZE_MB), we could add more fine-grained ones and request making them opt-out.

Flags: needinfo?(gfritzsche)

Mark Reid [:mreid]

Comment 4

•

8 years ago

:trink and :gfritzsche - does the recently-deployed doctype monitoring solve this use case?

Flags: needinfo?(mtrinkala)

Flags: needinfo?(gfritzsche)

Mike Trinkala [:trink]

Comment 5

•

8 years ago

This is what is currently firing. #### Subject: Hindsight [analysis.moz_telemetry_doctype_monitor_crash#release] - size MIME-Version: 1.0 Date: Mon, 24 Apr 2017 03:34:07 +0000 From: <hindsight@pipeline-cep.prod.mozaws.net> To: AlertRecipients <noreply@example.com> Content-Type: text/plain; charset="iso-8859-1" X-Mailer: LuaSocket 3.0-rc1 Message-ID: <0101015b9e060a8d-dffaa9ca-66c1-4eca-9e35-571606054795-000000@us-west-2.amazonses.com> X-SES-Outgoing: 2017.04.24-54.240.27.113 Feedback-ID: 1.us-west-2.9obwqSuHxAmNPKpejVDo3cEAmnSHOVLO3+B/64gdyXQ=:AmazonSES Hostname: pipeline-cep.prod.mozaws.net Pid: 54712 The average message size has changed by 101.995% (current avg: 16812B) graph: https://pipeline-cep.prod.mozaws.net/dashboard_output/graphs/analysis.moz_telemetry_doctype_monitor_crash.size.html ### Acceptable? Sadly we won't know what the new average size of the crash ping will actually be until well after the 53 roll-out as the average size continues to increase.

Flags: needinfo?(mtrinkala)

Georg Fritzsche [:gfritzsche]

Reporter

Comment 6

•

8 years ago

This is great already. Would it be hard to get a per-docType monitor going? AFAICT, currently we can't tell which docType is changing based on this monitor alone.

Flags: needinfo?(gfritzsche)

Thomas Huelbert

Updated

•

7 years ago

Component: Metrics: Pipeline → Monitoring & Alerting

Product: Cloud Services → Data Platform and Tools

Mark Reid [:mreid]

Comment 7

•

7 years ago

You can monitor this for any configured docType. Trink, where are the monitored doctypes configured?

Flags: needinfo?(mtrinkala)

Mike Trinkala [:trink]

Comment 8

•

7 years ago

puppet-config https://github.com/mozilla-services/puppet-config/tree/0242c49788057111fed02c439c5029baa27d6616/pipeline/modules/pipeline/templates/hindsight/analysis

Flags: needinfo?(mtrinkala)

:shell escalante

Comment 9

•

3 years ago

we have a working process to manage size now- also legacy telemetry is migrating to Glean in foreseeable future

Status: NEW → RESOLVED

Closed: 3 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Fire emails on ping size budget monitoring alerts

Categories

(Data Platform and Tools :: Monitoring & Alerting, defect, P3)

Tracking

(Not tracked)

People

(Reporter: gfritzsche, Unassigned)

References

Details

(Whiteboard: [measurement:client:tracking])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Comment 8

Comment 9