Closed Bug 1249343 Opened 8 years ago Closed 3 years ago

Add budget monitoring for "core" pings

Categories

(Firefox for Android Graveyard :: General, defect, P5)

All
Android
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: mcomella, Unassigned)

References

Details

The "core" ping is intended to be sent every time `onStart` is called so we should focus on keeping the core ping tiny in size. In order to prevent regressions to this, we could specify a maximum ping size (e.g. 2 kB?) and verify that our pings don't exceed that size.

Ideally, this would be an analysis done on all (or a sample of) incoming data that verifies that the pings don't exceed a certain size. We'd get an email (or an automatic bug filed) if a ping exceeds the specified amount and we can look into it.

If that's not feasible, we could write a test in Java that estimates the size of fields and roughly verifies we don't exceed the specified amount when we add new fields, however, it'd be difficult to check for accidental regressions in existing fields (e.g. list of active experiments grows very long).

Finkle, what do you think?
Flags: needinfo?(mark.finkle)
I think server-side monitoring would be needed (that also covers "real" clients, not just our specific test setups).
We already have a budget-monitoring dashboard for "main" and "saved-session" pings [0], we could probably add data for "core" pings there?

mreid, is that feasible? Do we have any automated monitoring plans?

0: https://metrics.services.mozilla.com/telemetry-budget-dashboard/
Flags: needinfo?(mreid)
Yes, that's definitely feasible.

We'd need to make several changes, but it does seem worthwhile to monitor the size of incoming pings.

The code that computes the aggregates for the budget dashboard also fires alerts if we exceed our estimates. Those alerts could be extended to do some detection of increases (or decreases) in size or volume of submissions.

Another possibility is to implement a real-time(ish) filter to monitor this on the fly.
Flags: needinfo?(mreid)
I filed bug 1252050 on the monitoring alerts.

Do we have any idea for a size budget for "core" pings already?
Currently we are around 600-800 byte on beta, but we don't have session length info in yet etc., so i assume we have to come back to this later.
Blocks: 1251614
No longer blocks: ut-android
I'm morphing this to be about the budget monitoring.
Lets request that once we can specify an upper bound on the "core" ping sizes.
Summary: Consider writing a test or analysis for maximum core ping size → Add budget monitoring for "core" pings
Server-side budget monitoring seems like the right approach
Flags: needinfo?(mark.finkle)
The repos for this are:
https://github.com/mozilla/telemetry-budget-dashboard
https://github.com/mozilla-services/data-pipeline/tree/master/reports/budget

mreid notes though that the "payload size" messages currently only has main|saved-session|other :
https://github.com/mozilla-services/data-pipeline/blob/master/heka/sandbox/filters/dollars.lua#L132-L134
Priority: -- → P3
Whiteboard: [measurement:client]
Whiteboard: [measurement:client]
Re-triaging per https://bugzilla.mozilla.org/show_bug.cgi?id=1473195

Needinfo :susheel if you think this bug should be re-triaged.
Priority: P3 → P5
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.