Closed Bug 1762949 Opened 2 years ago Closed 2 years ago

autosubmit should use a different annotation than Throttleable

Categories

(Toolkit :: Crash Reporting, defect)

defect

Tracking

()

RESOLVED FIXED
101 Branch
Tracking Status
firefox101 --- fixed

People

(Reporter: willkg, Assigned: gsvelto)

Details

Attachments

(1 file)

We've had multiple spikes in crash reporting where one or two people manage to send tens of thousands of crash reports in under an hour because of some crash loop and because they have autosubmit enabled.

When a crash report is submitted with autosubmit enabled, the Throttleable annotation is set to 0. The crash ingestion collector doesn't throttle this crash report and instead always accepts it.

The problem here is two fold:

  1. Bugs in Firefox and other products can overwhelm crash ingestion
  2. Throttleable=0 is no longer a clear indicator that the user has manually submitted their crash report

I think we should have a new AutosubmitEnabled annotation and crash reports submitted because autosubmit is enabled should be marked as AutosubmitEnabled=1 and have no Throttleable flag at all.

Another alternative is that Socorro starts ignoring the Throttleable annotation. Maybe we don't need to support users manually submitting crash reports?

Rather than changing Throttleable wouldn't it be better for me to implement bug 1702509? I know, I should have done it a while ago but it's never too late. With the work there done auto-submitted crash reports would be clearly flagged and it should enable better filtering and throttling (such as by never discarding crashes that have been explicitly sent from about:crashes for example).

Flags: needinfo?(willkg)

Possibly. My issue here is that Throttleable=0 is being applied to crash reports automatically where it was only supposed to apply to crash reports that Socorro can't throttle because the user submitted it and is planning to follow up on it. So unless the fix in bug #1702509 also covers fixing submission so that Throttleable=0 only happens when a user really needs the crash report to not be throttled, I think we need to fix issue separately.

If you look at the 44,000 crash reports that were autosubmitted because of that rdd thing, they've all got Throttleable=0 annotation:

https://crash-stats.mozilla.org/report/index/8fbb7722-5355-470c-9a04-309c40220404#tab-annotations

I don't think that's what we intended with the Throttleable annotation. [1]

However, I'm game for changing things up. All I need is a clear signal that's seldom used that Socorro should not throttle a crash report. That information sort of overlaps with AboutCrashes value of SubmittedFrom, but if we conflate that value with "this crash report can't get throttled", I think that creates problems down the road when/if we change the values or other things pop up.

[1] Throttleable isn't documented in CrashAnnotations.yaml, but the description in Socorro is "Whether the crash report was throttleable when submitted."

Flags: needinfo?(willkg)

Good point, I should definitely "roll back" Throttleable to its original meaning which could happen in this bug and leave bug 1702509 for the SubmittedFrom annotation. Once both things are in place we should be able to handle things in a sane way. BTW my suggestion to not throttle crashes submitted from about:crashes was just an example of how we could use SubmittedFrom without abusing Throttleable, not a concrete proposal.

Assignee: nobody → gsvelto
Status: NEW → ASSIGNED

Just to be sure I'm doing things as intended. Bug 1702509 adds the annotation about where a crash is coming from (and removes SubmittedFromInfobar which becomes redundant) and this bugs adds changes and tests ensuring that the following always applies:

  • Reports submitted from a crashed always have Throttleable set to 1
  • Reports submitted from the infobar always have Throttleable set to 1
  • Reports that were automatically submitted always have Throttleable set to 1
  • Reports that were manually sent from about:crashes always have Throttleable set to 0
Flags: needinfo?(willkg)

That looks good to me! Thank you!

Flags: needinfo?(willkg)

Comment on attachment 9271303 [details]
Bug 1762949 - Explicitly flag all crash reports as throttleable when they've not been submitted manually r=KrisWright

  1. What questions will you answer with this data?

This data is used by the Socorro service to decide whether a crash can be selectively discarded. In short this answers the question "was this crash report directly sent by a user or not?"

  1. Why does Mozilla need to answer these questions? Are there benefits for users? Do we need this information to address product or business requirements? Some example responses:

To be able to throttle the number of crash reports we process, conserving resources.

  1. What alternative methods did you consider to answer these questions? Why were they not sufficient?

None, this annotation has been part of the crash reporting flow for a long, long time that predates our current data collection practices. We always relied on it.

  1. Can current instrumentation answer these questions?

Yes in the sense that the annotation was already present, it was just not documented.

  1. List all proposed measurements and indicate the category of data collection for each measurement, using the Firefox data collection categories found on the Mozilla wiki.
Measurement Description Data Collection Category Tracking Bug #
Throttleable annotation Category 1: Technical data 1762949
  1. Please provide a link to the documentation for this data collection which describes the ultimate data set in a public, complete, and accurate way.

The annotation description is provided in CrashAnnotations.yaml

  1. How long will this data be collected? Choose one of the following:

This annotation is permanently needed

  1. What populations will you measure?

All Firefox users reporting crashes

  1. If this data collection is default on, what is the opt-out mechanism for users?

As with everything crash reporting this is strictly opt-in, opt-out is the default in this case

  1. Please provide a general description of how you will analyze this data.

This annotation is used by Socorro to decide whether to consider a crash for processing or not.

  1. Where do you intend to share the results of your analysis?

The results are directly visible on crash-stats.mozilla.org in the form of the reports that have not been throttled.

  1. Is there a third-party tool (i.e. not Glean or Telemetry) that you are proposing to use for this data collection? If so:

No, this uses the normal crash reporting pipeline and is not included into telemetry pings.

Attachment #9271303 - Flags: data-review?(willkg)

Comment on attachment 9271303 [details]
Bug 1762949 - Explicitly flag all crash reports as throttleable when they've not been submitted manually r=KrisWright

  1. Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes--CrashAnnotations.yaml.

  1. Is there a control mechanism that allows the user to turn the data collection on and off?

Crash report data is default opt-out.

  1. If the request is for permanent data collection, is there someone who will monitor the data over time?

This is not necessary for crash report data.

  1. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1.

  1. Is the data collection request for default-on or default-off?

It's default-off.

  1. Does the instrumentation include the addition of any new identifiers?

No.

  1. Is the data collection covered by the existing Firefox privacy notice?

Yes.

  1. Does there need to be a check-in in the future to determine whether to renew the data?

No.

  1. Does the data collection use a third-party collection tool?

No.

Attachment #9271303 - Flags: data-review?(willkg) → data-review+

The severity field is not set for this bug.
:gsvelto, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gsvelto)
Severity: -- → S3
Flags: needinfo?(gsvelto)
Pushed by gsvelto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0f2484eac83c
Explicitly flag all crash reports as throttleable when they've not been submitted manually r=KrisWright
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 101 Branch

Because this affects whether or not Socorro processes crash reports, it will probably affect crash report counts. It's probably worth sending a heads-up email to the crash-reporting-wg and stability mailing lists. If you can't get to it, I can throw something together.

Flags: needinfo?(gsvelto)

Very good point, I'll do it right away

Flags: needinfo?(gsvelto)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: