Closed Bug 836753 Opened 12 years ago Closed 11 years ago

Add "weight" field to crash reports allowing to compensate for throttling

Categories

(Socorro :: Backend, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bjacob, Assigned: lars)

Details

Because of 10% throttling of Firefox release channel, when we try to manually extract statistical information from crash reports (say from CSV files) we need to avoid a pitfall: by default, non-release channel reports are given 10x more weight than release channel reports. And when making cross-OS stats, mobile reports are also given 10x more weight than desktop reports. Currently I can compensate for it manually by looking at the throttling factors there: https://github.com/mozilla/socorro/blob/master/scripts/config/collectorconfig.py.dist#L138 However it would be much more convenient and less error-prone if each crash report carried a field indicating the throttling factor that was used for it. Or maybe the inverse of that factor, which could be thought of as a 'weight'. (Laura suggested I assign to Lars; triage as you wish).
I've made a prototype implementation of this. Rather than converting the throttle rate to a new value, I decided to be simple minded about it and just echo the 'throttle_rate' of the 'throttle_condition' that matched for that raw crash. A new field has been added to the raw_crash called 'throttle_rate'. The 'throttle_rate' is a number between 0 (always reject) and 100 (always accept). For example, production Firefox uses the 'throttle_rate' of 10, meaning 10% of crashes are accepted. To reiterate, this code is a prototype and while there is a pull request for it to be included in Socorro, there has been no discussion of the merits of by the dev team. https://github.com/mozilla/socorro/pull/1262
Target Milestone: --- → 50
I need some feedback about this change before we commit it to the codebase. In testing my patch, I realized that it might not be as simple as fetching the throttle rate from the matching rule. Remember there are rules that have nothing to do with product and versioning. Say we have two crashes, both for some theoretical released version of Firefox 68. The first gets a "throttle_rate" of 10 because by default, that how we throttle releases. The second, however, gets a "throttle_rate" of 100 because the matching rule was not the release, but instead was the process-all-crashes-with-comments rule. Does that constitute a fatal problem for this scheme?
What is the proportion of crashes with a comment on them? If the proportion of crashes with a comment of them is much smaller than 10%, then even the factor of 10 in the throttle_rate won't allow comments with crashes to skew statistics obtained in this way, so this is not a big deal. If the proportion of crashes with a comment of them is high (which would be surprising to me!) then this may still not be a fatal problem: commented crashes _are_ particularly interesting so there is still value in statistics that are biased towards them. But it's definitely good to know at least how much we'll be biased, i.e. what is the proportion of commented crashes.
(In reply to K Lars Lohn [:lars] [:klohn] from comment #2) > Does that constitute a fatal problem for this scheme? Actually, I'd say that's exactly why we prefer to *have* this field that tells us what throttling rate was applied to this crash. With it, we can do better calculations of the total crash rates because we can go and basically add up 100/throttle_rate to form a total crashes number instead of counting every crash as 1 and then multiply Firefox release numbers by 10 "because most of those use 10% throttling" as we do now.
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/75d4b0837144d17b23e4dfdae9772b2dd4f97431 Merge pull request #1262 from twobraids/report_throttle fixes Bug 836753: added 'throttle_rate' to raw crash
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.