Closed Bug 1916751 Opened 1 year ago Closed 1 year ago

Avoid decimating background process crash reports on release desktop

Categories

(Socorro :: Antenna, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: afranchuk, Assigned: bdanforth)

Details

Attachments

(1 file)

Like the gpu process (bug 1547804), other background processes (rdd, plugin, utility, socket) have very low submission rates (and crash rates, generally), and so they should likely always be accepted rather than being potentially decimated by the release firefox 10% rule.

There's an is_gpu rule already:

https://github.com/mozilla-services/antenna/blob/68b1d2f16dac6c140a38a7363b106fa722ef83f1/antenna/throttler.py#L458-L465

We could adjust that to be is_background and cover other background process types.

Here are counts by process type for the last week of Firefox 129.0.2 release channel:

$ supersearchfacet --_histogram.date=process_type --relative-range=1w --product=Firefox --release_channel=release --denote-weekends --format=markdown --version=129.0.2
histogram_date content gpu parent plugin rdd socket utility total
2024-08-28 7586 26 5367 9 14 0 47 13049
2024-08-29 7374 29 5332 16 0 0 36 12787
2024-08-30 6915 28 5422 13 3 0 28 12409
2024-08-31 ** 5617 32 3812 12 0 0 29 9502
2024-09-01 ** 5672 110 3763 14 6 0 0 9565
2024-09-02 7894 46 5495 9 0 0 0 13444
2024-09-03 7807 22 5413 25 6 1 81 13355
2024-09-04 4880 28 3417 11 0 0 0 8336
Priority: -- → P2
Assignee: nobody → bdanforth
Status: NEW → ASSIGNED

Similar to this comment for removing throttling of GPU process crashes, I wanted to provide an estimate of the expected increase in crash report volume if we no longer throttle crash reports for the (rdd, plugin, utility, socket) processes.

Since we throttle to accept only 10% of release Firefox crashes, we will assume that what we currently accept for these processes represents 10% of the total volume if we removed throttling.

For current release Firefox (130.0.0), there are 2715 crash reports from rdd, plugin and utility processes in the past week[1]. If we assume this week is representative of a typical week, then we can expect around 27k crashes per week from these processes once this is deployed in production.

[1]: Interestingly, I did not see any socket process crashes during this period, nor in the past 6 months. Is this expected?

(In reply to Bianca Danforth [:bdanforth] from comment #2)

[1]: Interestingly, I did not see any socket process crashes during this period, nor in the past 6 months. Is this expected?

I do see some socket crashes in the past week, however I'm not filtering to 130.0.0, just the release channel. Did you change your filtering when looking at the past 6 months? Nonetheless, they are very infrequent. Those I see are single digits per day, if any. They are predominantly caused by third-party dlls injecting into the process and crashing :)

Ah okay thanks! I was worried that there were none but I should have checked some different filters (or perhaps unchecked in this case) to open it up more. Glad you are seeing some.

We're hoping to get this change deployed to prod early next week. Once it's deployed, we'll add a comment to this ticket referencing the deploy bug.

This was deployed in bug 1919125 tag v2024.09.16.

In the last 50 minutes since the deploy, there's been almost 100 crashes for these background processes compared to 4 in the 50 minutes before the deploy. That's on the order of the expected 10x increase, so it looks to be working!

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: