Closed Bug 1635194 Opened 5 years ago Closed 5 years ago

[research] look into dip in process_crash mean times

Categories

(Socorro :: General, task, P2)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

Details

Attachments

(1 file)

The socorro processor process_crash mean time has been pretty steady at slightly-over-6s for a long time. Then around April 21st, it dipped to slightly-under-4s with clear day/night cycles. Then on 4/30, the mean started going up to about 6s again.

This bug covers looking into the dip. What happened?

Brian did a graph of crashes processed over the last 30 days. Looks like there's an increase in crashes processed between April 20th and April 30th, then it goes down again.

Maybe the processor was processing more crashes during this period that didn't have minidumps?

In bug #1635164, Gabriele opined that shutdownkill crash reports are steadily increasing. Maybe this is related to that?

Maybe this is Fission-related?

Maybe this is Fenix-related? When Fenix crashes in Java-land, it doesn't send a crash report.

It'd be nice if we had tools to look at crashes by (some filter) by day. I know I was working on something like that with crashstats-tools, but I don't think I finished it.

Grabbing this to look into.

Assignee: nobody → willkg
Status: NEW → ASSIGNED

Looking at the last month of crash report counts, we have this:

date Fenix Fennec FennecAndroid Firefox FirefoxReality Focus GeckoViewExample ReferenceBrowser SeaMonkey Thunderbird
2020-04-04 8259 21 35813 78229 26 1918 0 0 273 19995
2020-04-05 8059 10 36817 77587 36 1806 0 0 243 18496
2020-04-06 8555 8 36224 109945 11 1989 0 0 302 50670
2020-04-07 9231 10 35393 113812 20 2044 0 13 233 48483
2020-04-08 9609 13 35702 111221 33 2109 0 0 251 47321
2020-04-09 10948 8 34964 102698 30 2008 0 1 275 44669
2020-04-10 12331 11 36093 93720 35 2033 0 0 262 36192
2020-04-11 12101 14 35626 77794 25 1928 0 3 234 19380
2020-04-12 11844 12 36468 73830 23 1917 0 3 233 16561
2020-04-13 12100 9 37862 101718 21 2112 0 1 277 34590
2020-04-14 12177 4 36015 112043 40 2513 0 3 316 50639
2020-04-15 14155 6 35844 109062 33 2437 0 2 354 47875
2020-04-16 14069 2 35196 105707 32 2395 0 2 349 47049
2020-04-17 16022 10 34821 103113 29 2385 0 0 256 42944
2020-04-18 17326 14 35850 77277 35 2471 0 2 238 19300
2020-04-19 16419 12 35623 79855 41 2358 0 5 232 18047
2020-04-20 20887 7 35707 113964 58 2312 0 4 308 48999
2020-04-21 36660 12 35293 105699 151 2439 0 4 310 46486
2020-04-22 49339 4 34775 107905 63 2306 0 0 327 47646
2020-04-23 54797 11 34822 103090 144 2316 0 0 328 45790
2020-04-24 67300 13 33646 101181 175 2349 0 3 327 43571
2020-04-25 93088 8 34755 76605 160 2167 1 1 263 18464
2020-04-26 104930 8 34980 74319 148 2353 1 2 294 17843
2020-04-27 114474 17 33101 107444 120 2190 0 5 301 49195
2020-04-28 121653 12 34139 106058 165 2350 0 4 308 48321
2020-04-29 118677 10 33884 107850 136 2238 0 2 314 46651
2020-04-30 111292 6 33307 107102 180 2316 0 5 340 46910
2020-05-01 91442 8 34350 84696 239 2338 0 1 295 26367
2020-05-02 54798 7 33244 73302 155 2107 0 2 261 19463
2020-05-03 41636 6 34102 71861 128 2223 0 2 252 18386
2020-05-04 36904 6 33263 103017 170 2247 0 1 324 47256
2020-05-05 18551 8 15880 52720 77 1143 0 13 149 23683

During the date range we're curious about, Fenix sees a dramatic increase in processed crash reports. Fenix crash reports won't have a minidump if they're crashes in Java-land and that means minidump-stackwalk isn't run. That rule is the bulk of processing, so getting a spike of crash reports that don't run that rule in processing will drop the mean significantly.

Ergo, I think Socorro is fine here.

The one thing I think I wan to do is split the process_crash time somehow so we're looking at process_crash time for processing that runs minidump stackwalk separately from process_crash time for processing that doesn't run minidump-stackwalk.

I'm going to defer thinking about that until something like this happens again.

Marking this as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: