Closed Bug 1250716 Opened 8 years ago Closed 8 years ago

Compare crash submission rates for e10s vs. non-e10s

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
e10s + ---

People

(Reporter: billm, Unassigned)

References

Details

When comparing how often a particular signature occurs in e10s and non-e10s, it would be nice to know if content process crashes are submitted more or less often than main process crashes. The UI is different and it wouldn't be surprising if the rates are different. Can we do a telemetry analysis of this using the data from the 45 beta no-addon experiment? poiru, is this something you could do?
Flags: needinfo?(birunthan)
(In reply to Bill McCloskey (:billm) from comment #0)
> When comparing how often a particular signature occurs in e10s and non-e10s,
> it would be nice to know if content process crashes are submitted more or
> less often than main process crashes. The UI is different and it wouldn't be
> surprising if the rates are different. Can we do a telemetry analysis of
> this using the data from the 45 beta no-addon experiment? poiru, is this
> something you could do?

How do we track the submission rate?
Flags: needinfo?(birunthan)
Hm, I thought we had Telemetry probes for this, but looking at Histograms.json, I don't see any good candidates...

ted, do you know if we have any way of knowing how many times the main-process crash reporter dialog is shown, but a crash report is _not_ sent? Glancing at the client code, I don't see us recording that anywhere, but I wanted to be sure...
Flags: needinfo?(ted)
We know from monitoring that content process crashes are submitted at a rate of about 10%. Main-process crashes are submitted at a rate of 50-60%

PROCESS_CRASH_SUBMIT_ATTEMPT records attempts for "main", "content", and "plugin" so you can compare that with the crash rate to generate a submission rate.
Yeah, I don't know the telemetry bits well, bsmedberg or gfritzsche are better people to ask.
Flags: needinfo?(ted)
tracking-e10s: --- → +
Priority: -- → P4
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #3)
> PROCESS_CRASH_SUBMIT_ATTEMPT records attempts for "main", "content", and
> "plugin" so you can compare that with the crash rate to generate a
> submission rate.

I added this to the stability analysis: https://gist.github.com/poiru/e7b9b89282179b17125f

                                      non-e10s       e10s
usage hours                               5419       4774
chrome crashes                           73078      33801
content crashes                          11797      76223
plugin crashes                           42173      56263
main crash rate                          13.49       7.08
main+content crash rate                  15.66      23.04
plugin crash rate                         7.78      11.78
submit attempts per main crash            0.19       0.17
submit attempts per content crash         0.00       0.12

The numbers are <1 so it seems like we don't record PROCESS_CRASH_SUBMIT_ATTEMPT for all crashes (or we're somehow overcounting crashes).
    "payload/keyedHistograms/PROCESS_CRASH_SUBMIT_ATTEMPT/main-crash",
    "payload/keyedHistograms/PROCESS_CRASH_SUBMIT_ATTEMPT/content-crash"])

These are incorrect. It should be "main" and "content". I'm surprised/concerned that you're getting results for the other two keys.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #6)
> These are incorrect. It should be "main" and "content". I'm
> surprised/concerned that you're getting results for the other two keys.

I tried "main" and "content" first and was surprised to see that they didn't work. It turns out the keys are represented as PROCESS_TYPE-CRASH_TYPE: https://dxr.mozilla.org/mozilla-central/rev/4ea7408b3eef059aa248f4b00328f8fdb4475112/toolkit/components/crashes/CrashManager.jsm#1081
You're right, the user story from the original bug was wrong.
Those submission rates for main crashes are low-seeming (on both sides). I wonder why that is. I'd expect something close to 40-50% instead of 20%.
We know there's a difference, we're comfortable living with the difference.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.