Narrow down the zero_byte_load probe to tailor results for YSOD
Categories
(Core :: Networking: JAR, task, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox87 | --- | fixed |
People
(Reporter: zbraniecki, Assigned: zbraniecki)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged])
Attachments
(1 file)
The probe landed in bug 1693146 returned 1.5 million results on nightly in a day. Let's filter out everything we don't care about for now.
Assignee | ||
Comment 1•5 years ago
|
||
I want to carefully tailor down the number of events we're getting to zero down on ones that are most likely causing YSOD.
That is a bit of a guess game until we have correlations, but based on results I posted in bug 1693711 I believe I can reliably cut out ~1.3m out of 1.5 events we got today without losing the data we're hunting down.
I'm going to document what I'm filtering out both in the code and here to keep awareness that we are filtering data and may want to unfilter later to analyze it for other errors or in correlative with YSOD:
-
Remove "other" category
Volume: 60% of events
Why: Other is dominated by SVG and JSON unrelated to YSOD. It is worth noting that the most common status there isNS_BINDING_ABORTED
and not file not found. May be worth investigating separately. -
Remove "FTL" when matched with "NS_ERROR_FILE_NOT_FOUND"
Volume: 17%
Why: Fluent L10nRegistry intentionally attempts to load files from toolkit/browser omni.ja to learn if the file is present. file not found is an expected output of such test and we heavily cache it so that we don't fire it multiple times.
But it is not causing YSODs and even if some of those calls are errors, Fluent will recover, report to console and display as much as it can without breaking UI (think, CSS style). In result it's not worth investigating file not found for it. If other errors show up, I'm keeping them in the probe. -
Remove "JS" when not coming from "omni.ja!"
Volume: 10%
Why: JS coming from extensions may be worth investigating by extensions, but is not related to our main sources of YSODs -
Remove "DTD" when starting with "omni.ja!/res/dtd"
Volume: 7%
Why: "res/dtd" paths are not interesting for our use case I believe as they don't cause the most common DTD related YSODs - neither the NO_ELEMENTS nor the MISSING_ENTITY (which is located in the first localization DTD callsite, notsvg11.dtd
style).
The volume of such missing DTD files is suspicious and some paths likeomni.ja!/chrome/toolkit/content/global/DTD/xhtml1-strict.dtd
indicate that we may be constructing wrong paths in some generated files. May be worth investigating separately.
Those 4 will remove ~95% of the events leaving less than 70k events per day which should be much more managable. If we don't see a strong correlation in that group we can carefully open up to let more events be sent from the filtered group.
Assignee | ||
Comment 2•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 4•5 years ago
|
||
bugherder |
Description
•