Closed
Bug 1337688
Opened 8 years ago
Closed 8 years ago
Remove NIGHTLY_BUILD wrapping if the increased size from adding unloaded modules and process/thread data to minidumps is acceptable
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
RESOLVED
WONTFIX
mozilla55
People
(Reporter: ting, Unassigned)
References
Details
This is a followup for bug 1334027. Once bug 1334027 is landed, we will know how the size of minidumps is affected by adding unloading modules and process/thread data. If the increased size is acceptable, we can let it propagate to the other channels.
Reporter | ||
Comment 1•8 years ago
|
||
Marco, I am not sure whom to ask, but do you know how to check the increased size of minidumps from bug 1334027 that we received recently?
Flags: needinfo?(mcastelluccio)
Comment 2•8 years ago
|
||
:adrian or :lonnen can probably help (or redirect to somebody that can help).
Flags: needinfo?(mcastelluccio)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(adrian)
Comment 3•8 years ago
|
||
Adrian told me Will can probably help.
Flags: needinfo?(willkg)
Flags: needinfo?(chris.lonnen)
Flags: needinfo?(adrian)
Comment 4•8 years ago
|
||
Reading through this bug and bug #1334027, it sounds like you want to know the change in median/95%/max crash report sizes between before bug #1334027 landed and today for crashes from Firefox nightly. Is that correct?
Reporter | ||
Comment 5•8 years ago
|
||
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #4) > Reading through this bug and bug #1334027, it sounds like you want to know > the change in median/95%/max crash report sizes between before bug #1334027 > landed and today for crashes from Firefox nightly. Is that correct? Correct.
Comment 6•8 years ago
|
||
We have metrics for overall median/95%/max for crash report sizes where a crash report is the entire breakpad crash report--not just the minidump. Given that this requires just data about nightly, I think I'm going to have to do it by hand. I'll think about how to do that. Maybe build a list of crash ids using a socorro super search and then capturing the sizes for the upload_file_minidump files for those crashes. When do you need this data by?
Flags: needinfo?(willkg) → needinfo?(janus926)
Comment 7•8 years ago
|
||
Marco said this on IRC: <marco> it would be more precise by build ID, as in the AFTER period there might be crash reports both from Nightly builds that contain the change and Nightly builds that don't contain the change <marco> the first build ID with the change that might have increased the size of the minidump is 20170209030214 <marco> so you can compare all crash reports with build IDs < 20170209030214 vs all crash reports with build ID >= 20170209030214 I'll search by build id.
Comment 8•8 years ago
|
||
Also! Note that that change only changed minidumps from Windows clients, so you should restrict your query to Windows-only.
Reporter | ||
Comment 9•8 years ago
|
||
Low priority, so do it when you have time.
Flags: needinfo?(janus926)
Comment 10•8 years ago
|
||
I used the following SuperSearch query: https://gist.github.com/willkg/25e28570fd8c95537dbd7f9e2855c7c8#file-analysis_1337688-py-L129 https://gist.github.com/willkg/25e28570fd8c95537dbd7f9e2855c7c8#file-analysis_1337688-py-L149 Here's the script: https://gist.github.com/willkg/25e28570fd8c95537dbd7f9e2855c7c8 It does a supersearch query per day for the before build id and the after build id. Then for each crashid, I pulled down the dump. Then I looked at the "before" set and the "after" set and here's the summary: ./before Number of files: 920 Average size: 383092 Median size: 348876 95% size: 747782 Max size: 3149706 ./after Number of files: 1001 Average size: 764912 Median size: 662433 95% size: 1516906 Max size: 16634443 Please let me know if there are changes in how I did it that you want to see and/or if I messed up the SuperSearch query. Hope this helps!
Comment 11•8 years ago
|
||
Do we have criteria for 'acceptable'? From the perspective of the crash reporter, increasing the size of the minidump (1) increases the risk of a network disconnect during transmission and (2) results in additional load on the collectors. (2) We scale well in our current infrastructure and antenna (the collector rewrite) appears to be constrained on network throughput so I'm not worried about load on them. (1) I don't have ideas for quantifying the risk of a disconnect that can be validated in under a week. This may be less important in non-release channels because we can retry or prompt with the doorhanger
Comment 12•8 years ago
|
||
The HTTP POST payload from a breakpad crash report from Windows is uncompressed. Maybe compressing the payloads from Windows can alleviate the concerns?
Reporter | ||
Comment 13•8 years ago
|
||
I think it better stays in nightly if the size increases ~2x. Ted, what do you think?
Flags: needinfo?(ted)
Comment 14•8 years ago
|
||
I agree, it seems like a bit too much to ship in release.
Flags: needinfo?(ted)
Comment 15•8 years ago
|
||
Pushed by tchou@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/52512f137a66 Update a TODO comment according to the experimental data. r=me
Comment 16•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/52512f137a66
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox55:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Reporter | ||
Updated•8 years ago
|
Resolution: FIXED → WONTFIX
Reporter | ||
Comment 17•8 years ago
|
||
Will, thanks for collecting the numbers. :)
You need to log in
before you can comment on or make changes to this bug.
Description
•