1.04 KB, patch
Benjamin Smedberg: review+
|Details | Diff | Splinter Review|
1.07 KB, patch
Benjamin Smedberg: review+
|Details | Diff | Splinter Review|
This bug was filed from the Socorro interface and is report bp-e5e141e8-fc0d-4852-a2ef-1e17f2150122. ============================================================= We've been seeing OOM|large crashes with mostly ~2M allocations in topcrash ranges of 36 and higher, see for example https://crash-stats.mozilla.com/signature/?build_id=20150120155007&product=Firefox&release_channel=beta&version=36.0&hang_type=crash&signature=OOM+|+large+|+mozalloc_abort%28char+const*+const%29+|+mozalloc_handle_oom%28unsigned+int%29+|+moz_xmalloc+|+xul.dll%400xd3956+|+xul.dll%400x11c648c+|+xul.dll%400x11327b1+|+xul.dll%400x112e90d+|+xul.dll%400x112e730+|+xul.dll%400x1541f5+|+xul.dll%400x158795+|+xul.dll%400xdf784+...&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&page=1 I feel like that's an OOM|large that makes us unable to actually map xul.dll and not get a debug ID for it, see bp-e5e141e8-fc0d-4852-a2ef-1e17f2150122 as one example.
xul.dll has grown by 8 megs between 34 and 36, mostly due to the DLL merges in bug 609976 and bug 922912. We likely need to update ReserveBreakpadVM's kReserveSize. (I bet OOM|small is equally affected, but since Socorro doesn't put stack frames in those signatures, we don't notice)
Bug 1028972 comment 15: "This ought to buy us a year and change. Here's hoping we'll revisit this for unified xul before then." Well, at least that came true :-)
[Tracking Requested - why for this release]: This causes a lot of useless crash signatures on 36+
Created attachment 8553365 [details] [diff] [review] Adjust Breakpad reservation for xul.dll inflation Increase by 10MB to cover recent growth plus a little future breathing room. (For anyone following along, this patch is not intended to fix the crashes, but it should turn the "xul.dll@0x1234" signatures into something more actionable)
Some of the signatures I checked had lots of youtube URLs. That makes the importance of trying to figure these crashes out all the more important for MSE. Adding Anthony and Sheila to the bug as we will need to discuss this next week.
Important crash, tracking.
David, can you please request uplift to aurora and beta for this ASAP? As you know, we have those crashes on beta that we need data for and then need to react to that, so the sooner we can get this there, the better.
Comment on attachment 8553365 [details] [diff] [review] Adjust Breakpad reservation for xul.dll inflation [Triage Comment] I am going to bypass the procedure to make sure we have it in beta 4. David, hope you don't mind!
https://hg.mozilla.org/integration/mozilla-inbound/rev/8aa2406244bc https://hg.mozilla.org/releases/mozilla-aurora/rev/94ecdce16831 https://hg.mozilla.org/releases/mozilla-beta/rev/59aa16cfd49f
Given that it landed on the branches, clearing the ni?
Agreed re the uplifts, thanks for jumping in!
From what I see, it's substantially lower volume, but we still have reports like this in 36.0b4 even with this patch: https://crash-stats.mozilla.com/report/list?signature=OOM%20%7C%20large%20%7C%20mozalloc_abort%28char%20const%2A%20const%29%20%7C%20mozalloc_handle_oom%28unsigned%20int%29%20%7C%20moz_xmalloc%20%7C%20xul.dll%400x278d96%20%7C%20xul.dll%400x7bfa98%20%7C%20xul.dll%400x10f419a%20%7C%20xul.dll%400x10f0227%20%7C%20xul.dll%400x10f004a%20%7C%20xul.dll%400x1e8527%20%7C%20xul.dll%400x10867e%20%7C%20xul.dll%400x286684%2E%2E%2E and https://crash-stats.mozilla.com/report/list?signature=xul.dll%400x31f195%20%7C%20xul.dll%400x54561%20%7C%20xul.dll%400x56dd3%20%7C%20xul.dll%400x3f66c3%20%7C%20xul.dll%400x3f697a%20%7C%20xul.dll%400x603795%20%7C%20xul.dll%400x1a47e43
(In reply to Robert Kaiser (:email@example.com) from comment #13) > From what I see, it's substantially lower volume, but we still have reports > like this in 36.0b4 even with this patch: Yes, I was just going to mention that as well, http://bit.ly/1tyKWex shows almost all youtube.com URLs which concerns me a bit since this patch made into that build ID.
David: Is there anything in particular we could tweak to get the remaining signatures to show more information? The concern from the MSE side would be that many URLs show youtube, and it would great to find out what the root cause of those remaining crashes is. Thanks.
The previous patch really should have worked, but I guess we can try increasing the value even higher. In the meantime, WinDbg can make sense of the dump files if you already have a PDB cached for that xul version. If you have a list of the top N bogus signatures that you care about, I can go and open bugs with the correct stacks.
Created attachment 8556794 [details] [diff] [review] If at first you don't succeed, use more memory
It looks as if it may be too late to get this in B5 since I already see build candidates - ni on Sylvestre to see the possibilities. Sylvestre we see crashes in the latest B4 explosive report that we would like to get a better stack for, so getting this patch into Beta 5 may be important for the MSE decision.
Not sure we should postpone beta 5. Even if we start a build2 right now, we won't have the QE sign off from SV and QE before next Monday. Beta 6 is going to build next Monday, should be released next Tuesday.
Comment on attachment 8556794 [details] [diff] [review] If at first you don't succeed, use more memory Approval Request Comment [Feature/regressing bug #]: xul.dll growth, mostly bug 609976 and bug 922912 [User impact if declined]: Unactionable crash reports [Describe test coverage new/current, TreeHerder]: m-i is green [Risks and why]: We will use more address space [String/UUID change made/needed]: None
Comment on attachment 8556794 [details] [diff] [review] If at first you don't succeed, use more memory I don't know if it is possible but maybe we should automatically detect if this needs to be increased.
I went ahead and pulled up the correct stacks for a few of these on 36b4: they led to bug 1128409, bug 1114976, bug 1081911.
Ryan could you help with the uplift of comment 23? (I'm surprised it didn't happen automagically. Were the scripts thrown off by the previous patch? Was there some marking I should have put on the bug?)
(In reply to David Major [:dmajor] (UTC+13) from comment #26) > Ryan could you help with the uplift of comment 23? (I'm surprised it didn't > happen automagically. Were the scripts thrown off by the previous patch? Was > there some marking I should have put on the bug?) Yeah, the status flags should have been set back to affected when there was more follow-up work to land.