Last Comment Bug 866526 - Firefox 23 spikes in empty crashes
: Firefox 23 spikes in empty crashes
Status: RESOLVED FIXED
[MemShrink][qa-]
: crash, regression, topcrash
Product: Core
Classification: Components
Component: General (show other bugs)
: 23 Branch
: All All
-- critical (vote)
: mozilla23
Assigned To: Nobody; OK to take it and work on it
:
:
Mentors:
Depends on: 859377
Blocks:
  Show dependency treegraph
 
Reported: 2013-04-28 10:26 PDT by Scoobidiver (away)
Modified: 2013-07-11 15:57 PDT (History)
14 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
+
fixed


Attachments
Graph of crashes by available VM and pagefile (74.96 KB, image/svg+xml)
2013-05-01 13:07 PDT, Benjamin Smedberg [:bsmedberg]
no flags Details

Description User image Scoobidiver (away) 2013-04-28 10:26:12 PDT
There are about 80 empty crashes per build. When there were about 300-500 crashes, it was because there was other top crashers with that volume of crashes. But in 23.0a1/20130427, #1 non-empty top crasher has only about 50 crashes and there are about 500 empty crashes.
The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a6104e0e5a2c&tochange=0e45f1b9521f

I see in App Notes a few crashes on abort (bug 767343, bug 859247, bug 793126, bug 844819) but not higher than usual so the spike is not caused by an abort.

I suspect a regression from bug 844323.

More reports at:
https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A23.0a1&range_value=4&range_unit=weeks&signature=EMPTY%3A%20no%20crashing%20thread%20identified%3B%20corrupt%20dump
Comment 1 User image Justin Lebar (not reading bugmail) 2013-04-28 10:30:31 PDT
> I suspect a regression from bug 844323.

Are we talking about Firefox or B2G here?  None of the code from bug 844323 should be running in Firefox.
Comment 2 User image Scoobidiver (away) 2013-04-28 10:38:59 PDT
(In reply to Justin Lebar [:jlebar] from comment #1)
> Are we talking about Firefox or B2G here? 
It's only about Firefox but I though those changes for B2G have some impacts on Firefox.
Comment 3 User image Justin Lebar (not reading bugmail) 2013-04-28 11:19:49 PDT
Okay.  It's possible that bug 844323 had some effect, but the vast majority of the code there is guarded on dom.ipc.processPriorityManager.enabled, which is true on b2g only, so I would be very surprised if the code was doing something so strange as to cause empty crash reports.
Comment 4 User image Scoobidiver (away) 2013-04-29 01:27:41 PDT
A comment says: "Today's build keeps crashing at random intervals even when seemingly doing nothing." (bp-d107fb92-ba0a-4cc8-a46b-d9b7e2130428)
Another user posted the same thing in the wrong bug: bug 744722 comment 7.
Comment 5 User image Robert Kaiser 2013-04-30 08:14:06 PDT
We are indeed at >8000 crashes per 100 ADI on the last two days while we had ~1000 before 2013-04-27 (27th with ~2000 may be the build where the causing change landed)
Comment 6 User image Ted Mielczarek [:ted.mielczarek] 2013-04-30 08:20:33 PDT
I hit two empty dump crashes in the past 24 hours:
https://crash-stats.mozilla.com/report/index/bp-87a8f61b-8ccf-4e16-8475-72a152130429
https://crash-stats.mozilla.com/report/index/bp-d0ed0b6f-e1fb-4e74-8fc2-642e32130430

I'm running with a debugger attached to my browser now to see if I can catch it and get a useful minidump out.
Comment 7 User image Benjamin Smedberg [:bsmedberg] 2013-04-30 09:02:12 PDT
http://people.mozilla.com/~bsmedberg/bsmedberg-graphing-playground/emptydump-nightly-frequency.html now has data from the weekend.

Can somebody check whether there was something which landed for 20130326 and got backed out 20130329 and relanded for 20130427 which might be a culprit? Also cc'ing memshrink guys in case they have seen matching data from other sources.
Comment 8 User image Scoobidiver (away) 2013-04-30 09:48:38 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #7)
> Can somebody check whether there was something which landed for 20130326 and
> got backed out 20130329 and relanded for 20130427 which might be a culprit?
First regression range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3acbf951b3b1&tochange=456cb08f8509
Working range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=962f5293f87f&tochange=8693d1d4c86d (no backout from the first regression range)
Second regression range:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a6104e0e5a2c&tochange=0e45f1b9521f
Comment 9 User image Scoobidiver (away) 2013-05-01 00:52:33 PDT
Here is a list of crashers with the same regression range: bug 767343, bug 814954, and bug 866108.
Comment 10 User image Benjamin Smedberg [:bsmedberg] 2013-05-01 08:33:33 PDT
Currently on the suspect list, bug 859377. The new piece of that bug which landed this morning may help with tomorrow's nightly.
Comment 11 User image Benjamin Smedberg [:bsmedberg] 2013-05-01 13:07:01 PDT
Created attachment 744254 [details]
Graph of crashes by available VM and pagefile

Here's a graph of the crashes against MEMORYSTATUS information. This shows that users in these crashes are overwhelmingly running out of VM space and are not thrashing in low-memory conditions.

Note that because I don't have minidumps, I can't classify the smallest available VM block, nor can I measure private bytes. Ted is seeing this and gave me a VMMap showing that there was significant fragmentation *and* very large amounts of private-bytes allocations which were totally unaccounted-for in about:memory.

I asked him to try running with VMMap profiling mode to see if anything more interesting showed up.
Comment 12 User image Scoobidiver (away) 2013-05-02 05:00:33 PDT
In 23.0a1/20130501085824 that contains http://hg.mozilla.org/mozilla-central/rev/cc82e1599dd0, the spike is gone and crashes are back to their previous volume, maybe even lower (before bug 837835).
Comment 13 User image Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2013-07-11 14:10:15 PDT
Scoobidiver, is it safe to mark this verified fixed now? How are we looking on Beta?
Comment 14 User image Scoobidiver (away) 2013-07-11 14:22:38 PDT
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #13)
> Scoobidiver, is it safe to mark this verified fixed now? How are we looking
> on Beta?
We are unable to verify in Beta. Empty crash accounted for 28% of top-100 crashes on May 8 before bug 859377 and accounts now for 26% in 23.0b3.

Note You need to log in before you can comment on or make changes to this bug.