Last Comment Bug 752098 - Crashes on MemBench with incremental GC enabled
: Crashes on MemBench with incremental GC enabled
Status: RESOLVED FIXED
: crash, regression
Product: Core
Classification: Components
Component: DOM (show other bugs)
: 15 Branch
: All All
: -- critical with 3 votes (vote)
: mozilla15
Assigned To: Bill McCloskey (:billm)
:
Mentors:
http://gregor-wagner.com/tmp/mem
: 752191 752327 752392 (view as bug list)
Depends on:
Blocks: 702531 654903 735099
  Show dependency treegraph
 
Reported: 2012-05-04 16:10 PDT by Gregor Wagner [:gwagner]
Modified: 2012-05-11 14:09 PDT (History)
19 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (848 bytes, patch)
2012-05-07 16:45 PDT, Bill McCloskey (:billm)
luke: review+
Details | Diff | Review

Description Gregor Wagner [:gwagner] 2012-05-04 16:10:34 PDT
This bug was filed from the Socorro interface and is 
report bp-51004ce3-ccbe-4451-90f0-237052120504 .
============================================================= 

Just start http://gregor-wagner.com/tmp/mem and wait a few sec.
Comment 2 Gregor Wagner [:gwagner] 2012-05-04 16:30:21 PDT
Important detail: IGC is enabled.
Seems stable without IGC.
Comment 3 Boris Zbarsky [:bz] 2012-05-04 17:09:14 PDT
I just tried this with a debug Mac build from changeset 40a901a6e733.  No crash....

With igc, is it possible for an object to be finalized more than once?  I'd assume not, right?
Comment 4 Bill McCloskey (:billm) 2012-05-04 17:14:56 PDT
(In reply to Boris Zbarsky (:bz) from comment #3)
> With igc, is it possible for an object to be finalized more than once?  I'd
> assume not, right?

They're only called once.

The second crash seems to be bug 752081.
Comment 6 Andrew McCreight [:mccr8] 2012-05-05 06:10:06 PDT
*** Bug 752191 has been marked as a duplicate of this bug. ***
Comment 7 Andrew McCreight [:mccr8] 2012-05-05 06:19:08 PDT
Danial Horton narrowed down the range a bit, but there are around 200 changesets in the range he found, so it is still fairly wide.
Comment 8 Andrew McCreight [:mccr8] 2012-05-05 06:27:46 PDT
I got a similar crash just now with incremental enabled.  Just on Google Reader.

https://crash-stats.mozilla.com/report/index/bp-77c96c0e-9f96-4bc9-bef8-198572120505
Comment 9 Danial Horton 2012-05-05 06:33:03 PDT
if we both started seeing it roughly around the same changesets, then it could be the same thing, especially if his tests was on inbound where the CPG and other stuff was already landed.

for me, 20120503142649 is the last build i could open my usual session on with iGC enabled.

my DSL is flaky at nights atm, which makes diving into the inbound builds a problem.(In reply to Andrew McCreight [:mccr8] from comment #8)

> I got a similar crash just now with incremental enabled.  Just on Google
> Reader.
> 
> https://crash-stats.mozilla.com/report/index/bp-77c96c0e-9f96-4bc9-bef8-
> 198572120505

Yeah, thats pretty much exactly what i had been seeing
I was just using membench in the bug i opened to exacerbate the situation and bring a crash on quicker.

In some cases, all i have to do is merely interact with the ui, even as slight as minimising the window and it might crash while iGC is on.
Comment 10 Andrew McCreight [:mccr8] 2012-05-05 06:49:14 PDT
Yeah, this is pretty terrible.  I'm going to disable incremental GC for now.  There are one or two other reports of crashy IGC in bug 654903.
Comment 11 Danial Horton 2012-05-05 06:58:37 PDT
alright, i just dropped my sync speed a bit, i'll get a few inbound zips downloading and set up a profile with igc enabled to run through membench till i find where it stops crashing
Comment 12 Andrew McCreight [:mccr8] 2012-05-05 07:03:17 PDT
Well, there's already one patch in that range that is known to have caused a regression (bug 730208), and there's a patch to fix it, but it just hasn't landed yet.  Probably a good place to start would be to look at builds before that landed.
Comment 13 Danial Horton 2012-05-05 07:32:08 PDT
Regression window(m-i)

Works:
http://hg.mozilla.org/integration/mozilla-inbound/rev/c3813fbb1c9a
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1
ID:20120502040217

Fails:
http://hg.mozilla.org/integration/mozilla-inbound/rev/de5745bce8bc
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1
ID:20120503040220
Comment 14 Danial Horton 2012-05-05 07:41:40 PDT
installed memchaser and nightly tools to be sure of my findings... which had the effect of making the regression appear quicker in 20120503040220
Comment 15 Danial Horton 2012-05-05 08:08:01 PDT
This doesn't appear to be osx only, so platform might do with changing to All All?

now all we need is someone to test iGC on a droid and see if it happens there for a trifecta....
Comment 16 Bill McCloskey (:billm) 2012-05-05 08:37:21 PDT
I'll work on this soon. It sounds like it's affecting a lot of people.
Comment 17 Danial Horton 2012-05-06 07:37:41 PDT
You sure Andrew?, that crash seems to predate this particular regression.
Comment 18 Andrew McCreight [:mccr8] 2012-05-06 07:38:51 PDT
Bug 654903 likely has many causes.  This bug is one of them, for people who have incremental GC turned on.
Comment 19 Danial Horton 2012-05-06 07:45:23 PDT
initial report in bug 654903 is not likely to be caused by incremental GC at all, the last 4 or so comments in it are related to this issue, but that doesn't mean the overall bug is.

I remember getting that particular crash earlier last year though and thought it had since been resolved (something specific to JIT and certain extensions?)
Comment 20 Gregor Wagner [:gwagner] 2012-05-06 09:39:32 PDT
*** Bug 752327 has been marked as a duplicate of this bug. ***
Comment 21 Bill McCloskey (:billm) 2012-05-07 16:45:42 PDT
Created attachment 621785 [details] [diff] [review]
patch

Thanks for the bisection, Daniel. It helped a lot.

The main problem is that, while the gcIsFull flag is set during BeginMarkPhase, we're using it during the sweep phase. So it's possible for a compartment to be created in between and not have its script filenames marked, but it will get them swept.

One possible solution is to change the gcIsFull flag so that it's set during sweeping. But then we might set mark bits but not sweep, which is sort of bad, although only theoretically. Generally I think it's better to set flags like this in the first slice and then leave them alone. Otherwise it gets really confusing what should be marked.

The solution I've chosen is to ensure that all new script filenames are marked. This includes ones created for new compartments.

The right solution for this is bug 751618.
Comment 22 Luke Wagner [:luke] 2012-05-07 16:55:39 PDT
Comment on attachment 621785 [details] [diff] [review]
patch

Thanks to you both!
Comment 23 Bill McCloskey (:billm) 2012-05-07 17:00:47 PDT
https://hg.mozilla.org/mozilla-central/rev/cf9354f3a976
Comment 24 Danial Horton 2012-05-07 17:15:39 PDT
Ok so Lukes patch is a temporary fix till bug 751618 is looked into and implemented? (coz if thats the full fix, it just did my head in... a single line... xD)
Comment 25 Bill McCloskey (:billm) 2012-05-07 17:20:05 PDT
(In reply to Danial Horton from comment #24)
> Ok so Lukes patch is a temporary fix till bug 751618 is looked into and
> implemented? (coz if thats the full fix, it just did my head in... a single
> line... xD)

This is a real fix for the problem. Bug 751618 would just solve it in a cleaner way. Sometimes scary bugs have simple fixes :-).
Comment 26 Danial Horton 2012-05-07 17:30:55 PDT
be nice if 20120507 could get a respin just for this, understandably we don't always get nice things :-(

i'll check out the hourly if i remember. thanks for the quick fix guys
Comment 27 David Mandelin [:dmandelin] 2012-05-11 14:09:50 PDT
*** Bug 752392 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.