Last Comment Bug 785626 - crash in mozilla::FrameLayerBuilder::BuildContainerLayerFor
: crash in mozilla::FrameLayerBuilder::BuildContainerLayerFor
Status: RESOLVED FIXED
: crash, regression, topcrash, verifyme
Product: Core
Classification: Components
Component: Layout (show other bugs)
: 17 Branch
: All All
: -- critical (vote)
: mozilla18
Assigned To: Chris Lord [:cwiiis]
:
: Jet Villegas (:jet)
Mentors:
: 785816 786138 (view as bug list)
Depends on:
Blocks: 785333
  Show dependency treegraph
 
Reported: 2012-08-25 08:52 PDT by Scoobidiver (away)
Modified: 2013-12-27 14:24 PST (History)
16 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
verified


Attachments

Description Scoobidiver (away) 2012-08-25 08:52:32 PDT
It first appeared in 17.0a1/20120825. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1c0ac073dc65&tochange=f077de66e52d

Signature 	mozilla::`anonymous namespace''::ContainerState::Finish(unsigned int*) More Reports Search
UUID	113744a5-686b-45a4-9e45-128692120825
Date Processed	2012-08-25 14:10:34
Uptime	970
Last Crash	6.5 days before submission
Install Age	16.2 minutes since version was first installed.
Install Time	2012-08-25 14:28:10
Product	Firefox
Version	17.0a1
Build ID	20120825030541
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	AuthenticAMD family 18 model 1 stepping 0
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x14
User Comments	Looking at some medical forum...and CRASH
App Notes 	
AdapterVendorID: 0x1002, AdapterDeviceID: 0x9640, AdapterSubsysID: 2acf103c, AdapterDriverVersion: 8.982.0.0
D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ 
EMCheckCompatibility	True
Adapter Vendor ID	0x1002
Adapter Device ID	0x9640
Total Virtual Memory	4294836224
Available Virtual Memory	3812274176
System Memory Use Percentage	25
Available Page File	13676478464
Available Physical Memory	5957169152

Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::`anonymous namespace'::ContainerState::Finish 	layout/base/FrameLayerBuilder.cpp:2101
1 	xul.dll 	mozilla::FrameLayerBuilder::BuildContainerLayerFor 	layout/base/FrameLayerBuilder.cpp:2456
2 	xul.dll 	PL_DHashTableOperate 	obj-firefox/xpcom/build/pldhash.cpp:586
3 	xul.dll 	mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems 	layout/base/FrameLayerBuilder.cpp:1806
4 	xul.dll 	mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems 	layout/base/FrameLayerBuilder.cpp:1751
5 	xul.dll 	mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems 	layout/base/FrameLayerBuilder.cpp:1751
6 	xul.dll 	mozilla::FrameLayerBuilder::BuildContainerLayerFor 	layout/base/FrameLayerBuilder.cpp:2451
7 	mozglue.dll 	je_malloc 	memory/mozjemalloc/jemalloc.c:6291
8 	xul.dll 	nsDisplayList::PaintRoot 	layout/base/nsDisplayList.cpp:943
9 	xul.dll 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1868
10 	xul.dll 	nsViewManager::FlushDirtyRegionToWidget 	view/src/nsViewManager.cpp:465
11 	xul.dll 	nsViewManager::ProcessPendingUpdates 	view/src/nsViewManager.cpp:1217
12 	xul.dll 	nsRefreshDriver::Notify 	layout/base/nsRefreshDriver.cpp:421
13 	xul.dll 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:476
14 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:624
15 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:82
16 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
17 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
18 	xul.dll 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:163
19 	xul.dll 	nsAppShell::Run 	widget/windows/nsAppShell.cpp:232
20 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:273
21 	xul.dll 	XREMain::XRE_mainRun 	toolkit/xre/nsAppRunner.cpp:3800
22 	xul.dll 	XREMain::XRE_main 	toolkit/xre/nsAppRunner.cpp:3877
23 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3953
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3A%60anonymous+namespace%27%27%3A%3AContainerState%3A%3AFinish%28unsigned+int*%29
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3AFrameLayerBuilder%3A%3ABuildContainerLayerFor
Comment 1 David Baron :dbaron: ⌚️UTC-10 2012-08-25 10:28:18 PDT
Bug 785333 is the only thing that pops out within that range.
Comment 2 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-08-25 18:05:42 PDT
I can pretty easily reproduce this locally, and backing out bug 785333 does fix it. Unfortunately I'm having a hard time coming up with a testcase - it involves using a custom provider for the social feature in Firefox.
Comment 3 ex.astris 2012-08-25 21:49:45 PDT
I am getting this crash consistently in 17.0a1 when I go 'offline' on facebook chat and then scroll up/down my timeline. Going back 'online' with chat resolves the problem.

One of the several crash reports:

https://crash-stats.mozilla.com/report/index/bp-5ce86f2c-287c-492c-bce4-b9e572120826
Comment 4 Scoobidiver (away) 2012-08-26 05:05:39 PDT
There are about 800 crashes in less than one day.
Comment 6 Chris Lord [:cwiiis] 2012-08-27 01:31:46 PDT
Bank holiday, will see if this is a quick fix.
Comment 7 Chris Lord [:cwiiis] 2012-08-27 03:54:40 PDT
I can't reproduce this in Linux - judging by the reports, it seems this may be Windows-specific?

I guess I've broken some kind of invariant to do with mNewChildLayers... Looking but will be away for the day so won't have a fix until the evening/night (GMT), assuming it's relatively simple.
Comment 8 Chris Lord [:cwiiis] 2012-08-27 03:57:17 PDT
For anyone that can easily reproduce this crash, it'd be super-useful if they could post if there are any assertions leading up to the crash, and what they are.
Comment 9 Chris Lord [:cwiiis] 2012-08-27 04:22:44 PDT
    // An invariant of this loop is that the layers in mNewChildLayers
    // with index < i are the first i child layers of mContainerLayer.

My current guess is that this would happen if a layer ended up being represented by an item in a different part of the display-list than before and caused layers to swap positions, thus breaking this invariant.

I don't think this is an unreasonable thing to happen though, so perhaps we should pick a better way of pruning old child layers... Couldn't we just ref and remove all child layers of a container layer in BuildContainerLayerFor, then unref them after ContainerState::Finish? I think this would let us simplify this Finish method quite a bit anyway.

Will try this tonight after I reproduce this in Windows.
Comment 10 ex.astris 2012-08-27 04:26:32 PDT
> I can't reproduce this in Linux - judging by the reports, it seems this may
> be Windows-specific?

Happening on OS X for me.

I can easily reproduce the crash but not being well versed in this area, you'd need to tell me specifically what information you'd want me to provide.
Comment 11 Girish Sharma [:Optimizer] 2012-08-27 04:34:14 PDT
I can reproduce it every time on Facebook, while scrolling to top. I think it has something to do with index less than 0. If anything specific is needed, I can help.
(I will need to build Firefox to get assertions, please help me to do that)
Comment 12 ex.astris 2012-08-27 04:37:18 PDT
Same reproducibility here, but only if the chat side-pane is closed. If it's active, the problem does not re-occur.
Comment 13 Girish Sharma [:Optimizer] 2012-08-27 04:39:01 PDT
maybe the chat panel being "fixed" position somehow prevents the assertion.
Comment 14 Scoobidiver (away) 2012-08-27 04:42:00 PDT
*** Bug 785816 has been marked as a duplicate of this bug. ***
Comment 15 Scoobidiver (away) 2012-08-27 11:36:18 PDT
Bug 785816 has STR but I can't reproduce.
Comment 16 Asa Dotzler [:asa] 2012-08-27 14:39:25 PDT
Can we back out the offending change and re-land when this is solved. This is blocking some pretty important testing and crashing a lot of our nightly users. Going multiple days with a major crash at Facebook is probably costing us a non-trivial number of nightly testers.
Comment 17 (mostly gone) XtC4UaLL [:xtc4uall] 2012-08-27 15:37:54 PDT
Using the STR in Bug 785816 Comment 0 + Build of http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012-08-27-mozilla-central-debug/ (i.e. http://hg.mozilla.org/mozilla-central/rev/8af6a22827ec) I get the following Assertion when crashing:

###!!! ASSERTION: Layer already in list???: '!mNewChildLayers.Contains(ownLayer)', file e:/builds/moz2_slave/m-cen-w32-dbg/build/layout/base/FrameLayerBuilder.cpp, line 1859
Comment 18 Chris Lord [:cwiiis] 2012-08-27 15:38:49 PDT
(In reply to Asa Dotzler [:asa] from comment #16)
> Can we back out the offending change and re-land when this is solved. This
> is blocking some pretty important testing and crashing a lot of our nightly
> users. Going multiple days with a major crash at Facebook is probably
> costing us a non-trivial number of nightly testers.

I didn't get the time today to work on a fix and it isn't trivial, so I think this would be a good idea. I'll likely have a fix tomorrow, but it'd be better to back this out now and re-land than sacrifice users.
Comment 19 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-08-27 16:45:49 PDT
Fixed by the backout of bug 785333:
http://hg.mozilla.org/mozilla-central/rev/ee5484eb9f4a

Still need to back this out on Aurora also, once that merge is done.
Comment 20 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-08-27 18:44:24 PDT
Backed out on aurora post-merge too:
https://hg.mozilla.org/releases/mozilla-aurora/rev/b766865d2857
Comment 21 Scoobidiver (away) 2012-08-28 00:59:26 PDT
*** Bug 786138 has been marked as a duplicate of this bug. ***
Comment 22 Chris Lord [:cwiiis] 2012-08-28 02:46:30 PDT
Is anyone that could easily reproduce this crash in a position that they could test out patches? I'll have a patch for this later today, but I wasn't easily able to reproduce the crash, so it'd be good to have someone that was help out.
Comment 23 Chris Lord [:cwiiis] 2012-08-28 02:51:05 PDT
Sorry for flags change, Bugzilla being unhelpful...
Comment 24 Chris Lord [:cwiiis] 2012-08-28 06:40:07 PDT
(In reply to XtC4UaLL [:xtc4uall] from comment #17)
> Using the STR in Bug 785816 Comment 0 + Build of
> http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012-08-27-mozilla-
> central-debug/ (i.e. http://hg.mozilla.org/mozilla-central/rev/8af6a22827ec)
> I get the following Assertion when crashing:
> 
> ###!!! ASSERTION: Layer already in list???:
> '!mNewChildLayers.Contains(ownLayer)', file
> e:/builds/moz2_slave/m-cen-w32-dbg/build/layout/base/FrameLayerBuilder.cpp,
> line 1859

argh, I missed this and this is not good... I've got patches that change the layer iteration so that the crash doesn't get hit, but this shouldn't happen and is much more likely what is causing the problem :(

I'm able to reproduce this on the page listed in bug 785816, so looking at this now.
Comment 25 Chris Lord [:cwiiis] 2012-08-28 07:04:04 PDT
I think this assertion would only be hit if a frame appeared twice in the same list with the same item (including merged frames) - I don't think this should happen, as frame + item should always be unique... Still investigating.
Comment 26 Chris Lord [:cwiiis] 2012-08-28 07:36:38 PDT
ok, I think this is happening because a merged item is being merged into a different frame between draws, causing a layer to be picked up twice... Maybe storing data against merged frames is just not a valid thing to do...
Comment 27 Chris Lord [:cwiiis] 2012-08-28 09:56:32 PDT
Is a valid thing to do, but requires an extra flag so as not to pick up the same layer more than once - fixes incoming to bug #785333, hopefully.
Comment 28 Virgil Dicu [:virgil] [QA] 2012-11-05 06:40:28 PST
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 beta 4

http://bit.ly/SNVobv

Only 6 crashes for 17beta for the last two weeks. 
Couldn't reproduce any issues with facebook mentioned in comments 3 and 11.
Setting this to verified.

Note You need to log in before you can comment on or make changes to this bug.