Closed Bug 785626 Opened 9 years ago Closed 9 years ago
crash in mozilla::Frame
Layer Builder::Build Container Layer For
It first appeared in 17.0a1/20120825. The regression range is: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1c0ac073dc65&tochange=f077de66e52d Signature mozilla::`anonymous namespace''::ContainerState::Finish(unsigned int*) More Reports Search UUID 113744a5-686b-45a4-9e45-128692120825 Date Processed 2012-08-25 14:10:34 Uptime 970 Last Crash 6.5 days before submission Install Age 16.2 minutes since version was first installed. Install Time 2012-08-25 14:28:10 Product Firefox Version 17.0a1 Build ID 20120825030541 Release Channel nightly OS Windows NT OS Version 6.1.7601 Service Pack 1 Build Architecture x86 Build Architecture Info AuthenticAMD family 18 model 1 stepping 0 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x14 User Comments Looking at some medical forum...and CRASH App Notes AdapterVendorID: 0x1002, AdapterDeviceID: 0x9640, AdapterSubsysID: 2acf103c, AdapterDriverVersion: 8.982.0.0 D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ EMCheckCompatibility True Adapter Vendor ID 0x1002 Adapter Device ID 0x9640 Total Virtual Memory 4294836224 Available Virtual Memory 3812274176 System Memory Use Percentage 25 Available Page File 13676478464 Available Physical Memory 5957169152 Frame Module Signature Source 0 xul.dll mozilla::`anonymous namespace'::ContainerState::Finish layout/base/FrameLayerBuilder.cpp:2101 1 xul.dll mozilla::FrameLayerBuilder::BuildContainerLayerFor layout/base/FrameLayerBuilder.cpp:2456 2 xul.dll PL_DHashTableOperate obj-firefox/xpcom/build/pldhash.cpp:586 3 xul.dll mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems layout/base/FrameLayerBuilder.cpp:1806 4 xul.dll mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems layout/base/FrameLayerBuilder.cpp:1751 5 xul.dll mozilla::`anonymous namespace'::ContainerState::ProcessDisplayItems layout/base/FrameLayerBuilder.cpp:1751 6 xul.dll mozilla::FrameLayerBuilder::BuildContainerLayerFor layout/base/FrameLayerBuilder.cpp:2451 7 mozglue.dll je_malloc memory/mozjemalloc/jemalloc.c:6291 8 xul.dll nsDisplayList::PaintRoot layout/base/nsDisplayList.cpp:943 9 xul.dll nsLayoutUtils::PaintFrame layout/base/nsLayoutUtils.cpp:1868 10 xul.dll nsViewManager::FlushDirtyRegionToWidget view/src/nsViewManager.cpp:465 11 xul.dll nsViewManager::ProcessPendingUpdates view/src/nsViewManager.cpp:1217 12 xul.dll nsRefreshDriver::Notify layout/base/nsRefreshDriver.cpp:421 13 xul.dll nsTimerImpl::Fire xpcom/threads/nsTimerImpl.cpp:476 14 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:624 15 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:82 16 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:201 17 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:175 18 xul.dll nsBaseAppShell::Run widget/xpwidgets/nsBaseAppShell.cpp:163 19 xul.dll nsAppShell::Run widget/windows/nsAppShell.cpp:232 20 xul.dll nsAppStartup::Run toolkit/components/startup/nsAppStartup.cpp:273 21 xul.dll XREMain::XRE_mainRun toolkit/xre/nsAppRunner.cpp:3800 22 xul.dll XREMain::XRE_main toolkit/xre/nsAppRunner.cpp:3877 23 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3953 ... More reports at: https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3A%60anonymous+namespace%27%27%3A%3AContainerState%3A%3AFinish%28unsigned+int*%29 https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3AFrameLayerBuilder%3A%3ABuildContainerLayerFor
Bug 785333 is the only thing that pops out within that range.
I can pretty easily reproduce this locally, and backing out bug 785333 does fix it. Unfortunately I'm having a hard time coming up with a testcase - it involves using a custom provider for the social feature in Firefox.
I am getting this crash consistently in 17.0a1 when I go 'offline' on facebook chat and then scroll up/down my timeline. Going back 'online' with chat resolves the problem. One of the several crash reports: https://crash-stats.mozilla.com/report/index/bp-5ce86f2c-287c-492c-bce4-b9e572120826
There are about 800 crashes in less than one day.
Looking at the comment, only certain sites are affected by this bug: https://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A17.0a1&platform=windows&build_id=20120825030541&query_search=signature&query_type=contains&reason_type=contains&date=08%2F26%2F2012%2017%3A20%3A54&range_value=2&range_unit=days&hang_type=any&process_type=browser&do_query=1&signature=mozilla%3A%3A%60anonymous%20namespace%27%27%3A%3AContainerState%3A%3AFinish%28unsigned%20int*%29 mainly Facebook.
Assignee: nobody → chrislord.net
Bank holiday, will see if this is a quick fix.
Status: NEW → ASSIGNED
I can't reproduce this in Linux - judging by the reports, it seems this may be Windows-specific? I guess I've broken some kind of invariant to do with mNewChildLayers... Looking but will be away for the day so won't have a fix until the evening/night (GMT), assuming it's relatively simple.
For anyone that can easily reproduce this crash, it'd be super-useful if they could post if there are any assertions leading up to the crash, and what they are.
// An invariant of this loop is that the layers in mNewChildLayers // with index < i are the first i child layers of mContainerLayer. My current guess is that this would happen if a layer ended up being represented by an item in a different part of the display-list than before and caused layers to swap positions, thus breaking this invariant. I don't think this is an unreasonable thing to happen though, so perhaps we should pick a better way of pruning old child layers... Couldn't we just ref and remove all child layers of a container layer in BuildContainerLayerFor, then unref them after ContainerState::Finish? I think this would let us simplify this Finish method quite a bit anyway. Will try this tonight after I reproduce this in Windows.
> I can't reproduce this in Linux - judging by the reports, it seems this may > be Windows-specific? Happening on OS X for me. I can easily reproduce the crash but not being well versed in this area, you'd need to tell me specifically what information you'd want me to provide.
I can reproduce it every time on Facebook, while scrolling to top. I think it has something to do with index less than 0. If anything specific is needed, I can help. (I will need to build Firefox to get assertions, please help me to do that)
Same reproducibility here, but only if the chat side-pane is closed. If it's active, the problem does not re-occur.
maybe the chat panel being "fixed" position somehow prevents the assertion.
Bug 785816 has STR but I can't reproduce.
Can we back out the offending change and re-land when this is solved. This is blocking some pretty important testing and crashing a lot of our nightly users. Going multiple days with a major crash at Facebook is probably costing us a non-trivial number of nightly testers.
Using the STR in Bug 785816 Comment 0 + Build of http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012-08-27-mozilla-central-debug/ (i.e. http://hg.mozilla.org/mozilla-central/rev/8af6a22827ec) I get the following Assertion when crashing: ###!!! ASSERTION: Layer already in list???: '!mNewChildLayers.Contains(ownLayer)', file e:/builds/moz2_slave/m-cen-w32-dbg/build/layout/base/FrameLayerBuilder.cpp, line 1859
(In reply to Asa Dotzler [:asa] from comment #16) > Can we back out the offending change and re-land when this is solved. This > is blocking some pretty important testing and crashing a lot of our nightly > users. Going multiple days with a major crash at Facebook is probably > costing us a non-trivial number of nightly testers. I didn't get the time today to work on a fix and it isn't trivial, so I think this would be a good idea. I'll likely have a fix tomorrow, but it'd be better to back this out now and re-land than sacrifice users.
Fixed by the backout of bug 785333: http://hg.mozilla.org/mozilla-central/rev/ee5484eb9f4a Still need to back this out on Aurora also, once that merge is done.
Backed out on aurora post-merge too: https://hg.mozilla.org/releases/mozilla-aurora/rev/b766865d2857
Is anyone that could easily reproduce this crash in a position that they could test out patches? I'll have a patch for this later today, but I wasn't easily able to reproduce the crash, so it'd be good to have someone that was help out.
(In reply to XtC4UaLL [:xtc4uall] from comment #17) > Using the STR in Bug 785816 Comment 0 + Build of > http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/2012-08-27-mozilla- > central-debug/ (i.e. http://hg.mozilla.org/mozilla-central/rev/8af6a22827ec) > I get the following Assertion when crashing: > > ###!!! ASSERTION: Layer already in list???: > '!mNewChildLayers.Contains(ownLayer)', file > e:/builds/moz2_slave/m-cen-w32-dbg/build/layout/base/FrameLayerBuilder.cpp, > line 1859 argh, I missed this and this is not good... I've got patches that change the layer iteration so that the crash doesn't get hit, but this shouldn't happen and is much more likely what is causing the problem :( I'm able to reproduce this on the page listed in bug 785816, so looking at this now.
I think this assertion would only be hit if a frame appeared twice in the same list with the same item (including merged frames) - I don't think this should happen, as frame + item should always be unique... Still investigating.
ok, I think this is happening because a merged item is being merged into a different frame between draws, causing a layer to be picked up twice... Maybe storing data against merged frames is just not a valid thing to do...
Is a valid thing to do, but requires an extra flag so as not to pick up the same layer more than once - fixes incoming to bug #785333, hopefully.
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0 beta 4 http://bit.ly/SNVobv Only 6 crashes for 17beta for the last two weeks. Couldn't reproduce any issues with facebook mentioned in comments 3 and 11. Setting this to verified.
You need to log in before you can comment on or make changes to this bug.