Closed Bug 594774 Opened 14 years ago Closed 14 years ago

Crash [@ nsDisplayBackground::Paint(nsDisplayListBuilder*, nsIRenderingContext*) ]

Categories

(Core :: Layout, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla2.0b7
Tracking Status
blocking2.0 --- beta7+

People

(Reporter: scoobidiver, Assigned: roc)

References

(Depends on 1 open bug, )

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(2 files, 4 obsolete files)

Build : Mozilla/5.0 (Windows NT 6.1; rv:2.0b6pre) Gecko/20100909 Firefox/4.0b6pre

Here is a bug that is very similar to bug 593511.
It is #6 top crasher for the last week.
It has a low daily crash rate until 20100904 build (1 crash/day). Since 20100905 build (D3D9 ON by default), the daily rate is about 40 crashes/day
2 comments say that they were on google finance when the crash occurred.

Signature	nsDisplayBackground::Paint(nsDisplayListBuilder*, nsIRenderingContext*)
UUID	eaec51a0-089f-477c-a6b0-6fe122100908
Time 	2010-09-08 10:58:29.651630
Uptime	72818
Last Crash	1574131 seconds (2.6 weeks) before submission
Install Age	72818 seconds (20.2 hours) since version was first installed.
Product	Firefox
Version	4.0b6pre
Build ID	20100907041859
Branch	2.0
OS	Windows NT
OS Version	6.1.7600
CPU	x86
CPU Info	GenuineIntel family 6 model 15 stepping 11
Crash Reason	EXCEPTION_ACCESS_VIOLATION
Crash Address	0xfffffffff0de801b
User Comments	
App Notes 	AdapterVendorID: 10de, AdapterDeviceID: 0165
Processor Notes 	
EMCheckCompatibility	False
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	xul.dll 	nsDisplayBackground::Paint 	layout/base/nsDisplayList.cpp:844
1 	xul.dll 	mozilla::FrameLayerBuilder::DrawThebesLayer 	layout/base/FrameLayerBuilder.cpp:1660
2 	xul.dll 	mozilla::layers::ThebesLayerD3D9::DrawRegion 	gfx/layers/d3d9/ThebesLayerD3D9.cpp:358
3 	xul.dll 	mozilla::layers::ThebesLayerD3D9::RenderLayer 	gfx/layers/d3d9/ThebesLayerD3D9.cpp:213
4 	xul.dll 	mozilla::layers::ContainerLayerD3D9::RenderLayer 	gfx/layers/d3d9/ContainerLayerD3D9.cpp:220
5 	xul.dll 	mozilla::layers::ContainerLayerD3D9::RenderLayer 	gfx/layers/d3d9/ContainerLayerD3D9.cpp:220
6 	xul.dll 	mozilla::layers::LayerManagerD3D9::Render 	gfx/layers/d3d9/LayerManagerD3D9.cpp:218
7 	xul.dll 	mozilla::layers::LayerManagerD3D9::EndTransaction 	gfx/layers/d3d9/LayerManagerD3D9.cpp:131
8 	xul.dll 	nsDisplayList::PaintForFrame 	layout/base/nsDisplayList.cpp:418
9 	xul.dll 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1412
10 	xul.dll 	PresShell::Paint 	layout/base/nsPresShell.cpp:5934
11 	xul.dll 	nsViewManager::RenderViews 	view/src/nsViewManager.cpp:459
12 	xul.dll 	nsViewManager::Refresh 	view/src/nsViewManager.cpp:425
13 	xul.dll 	nsViewManager::DispatchEvent 	view/src/nsViewManager.cpp:940
14 	xul.dll 	AttachedHandleEvent 	view/src/nsView.cpp:193
15 	xul.dll 	nsWindow::DispatchEvent 	widget/src/windows/nsWindow.cpp:3534
16 	xul.dll 	nsWindow::DispatchWindowEvent 	widget/src/windows/nsWindow.cpp:3565
17 		@0x22d0d7 	

More reports at :
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-09-09%2005%3A00%3A00&signature=nsDisplayBackground%3A%3APaint%28nsDisplayListBuilder*%2C%20nsIRenderingContext*%29&version=Firefox%3A4.0b6pre
It's reproducabale on given URL.
bp-909503d8-9fae-4cdf-972b-c0b642100909
Setting layers.accelerate-all;false and layers.accelerate-none;true helps.
Blocks: d3d9-layers
blocking2.0: --- → ?
OS: Windows 7 → Windows XP
Keywords: testcase-wanted
I got it too (bp-9c2f5f0b-36f8-42e0-9150-7eabe2100910), at the google finance page. I had set layers.accelerate-all;false (because of the black screen problem), but I didn't change layers.accelerate-none.
blocking2.0: ? → beta6+
Keywords: topcrash
I can reproduce this. Will investigate.
Assignee: nobody → bas.schouten
Status: NEW → ASSIGNED
Before the crash there's a long series of assertions:

###!!! ASSERTION: Want to fire mutation events, but it's not safe: '(aNode->IsNodeOfType(nsINode::eCONTENT) && static_cast<nsIContent*>(aNode)-> IsInNativeAnonymousSubtree()) || sScriptBlockerCount == sRemovableScriptBlockerCount', file c:/Users/Bas/Development/mozilla/firefox-debug-clean/content/base/src/../../../../mozilla-central-clean/content/base/src/nsContentUtils.cpp, line 3664

And then right before the crash:

JavaScript warning: http://www.google.com/finance/s/uNjjKDd8l3M/js/finance-opt.j
s?hl=en&gl=us, line 125: A long running script was terminated
Assignee: bas.schouten → roc
I examined the stack for the assertion. A windowless Flash object calls NPN_Evaluate during painting, running some page script "flashToXML" that apparently updates the DOM during painting. Something's obviously changed to make this crash more often than it used to, but fundamentally DOM modifications during painting have always been forbidden/deadly.

Bug 556487 will fix this for out-of-process plugins --- once we add Windows/Mac support.
blocking2.0: beta7+ → beta6+
Depends on: 552512, 556487
But I'll see if I can come up with some kind of workaround so at least we don't crash.
Attached patch Workaround (obsolete) — Splinter Review
This adds a global generation counter that increments every time nsCSSFrameConstructor sees a DOM update.

In FrameLayerBuilder, if we detect a DOM modification during painting, we bail out of our DrawThebesLayer callbacks ASAP.

This will not and cannot protect us against all crashes due to DOM changes during painting, but it definitely should help.
Attachment #475369 - Flags: review?(dbaron)
Attached patch test (obsolete) — Splinter Review
Testcase. Crashes trunk, does not crash with the patch.
Attachment #475370 - Flags: review?(dbaron)
Comment on attachment 475369 [details] [diff] [review]
Workaround

>+PRBool
>+FrameLayerBuilder::CheckDOMModified()
>+{
>+  if (mInitialDOMGeneration == nsCSSFrameConstructor::GetDOMGeneration())
>+    return PR_FALSE;
>+  if (mDetectedDOMModification)
>+    return PR_TRUE;
>+  mDetectedDOMModification = PR_TRUE;

Two problems here:
 * you fall off the end without returning
 * I think you meant to check if mDetectedDOMModification before the generation check (in case of counter wraparound); otherwise I don't see the point

So I think you meant:

if (mDetectedDOMModification)
  return PR_TRUE;
if (mInitialDOMGeneration == nsCSSFrameConstructor::GetDOMGeneration())
  return PR_FALSE;
NS_WARNING("Detected DOM modification during paint, bailing out!");
mDetectedDOMModification = PR_TRUE
return PR_TRUE;

or perhaps even:

if (mInitialDOMGeneration != nsCSSFrameConstructor::GetDOMGeneration())
{
  if (!mDetectedDOMModification) {
    NS_WARNING("Detected DOM modification during paint, bailing out!");
  }
  mDetectedDOMModification = PR_TRUE;
}
return mDetectedDOMModification;
r=dbaron with something like that
Attachment #475369 - Flags: review?(dbaron) → review+
(In reply to comment #10)
> >+PRBool
> >+FrameLayerBuilder::CheckDOMModified()
> >+{
> >+  if (mInitialDOMGeneration == nsCSSFrameConstructor::GetDOMGeneration())
> >+    return PR_FALSE;
> >+  if (mDetectedDOMModification)
> >+    return PR_TRUE;
> >+  mDetectedDOMModification = PR_TRUE;
> 
> Two problems here:
>  * you fall off the end without returning
>  * I think you meant to check if mDetectedDOMModification before the generation
> check (in case of counter wraparound); otherwise I don't see the point

To not completely spam the console with NS_WARNINGS, and also so that we can add invalidation or some other mitigation here if we decide to.
Attached patch workaround v2 (obsolete) — Splinter Review
Attachment #475369 - Attachment is obsolete: true
Attachment #475383 - Flags: review?(dbaron)
Whiteboard: [needs review]
Comment on attachment 475383 [details] [diff] [review]
workaround v2

ok, I suppose.  Though it seems like the member could be debug-only in that case, but I guess it doesn't matter.
Attachment #475383 - Flags: review?(dbaron) → review+
Keywords: checkin-needed
Whiteboard: [needs review] → [needs landing]
http://hg.mozilla.org/mozilla-central/rev/a08e5d5a3256
http://hg.mozilla.org/mozilla-central/rev/25b165f66af3
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Target Milestone: --- → mozilla2.0b7
The assertions are caused by the test I added which mutates the DOM during painting. Those assertions are expected and should just be marked as such.

The reftest failures are because this patch broke SVG images (yay tests!). For SVG images we actually can do reflows etc during painting. This patch has to handle that. I'll have to think a bit to figure out the best way to do that.
Probably need to move to beta7 to not get out of the radar.
blocking2.0: beta6+ → beta7+
Attached patch workaround v3Splinter Review
This makes the DOM generation counter per-RootPresContext. Fixes the SVG-image test failures.
Attachment #475383 - Attachment is obsolete: true
Attachment #475759 - Flags: review?(dbaron)
Attached patch test v2 (obsolete) — Splinter Review
Add "asserts(4)" to the crashtest.list; that's how many assertions I get on Mac. If we get other assertion counts on other platforms, we should just loosen the asserts() clause to cover them.
Attachment #475370 - Attachment is obsolete: true
Whiteboard: [needs review]
Comment on attachment 475759 [details] [diff] [review]
workaround v3

r=dbaron
Attachment #475759 - Flags: review?(dbaron) → review+
And I'd recommend pushing to try tonight to check the assertion counts.
Keywords: checkin-needed
Whiteboard: [needs review] → [needs landing]
The Windows crashtest run yielded:

REFTEST TEST-UNEXPECTED-FAIL | file:///e:/builds/moz2_slave/tryserver-win32-debug-unittest-crashtest/build/reftest/tests/modules/plugin/test/crashtests/522512-1.html | timed out waiting for reftest-wait to be removed (after onload fired)
REFTEST TEST-UNEXPECTED-PASS | file:///e:/builds/moz2_slave/tryserver-win32-debug-unittest-crashtest/build/reftest/tests/modules/plugin/test/crashtests/522512-1.html | assertion count 0 is less than expected 4 assertions

There were also a bunch of other failures in that try push; I haven't investigated whether they were due to this patch or others, or due to pushing on top of a bad mozilla-central revision, or just all known intermittent (although a bunch of them do look known).
I belive the other are in fact known oranges.
I need to look into why the Windows test failed --- I have no idea. But I think we could land this with the test skipped on Windows.
Attachment #475761 - Attachment is obsolete: true
Attachment #475873 - Flags: review?(dbaron)
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Whiteboard: [needs landing]
Depends on: 597676
Scoobidiver or anyone else, can you verify that the test URL no longer causes a crash on Windows?
I disabled HW acceleration : no crash on the test URL.
Moreover, no crash in b7pre/20100918 build :
http://crash-stats.mozilla.com/query/query?build_id=20100918041706&do_query=1
It is fixed.
Depends on: 607578
Blocks: 591409
(In reply to comment #6)
> I examined the stack for the assertion. A windowless Flash object calls
> NPN_Evaluate during painting, running some page script "flashToXML" that
> apparently updates the DOM during painting. Something's obviously changed to
> make this crash more often than it used to, but fundamentally DOM modifications
> during painting have always been forbidden/deadly.

Is there anything the user needs to do to get NPN_Evaluate to be called during the paint event? Do you need to interact with the URL (http://www.google.com/finance?q=NYSE:RIG) at all? I ask because I am hoping this will help me see where the Flash Player is calling out to JavaScript while processing the paint event so I can plug this hole. Thanks.
You don't have to do anything special.

Note that on Firefox 4 trunk, on Windows, with out-of-process plugins enabled (the default), this will no longer assert because windowless plugin painting is asynchronous now. If you disable OOPP you should still be able to see this.
Crash Signature: [@ nsDisplayBackground::Paint(nsDisplayListBuilder*, nsIRenderingContext*) ]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: