Crash [@ nsXBLBinding::GenerateAnonymousContent] with DOMAttrModified and oncommand removing element and xbl:inherits="xbl:text"

RESOLVED FIXED in mozilla1.9.2

Status

()

Core
XBL
P2
critical
RESOLVED FIXED
8 years ago
6 years ago

People

(Reporter: Martijn Wargers (dead), Assigned: Mats Palmgren (vacation - back in August))

Tracking

(4 keywords)

Trunk
mozilla1.9.2
crash, testcase, verified1.9.0.18, verified1.9.1
Points:
---
Bug Flags:
blocking1.9.2 +
in-testsuite ?

Firefox Tracking Flags

(status1.9.2 beta4-fixed, status1.9.1 .8-fixed)

Details

(Whiteboard: [sg:critical?], crash signature)

Attachments

(6 attachments, 3 obsolete attachments)

(Reporter)

Description

8 years ago
Created attachment 386497 [details]
testcase

See testcase, which crashes current trunk build and Firefox 3 after 100ms.

http://crash-stats.mozilla.com/report/index/b25ff7b7-dfdb-4b4e-b896-312272090702?p=1
0  	xul.dll  	xul.dll@0x421a98  	
1 	xul.dll 	nsBaseHashtable<nsISupportsHashKey,nsCSSFrameConstructor::RestyleData,nsCSSFrameConstructor::RestyleData>::s_EnumStub 	obj-firefox/dist/include/nsBaseHashtable.h:346
2 	xul.dll 	PL_DHashTableEnumerate 	obj-firefox/xpcom/build/pldhash.c:754
3 	xul.dll 	nsXBLBinding::GenerateAnonymousContent 	content/xbl/src/nsXBLBinding.cpp:794
4 	xul.dll 	nsXBLService::LoadBindings 	content/xbl/src/nsXBLService.cpp:664
5 	xul.dll 	nsElementSH::PostCreate 	dom/base/nsDOMClassInfo.cpp:7620
6 	xul.dll 	XPCWrappedNative::GetNewOrUsed 	js/src/xpconnect/src/xpcwrappednative.cpp:599

Firefox 3 crash report:
http://crash-stats.mozilla.com/report/index/aae1cfba-71ac-4698-b21c-6100c2090702?p=1
0  	xul.dll  	nsCOMPtr<nsIRDFNode>::operator=  	 nsCOMPtr.h:713
1 	xul.dll 	RealizeDefaultContent 	mozilla/content/xbl/src/nsXBLBinding.cpp:548
2 	xul.dll 	nsBaseHashtable<nsURIHashKey,CacheScriptEntry,CacheScriptEntry>::s_EnumStub 	nsBaseHashtable.h:346
3 	xul.dll 	PL_DHashTableEnumerate 	pldhash.c:724
4 	xul.dll 	nsXBLBinding::GenerateAnonymousContent 	mozilla/content/xbl/src/nsXBLBinding.cpp:775
5 	xul.dll 	nsXBLService::LoadBindings 	mozilla/content/xbl/src/nsXBLService.cpp:635
6 	xul.dll 	nsElementSH::PostCreate 	mozilla/dom/src/base/nsDOMClassInfo.cpp:7236
7 	xul.dll 	XPCWrappedNative::GetNewOrUsed 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:546
8 	xul.dll 	XPCConvert::NativeInterface2JSObject 	mozilla/js/src/xpconnect/src/xpcconvert.cpp:1107
9 	xul.dll 	XPCConvert::NativeData2JS 	mozilla/js/src/xpconnect/src/xpcconvert.cpp:482
10 	xul.dll 	XPCWrappedNative::CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2481
11 	xul.dll 	XPC_WN_CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1473
12 	js3250.dll 	js_Invoke 	mozilla/js/src/jsinterp.c:1310
13 	js3250.dll 	js_Interpret 	mozilla/js/src/jsinterp.c:4883
14 	js3250.dll 	js_Invoke 	mozilla/js/src/jsinterp.c:1326
15 	js3250.dll 	JS_CallFunctionValue 	mozilla/js/src/jsapi.c:5058
16 	xul.dll 	nsJSContext::CallEventHandler 	mozilla/dom/src/base/nsJSEnvironment.cpp:1962
17 	xul.dll 	nsGlobalWindow::RunTimeout 	mozilla/dom/src/base/nsGlobalWindow.cpp:7900
18 	xul.dll 	nsGlobalWindow::TimerCallback 	mozilla/dom/src/base/nsGlobalWindow.cpp:8231
19 	xul.dll 	nsTimerImpl::Fire 	mozilla/xpcom/threads/nsTimerImpl.cpp:400
20 	xul.dll 	nsTimerEvent::Run 	mozilla/xpcom/threads/nsTimerImpl.cpp:490
21 	xul.dll 	nsThread::ProcessNextEvent 	mozilla/xpcom/threads/nsThread.cpp:510
22 	xul.dll 	nsBaseAppShell::Run 	mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:170
23 	nspr4.dll 	PR_GetEnv 	
24 	firefox.exe 	wmain 	mozilla/toolkit/xre/nsWindowsWMain.cpp:87
25 	firefox.exe 	firefox.exe@0x217f 	
26 	kernel32.dll 	kernel32.dll@0x17076
(Reporter)

Comment 1

8 years ago
It doesn't seem to crash online, so you need to download the testcase to your computer and test there, to see the crash.
Created attachment 386506 [details]
stack

Deleting current nsXBLInsertionPoint while iterating mInsertionPointTable
Created attachment 386507 [details]
Hashtable assertion stack

We could use a nsRefPtr to avoid the above, but we would still get an
assertion on a recursive hash table op...
Created attachment 386509 [details] [diff] [review]
Like so?

Wrapping the enumerate op in a script blocker fixes it.

Comment 5

8 years ago
That makes sense.

Comment 6

8 years ago
Can we deal with script running when that scriptblocker goes out of scope?
(Reporter)

Comment 7

8 years ago
Ping?

Updated

8 years ago
Flags: blocking1.9.2?
Mats, can you drive this one in for 1.9.2?
Assignee: nobody → matspal
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Target Milestone: --- → mozilla1.9.2
I'm reviewing the code to answer comment 6, and if there are additional
crash paths...
Mats, any updates here?
Mats, any updates? We need to get this in soon...
I got interrupted by a couple of other bugs, sorry.  I'll have a look at
this one now...
The crash stopped occurring on trunk 2009-10-07-03 -- 2009-10-08-03
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4d5760a4f96f&tochange=abe269bb23ef
No crash on 1.9.2 tip either, but it does crash 1.9.1 tip.
I suspect the underlying bug is still there though, also on trunk.
I get this when loading the testcase on trunk:
JavaScript error: event.target.parentNode is null

I'm going to track down the exact change that fixed the bug/broke the test...
It's changeset eef814b58a9c that made it not crash anymore:
"Bug 502567. Get rid of the silly ShouldBuildChildFrames check."

It doesn't seem like a real fix though.  Boris?
It's changeset eef814b58a9c that made it not crash anymore:
"Bug 502567. Get rid of the silly ShouldBuildChildFrames check."

It doesn't seem like a real fix though.  Boris?

Comment 16

8 years ago
Yeah, I really don't know why that would have affected this bug, other than both being involved with xbl:text.
Created attachment 412447 [details]
Testcase #2

In a pre-Bug 502567 tree, this testcase doesn't crash but triggers:

###!!! ASSERTION: op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0: 'op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0', file pldhash.c, line 612
###!!! ASSERTION: Insertion parent should have NODE_IS_INSERTION_PARENT flag!: '!aParent || aParent->HasFlag(NODE_IS_INSERTION_PARENT)', file nsBindingManager.cpp, line 599

(so probably worth making a regression test from)
When bug 502567 removed the ShouldBuildChildFrames() from the if-condition
it made LoadBindings() occur earlier (in AddFrameConstructionItemsInternal).
Before bug 502567 it was delayed for the element with xbl:inherits="xbl:text"
until nsElementSH::PostCreate(), see the attached "Hashtable assertion stack".

It looks to me as it might still be possible to trigger the crash using
document.addBinding but I haven't been able to so far.
Created attachment 412484 [details] [diff] [review]
Patch rev. 2

Protect all "mInsertionPointTable->Enumerate" with nsAutoScriptBlocker.
Remove a redundant null document check.
Add a strong ref on "content".

Re: comment 6 -  Yes, I believe we can handle script running when they
go out of scope in these blocks.
Attachment #386509 - Attachment is obsolete: true
Attachment #412484 - Flags: review?(bzbarsky)
Created attachment 412485 [details] [diff] [review]
crashtest.diff
Created attachment 412488 [details] [diff] [review]
for 1.9.0 (including crashtests)

Minor difference due to an existing nsAutoScriptBlocker.  Moved it a bit
earlier so the resulting code is the same as in the trunk patch.
Attachment #412488 - Flags: review?(bzbarsky)
Boris, now that you pointed it out in bug 529087, we do in fact
access mBoundElement after the nsAutoScriptBlocker goes out of scope.
http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLBinding.cpp#814

So, I should probably just wrap that part too, i.e. all of
nsXBLBinding::GenerateAnonymousContent() except the first few lines
that calls mNextBinding->GenerateAnonymousContent();

Or, wrap the loop at the end inside a "if (mBoundElement)" depending
on what the answer is to your question in bug 529087 comment 3.
Attachment #412484 - Attachment is obsolete: true
Attachment #412484 - Flags: review?(bzbarsky)
Attachment #412488 - Attachment is obsolete: true
Attachment #412488 - Flags: review?(bzbarsky)
Mats, how close to a fix do you think we are here? We're really close to the code freeze and we'd love to see this blocker fixed before the freeze!
Created attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Blocking script locally in GenerateAnonymousContent() isn't enough,
because we can't trust the weak ref 'mBoundElement' so we would crash
in the AllowScripts() call in nsXBLBinding::InstallEventHandlers()
instead (as in bug 529087).
Fortunately there is only one consumer, nsXBLService::LoadBindings:
http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLService.cpp#623

Wrapping the whole block, line 623 -- 639, should be enough to fix that.
(A variation, if we can't block script through all of that, would be to
hold a kungFuDeathGrip on 'aContent' here (same as 'mBoundElement')
and block script in a narrower scope)

Also, in nsXBLBinding::ChangeDocument we have the same problem with
using of 'mBoundElement' (and maybe also 'aOldDocument') after the
script blocker goes out of scope, so we need a larger scope there.
Attachment #412771 - Flags: review?(bzbarsky)
Whiteboard: [needs review bzbarsky]

Comment 25

8 years ago
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Looks reasonable.
Attachment #412771 - Flags: review?(bzbarsky) → review+
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Security bug, so need sr too.
Attachment #412771 - Flags: superreview?(Olli.Pettay)
Whiteboard: [needs review bzbarsky] → [needs superreview olli]

Updated

8 years ago
Attachment #412771 - Flags: superreview?(Olli.Pettay) → superreview+
http://hg.mozilla.org/mozilla-central/rev/d0badc9bc496

Will push the crashtests when older branches have shipped the fix.
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Flags: in-testsuite?
OS: Windows XP → All
Hardware: x86 → All
Resolution: --- → FIXED
Whiteboard: [needs superreview olli]
(Reporter)

Updated

8 years ago
Status: RESOLVED → VERIFIED
(Reporter)

Updated

8 years ago
Status: VERIFIED → RESOLVED
Last Resolved: 8 years ago8 years ago
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/28c179d1b30e
status1.9.2: --- → final-fixed
Attachment #412771 - Flags: approval1.9.1.7?
Attachment #412771 - Flags: approval1.9.0.17?
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Approved for 1.9.1.7 and 1.9.0.17, a=dveditz for release-drivers
Attachment #412771 - Flags: approval1.9.1.7?
Attachment #412771 - Flags: approval1.9.1.7+
Attachment #412771 - Flags: approval1.9.0.17?
Attachment #412771 - Flags: approval1.9.0.17+
Whiteboard: [sg:critical?]
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e2e5b13d483c

CVS trunk:
mozilla/content/xbl/src/nsXBLBinding.cpp 	1.269
mozilla/content/xbl/src/nsXBLService.cpp 	1.261
status1.9.1: --- → .7-fixed
Keywords: fixed1.9.0.17
Did this cause orange on the Firefox3.5-Unittest tree?
I don't think so, the U/M/E unit test runs for my changeset were all green
except for one mochitest failure in test_videocontrols_audio_direction.html
which I think is a known intermittent orange.  The run after my change
has a few orange M with failed local storage tests - I don't think
these are related to XBL at all.  There's also some orange E with a lot
of tests timing out, hard to say what that could be...  Is there a way we
can force more unit test runs so we can get a few more cycles to see
if it's consistent?
The 3.5 unit test Tinderboxes are green now.
Yeah, sorry, I should have followed up. Talking with the build team, apparently it was building the tag (which was at the tip) not actually mozilla-1.9.1. Checking in something to mozilla-1.9.1 made it go green.
Verified for 1.9.1 using the nightly build on XP SP3 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8pre) Gecko/20100115 Shiretoko/3.5.8pre (.NET CLR 3.5.30729) and local testcase. Same testcase crashes 1.9.1.7 reliably when run.
Keywords: verified1.9.1
Verified for 1.9.0.18 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.16pre) Gecko/2010020215 GranParadiso/3.0.18pre (.NET CLR 3.5.30729) (my debug). Testcases crash 1.9.0.17 reliably when run locally and don't affect the 1.9.0.18 build.
Keywords: fixed1.9.0.18 → verified1.9.0.18
Group: core-security
Crash Signature: [@ nsXBLBinding::GenerateAnonymousContent]
You need to log in before you can comment on or make changes to this bug.