Last Comment Bug 501934 - Crash [@ nsXBLBinding::GenerateAnonymousContent] with DOMAttrModified and oncommand removing element and xbl:inherits="xbl:text"
: Crash [@ nsXBLBinding::GenerateAnonymousContent] with DOMAttrModified and onc...
Status: RESOLVED FIXED
[sg:critical?]
: crash, testcase, verified1.9.0.18, verified1.9.1
Product: Core
Classification: Components
Component: XBL (show other bugs)
: Trunk
: All All
: P2 critical (vote)
: mozilla1.9.2
Assigned To: Mats Palmgren (vacation)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-07-02 05:23 PDT by Martijn Wargers [:mwargers] (not working for Mozilla)
Modified: 2011-06-13 10:01 PDT (History)
8 users (show)
jst: blocking1.9.2+
mats: in‑testsuite?
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta4-fixed
.8-fixed


Attachments
testcase (926 bytes, application/vnd.mozilla.xul+xml)
2009-07-02 05:23 PDT, Martijn Wargers [:mwargers] (not working for Mozilla)
no flags Details
stack (13.94 KB, text/html)
2009-07-02 07:19 PDT, Mats Palmgren (vacation)
no flags Details
Hashtable assertion stack (14.53 KB, text/plain)
2009-07-02 07:24 PDT, Mats Palmgren (vacation)
no flags Details
Like so? (886 bytes, patch)
2009-07-02 07:41 PDT, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
Testcase #2 (1000 bytes, application/vnd.mozilla.xul+xml)
2009-11-14 19:46 PST, Mats Palmgren (vacation)
no flags Details
Patch rev. 2 (2.19 KB, patch)
2009-11-15 12:13 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
crashtest.diff (2.69 KB, patch)
2009-11-15 12:14 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
for 1.9.0 (including crashtests) (7.04 KB, patch)
2009-11-15 12:18 PST, Mats Palmgren (vacation)
no flags Details | Diff | Splinter Review
Patch rev. 3 (diff -w) (4.26 KB, patch)
2009-11-16 20:39 PST, Mats Palmgren (vacation)
bzbarsky: review+
bugs: superreview+
dveditz: approval1.9.1.8+
dveditz: approval1.9.0.18+
Details | Diff | Splinter Review

Description Martijn Wargers [:mwargers] (not working for Mozilla) 2009-07-02 05:23:43 PDT
Created attachment 386497 [details]
testcase

See testcase, which crashes current trunk build and Firefox 3 after 100ms.

http://crash-stats.mozilla.com/report/index/b25ff7b7-dfdb-4b4e-b896-312272090702?p=1
0  	xul.dll  	xul.dll@0x421a98  	
1 	xul.dll 	nsBaseHashtable<nsISupportsHashKey,nsCSSFrameConstructor::RestyleData,nsCSSFrameConstructor::RestyleData>::s_EnumStub 	obj-firefox/dist/include/nsBaseHashtable.h:346
2 	xul.dll 	PL_DHashTableEnumerate 	obj-firefox/xpcom/build/pldhash.c:754
3 	xul.dll 	nsXBLBinding::GenerateAnonymousContent 	content/xbl/src/nsXBLBinding.cpp:794
4 	xul.dll 	nsXBLService::LoadBindings 	content/xbl/src/nsXBLService.cpp:664
5 	xul.dll 	nsElementSH::PostCreate 	dom/base/nsDOMClassInfo.cpp:7620
6 	xul.dll 	XPCWrappedNative::GetNewOrUsed 	js/src/xpconnect/src/xpcwrappednative.cpp:599

Firefox 3 crash report:
http://crash-stats.mozilla.com/report/index/aae1cfba-71ac-4698-b21c-6100c2090702?p=1
0  	xul.dll  	nsCOMPtr<nsIRDFNode>::operator=  	 nsCOMPtr.h:713
1 	xul.dll 	RealizeDefaultContent 	mozilla/content/xbl/src/nsXBLBinding.cpp:548
2 	xul.dll 	nsBaseHashtable<nsURIHashKey,CacheScriptEntry,CacheScriptEntry>::s_EnumStub 	nsBaseHashtable.h:346
3 	xul.dll 	PL_DHashTableEnumerate 	pldhash.c:724
4 	xul.dll 	nsXBLBinding::GenerateAnonymousContent 	mozilla/content/xbl/src/nsXBLBinding.cpp:775
5 	xul.dll 	nsXBLService::LoadBindings 	mozilla/content/xbl/src/nsXBLService.cpp:635
6 	xul.dll 	nsElementSH::PostCreate 	mozilla/dom/src/base/nsDOMClassInfo.cpp:7236
7 	xul.dll 	XPCWrappedNative::GetNewOrUsed 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:546
8 	xul.dll 	XPCConvert::NativeInterface2JSObject 	mozilla/js/src/xpconnect/src/xpcconvert.cpp:1107
9 	xul.dll 	XPCConvert::NativeData2JS 	mozilla/js/src/xpconnect/src/xpcconvert.cpp:482
10 	xul.dll 	XPCWrappedNative::CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2481
11 	xul.dll 	XPC_WN_CallMethod 	mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1473
12 	js3250.dll 	js_Invoke 	mozilla/js/src/jsinterp.c:1310
13 	js3250.dll 	js_Interpret 	mozilla/js/src/jsinterp.c:4883
14 	js3250.dll 	js_Invoke 	mozilla/js/src/jsinterp.c:1326
15 	js3250.dll 	JS_CallFunctionValue 	mozilla/js/src/jsapi.c:5058
16 	xul.dll 	nsJSContext::CallEventHandler 	mozilla/dom/src/base/nsJSEnvironment.cpp:1962
17 	xul.dll 	nsGlobalWindow::RunTimeout 	mozilla/dom/src/base/nsGlobalWindow.cpp:7900
18 	xul.dll 	nsGlobalWindow::TimerCallback 	mozilla/dom/src/base/nsGlobalWindow.cpp:8231
19 	xul.dll 	nsTimerImpl::Fire 	mozilla/xpcom/threads/nsTimerImpl.cpp:400
20 	xul.dll 	nsTimerEvent::Run 	mozilla/xpcom/threads/nsTimerImpl.cpp:490
21 	xul.dll 	nsThread::ProcessNextEvent 	mozilla/xpcom/threads/nsThread.cpp:510
22 	xul.dll 	nsBaseAppShell::Run 	mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:170
23 	nspr4.dll 	PR_GetEnv 	
24 	firefox.exe 	wmain 	mozilla/toolkit/xre/nsWindowsWMain.cpp:87
25 	firefox.exe 	firefox.exe@0x217f 	
26 	kernel32.dll 	kernel32.dll@0x17076
Comment 1 Martijn Wargers [:mwargers] (not working for Mozilla) 2009-07-02 05:25:44 PDT
It doesn't seem to crash online, so you need to download the testcase to your computer and test there, to see the crash.
Comment 2 Mats Palmgren (vacation) 2009-07-02 07:19:59 PDT
Created attachment 386506 [details]
stack

Deleting current nsXBLInsertionPoint while iterating mInsertionPointTable
Comment 3 Mats Palmgren (vacation) 2009-07-02 07:24:46 PDT
Created attachment 386507 [details]
Hashtable assertion stack

We could use a nsRefPtr to avoid the above, but we would still get an
assertion on a recursive hash table op...
Comment 4 Mats Palmgren (vacation) 2009-07-02 07:41:02 PDT
Created attachment 386509 [details] [diff] [review]
Like so?

Wrapping the enumerate op in a script blocker fixes it.
Comment 5 Olli Pettay [:smaug] 2009-07-02 07:44:18 PDT
That makes sense.
Comment 6 Boris Zbarsky [:bz] 2009-07-02 15:28:21 PDT
Can we deal with script running when that scriptblocker goes out of scope?
Comment 7 Martijn Wargers [:mwargers] (not working for Mozilla) 2009-10-14 08:37:38 PDT
Ping?
Comment 8 Johnny Stenback (:jst, jst@mozilla.com) 2009-10-16 17:08:02 PDT
Mats, can you drive this one in for 1.9.2?
Comment 9 Mats Palmgren (vacation) 2009-10-20 12:33:29 PDT
I'm reviewing the code to answer comment 6, and if there are additional
crash paths...
Comment 10 Johnny Stenback (:jst, jst@mozilla.com) 2009-10-27 17:32:14 PDT
Mats, any updates here?
Comment 11 Johnny Stenback (:jst, jst@mozilla.com) 2009-11-03 09:04:08 PST
Mats, any updates? We need to get this in soon...
Comment 12 Mats Palmgren (vacation) 2009-11-03 11:14:38 PST
I got interrupted by a couple of other bugs, sorry.  I'll have a look at
this one now...
Comment 13 Mats Palmgren (vacation) 2009-11-10 15:55:57 PST
The crash stopped occurring on trunk 2009-10-07-03 -- 2009-10-08-03
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4d5760a4f96f&tochange=abe269bb23ef
No crash on 1.9.2 tip either, but it does crash 1.9.1 tip.
I suspect the underlying bug is still there though, also on trunk.
I get this when loading the testcase on trunk:
JavaScript error: event.target.parentNode is null

I'm going to track down the exact change that fixed the bug/broke the test...
Comment 14 Mats Palmgren (vacation) 2009-11-11 20:46:21 PST
It's changeset eef814b58a9c that made it not crash anymore:
"Bug 502567. Get rid of the silly ShouldBuildChildFrames check."

It doesn't seem like a real fix though.  Boris?
Comment 15 Mats Palmgren (vacation) 2009-11-11 20:47:09 PST
It's changeset eef814b58a9c that made it not crash anymore:
"Bug 502567. Get rid of the silly ShouldBuildChildFrames check."

It doesn't seem like a real fix though.  Boris?
Comment 16 Boris Zbarsky [:bz] 2009-11-11 20:54:17 PST
Yeah, I really don't know why that would have affected this bug, other than both being involved with xbl:text.
Comment 17 Mats Palmgren (vacation) 2009-11-14 19:46:31 PST
Created attachment 412447 [details]
Testcase #2

In a pre-Bug 502567 tree, this testcase doesn't crash but triggers:

###!!! ASSERTION: op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0: 'op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0', file pldhash.c, line 612
###!!! ASSERTION: Insertion parent should have NODE_IS_INSERTION_PARENT flag!: '!aParent || aParent->HasFlag(NODE_IS_INSERTION_PARENT)', file nsBindingManager.cpp, line 599

(so probably worth making a regression test from)
Comment 18 Mats Palmgren (vacation) 2009-11-15 12:07:01 PST
When bug 502567 removed the ShouldBuildChildFrames() from the if-condition
it made LoadBindings() occur earlier (in AddFrameConstructionItemsInternal).
Before bug 502567 it was delayed for the element with xbl:inherits="xbl:text"
until nsElementSH::PostCreate(), see the attached "Hashtable assertion stack".

It looks to me as it might still be possible to trigger the crash using
document.addBinding but I haven't been able to so far.
Comment 19 Mats Palmgren (vacation) 2009-11-15 12:13:38 PST
Created attachment 412484 [details] [diff] [review]
Patch rev. 2

Protect all "mInsertionPointTable->Enumerate" with nsAutoScriptBlocker.
Remove a redundant null document check.
Add a strong ref on "content".

Re: comment 6 -  Yes, I believe we can handle script running when they
go out of scope in these blocks.
Comment 20 Mats Palmgren (vacation) 2009-11-15 12:14:12 PST
Created attachment 412485 [details] [diff] [review]
crashtest.diff
Comment 21 Mats Palmgren (vacation) 2009-11-15 12:18:16 PST
Created attachment 412488 [details] [diff] [review]
for 1.9.0 (including crashtests)

Minor difference due to an existing nsAutoScriptBlocker.  Moved it a bit
earlier so the resulting code is the same as in the trunk patch.
Comment 22 Mats Palmgren (vacation) 2009-11-16 13:43:15 PST
Boris, now that you pointed it out in bug 529087, we do in fact
access mBoundElement after the nsAutoScriptBlocker goes out of scope.
http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLBinding.cpp#814

So, I should probably just wrap that part too, i.e. all of
nsXBLBinding::GenerateAnonymousContent() except the first few lines
that calls mNextBinding->GenerateAnonymousContent();

Or, wrap the loop at the end inside a "if (mBoundElement)" depending
on what the answer is to your question in bug 529087 comment 3.
Comment 23 Johnny Stenback (:jst, jst@mozilla.com) 2009-11-16 18:37:54 PST
Mats, how close to a fix do you think we are here? We're really close to the code freeze and we'd love to see this blocker fixed before the freeze!
Comment 24 Mats Palmgren (vacation) 2009-11-16 20:39:15 PST
Created attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Blocking script locally in GenerateAnonymousContent() isn't enough,
because we can't trust the weak ref 'mBoundElement' so we would crash
in the AllowScripts() call in nsXBLBinding::InstallEventHandlers()
instead (as in bug 529087).
Fortunately there is only one consumer, nsXBLService::LoadBindings:
http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLService.cpp#623

Wrapping the whole block, line 623 -- 639, should be enough to fix that.
(A variation, if we can't block script through all of that, would be to
hold a kungFuDeathGrip on 'aContent' here (same as 'mBoundElement')
and block script in a narrower scope)

Also, in nsXBLBinding::ChangeDocument we have the same problem with
using of 'mBoundElement' (and maybe also 'aOldDocument') after the
script blocker goes out of scope, so we need a larger scope there.
Comment 25 Boris Zbarsky [:bz] 2009-11-17 20:10:33 PST
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Looks reasonable.
Comment 26 Mats Palmgren (vacation) 2009-11-17 22:37:43 PST
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Security bug, so need sr too.
Comment 27 Mats Palmgren (vacation) 2009-11-18 07:21:10 PST
http://hg.mozilla.org/mozilla-central/rev/d0badc9bc496

Will push the crashtests when older branches have shipped the fix.
Comment 28 Mats Palmgren (vacation) 2009-11-18 20:20:38 PST
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/28c179d1b30e
Comment 29 Daniel Veditz [:dveditz] 2009-12-02 15:46:11 PST
Comment on attachment 412771 [details] [diff] [review]
Patch rev. 3 (diff -w)

Approved for 1.9.1.7 and 1.9.0.17, a=dveditz for release-drivers
Comment 30 Mats Palmgren (vacation) 2009-12-04 09:01:09 PST
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e2e5b13d483c

CVS trunk:
mozilla/content/xbl/src/nsXBLBinding.cpp 	1.269
mozilla/content/xbl/src/nsXBLService.cpp 	1.261
Comment 31 Samuel Sidler (old account; do not CC) 2009-12-07 22:04:53 PST
Did this cause orange on the Firefox3.5-Unittest tree?
Comment 32 Mats Palmgren (vacation) 2009-12-08 00:10:21 PST
I don't think so, the U/M/E unit test runs for my changeset were all green
except for one mochitest failure in test_videocontrols_audio_direction.html
which I think is a known intermittent orange.  The run after my change
has a few orange M with failed local storage tests - I don't think
these are related to XBL at all.  There's also some orange E with a lot
of tests timing out, hard to say what that could be...  Is there a way we
can force more unit test runs so we can get a few more cycles to see
if it's consistent?
Comment 33 Mats Palmgren (vacation) 2009-12-09 13:17:16 PST
The 3.5 unit test Tinderboxes are green now.
Comment 34 Samuel Sidler (old account; do not CC) 2009-12-10 12:22:36 PST
Yeah, sorry, I should have followed up. Talking with the build team, apparently it was building the tag (which was at the tip) not actually mozilla-1.9.1. Checking in something to mozilla-1.9.1 made it go green.
Comment 35 Al Billings [:abillings] 2010-01-15 17:10:07 PST
Verified for 1.9.1 using the nightly build on XP SP3 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8pre) Gecko/20100115 Shiretoko/3.5.8pre (.NET CLR 3.5.30729) and local testcase. Same testcase crashes 1.9.1.7 reliably when run.
Comment 36 Al Billings [:abillings] 2010-02-02 16:59:26 PST
Verified for 1.9.0.18 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.16pre) Gecko/2010020215 GranParadiso/3.0.18pre (.NET CLR 3.5.30729) (my debug). Testcases crash 1.9.0.17 reliably when run locally and don't affect the 1.9.0.18 build.

Note You need to log in before you can comment on or make changes to this bug.