Closed Bug 501934 Opened 16 years ago Closed 16 years ago

Crash [@ nsXBLBinding::GenerateAnonymousContent] with DOMAttrModified and oncommand removing element and xbl:inherits="xbl:text"

Categories

(Core :: XBL, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.2
Tracking Status
status1.9.2 --- beta4-fixed
status1.9.1 --- .8-fixed

People

(Reporter: martijn.martijn, Assigned: MatsPalmgren_bugz)

Details

(4 keywords, Whiteboard: [sg:critical?])

Crash Data

Attachments

(6 files, 3 obsolete files)

Attached file testcase
See testcase, which crashes current trunk build and Firefox 3 after 100ms. http://crash-stats.mozilla.com/report/index/b25ff7b7-dfdb-4b4e-b896-312272090702?p=1 0 xul.dll xul.dll@0x421a98 1 xul.dll nsBaseHashtable<nsISupportsHashKey,nsCSSFrameConstructor::RestyleData,nsCSSFrameConstructor::RestyleData>::s_EnumStub obj-firefox/dist/include/nsBaseHashtable.h:346 2 xul.dll PL_DHashTableEnumerate obj-firefox/xpcom/build/pldhash.c:754 3 xul.dll nsXBLBinding::GenerateAnonymousContent content/xbl/src/nsXBLBinding.cpp:794 4 xul.dll nsXBLService::LoadBindings content/xbl/src/nsXBLService.cpp:664 5 xul.dll nsElementSH::PostCreate dom/base/nsDOMClassInfo.cpp:7620 6 xul.dll XPCWrappedNative::GetNewOrUsed js/src/xpconnect/src/xpcwrappednative.cpp:599 Firefox 3 crash report: http://crash-stats.mozilla.com/report/index/aae1cfba-71ac-4698-b21c-6100c2090702?p=1 0 xul.dll nsCOMPtr<nsIRDFNode>::operator= nsCOMPtr.h:713 1 xul.dll RealizeDefaultContent mozilla/content/xbl/src/nsXBLBinding.cpp:548 2 xul.dll nsBaseHashtable<nsURIHashKey,CacheScriptEntry,CacheScriptEntry>::s_EnumStub nsBaseHashtable.h:346 3 xul.dll PL_DHashTableEnumerate pldhash.c:724 4 xul.dll nsXBLBinding::GenerateAnonymousContent mozilla/content/xbl/src/nsXBLBinding.cpp:775 5 xul.dll nsXBLService::LoadBindings mozilla/content/xbl/src/nsXBLService.cpp:635 6 xul.dll nsElementSH::PostCreate mozilla/dom/src/base/nsDOMClassInfo.cpp:7236 7 xul.dll XPCWrappedNative::GetNewOrUsed mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:546 8 xul.dll XPCConvert::NativeInterface2JSObject mozilla/js/src/xpconnect/src/xpcconvert.cpp:1107 9 xul.dll XPCConvert::NativeData2JS mozilla/js/src/xpconnect/src/xpcconvert.cpp:482 10 xul.dll XPCWrappedNative::CallMethod mozilla/js/src/xpconnect/src/xpcwrappednative.cpp:2481 11 xul.dll XPC_WN_CallMethod mozilla/js/src/xpconnect/src/xpcwrappednativejsops.cpp:1473 12 js3250.dll js_Invoke mozilla/js/src/jsinterp.c:1310 13 js3250.dll js_Interpret mozilla/js/src/jsinterp.c:4883 14 js3250.dll js_Invoke mozilla/js/src/jsinterp.c:1326 15 js3250.dll JS_CallFunctionValue mozilla/js/src/jsapi.c:5058 16 xul.dll nsJSContext::CallEventHandler mozilla/dom/src/base/nsJSEnvironment.cpp:1962 17 xul.dll nsGlobalWindow::RunTimeout mozilla/dom/src/base/nsGlobalWindow.cpp:7900 18 xul.dll nsGlobalWindow::TimerCallback mozilla/dom/src/base/nsGlobalWindow.cpp:8231 19 xul.dll nsTimerImpl::Fire mozilla/xpcom/threads/nsTimerImpl.cpp:400 20 xul.dll nsTimerEvent::Run mozilla/xpcom/threads/nsTimerImpl.cpp:490 21 xul.dll nsThread::ProcessNextEvent mozilla/xpcom/threads/nsThread.cpp:510 22 xul.dll nsBaseAppShell::Run mozilla/widget/src/xpwidgets/nsBaseAppShell.cpp:170 23 nspr4.dll PR_GetEnv 24 firefox.exe wmain mozilla/toolkit/xre/nsWindowsWMain.cpp:87 25 firefox.exe firefox.exe@0x217f 26 kernel32.dll kernel32.dll@0x17076
It doesn't seem to crash online, so you need to download the testcase to your computer and test there, to see the crash.
Attached file stack
Deleting current nsXBLInsertionPoint while iterating mInsertionPointTable
We could use a nsRefPtr to avoid the above, but we would still get an assertion on a recursive hash table op...
Attached patch Like so? (obsolete) — Splinter Review
Wrapping the enumerate op in a script blocker fixes it.
That makes sense.
Can we deal with script running when that scriptblocker goes out of scope?
Ping?
Flags: blocking1.9.2?
Mats, can you drive this one in for 1.9.2?
Assignee: nobody → matspal
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
Target Milestone: --- → mozilla1.9.2
I'm reviewing the code to answer comment 6, and if there are additional crash paths...
Mats, any updates here?
Mats, any updates? We need to get this in soon...
I got interrupted by a couple of other bugs, sorry. I'll have a look at this one now...
The crash stopped occurring on trunk 2009-10-07-03 -- 2009-10-08-03 http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4d5760a4f96f&tochange=abe269bb23ef No crash on 1.9.2 tip either, but it does crash 1.9.1 tip. I suspect the underlying bug is still there though, also on trunk. I get this when loading the testcase on trunk: JavaScript error: event.target.parentNode is null I'm going to track down the exact change that fixed the bug/broke the test...
It's changeset eef814b58a9c that made it not crash anymore: "Bug 502567. Get rid of the silly ShouldBuildChildFrames check." It doesn't seem like a real fix though. Boris?
It's changeset eef814b58a9c that made it not crash anymore: "Bug 502567. Get rid of the silly ShouldBuildChildFrames check." It doesn't seem like a real fix though. Boris?
Yeah, I really don't know why that would have affected this bug, other than both being involved with xbl:text.
Attached file Testcase #2
In a pre-Bug 502567 tree, this testcase doesn't crash but triggers: ###!!! ASSERTION: op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0: 'op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0', file pldhash.c, line 612 ###!!! ASSERTION: Insertion parent should have NODE_IS_INSERTION_PARENT flag!: '!aParent || aParent->HasFlag(NODE_IS_INSERTION_PARENT)', file nsBindingManager.cpp, line 599 (so probably worth making a regression test from)
When bug 502567 removed the ShouldBuildChildFrames() from the if-condition it made LoadBindings() occur earlier (in AddFrameConstructionItemsInternal). Before bug 502567 it was delayed for the element with xbl:inherits="xbl:text" until nsElementSH::PostCreate(), see the attached "Hashtable assertion stack". It looks to me as it might still be possible to trigger the crash using document.addBinding but I haven't been able to so far.
Attached patch Patch rev. 2 (obsolete) — Splinter Review
Protect all "mInsertionPointTable->Enumerate" with nsAutoScriptBlocker. Remove a redundant null document check. Add a strong ref on "content". Re: comment 6 - Yes, I believe we can handle script running when they go out of scope in these blocks.
Attachment #386509 - Attachment is obsolete: true
Attachment #412484 - Flags: review?(bzbarsky)
Attached patch crashtest.diffSplinter Review
Attached patch for 1.9.0 (including crashtests) (obsolete) — Splinter Review
Minor difference due to an existing nsAutoScriptBlocker. Moved it a bit earlier so the resulting code is the same as in the trunk patch.
Attachment #412488 - Flags: review?(bzbarsky)
Boris, now that you pointed it out in bug 529087, we do in fact access mBoundElement after the nsAutoScriptBlocker goes out of scope. http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLBinding.cpp#814 So, I should probably just wrap that part too, i.e. all of nsXBLBinding::GenerateAnonymousContent() except the first few lines that calls mNextBinding->GenerateAnonymousContent(); Or, wrap the loop at the end inside a "if (mBoundElement)" depending on what the answer is to your question in bug 529087 comment 3.
Attachment #412484 - Attachment is obsolete: true
Attachment #412484 - Flags: review?(bzbarsky)
Attachment #412488 - Attachment is obsolete: true
Attachment #412488 - Flags: review?(bzbarsky)
Mats, how close to a fix do you think we are here? We're really close to the code freeze and we'd love to see this blocker fixed before the freeze!
Blocking script locally in GenerateAnonymousContent() isn't enough, because we can't trust the weak ref 'mBoundElement' so we would crash in the AllowScripts() call in nsXBLBinding::InstallEventHandlers() instead (as in bug 529087). Fortunately there is only one consumer, nsXBLService::LoadBindings: http://mxr.mozilla.org/mozilla-central/source/content/xbl/src/nsXBLService.cpp#623 Wrapping the whole block, line 623 -- 639, should be enough to fix that. (A variation, if we can't block script through all of that, would be to hold a kungFuDeathGrip on 'aContent' here (same as 'mBoundElement') and block script in a narrower scope) Also, in nsXBLBinding::ChangeDocument we have the same problem with using of 'mBoundElement' (and maybe also 'aOldDocument') after the script blocker goes out of scope, so we need a larger scope there.
Attachment #412771 - Flags: review?(bzbarsky)
Whiteboard: [needs review bzbarsky]
Comment on attachment 412771 [details] [diff] [review] Patch rev. 3 (diff -w) Looks reasonable.
Attachment #412771 - Flags: review?(bzbarsky) → review+
Comment on attachment 412771 [details] [diff] [review] Patch rev. 3 (diff -w) Security bug, so need sr too.
Attachment #412771 - Flags: superreview?(Olli.Pettay)
Whiteboard: [needs review bzbarsky] → [needs superreview olli]
Attachment #412771 - Flags: superreview?(Olli.Pettay) → superreview+
http://hg.mozilla.org/mozilla-central/rev/d0badc9bc496 Will push the crashtests when older branches have shipped the fix.
Status: NEW → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
OS: Windows XP → All
Hardware: x86 → All
Resolution: --- → FIXED
Whiteboard: [needs superreview olli]
Status: RESOLVED → VERIFIED
Status: VERIFIED → RESOLVED
Closed: 16 years ago16 years ago
Attachment #412771 - Flags: approval1.9.1.7?
Attachment #412771 - Flags: approval1.9.0.17?
Comment on attachment 412771 [details] [diff] [review] Patch rev. 3 (diff -w) Approved for 1.9.1.7 and 1.9.0.17, a=dveditz for release-drivers
Attachment #412771 - Flags: approval1.9.1.7?
Attachment #412771 - Flags: approval1.9.1.7+
Attachment #412771 - Flags: approval1.9.0.17?
Attachment #412771 - Flags: approval1.9.0.17+
Whiteboard: [sg:critical?]
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/e2e5b13d483c CVS trunk: mozilla/content/xbl/src/nsXBLBinding.cpp 1.269 mozilla/content/xbl/src/nsXBLService.cpp 1.261
Did this cause orange on the Firefox3.5-Unittest tree?
I don't think so, the U/M/E unit test runs for my changeset were all green except for one mochitest failure in test_videocontrols_audio_direction.html which I think is a known intermittent orange. The run after my change has a few orange M with failed local storage tests - I don't think these are related to XBL at all. There's also some orange E with a lot of tests timing out, hard to say what that could be... Is there a way we can force more unit test runs so we can get a few more cycles to see if it's consistent?
The 3.5 unit test Tinderboxes are green now.
Yeah, sorry, I should have followed up. Talking with the build team, apparently it was building the tag (which was at the tip) not actually mozilla-1.9.1. Checking in something to mozilla-1.9.1 made it go green.
Verified for 1.9.1 using the nightly build on XP SP3 with Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.8pre) Gecko/20100115 Shiretoko/3.5.8pre (.NET CLR 3.5.30729) and local testcase. Same testcase crashes 1.9.1.7 reliably when run.
Keywords: verified1.9.1
Verified for 1.9.0.18 using Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.16pre) Gecko/2010020215 GranParadiso/3.0.18pre (.NET CLR 3.5.30729) (my debug). Testcases crash 1.9.0.17 reliably when run locally and don't affect the 1.9.0.18 build.
Group: core-security
Crash Signature: [@ nsXBLBinding::GenerateAnonymousContent]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: