Last Comment Bug 329982 - Crash [@ nsXULElement::RemoveChildAt] involving DOMNodeRemoved mutation event
: Crash [@ nsXULElement::RemoveChildAt] involving DOMNodeRemoved mutation event
Status: RESOLVED FIXED
[sg:investigate] out of array bounds
: crash, fixed1.8.1, testcase, verified1.8.0.4
Product: Core
Classification: Components
Component: XUL (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- critical (vote)
: ---
Assigned To: Jonas Sicking (:sicking) PTO Until July 5th
:
Mentors:
Depends on: 325730 338391
Blocks: 325861
  Show dependency treegraph
 
Reported: 2006-03-09 16:58 PST by Jesse Ruderman
Modified: 2011-06-13 10:01 PDT (History)
6 users (show)
dveditz: blocking1.8.1+
darin.moz: blocking1.8.0.4+
jruderman: in‑testsuite+
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (813 bytes, application/xhtml+xml)
2006-03-09 16:59 PST, Jesse Ruderman
no flags Details
stack trace (mac debug) (8.11 KB, text/plain)
2006-03-09 17:00 PST, Jesse Ruderman
no flags Details
slightly simpler testcase (790 bytes, application/xhtml+xml)
2006-03-09 23:28 PST, Jesse Ruderman
no flags Details
Patch to fix (6.60 KB, patch)
2006-03-25 05:21 PST, Jonas Sicking (:sicking) PTO Until July 5th
bzbarsky: review+
bzbarsky: superreview+
jst: approval‑branch‑1.8.1+
dveditz: approval1.8.0.4+
Details | Diff | Review
same with -w (6.29 KB, patch)
2006-03-25 05:21 PST, Jonas Sicking (:sicking) PTO Until July 5th
no flags Details | Diff | Review
Patch that landed on 1.8.1 branch (8.39 KB, patch)
2006-04-20 00:25 PDT, Jonas Sicking (:sicking) PTO Until July 5th
no flags Details | Diff | Review

Description Jesse Ruderman 2006-03-09 16:58:24 PST
To reproduce the crash, load the testcase.

Before the crash, I see some assertion failures:

###!!! ASSERTION: invalid removeIndex: 'removeIndex >= 0 && !(oldParent == container && removeIndex == insPos)', file /Users/admin/trunk/mozilla/content/base/src/nsGenericElement.cpp, line 2974

###!!! ASSERTION: out-of-bounds access in nsAttrAndChildArray: 'aPos < ChildCount()', file /Users/admin/trunk/mozilla/content/base/src/nsAttrAndChildArray.h, line 81


Debug crash stacktrace:

Thread 0 Crashed:
0   nsCOMPtr<nsIContent>::nsCOMPtr[in-charge](nsIContent*) + 72 (nsCOMPtr.h:627)
1   nsXULElement::RemoveChildAt(unsigned, int) + 176 (nsXULElement.cpp:1025)
2   nsGenericElement::doReplaceOrInsertBefore(int, nsIDOMNode*, nsIDOMNode*, nsIContent*, nsIDocument*, nsIDOMNode**) + 4796 (nsGenericElement.cpp:2978)
3   nsGenericElement::InsertBefore(nsIDOMNode*, nsIDOMNode*, nsIDOMNode**) + 84 (nsGenericElement.cpp:2454)
4   nsHTMLHtmlElement::InsertBefore(nsIDOMNode*, nsIDOMNode*, nsIDOMNode**) + 60 (nsHTMLHtmlElement.cpp:57)
5   nsGenericElement::AppendChild(nsIDOMNode*, nsIDOMNode**) + 76 (nsGenericElement.h:520)
6   nsHTMLHtmlElement::AppendChild(nsIDOMNode*, nsIDOMNode**) + 52 (nsHTMLHtmlElement.cpp:57)
7   _XPTC_InvokeByIndex + 216 (xptcstubsdef.inc:256)
(see attachment for the rest)


Security-sensitive because
(1) There is a violation of C-style array bounds, and it's not checked at runtime except by an assertion.  Something is reading from or writing to a bogus memory location.
(2) The testcase reveals some of the ideas behind bug 325861.
Comment 1 Jesse Ruderman 2006-03-09 16:59:02 PST
Created attachment 214620 [details]
testcase
Comment 2 Jesse Ruderman 2006-03-09 17:00:02 PST
Created attachment 214621 [details]
stack trace (mac debug)
Comment 3 Jesse Ruderman 2006-03-09 23:28:07 PST
Created attachment 214649 [details]
slightly simpler testcase
Comment 4 Mike Shaver (:shaver -- probably not reading bugmail closely) 2006-03-12 09:35:25 PST
Neil might have cycles for this.
Comment 5 Neil Deakin 2006-03-12 11:53:36 PST
Looks like nsXULElement::RemoveChildAt should have a guard against mutation event handlers changing the DOM, like nsGenericElement::doRemoveAt has. In fact, both functions should really just be consolidated.
Comment 6 Jonas Sicking (:sicking) PTO Until July 5th 2006-03-25 05:21:21 PST
Created attachment 216204 [details] [diff] [review]
Patch to fix

Fire mutations after dealing with selection. Also call nsGenericElement::RemoveChildAt rather then duplicating the code in nsXULElement.

I also added some fixes to nsGenericElement::doRemoveChildAt to make it behave a little better if a sibling of a going-away child is removed during the mutationevent.
Comment 7 Jonas Sicking (:sicking) PTO Until July 5th 2006-03-25 05:21:39 PST
Created attachment 216205 [details] [diff] [review]
same with -w
Comment 8 Boris Zbarsky [:bz] (Out June 25-July 6) 2006-03-25 08:27:25 PST
Comment on attachment 216204 [details] [diff] [review]
Patch to fix

>Index: content/base/src/nsGenericElement.cpp
> nsGenericElement::RemoveChildAt(PRUint32 aIndex, PRBool aNotify)
> {
>-  nsCOMPtr<nsIContent> oldKid = GetChildAt(aIndex);
>+  nsCOMPtr<nsIContent> oldKid = mAttrsAndChildren.GetSafeChildAt(aIndex);

I think it's worth adding an assert that this gives the same result as GetChildAt (just to catch cases when someone makes GetChildAt do something weird).

>@@ -2398,44 +2398,46 @@ nsGenericElement::doRemoveChildAt(PRUint
>   NS_PRECONDITION(aKid && aKid->GetParent() == aParent &&
>                   aKid == container->GetChildAt(aIndex), "Bogus aKid");

Add an assert that |container->IndexOf(aKid) == aIndex|, since we're relying on that later, to this precondition?

>+  if (aParent && nsGenericElement::HasMutationListeners(aParent,
>+        NS_EVENT_BITS_MUTATION_NODEREMOVED)) {

Do you happen to know why we're checking on aParent, not on aKid here?  We're dispatching to aKid, so I'd think we'd check for listeners on aKid too...

>+  // Someone may have removed the kid or any of its siblings while that event
>+  // was processing.
>+  if (guard.Mutated(0)) {
>+    aIndex = container->IndexOf(aKid);
>+    if (aIndex < 0) {
>+      return NS_OK;

Hmmm...  I don't buy this.  Specifically, say we have the following situation:

1)  Node A has a DOMNodeRemoved listener that moves it somewhere else in
    the tree.
2)  We call B.appendChild(A).

What will happen is that appendChild will remove A from its current parent.  Except that the mutation listener will just move it to a different parent, so we'll hit this aIndex < 0 case and return NS_OK.  Then we'll put A in the child list of B.  So now A is in two different child lists and confused about what its parent is.

So I think we either we need to deal with this case in appendChild or we need to throw here if aIndex < 0 and aKid has a parent node (possibly the document!).
Comment 9 Jonas Sicking (:sicking) PTO Until July 5th 2006-03-25 12:54:30 PST
> >@@ -2398,44 +2398,46 @@ nsGenericElement::doRemoveChildAt(PRUint
> >   NS_PRECONDITION(aKid && aKid->GetParent() == aParent &&
> >                   aKid == container->GetChildAt(aIndex), "Bogus aKid");
> 
> Add an assert that |container->IndexOf(aKid) == aIndex|, since we're relying
> on that later, to this precondition?

Isn't that pretty much the same thing as checking |aKid == container->GetChildAt(aIndex)|? Or are you worried about bugs in IndexOf/GetChildAt?

> >+  if (aParent && nsGenericElement::HasMutationListeners(aParent,
> >+        NS_EVENT_BITS_MUTATION_NODEREMOVED)) {
> 
> Do you happen to know why we're checking on aParent, not on aKid here?  We're
> dispatching to aKid, so I'd think we'd check for listeners on aKid too...

Good point, i'll fix that in bug 90983 to avoid possibility of perf regression (longer parent chain to check).

> 1)  Node A has a DOMNodeRemoved listener that moves it somewhere else in
>     the tree.
> 2)  We call B.appendChild(A).

We already handle this in appendChild. Look at the GetParent || IsInDoc checks here http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/content/base/src/nsGenericElement.cpp&rev=3.458&mark=2962,3053#2962
Otherwise this would have been a problem before this patch too.
Comment 10 Boris Zbarsky [:bz] (Out June 25-July 6) 2006-03-25 16:43:28 PST
Comment on attachment 216204 [details] [diff] [review]
Patch to fix

> Or are you worried about bugs in IndexOf/GetChildAt?

Sort of, and also I feel it's better to assert the invariants we're actually depending on instead of other things that really should be equivalent...

Good point on appendChild.  I looked for that code, but didn't find it.  :(  Maybe add a comment here saying that that caller handles this situation correctly?

r+sr=bzbarsky with the assert nit picked.
Comment 11 Jonas Sicking (:sicking) PTO Until July 5th 2006-03-26 02:17:43 PST
I'll add he assert. I'd rather not add the comment since this function is really only about removing nodes.
Comment 12 Jonas Sicking (:sicking) PTO Until July 5th 2006-04-20 00:25:35 PDT
Created attachment 219118 [details] [diff] [review]
Patch that landed on 1.8.1 branch
Comment 13 Daniel Veditz [:dveditz] 2006-04-21 13:20:02 PDT
Comment on attachment 216204 [details] [diff] [review]
Patch to fix

approved for 1.8.0 branch, a=dveditz for drivers
Comment 14 Tracy Walker [:tracy] 2006-05-11 09:23:35 PDT
verified with Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
Comment 15 Jesse Ruderman 2006-05-17 22:16:18 PDT
This has been fixed, right?  I'm trying to figure out why this isn't marked as FIXED.
Comment 16 Jonas Sicking (:sicking) PTO Until July 5th 2006-05-18 00:52:24 PDT
I probably kept it open for branches.
Comment 17 Jesse Ruderman 2007-12-19 11:13:35 PST
Crashtest checked in.

Note You need to log in before you can comment on or make changes to this bug.