Last Comment Bug 543645 - Hang [@ nsRange::CutContents] with DOMNodeInserted event, deleteContents
: Hang [@ nsRange::CutContents] with DOMNodeInserted event, deleteContents
Status: RESOLVED FIXED
[sg:dos]
: hang, testcase
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: Trunk
: x86 All
: -- critical (vote)
: mozilla16
Assigned To: :Aryeh Gregor (away until August 15)
:
Mentors:
Depends on: 763869
Blocks: 336383
  Show dependency treegraph
 
Reported: 2010-02-01 19:07 PST by Jesse Ruderman
Modified: 2013-04-04 13:53 PDT (History)
8 users (show)
ayg: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
wanted


Attachments
testcase (hangs Firefox) (441 bytes, application/xhtml+xml)
2010-02-01 19:07 PST, Jesse Ruderman
no flags Details
Patch v1 (16.37 KB, patch)
2012-06-11 05:16 PDT, :Aryeh Gregor (away until August 15)
bzbarsky: review-
Details | Diff | Review
Patch v2 (16.33 KB, patch)
2012-06-12 03:13 PDT, :Aryeh Gregor (away until August 15)
bzbarsky: review+
Details | Diff | Review

Description Jesse Ruderman 2010-02-01 19:07:34 PST
Created attachment 424709 [details]
testcase (hangs Firefox)
Comment 1 Boris Zbarsky [:bz] (Out June 25-July 6) 2010-02-04 14:49:21 PST
So the SplitDataNode call that CutContents makes triggers the mutation listener there.  That removes the content our iterator is positioned on from the DOM, and nsContentSubtreeIterator::Prev happens to not set itself done even when it advanves to null (in particular if prevNode ends up null).  It also doesn't set mCurNode in that case, because GetTopAncestorInRange bails out if given null and that's what sets mCurNode.

Should Prev() just be setting the iterator as done in this situation?
Comment 2 Johnny Stenback (:jst, jst@mozilla.com) 2010-02-18 15:48:33 PST
Not a blocker, but wanted. And bz, your suggestion sounds reasonable to me.
Comment 3 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-09 00:04:34 PDT
This still looks broken.  Aryeh, want to take a look?
Comment 4 :Aryeh Gregor (away until August 15) 2012-06-10 01:46:16 PDT
I have a better proposed solution: let's drop support for sync mutation listeners already.  0:)

But in the meantime, I'll see what I can do.
Comment 5 :Aryeh Gregor (away until August 15) 2012-06-10 05:20:47 PDT
So honestly, I'm not sure exactly what nsContentSubtreeIterator is supposed to do.  What order is it meant to return nodes in?  The code is complicated by a lot of caching, so I'm not confident that I see what it's doing.  It doesn't look like the simplest way to do anything . . . e.g., Prev() calls GetDeepFirstChild on mCurNode, but then immediately calls PrevNode on the result, which seems like it will just boil down to GetPrevSibling in this case, and the first thing that does is go back up until it finds the first ancestor that has a previous sibling.  Is this really the case?  If so, why is the code so complicated?

If I'm understanding correctly, it looks like it iterates over all nodes in the range whose parent is not also in the range.  So it's like a regular post-order mContentIterator, except once it's returned a node it doesn't descend into its children.  Is that right?
Comment 6 :Aryeh Gregor (away until August 15) 2012-06-10 06:14:45 PDT
Hmm, never mind.  I think I understand what's going on here well enough.  Hopefully I'll have a patch tomorrow.
Comment 7 :Aryeh Gregor (away until August 15) 2012-06-11 05:16:45 PDT
Created attachment 631861 [details] [diff] [review]
Patch v1

It would be no fun to just fix the bug, so I cleaned up a lot of the code and fixed the bug at the same time.  :)  Notes:

A crashtest is the right way to test this, right?

I changed GetTopAncestorInRange to return an nsINode*, with the guarantee that it will always return a non-null node if the input is not null.  I wanted to MOZ_ASSERT that the input is not null, but the problematic caller (in nsContentSubtreeIterator::Prev) looked nontrivial to fix properly.

I MOZ_ASSERT that the nsIDOMRange passed to nsContentSubtreeIterator::Init is not null, and that its common ancestor/start node/end node aren't null.  I don't know how you feel about this sort of MOZ_ASSERT usage, but it seems unlikely to bite anyone who doesn't deserve it.

There were a bunch of places that used bare nsINode* for local variables.  I figured this was *probably* safe, since an actual reference was being kept to other nodes in the same tree, but I changed them to nsCOMPtr<nsINode> anyway.  Tell me if that's bad.

In Init(), when calling nsRange::CompareNodeToRange, I wrap it in MOZ_ALWAYS_TRUE(NS_SUCCEEDED()).  The only ways that function can fail are if it's passed null, if the range's it's passed is not positioned, or if the node and range it's passed are not in the same tree.  None of these should be possible in Init(), because we just got the nodes from the range.  In GetTopAncestorInRange, I instead use NS_ASSERTION plus returning null, because it's not obvious that the node can't be in a different tree for some reason at that point.

I don't actually understand the logic in GetTopAncestorInRange.  Why the tmp variable?  What purpose does it serve?  The parent variable should only ever be null if the range selects the whole tree and we make it all the way up to the root -- in that case, why do we return null or a grandchild (depending on aNode) instead of returning the tree root?  But if I try to change it to return aNode instead of tmp in that case, a whole bunch of tests fail.

Anyway, this fixes the hang because GetTopAncestorInRange will now return null instead of bailing out if it's passed null, so mCurNode will get set to null and mIsDone will be set to false.

https://tbpl.mozilla.org/?tree=Try&rev=6b9458a98cd1
Comment 8 :Aryeh Gregor (away until August 15) 2012-06-11 05:18:08 PDT
(In retrospect, maybe a more trivial fix would have been good for the sake of backporting.  If you want, I can write a separate patch to do that.)
Comment 9 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-11 20:26:26 PDT
> A crashtest is the right way to test this, right?

Seems fine, yes.

> Why the tmp variable?

Presumably some sort of holdover from when this used nsIDOMNode?

At this point it seems to be equal to the child of aNode we last examined, except the first time through the loop, when it's null.  That makes no sense whatsoever.  It's worth looking up the blame from this code to see what happened here.  I really hope that we never hit the !parent case, as things stand....
Comment 10 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-11 20:26:53 PDT
Comment on attachment 631861 [details] [diff] [review]
Patch v1

It'd be pretty awesome if cleanup and substantive changes happened in separate diffs... ;)

>+++ b/content/base/src/nsContentIterator.cpp
>+  nsCOMPtr<nsINode> startParent = mRange->GetStartParent();
>+  nsCOMPtr<nsINode> endParent = mRange->GetEndParent();

Why do those need to be nsCOMPtrs?

>+    nsCOMPtr<nsINode> child = startParent->GetFirstChild();

Pretty sure this one does NOT need to be an nsCOMPtr.

>+  nsCOMPtr<nsINode> firstCandidate, lastCandidate;

Nor these (though they do need to be inited to null.

>+  nsCOMPtr<nsINode> node;

Nor this one.

>+    nsCOMPtr<nsINode> child = startParent->GetChildAt(offset);

Or this one.

>+    // then firstCandidate is next node after cN

s/cN/node/

>+  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
>+    nsRange::CompareNodeToRange(firstCandidate, mRange, &nodeBefore, &nodeAfter)));

I don't think you can assert that (e.g. you have no idea whether the range is positioned, afaict).

> nsContentSubtreeIterator::Next()
>-  nsINode *nextNode = GetNextSibling(mCurNode, nsnull);
>+  nsCOMPtr<nsINode> nextNode = GetNextSibling(mCurNode, nsnull);

Why?


> nsContentSubtreeIterator::Prev()
>+  nsCOMPtr<nsINode> prevNode = GetDeepFirstChild(mCurNode, nsnull);

And why does that need to be an nsCOMPtr?

Generally using nsCOMPtr where not needed adds to cc pressure and makes things slower...

We should figure out what the tmp bit did before someone broke this code.
Comment 11 :Aryeh Gregor (away until August 15) 2012-06-12 03:11:37 PDT
(In reply to Boris Zbarsky (:bz) from comment #9)
> Presumably some sort of holdover from when this used nsIDOMNode?
> 
> At this point it seems to be equal to the child of aNode we last examined,
> except the first time through the loop, when it's null.  That makes no sense
> whatsoever.  It's worth looking up the blame from this code to see what
> happened here.  I really hope that we never hit the !parent case, as things
> stand....

We definitely do.  I tried taking out that if statement, and a whole bunch of tests started failing.  :/  I didn't look into why.

(In reply to Boris Zbarsky (:bz) from comment #10)
> It'd be pretty awesome if cleanup and substantive changes happened in
> separate diffs... ;)

Hmm, noted.  I tend to think of things like getting rid of nsresult as cleanup, but it's also a substantive change in this case, yeah.  Sorry.

> >+++ b/content/base/src/nsContentIterator.cpp
> >+  nsCOMPtr<nsINode> startParent = mRange->GetStartParent();
> >+  nsCOMPtr<nsINode> endParent = mRange->GetEndParent();
> 
> Why do those need to be nsCOMPtrs?

I don't know.  My understanding of ref-counting is still somewhat hazy, so I tend to err on the side of avoiding leaks if I'm not totally sure what's going on.  In this case, I guess they don't need to be nsCOMPtrs, because we already have an nsRefPtr to mRange, which holds nsCOMPtrs to its start/end, right?  So it would only be a problem if we assigned something different to startParent/endParent, or if we changed mRange's start/end, neither of which is the case.

More generally, as long as we hold a strong reference to one node in a tree, the entire tree will stay put, because everything holds a strong reference to its parent/children, right?

> >+  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> >+    nsRange::CompareNodeToRange(firstCandidate, mRange, &nodeBefore, &nodeAfter)));
> 
> I don't think you can assert that (e.g. you have no idea whether the range
> is positioned, afaict).

If mRange isn't positioned, then the start and end parents would also be null, so we'd have hit a MOZ_ASSERT already, and in production builds we'd have crashed too due to null pointer dereference.  The only place mIsPositioned is set (after being initialized to false) is nsRange::DoSetRange, which sets it to !!mStartParent.  So if we got to this point, the range has to be positioned.

> We should figure out what the tmp bit did before someone broke this code.

I filed bug 763869 on the matter.
Comment 12 :Aryeh Gregor (away until August 15) 2012-06-12 03:13:40 PDT
Created attachment 632189 [details] [diff] [review]
Patch v2

New try: https://tbpl.mozilla.org/?tree=Try&rev=69ae64328564
Comment 13 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-12 06:25:37 PDT
> More generally, as long as we hold a strong reference to one node in a tree, the entire
> tree will stay put, because everything holds a strong reference to its parent/children,
> right?

Yes.  As long as you don't run script.

> So if we got to this point, the range has to be positioned.

Yes, but CompareNodeToRange errors out in various other conditions too.  I assume you carefully checked that none of those can happen here, even when anonymous content is involved?
Comment 14 :Aryeh Gregor (away until August 15) 2012-06-12 07:13:53 PDT
(In reply to Boris Zbarsky (:bz) from comment #13)
> Yes, but CompareNodeToRange errors out in various other conditions too.  I
> assume you carefully checked that none of those can happen here, even when
> anonymous content is involved?

Yes.  I detailed them in comment 7:

> In Init(), when calling nsRange::CompareNodeToRange, I wrap it in
> MOZ_ALWAYS_TRUE(NS_SUCCEEDED()).  The only ways that function can fail are
> if it's passed null, if the range's it's passed is not positioned, or if the
> node and range it's passed are not in the same tree.  None of these should
> be possible in Init(), because we just got the nodes from the range.  In
> GetTopAncestorInRange, I instead use NS_ASSERTION plus returning null,
> because it's not obvious that the node can't be in a different tree for some
> reason at that point.
Comment 15 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-12 12:32:40 PDT
Aryeh, any chance of an interdiff here?
Comment 16 :Aryeh Gregor (away until August 15) 2012-06-13 01:48:15 PDT
Bugzilla's interdiff feature does a good job in this case:

https://bugzilla.mozilla.org/attachment.cgi?oldid=631861&action=interdiff&newid=632189&headers=1

I just checked against diff -u <(wget -qO- ...) <(wget -qO- ...), and it's accurate.
Comment 17 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-13 08:07:35 PDT
> Bugzilla's interdiff feature does a good job in this case:

Thanks for checking that.  It's impossible to check on my end because it has a tendency to silently not show hunks.... :(
Comment 18 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-06-13 08:09:48 PDT
Comment on attachment 632189 [details] [diff] [review]
Patch v2

You might still need a return nsnull after the MOZ_NOTREACHED to avoid warnings or errors with all our compilers.  Or not, depending on how they treat MOZ_NOTREACHED.  Just watch out for this when pushing if you don't add the return.

r=me
Comment 19 :Aryeh Gregor (away until August 15) 2012-06-13 08:39:27 PDT
Try passed, so it should be okay without the return, at least as far as errors go.
Comment 20 :Aryeh Gregor (away until August 15) 2012-06-14 06:14:20 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d5cd601ac88
Comment 21 Ed Morley [:emorley] 2012-06-15 06:00:12 PDT
https://hg.mozilla.org/mozilla-central/rev/8d5cd601ac88

Note You need to log in before you can comment on or make changes to this bug.