The default bug view has changed. See this FAQ.

Hang [@ nsRange::CutContents] with DOMNodeInserted event, deleteContents

RESOLVED FIXED in mozilla16

Status

()

Core
DOM: Core & HTML
--
critical
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: Jesse Ruderman, Assigned: ayg)

Tracking

(Blocks: 1 bug, {hang, testcase})

Trunk
mozilla16
x86
All
hang, testcase
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(blocking2.0 -, status2.0 wanted)

Details

(Whiteboard: [sg:dos])

Attachments

(2 attachments, 1 obsolete attachment)

(Reporter)

Description

7 years ago
Created attachment 424709 [details]
testcase (hangs Firefox)
So the SplitDataNode call that CutContents makes triggers the mutation listener there.  That removes the content our iterator is positioned on from the DOM, and nsContentSubtreeIterator::Prev happens to not set itself done even when it advanves to null (in particular if prevNode ends up null).  It also doesn't set mCurNode in that case, because GetTopAncestorInRange bails out if given null and that's what sets mCurNode.

Should Prev() just be setting the iterator as done in this situation?
blocking2.0: --- → ?
Not a blocker, but wanted. And bz, your suggestion sounds reasonable to me.
blocking2.0: ? → -
status2.0: --- → wanted
Whiteboard: [sg:dos]
OS: Mac OS X → All
This still looks broken.  Aryeh, want to take a look?
I have a better proposed solution: let's drop support for sync mutation listeners already.  0:)

But in the meantime, I'll see what I can do.
So honestly, I'm not sure exactly what nsContentSubtreeIterator is supposed to do.  What order is it meant to return nodes in?  The code is complicated by a lot of caching, so I'm not confident that I see what it's doing.  It doesn't look like the simplest way to do anything . . . e.g., Prev() calls GetDeepFirstChild on mCurNode, but then immediately calls PrevNode on the result, which seems like it will just boil down to GetPrevSibling in this case, and the first thing that does is go back up until it finds the first ancestor that has a previous sibling.  Is this really the case?  If so, why is the code so complicated?

If I'm understanding correctly, it looks like it iterates over all nodes in the range whose parent is not also in the range.  So it's like a regular post-order mContentIterator, except once it's returned a node it doesn't descend into its children.  Is that right?
Hmm, never mind.  I think I understand what's going on here well enough.  Hopefully I'll have a patch tomorrow.
Created attachment 631861 [details] [diff] [review]
Patch v1

It would be no fun to just fix the bug, so I cleaned up a lot of the code and fixed the bug at the same time.  :)  Notes:

A crashtest is the right way to test this, right?

I changed GetTopAncestorInRange to return an nsINode*, with the guarantee that it will always return a non-null node if the input is not null.  I wanted to MOZ_ASSERT that the input is not null, but the problematic caller (in nsContentSubtreeIterator::Prev) looked nontrivial to fix properly.

I MOZ_ASSERT that the nsIDOMRange passed to nsContentSubtreeIterator::Init is not null, and that its common ancestor/start node/end node aren't null.  I don't know how you feel about this sort of MOZ_ASSERT usage, but it seems unlikely to bite anyone who doesn't deserve it.

There were a bunch of places that used bare nsINode* for local variables.  I figured this was *probably* safe, since an actual reference was being kept to other nodes in the same tree, but I changed them to nsCOMPtr<nsINode> anyway.  Tell me if that's bad.

In Init(), when calling nsRange::CompareNodeToRange, I wrap it in MOZ_ALWAYS_TRUE(NS_SUCCEEDED()).  The only ways that function can fail are if it's passed null, if the range's it's passed is not positioned, or if the node and range it's passed are not in the same tree.  None of these should be possible in Init(), because we just got the nodes from the range.  In GetTopAncestorInRange, I instead use NS_ASSERTION plus returning null, because it's not obvious that the node can't be in a different tree for some reason at that point.

I don't actually understand the logic in GetTopAncestorInRange.  Why the tmp variable?  What purpose does it serve?  The parent variable should only ever be null if the range selects the whole tree and we make it all the way up to the root -- in that case, why do we return null or a grandchild (depending on aNode) instead of returning the tree root?  But if I try to change it to return aNode instead of tmp in that case, a whole bunch of tests fail.

Anyway, this fixes the hang because GetTopAncestorInRange will now return null instead of bailing out if it's passed null, so mCurNode will get set to null and mIsDone will be set to false.

https://tbpl.mozilla.org/?tree=Try&rev=6b9458a98cd1
Assignee: nobody → ayg
Status: NEW → ASSIGNED
Attachment #631861 - Flags: review?(bzbarsky)
(In retrospect, maybe a more trivial fix would have been good for the sake of backporting.  If you want, I can write a separate patch to do that.)
> A crashtest is the right way to test this, right?

Seems fine, yes.

> Why the tmp variable?

Presumably some sort of holdover from when this used nsIDOMNode?

At this point it seems to be equal to the child of aNode we last examined, except the first time through the loop, when it's null.  That makes no sense whatsoever.  It's worth looking up the blame from this code to see what happened here.  I really hope that we never hit the !parent case, as things stand....
Comment on attachment 631861 [details] [diff] [review]
Patch v1

It'd be pretty awesome if cleanup and substantive changes happened in separate diffs... ;)

>+++ b/content/base/src/nsContentIterator.cpp
>+  nsCOMPtr<nsINode> startParent = mRange->GetStartParent();
>+  nsCOMPtr<nsINode> endParent = mRange->GetEndParent();

Why do those need to be nsCOMPtrs?

>+    nsCOMPtr<nsINode> child = startParent->GetFirstChild();

Pretty sure this one does NOT need to be an nsCOMPtr.

>+  nsCOMPtr<nsINode> firstCandidate, lastCandidate;

Nor these (though they do need to be inited to null.

>+  nsCOMPtr<nsINode> node;

Nor this one.

>+    nsCOMPtr<nsINode> child = startParent->GetChildAt(offset);

Or this one.

>+    // then firstCandidate is next node after cN

s/cN/node/

>+  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
>+    nsRange::CompareNodeToRange(firstCandidate, mRange, &nodeBefore, &nodeAfter)));

I don't think you can assert that (e.g. you have no idea whether the range is positioned, afaict).

> nsContentSubtreeIterator::Next()
>-  nsINode *nextNode = GetNextSibling(mCurNode, nsnull);
>+  nsCOMPtr<nsINode> nextNode = GetNextSibling(mCurNode, nsnull);

Why?


> nsContentSubtreeIterator::Prev()
>+  nsCOMPtr<nsINode> prevNode = GetDeepFirstChild(mCurNode, nsnull);

And why does that need to be an nsCOMPtr?

Generally using nsCOMPtr where not needed adds to cc pressure and makes things slower...

We should figure out what the tmp bit did before someone broke this code.
Attachment #631861 - Flags: review?(bzbarsky) → review-
(In reply to Boris Zbarsky (:bz) from comment #9)
> Presumably some sort of holdover from when this used nsIDOMNode?
> 
> At this point it seems to be equal to the child of aNode we last examined,
> except the first time through the loop, when it's null.  That makes no sense
> whatsoever.  It's worth looking up the blame from this code to see what
> happened here.  I really hope that we never hit the !parent case, as things
> stand....

We definitely do.  I tried taking out that if statement, and a whole bunch of tests started failing.  :/  I didn't look into why.

(In reply to Boris Zbarsky (:bz) from comment #10)
> It'd be pretty awesome if cleanup and substantive changes happened in
> separate diffs... ;)

Hmm, noted.  I tend to think of things like getting rid of nsresult as cleanup, but it's also a substantive change in this case, yeah.  Sorry.

> >+++ b/content/base/src/nsContentIterator.cpp
> >+  nsCOMPtr<nsINode> startParent = mRange->GetStartParent();
> >+  nsCOMPtr<nsINode> endParent = mRange->GetEndParent();
> 
> Why do those need to be nsCOMPtrs?

I don't know.  My understanding of ref-counting is still somewhat hazy, so I tend to err on the side of avoiding leaks if I'm not totally sure what's going on.  In this case, I guess they don't need to be nsCOMPtrs, because we already have an nsRefPtr to mRange, which holds nsCOMPtrs to its start/end, right?  So it would only be a problem if we assigned something different to startParent/endParent, or if we changed mRange's start/end, neither of which is the case.

More generally, as long as we hold a strong reference to one node in a tree, the entire tree will stay put, because everything holds a strong reference to its parent/children, right?

> >+  MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
> >+    nsRange::CompareNodeToRange(firstCandidate, mRange, &nodeBefore, &nodeAfter)));
> 
> I don't think you can assert that (e.g. you have no idea whether the range
> is positioned, afaict).

If mRange isn't positioned, then the start and end parents would also be null, so we'd have hit a MOZ_ASSERT already, and in production builds we'd have crashed too due to null pointer dereference.  The only place mIsPositioned is set (after being initialized to false) is nsRange::DoSetRange, which sets it to !!mStartParent.  So if we got to this point, the range has to be positioned.

> We should figure out what the tmp bit did before someone broke this code.

I filed bug 763869 on the matter.
Depends on: 763869
Created attachment 632189 [details] [diff] [review]
Patch v2

New try: https://tbpl.mozilla.org/?tree=Try&rev=69ae64328564
Attachment #631861 - Attachment is obsolete: true
Attachment #632189 - Flags: review?(bzbarsky)
> More generally, as long as we hold a strong reference to one node in a tree, the entire
> tree will stay put, because everything holds a strong reference to its parent/children,
> right?

Yes.  As long as you don't run script.

> So if we got to this point, the range has to be positioned.

Yes, but CompareNodeToRange errors out in various other conditions too.  I assume you carefully checked that none of those can happen here, even when anonymous content is involved?
(In reply to Boris Zbarsky (:bz) from comment #13)
> Yes, but CompareNodeToRange errors out in various other conditions too.  I
> assume you carefully checked that none of those can happen here, even when
> anonymous content is involved?

Yes.  I detailed them in comment 7:

> In Init(), when calling nsRange::CompareNodeToRange, I wrap it in
> MOZ_ALWAYS_TRUE(NS_SUCCEEDED()).  The only ways that function can fail are
> if it's passed null, if the range's it's passed is not positioned, or if the
> node and range it's passed are not in the same tree.  None of these should
> be possible in Init(), because we just got the nodes from the range.  In
> GetTopAncestorInRange, I instead use NS_ASSERTION plus returning null,
> because it's not obvious that the node can't be in a different tree for some
> reason at that point.
Aryeh, any chance of an interdiff here?
Bugzilla's interdiff feature does a good job in this case:

https://bugzilla.mozilla.org/attachment.cgi?oldid=631861&action=interdiff&newid=632189&headers=1

I just checked against diff -u <(wget -qO- ...) <(wget -qO- ...), and it's accurate.
> Bugzilla's interdiff feature does a good job in this case:

Thanks for checking that.  It's impossible to check on my end because it has a tendency to silently not show hunks.... :(
Comment on attachment 632189 [details] [diff] [review]
Patch v2

You might still need a return nsnull after the MOZ_NOTREACHED to avoid warnings or errors with all our compilers.  Or not, depending on how they treat MOZ_NOTREACHED.  Just watch out for this when pushing if you don't add the return.

r=me
Attachment #632189 - Flags: review?(bzbarsky) → review+
Try passed, so it should be okay without the return, at least as far as errors go.
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d5cd601ac88
Flags: in-testsuite+
Target Milestone: --- → mozilla16
https://hg.mozilla.org/mozilla-central/rev/8d5cd601ac88
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Component: DOM: Traversal-Range → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.