Last Comment Bug 763869 - Investigate why nsContentSubtreeIterator::GetTopAncestorInRange handles the !parent case oddly
: Investigate why nsContentSubtreeIterator::GetTopAncestorInRange handles the !...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: DOM: Core & HTML (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla16
Assigned To: Aryeh Gregor (:ayg) (away until October 25)
:
:
Mentors:
Depends on: 765205
Blocks: 543645
  Show dependency treegraph
 
Reported: 2012-06-12 03:11 PDT by Aryeh Gregor (:ayg) (away until October 25)
Modified: 2013-04-04 13:53 PDT (History)
2 users (show)
ayg: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch part 1, v1 -- Remove useless NodeHasChildren helper function (2.62 KB, patch)
2012-06-13 05:11 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review+
Details | Diff | Splinter Review
Patch part 2, v1 -- Clean up nsContentIterator.cpp (40.52 KB, patch)
2012-06-13 05:18 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review+
Details | Diff | Splinter Review
diff -w patch part 2, v1 (35.85 KB, patch)
2012-06-13 05:19 PDT, Aryeh Gregor (:ayg) (away until October 25)
no flags Details | Diff | Splinter Review
Patch part 3, v1 -- Make aIndexes param optional (7.94 KB, patch)
2012-06-13 05:20 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review+
Details | Diff | Splinter Review
Patch part 4, v1 -- Use nsIContent* in nsContentIterator where possible (12.82 KB, patch)
2012-06-13 07:56 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review-
Details | Diff | Splinter Review
Patch part 4, v2 -- Use nsIContent* in nsContentIterator where possible (13.40 KB, patch)
2012-06-14 00:20 PDT, Aryeh Gregor (:ayg) (away until October 25)
bzbarsky: review+
Details | Diff | Splinter Review
Interdiff for patch part 4, v1 -> v2 (1.94 KB, text/plain)
2012-06-14 00:20 PDT, Aryeh Gregor (:ayg) (away until October 25)
no flags Details
Interdiff for patch part 4, v1 -> v2 (1.94 KB, patch)
2012-06-14 00:21 PDT, Aryeh Gregor (:ayg) (away until October 25)
no flags Details | Diff | Splinter Review

Description Aryeh Gregor (:ayg) (away until October 25) 2012-06-12 03:11:11 PDT
The dodgy code right now (after landing bug 543645, which doesn't change the idea) is

  nsCOMPtr<nsINode> parent, tmp;
  while (aNode) {
    parent = aNode->GetNodeParent();
    if (!parent) {
      return tmp;
    }
    MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
      nsRange::CompareNodeToRange(parent, mRange, &nodeBefore, &nodeAfter)));

    if (nodeBefore || nodeAfter) {
      return aNode;
    }
    tmp = aNode;
    aNode = parent;
  }

The question is why we return tmp in the !parent case, rather than returning aNode.  I tried returning aNode there instead, but a bunch of tests failed.  As far as I can tell, the only way this can happen is if the range has start (document, 0) and end (document, document.childNodes.length).  In that case, we'll wind up with aNode equalling the document, with null parent.  In most cases, tmp will then be document.documentElement, because probably whatever node we looked at to start with was a child of the document.  I'm guessing that things break when returning aNode because they expect content to be returned, not a document.

A possible fix would be to make the function return only content, if it was passed a content node to start with.  But I don't want to do that unless I can figure out why callers expect that -- it probably makes more sense to fix the callers.
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2012-06-12 12:35:12 PDT
Oh, hmm.  That could totally be it: returning the root element when we run into the document....

Might be worth documenting that and all. Or maybe just using GetParent() instead of GetNodeParent(), which will guarantee we can't walk off into the document.

That's assuming that aNode can't be a document.  If it can, and if null should be returned in that case, then GetParent() won't work.
Comment 2 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 03:09:02 PDT
Reviewing the code, I'm pretty sure aNode can't be a document.  There are three call sites.  The first two are in Init, and they'll always pass in some child or sibling of the range's start/end node, never the start/end node itself.  The third is in Prev, and it passes in GetDeepLastChild(PrevNode(GetDeepFirstChild(mCurNode))).  The only way this could result in a document is if mCurNode was a document (which I think is itself impossible) which moreover would have to have no children (not a very likely case).

I suspect it might well be possible to change some variables' types from nsINode* to nsIContent* and have the compiler prove this for us with very little reasoning on our part, in fact.
Comment 3 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 05:11:38 PDT
Created attachment 632647 [details] [diff] [review]
Patch part 1, v1 -- Remove useless NodeHasChildren helper function

So the idea looks promising, but my current patch still has a bug.  I wrote some cleanup patches first, though (of course!), so I'll get your review on those while I fix up the real patch, which I'll possibly do tomorrow.
Comment 4 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 05:18:06 PDT
Created attachment 632649 [details] [diff] [review]
Patch part 2, v1 -- Clean up nsContentIterator.cpp

No functional changes!  The only possible functional change here is changing static_cast<nsIContent*> to ->AsContent(), and removing the (now-redundant) NS_ASSERTION that this actually works.  But in every one of these cases, it should be obvious that the node is actually content.

I did some outdenting in PrevNode/NextNode, so I'll also attach a diff -w.
Comment 5 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 05:19:38 PDT
Created attachment 632650 [details] [diff] [review]
diff -w patch part 2, v1
Comment 6 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 05:20:48 PDT
Created attachment 632651 [details] [diff] [review]
Patch part 3, v1 -- Make aIndexes param optional

Try push for patches so far: https://tbpl.mozilla.org/?tree=Try&rev=0f1e02a52207
Comment 7 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 07:56:33 PDT
Created attachment 632695 [details] [diff] [review]
Patch part 4, v1 -- Use nsIContent* in nsContentIterator where possible

So this should clarify matters.  It's just cleanup, no functional changes -- I think!  Let me know if you disagree.  :)

Basically, GetTopAncestorInRange never returns the root of the tree that aNode is in.  If aNode is the root, it returns null, and otherwise it will always stop before it hits the root.  Why this is, I'm not sure, but this patch at least makes it clear.  Even if I made sure it would only return content, tests would fail because at least some expected it not to return a DocumentFragment root.  (Maybe an argument for DocumentFragment not being content!)

If you want, I could investigate why callers expect this and try to change it, but I think it makes more sense to considered the bug fixed by this patch.

https://tbpl.mozilla.org/?tree=Try&rev=b00f5971a259
Comment 8 Boris Zbarsky [:bz] (still a bit busy) 2012-06-13 09:25:42 PDT
Comment on attachment 632647 [details] [diff] [review]
Patch part 1, v1 -- Remove useless NodeHasChildren helper function

r=me
Comment 9 Boris Zbarsky [:bz] (still a bit busy) 2012-06-13 09:34:28 PDT
Comment on attachment 632649 [details] [diff] [review]
Patch part 2, v1 -- Clean up nsContentIterator.cpp

r=me
Comment 10 Boris Zbarsky [:bz] (still a bit busy) 2012-06-13 09:35:18 PDT
Comment on attachment 632651 [details] [diff] [review]
Patch part 3, v1 -- Make aIndexes param optional

r=me
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2012-06-13 09:43:58 PDT
Comment on attachment 632695 [details] [diff] [review]
Patch part 4, v1 -- Use nsIContent* in nsContentIterator where possible

Doesn't this change what aIndexes looks like?  In particular, for a given argument it will look different for the nsINode and nsIContent overloads....  That seems wrong.
Comment 12 Aryeh Gregor (:ayg) (away until October 25) 2012-06-13 11:09:46 PDT
Hmm, maybe that's why I'm getting a SIGSEGV on try!  You seem to be right, good catch.  I'll submit a new version tomorrow.
Comment 13 Aryeh Gregor (:ayg) (away until October 25) 2012-06-14 00:20:08 PDT
Created attachment 633056 [details] [diff] [review]
Patch part 4, v2 -- Use nsIContent* in nsContentIterator where possible

Well, no, the SIGSEGV was because of the lack of a null check.  But you were right too!

New try: https://tbpl.mozilla.org/?tree=Try&rev=31b7a37defd2
Comment 14 Aryeh Gregor (:ayg) (away until October 25) 2012-06-14 00:20:29 PDT
Created attachment 633058 [details]
Interdiff for patch part 4, v1 -> v2
Comment 15 Aryeh Gregor (:ayg) (away until October 25) 2012-06-14 00:21:22 PDT
Created attachment 633059 [details] [diff] [review]
Interdiff for patch part 4, v1 -> v2

Something went wrong with the interdiff upload . . .
Comment 16 Boris Zbarsky [:bz] (still a bit busy) 2012-06-14 12:37:19 PDT
Comment on attachment 633056 [details] [diff] [review]
Patch part 4, v2 -- Use nsIContent* in nsContentIterator where possible

> nsContentSubtreeIterator::GetTopAncestorInRange(nsINode* aNode)
>+  // aNode has a parent, so it must be content.

Can you just test IsContent() instead of checking for a parent if that's what you really want to check here?

>+  nsCOMPtr<nsIContent> content = aNode->AsContent();

nsIContent*, please.

>+  nsCOMPtr<nsIContent> parent;

nsIContent*.  Also, can't this be declared inside the loop?

>+    // content always has a parent.  If its parent is the root, however --
>+    // i.e., either it's not content, or it is content but its own parent is
>+    // null -- then we're finished, since we don't go up to the root.

See, this is the part I don't get.  Why do we have that behavior?  I guess you're not changing it, but the comment doesn't explain the _why_, just the _what_, and the why is what needs explaining here....

r=me with the nits fixed.
Comment 17 Aryeh Gregor (:ayg) (away until October 25) 2012-06-15 05:08:59 PDT
(In reply to Boris Zbarsky (:bz) from comment #16)
> > nsContentSubtreeIterator::GetTopAncestorInRange(nsINode* aNode)
> >+  // aNode has a parent, so it must be content.
> 
> Can you just test IsContent() instead of checking for a parent if that's
> what you really want to check here?

It's not -- if it has no parent, it's the root of the tree, so we're not supposed to return it whether or not it's content.  That's the point of this patch version.

> >+  nsCOMPtr<nsIContent> content = aNode->AsContent();
> 
> nsIContent*, please.
> 
> >+  nsCOMPtr<nsIContent> parent;
> 
> nsIContent*.

Blech, fixed.

> Also, can't this be declared inside the loop?

Yes.  I was just changing the existing code, which didn't declare it inside the loop, so I didn't think to move it.

> >+    // content always has a parent.  If its parent is the root, however --
> >+    // i.e., either it's not content, or it is content but its own parent is
> >+    // null -- then we're finished, since we don't go up to the root.
> 
> See, this is the part I don't get.  Why do we have that behavior?  I guess
> you're not changing it, but the comment doesn't explain the _why_, just the
> _what_, and the why is what needs explaining here....

Because someone originally wrote it that way for some unknown reason and now callers depend on it.  :)

More seriously: the only case where it makes a difference, AFAICT, is when you have an nsContentSubtreeIterator that's initialized from a range that contains the whole subtree.  In that case, if GetTopAncestorInRange was willing to return the root, you'd iterate over one thing.  Callers apparently expect to iterate over the root's children instead.  I saw failures with paste and execCommand("insertHTML"), at least.

Perhaps the story is something like Range.deleteContents or an equivalent routine trying to remove every node in the range, so they iterate over subtrees -- but they're given the root, and removing that fails, so it does nothing.


So looking at it some more, I think I have an answer.  In nsRange::CompareNodeToRange():

  nsINode* parent = aNode->GetNodeParent();
  if (!parent) {
    // can't make a parent/offset pair to represent start or 
    // end of the root node, because it has no parent.
    // so instead represent it by (node,0) and (node,numChildren)
    parent = aNode;
    nodeStart = 0;
    nodeEnd = aNode->GetChildCount();
  }
  else {
    nodeStart = parent->IndexOf(aNode);
    nodeEnd = nodeStart + 1;
  }

This means that as far as CompareNodeToRange() is concerned, if you have a range whose start is (node, 0) and end is (node, node.length), the node is normally not contained in the range -- you have (parent, nodeStart) < (node, 0) and (node, node.length) < (parent, nodeEnd).  This matches the definition of "contains" in the editing spec, and what's used by deleteContents, etc.  But it uses a different definition for the root node -- a range with start (root, 0) and end (root, root.length) *will* contain the root.  So ContentSubtreeIterator works around this by special-casing the root node.  If CompareNodeToRange had no special case for the root node, and just said that no range can ever contain it, GetTopAncestorInRange could lose the special case too.


I've filed bug 765205 on this, and will mention the bug in a code comment when I check in the patches for this bug.
Comment 19 Boris Zbarsky [:bz] (still a bit busy) 2012-06-15 13:31:34 PDT
> I've filed bug 765205 on this, and will mention the bug in a code comment

Perfect, thanks!

Note You need to log in before you can comment on or make changes to this bug.