Last Comment Bug 767169 - "Assertion failure: NS_SUCCEEDED(nsRange::CompareNodeToRange(lastCandidate, mRange, &nodeBefore, &nodeAfter))"
: "Assertion failure: NS_SUCCEEDED(nsRange::CompareNodeToRange(lastCandidate, m...
Status: RESOLVED FIXED
: assertion, testcase
Product: Core
Classification: Components
Component: Editor (show other bugs)
: Trunk
: All All
: -- critical (vote)
: mozilla16
Assigned To: :Aryeh Gregor (working until September 2)
:
Mentors:
Depends on: 772387 770580 784905
Blocks: 336383 1055722
  Show dependency treegraph
 
Reported: 2012-06-21 14:48 PDT by Jesse Ruderman
Modified: 2014-08-19 11:37 PDT (History)
4 users (show)
ayg: in‑testsuite+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
testcase (asserts fatally when loaded) (474 bytes, text/html)
2012-06-21 14:48 PDT, Jesse Ruderman
no flags Details
stack trace (13.45 KB, text/plain)
2012-06-21 14:48 PDT, Jesse Ruderman
no flags Details
Patch part 1, v1 -- Partial cleanup of Selection::selectFrames (5.93 KB, patch)
2012-06-27 09:21 PDT, :Aryeh Gregor (working until September 2)
ehsan: review+
Details | Diff | Splinter Review
diff -w for patch part 1, v1 (4.93 KB, patch)
2012-06-27 09:21 PDT, :Aryeh Gregor (working until September 2)
no flags Details | Diff | Splinter Review
Patch part 2, v1 -- Handle lastCandidate being null in nsContentSubtreeIterator::Init (2.86 KB, patch)
2012-06-28 02:59 PDT, :Aryeh Gregor (working until September 2)
bzbarsky: review+
Details | Diff | Splinter Review
Patch part 1, v2 -- Partial cleanup of Selection::selectFrames (6.42 KB, patch)
2012-06-29 03:02 PDT, :Aryeh Gregor (working until September 2)
ehsan: review+
Details | Diff | Splinter Review
Patch part 3, v1 -- Use script runner for nsHTMLEditor::ResetRootElementAndEventTarget (3.59 KB, patch)
2012-07-02 07:54 PDT, :Aryeh Gregor (working until September 2)
ehsan: review+
bzbarsky: review+
Details | Diff | Splinter Review

Description Jesse Ruderman 2012-06-21 14:48:18 PDT
Created attachment 635483 [details]
testcase (asserts fatally when loaded)

Assertion failure: (((__builtin_expect(!!(!(( nsRange::CompareNodeToRange(lastCandidate, mRange, &nodeBefore, &nodeAfter)) & 0x80000000)), 1)))), at content/base/src/nsContentIterator.cpp:1295

nsRange::CompareNodeToRange returned NS_ERROR_UNEXPECTED, tripping the assertion:

MOZ_ALWAYS_TRUE(NS_SUCCEEDED(
  nsRange::CompareNodeToRange(lastCandidate, mRange, &nodeBefore, &nodeAfter)));
Comment 1 Jesse Ruderman 2012-06-21 14:48:56 PDT
Created attachment 635485 [details]
stack trace
Comment 2 :Aryeh Gregor (working until September 2) 2012-06-27 08:14:36 PDT
(Contrary to the comment in the test-case, this isn't actually a quirks vs. standards issue.  The same bug occurs if you have <!doctype html> but remove it before doing anything else in boom() -- it's just that there can't be a doctype node.)
Comment 3 :Aryeh Gregor (working until September 2) 2012-06-27 09:18:34 PDT
One problem here is that we're passing a range to nsContentSubtreeIterator::Init whose end offset is greater than the length of its end node -- this is being called from nsNodeUtils::ContentRemoved, and the ranges haven't been adjusted yet.

I suspect this is very broken.  If a node has been removed and the end offset of the selection hasn't yet been adjusted, a bogus node will be selected here.  Selection::selectFrames will then do whatever it does to it.

Ehsan, do you have any insight into what the right behavior here is?


This might not be the only way to trigger the bug, though.  The issue is that lastCandidate is null in nsContentSubtreeIterator::Init.  This is because node is the document, so GetPrevSibling(node) is null, and that's what we set lastCandidate to.  So I'll write a patch to explicitly handle the !lastCandidate case by "MakeEmpty(); return NS_OK;", as is done for firstCandidate.
Comment 4 :Aryeh Gregor (working until September 2) 2012-06-27 09:21:22 PDT
Created attachment 637155 [details] [diff] [review]
Patch part 1, v1 -- Partial cleanup of Selection::selectFrames

Possibly unrelated to the bug.  I say "partial" because I'm quite suspicious of the error-handling here in the case where various nodes aren't content -- I preserved existing behavior, but maybe that isn't what we actually want.  If non-content nodes weren't an error condition, we could make the function return void.  (Only one caller pays attention to the return code anyway.)
Comment 5 :Aryeh Gregor (working until September 2) 2012-06-27 09:21:50 PDT
Created attachment 637156 [details] [diff] [review]
diff -w for patch part 1, v1

The patch isn't very long, but it does do some outdenting.
Comment 6 :Ehsan Akhgari 2012-06-27 09:54:47 PDT
(In reply to :Aryeh Gregor from comment #3)
> One problem here is that we're passing a range to
> nsContentSubtreeIterator::Init whose end offset is greater than the length
> of its end node -- this is being called from nsNodeUtils::ContentRemoved,
> and the ranges haven't been adjusted yet.
> 
> I suspect this is very broken.  If a node has been removed and the end
> offset of the selection hasn't yet been adjusted, a bogus node will be
> selected here.  Selection::selectFrames will then do whatever it does to it.
> 
> Ehsan, do you have any insight into what the right behavior here is?

Could you perhaps use a script blocker to defer this to until after the range has been updated?
Comment 7 :Aryeh Gregor (working until September 2) 2012-06-28 02:59:39 PDT
Created attachment 637430 [details] [diff] [review]
Patch part 2, v1 -- Handle lastCandidate being null in nsContentSubtreeIterator::Init

This doesn't answer the question of "why the heck are we being passed a bogus range?", but even if the range is legit, I don't see any obvious reason why lastCandidate couldn't be null here, so we need the check anyway.  If we decide that nsContentSubtreeIterator shouldn't be legitimately passed invalid ranges, I can remove the newly-added comment and add an assertion to that effect to the start of Init().  The check I'm adding is paralleled by an existing check for firstCandidate:

  if (!firstCandidate) {
    // then firstCandidate is next node after node
    firstCandidate = GetNextSibling(node);

    if (!firstCandidate) {
      MakeEmpty();
      return NS_OK;
    }
  }

(In reply to Ehsan Akhgari [:ehsan] from comment #6)
> Could you perhaps use a script blocker to defer this to until after the
> range has been updated?

How would that help?  That stops JS scripts from running, right?  But the issue here is native code -- there are no mutation listeners or anything in the stack.  I don't know what the native code does, so I don't know if we want to defer it (or how to).  But maybe I don't understand what you're saying, or don't understand what script blockers do.

Try: https://tbpl.mozilla.org/?tree=Try&rev=7ecf5632e7d7
Comment 8 :Aryeh Gregor (working until September 2) 2012-06-28 05:02:27 PDT
Comment on attachment 637430 [details] [diff] [review]
Patch part 2, v1 -- Handle lastCandidate being null in nsContentSubtreeIterator::Init

There are test failures, so I'm deferring the review request until I figure them out.
Comment 9 :Aryeh Gregor (working until September 2) 2012-06-28 06:22:56 PDT
Comment on attachment 637430 [details] [diff] [review]
Patch part 2, v1 -- Handle lastCandidate being null in nsContentSubtreeIterator::Init

Okay, so the problem was actually with part 1.  I'm trying out a fix for that now:


diff --git a/layout/generic/nsSelection.cpp b/layout/generic/nsSelection.cpp
--- a/layout/generic/nsSelection.cpp
+++ b/layout/generic/nsSelection.cpp
@@ -80,17 +80,16 @@ static NS_DEFINE_CID(kFrameTraversalCID,
 #include "nsDOMError.h"
 #include "mozilla/dom/Element.h"

 using namespace mozilla;

 //#define DEBUG_TABLE 1

 static NS_DEFINE_IID(kCContentIteratorCID, NS_CONTENTITERATOR_CID);
-static NS_DEFINE_IID(kCSubtreeIteratorCID, NS_SUBTREEITERATOR_CID);

 //PROTOTYPES
 class nsFrameSelection;
 class nsAutoScrollTimer;

 static bool IsValidSelectionPoint(nsFrameSelection *aFrameSel, nsINode *aNode);

 static nsIAtom *GetTag(nsINode *aNode);
@@ -4025,17 +4024,17 @@ Selection::selectFrames(nsPresContext* a
       } else {
         endOffset = content->Length();
       }
       textFrame->SetSelectedRange(startOffset, endOffset, aFlags, mType);
     }
   }

   iter->First();
-  nsCOMPtr<nsIContentIterator> inneriter = NS_NewContentSubtreeIterator();
+  nsCOMPtr<nsIContentIterator> inneriter = NS_NewContentIterator();
   for (iter->First(); !iter->IsDone(); iter->Next()) {
     content = do_QueryInterface(iter->GetCurrentNode());
     SelectAllFramesForContent(inneriter, content, aFlags);
   }

   // We must now do the last one if it is not the same as the first
   if (aRange->GetEndParent() != aRange->GetStartParent()) {
     nsresult res;

But the second part should still be reviewable while I get the first part fixed up.
Comment 10 Boris Zbarsky [:bz] 2012-06-28 09:36:12 PDT
Comment on attachment 637430 [details] [diff] [review]
Patch part 2, v1 -- Handle lastCandidate being null in nsContentSubtreeIterator::Init

r=me
Comment 11 :Ehsan Akhgari 2012-06-28 12:35:12 PDT
I thought we set a script blocker when ranges are being updated...  Boris, don't we do that?
Comment 12 Boris Zbarsky [:bz] 2012-06-28 12:39:18 PDT
I don't know, offhand.
Comment 13 :Ehsan Akhgari 2012-06-28 12:42:16 PDT
Can you give this a shot, Aryeh?  (FWIW, I'm talking about nsContentUtils::AddScriptRunner and friends)
Comment 14 :Aryeh Gregor (working until September 2) 2012-06-29 03:02:39 PDT
Created attachment 637816 [details] [diff] [review]
Patch part 1, v2 -- Partial cleanup of Selection::selectFrames

In my cleanup, I accidentally made a regular iterator into a subtree iterator, which understandably broke stuff.  Interdiff is in comment 9; green try run at <https://tbpl.mozilla.org/?tree=Try&rev=f4e252126e3c>.

(In reply to Ehsan Akhgari [:ehsan] from comment #11)
> I thought we set a script blocker when ranges are being updated...  Boris,
> don't we do that?

(In reply to Ehsan Akhgari [:ehsan] from comment #13)
> Can you give this a shot, Aryeh?  (FWIW, I'm talking about
> nsContentUtils::AddScriptRunner and friends)

I'd be happy to look at this, but can you explain more clearly what I should do?  There's no user script running here on the stack above removeChild(); the assert is caused by nsHTMLEditor::ResetRootElementAndEventTarget resetting the selection.  Do you want me to wrap one of the function calls in nsContentUtils::AddScriptRunner so it doesn't run until the range is adjusted?  If so, which one?  I don't know which parts of this are safe to defer, because I don't know what else can happen before scripts can safely run.
Comment 15 :Ehsan Akhgari 2012-06-29 09:48:32 PDT
OK, here's how script blockers work.  When some piece of code wants to do something during which running scripts is not safe, it calls AddScriptBlocker (usually through allocating an nsAutoScriptBlocker RAII object on the stack), and when it's done it calls RemoveScriptBlocker.

Now, when you have native code that should not run during the period where a script blocker is active, you use AddScriptRunner to tell the script blocking mechanism about your code.  AddScriptRunner will run your code immediately if there is currently no active script blockers.  If there is, it will hold on to the runnable object and will run it once the last RemoveScriptBlocker is called (script blockers can be nested, and they're implemented using a counter.)

The reason I suggested this is that if we block scripts before the range is updated (which I suspect we do), then this will give you the ability to run your code after the ranges are updated.  It's sort of an abuse of "script" blockers, but it wouldn't be the first in Gecko.  :-)
Comment 16 :Aryeh Gregor (working until September 2) 2012-07-01 02:05:05 PDT
That brings me back to:

(In reply to :Aryeh Gregor from comment #14)
> Do you want me to wrap one of the function calls
> in nsContentUtils::AddScriptRunner so it doesn't run until the range is
> adjusted?  If so, which one?  I don't know which parts of this are safe to
> defer, because I don't know what else can happen before scripts can safely
> run.

I don't want to find out that I deferred the execution of something in a way that causes other stuff to break because I didn't understand what I was doing.
Comment 19 :Ehsan Akhgari 2012-07-01 10:55:49 PDT
I think you should try to call ResetRootElementAndEventTarget itself through a script runner.  I don't think it will have bad effects, but we should of course test it on try.
Comment 20 :Aryeh Gregor (working until September 2) 2012-07-02 07:54:47 PDT
Created attachment 638356 [details] [diff] [review]
Patch part 3, v1 -- Use script runner for nsHTMLEditor::ResetRootElementAndEventTarget

Requesting review from Ehsan for editor/, Boris for content/.  The added MOZ_ASSERT in nsContentIterator.cpp causes the test from part 2 to crash without the editor/ change -- it should help catch future cases where we accidentally pass bogus ranges.

Try: https://tbpl.mozilla.org/?tree=Try&rev=cc21722b8de2
Comment 21 :Aryeh Gregor (working until September 2) 2012-07-05 01:51:57 PDT
Oops -- I pushed the last patch here without remembering that I asked for review from bz too:

https://hg.mozilla.org/integration/mozilla-inbound/rev/f49e52ee86f3

I backed it out:

https://hg.mozilla.org/integration/mozilla-inbound/rev/25bdbb0faa4a
Comment 22 Boris Zbarsky [:bz] 2012-07-06 11:02:39 PDT
Comment on attachment 638356 [details] [diff] [review]
Patch part 3, v1 -- Use script runner for nsHTMLEditor::ResetRootElementAndEventTarget

r=me
Comment 23 :Aryeh Gregor (working until September 2) 2012-07-08 05:54:21 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8715136091b
Comment 24 Ryan VanderMeulen [:RyanVM] 2012-07-08 10:50:34 PDT
https://hg.mozilla.org/mozilla-central/rev/c8715136091b

Note You need to log in before you can comment on or make changes to this bug.