Closed Bug 1216001 Opened 9 years ago Closed 8 years ago

Freeze for a long time when selecting text in long page with lots of user-select:none bits

Categories

(Core :: DOM: Selection, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox44 --- affected
firefox47 --- fixed

People

(Reporter: xidorn, Assigned: MatsPalmgren_bugz)

References

(Depends on 1 open bug)

Details

(Keywords: hang, perf, Whiteboard: [content perf])

Attachments

(5 files, 6 obsolete files)

1001 bytes, patch
Details | Diff | Splinter Review
6.55 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
5.73 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
6.07 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
fix
961 bytes, patch
xidorn
: review+
Details | Diff | Splinter Review
Steps to reproduce:
1. open a large page like https://dxr.mozilla.org/mozilla-central/source/layout/base/nsCSSFrameConstructor.cpp
2. press Ctrl-A (or Cmd-A for Mac) to select the whole page

Expected result:
The whole page is selected instantaneously

Actual result:
The whole page freezes for several seconds


Here is a profile:
http://people.mozilla.org/~bgirard/cleopatra/#report=c4a4ba8d892197fa3e83309f8ebefaf898f341a9&filter=[{%22type%22%3A%22RangeSampleFilter%22,%22start%22%3A191070,%22end%22%3A212920}]&selection=0,758,783,26,31,32,34,64,158,36,37,38,819

It seems most of the time are spent in FragmentOrElement::IndexOf, which is somehow indirectly called by nsDisplayText::Paint.
I guess that call is from IsSelected() which is called by nsTextFrame::PaintText() several times.
Is the indexof cache not working correctly here for some reason?
Is this a DOM bug then?
Flags: needinfo?(bzbarsky)
Keywords: hang, perf
Well, calling IndexOf is known-bad and people shouldn't do it.  I mean, we're trying to _remove_ that API.  So ideally the text layout code would stop doing that.

Past that, though, calls to IndexOf that come in order (or reverse order, or _some_ locality of reference) on a given parent should be pretty fast due to the indexof cache.  If you make calls on many different parents interleaved, or totally randomly on kids of a single parent that has a lot of kids, you will have a bad time because the indexof cache will not be able to help you out.

In this case, I'm seeing a single parent (an HTMLTableCellElement) get IndexOf calls whose sequence of return values goes something like: 1, 11522, 1, 10308, 1, 16190, 1, 3398, 17430, etc.  This access pattern does not exhibit good locality of reference, mmkay?  So the proximate answer to my question in comment 2 is: "Yes, it's not working because the access pattern is dumb."

The reason for this access pattern is that nsTextFrame::IsFrameSelected does:

7056      return nsRange::IsNodeSelected(GetContent(), GetContentOffset(),
7057                                     GetContentEnd());

OK, so then we land in nsRange::IsNodeSelected.  This walks up the tree until it finds a node that's a common ancestor for a range in the selection.  That's OK so far.  Then it walks up its ancestor chain, getting a RangeHashTable off each one.  For each RangeHashTable, it looks at all the ranges in that hashtable and compares the given node/offset pair to the start/end of the range.

That's all great so far, except in this case the hashtable contains 12462 ranges (yes, really).  Each one has the table cell in question as end parent, with wildly different end offsets.  The start parent for each is a different textnode, with all start offsets being 0.  What the offsets of the nearest ancestor of those ranges that is a child of the table cell are, I don't know.

Anyway.  So we have thousands of textnodes, each of which we compare against thousands of ranges.  And each comparison takes O(thousands) time because we do it with crappy locality of reference and hence totally negate the indexof cache.  So now we've got an O(N^3) algorithm and N is on the order of 10000 or more (12000 ranges, 25000 kids of the table cell, I'm guessing about 12000 textnodes involved, etc). And 1e12 is a large number of operations even for a modern CPU if you want it to be done in 16ms.

I will claim that the existence of 12000 ranges here is not a DOM bug.  Why we're doing that, I have no clue.

The the  approach of comparing everything to everything that the layout code is taking, and doing it in random order due to the hashtable enumeration seem to basically be fallout from bug 619273: before that IsSelected was just a bit check on the frame.

The IsNodeSelected call and its accompanying data structure seem to come from
Component: Layout: Text → Selection
Flags: needinfo?(bzbarsky) → needinfo?(mats)
> The IsNodeSelected call and its accompanying data structure seem to come from

Er, ignore that part, forgot to delete it.

> Why we're doing that, I have no clue.

Well, I sort of do.  We land in mozilla::dom::Selection::SetAnchorFocusToRange which adds a range to the selection.  The selection code then tries to exclude the non-selectable nodes, via nsRange::ExcludeNonSelectableNodes, which breaks up the single selection range into lots of small ranges.  This seems to come from bug 739396.  Why are there non-selectable nodes?  Because all the <span>s containing line numbers in the <td id="line-numbers"> are styled "user-select: none".  If I turn off that style via devtools, Cmd-A is pretty much instantaneous.

So what's super-dumb about all this is that all this time is being taken to select the whitespace between those unselectable spans, basically...

Anyway, given the provenance of these ranges and their totally non-overlapping and in-order nature it seems like the least we could do is store them in some order-preserving data structure that would at least give us some locality of reference.  Or maybe we could not create ranges for (collapsed!) whitespace between unselectable things.  Or maybe both?

Anyway, the interaction of the fixes for bug 739396 and bug 619273 is very much not good here.
Blocks: 619273, 739396
Summary: Freeze for a long time when selecting text in long page → Freeze for a long time when selecting text in long page with lots of user-select:none bits
So, for the given STR, I typically see four *different* SELECTION_NORMAL Selections with
the same set of (cloned) ranges.  The root cause of *that* problem seems to be this:

#0  mozilla::dom::Selection::Selection () at layout/generic/nsSelection.cpp:3306
#1  NS_NewDomSelection () at layout/generic/nsSelection.cpp:253
#2  nsHTMLCopyEncoder::SetSelection () at dom/base/nsDocumentEncoder.cpp:1505
#3  SelectionCopyHelper () at dom/base/nsCopySupport.cpp:178
#4  nsCopySupport::HTMLCopy () at dom/base/nsCopySupport.cpp:286
#5  nsAutoCopyListener::NotifySelectionChanged () at layout/generic/nsSelection.cpp:6301
#6  mozilla::dom::Selection::NotifySelectionListeners () at layout/generic/nsSelection.cpp:5955
#7  nsFrameSelection::NotifySelectionListeners () at layout/generic/nsSelection.cpp:2252
#8  nsFrameSelection::EndBatchChanges () at layout/generic/nsSelection.cpp:2241
#9  mozilla::dom::Selection::EndBatchChangesInternal () at layout/generic/nsSelection.cpp:5981
#10 mozilla::dom::SelectionBatcher::~SelectionBatcher () at mozilla/dom/Selection.h:351
#11 mozilla::dom::Selection::SelectAllChildren () at layout/generic/nsSelection.cpp:5540
...

http://hg.mozilla.org/mozilla-central/annotate/5f9ba76eb3b1/dom/base/nsDocumentEncoder.cpp#l1500

Furthermore, if I press and hold CTRL+A so that it repeats, I see hundreds of these Selection
copies in IsNodeSelected!  After releasing and waiting 10 seconds or so it drops down to
the original four (I'm guessing we depend on CC to get those extra removed).  But I always
have those original four for some reason.

This is a related but somewhat orthogonal problem to the one you're describing.
It might be worth fixing though, perhaps by replacing nsDocumentEncoder::mSelection
with a nsTArray of nsRange clones or something?  I'm guessing these are only needed
during the encoding and can then be removed?  Involving a Selection (which incurs
registering the ranges in the DOM tree) seems unnecessary here.
Anyway, this problem should go in a separate bug I think.

FWIW, if I apply this patch I only see one SELECTION_NORMAL Selection in IsNodeSelected
so I'm pretty sure this is where the copies came from.
Assignee: nobody → mats
Flags: needinfo?(mats)
So I made three improvements to IsNodeSelected.  First, we can access
the ranges in their sorted Selection order.  Second, since we know
they are sorted we can do a binary search.  Third, we can calculate
a hash for each range (just the four start/end/node/offset members
should be enough?) and prune duplicates (this sort of wallpapers
over the problem described in comment 6).

The above gives a two orders of magnitude improvement in execution
time for IsNodeSelected in an Opt build on Linux64.  The STR is
still a tad slow though, it takes ~1 second from keypress to seeing
selected text.  I'm not sure where all that time goes.

I usually measure perf using Instruments on OSX but an up-to-date
m-c tree fails to compile there for some reason...  something called
libavcodec under media/ fails to build it seems.
BTW, is there a way to measure the hit rate for the IndexOf cache?  Where does that live?
Flags: needinfo?(bzbarsky)
(On Linux I use http://www.rotateright.com/zoom-profiler/ for profiling. It gives a nice UI on top of low level profiling drivers. Or Gecko profiler https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler)
> BTW, is there a way to measure the hit rate for the IndexOf cache?  Where does that live?

It lives in nsAttrAndChildArray.cpp.  It's just a 128-slot cache that maps nsAttrAndChildArray* to int32_t.  The idea is that when you do someParent->IndexOf(someChild) we look up the someParent's nsAttrAndChildArray* in the cache.  If this hits (as in, one of those 128 slots corresponds to the given pointer), then instead of doing a linear scan through the child list for someChild we do a scan starting at the cached index and then moving outward.  Then when we find someChild we store the index we found it at in the cache.  See nsAttrAndChildArray::IndexOfChild.  We only do this if the parent has more than 10 kids.

So you can "miss" in the sense of the parent having nothing in the cache at all, or you can "miss" in the sense that the cached index is 5 and the child is at index 12000, and on the next call the child is at index 6, etc: the cache just won't help much with such non-local references.  It was really designed to make repeated nextSibling traversal fast back when we didn't have direct nextSibling links.
Flags: needinfo?(bzbarsky)
Attached patch fix (obsolete) — Splinter Review
This part implements the first two optimizations mentioned above -
access the ranges in their sorted Selection order, using binary search.

With this patch, we go from 99.7% IndeoxOf time to ~3% (or less).
(measured with Instruments on an OSX Opt build)

I'm not sure there is much more we can do to improve the lookup here.
The next step to optimize this is probably to cache the select state
on the frames or something.
Attachment #8715592 - Flags: review?(bzbarsky)
This patch wallpapers the second problem mentioned in comment 6.
I investigated this on OSX as well and it turns out it doesn't
occur there.  I only see one Selection object with the huge
number of ranges for this STR, so I suspect this problem only
occurs in Linux builds (or "gtk" builds rather).

The performance improvement wasn't that great on Linux either
because when all nodes are selected you'll get a hit on the
first Selection and return before testing the clones.
I guess the worst case scenario is when most of the page is
selected and you keep repainting the unselected nodes.

So, it doesn't seem worth taking this part.
Depends on: 1245883
(filed bug 1245883 on the Selection cloning problem on Linux)
Comment on attachment 8715592 [details] [diff] [review]
fix

So I'm trying to understand what this code is trying to do in general.  Is it looking for a range that entirely contains everything from (aNode, aStartOffset) to (aNode, aEndOffset)?

If so, I would prefer that you used nsTArray::BinaryIndexOf with a custom comparator, instead of reimplementing binary search.

But I'm pretty worried about the (temporary) memory blowup from each IsNodeSelected call.  Seems like we would have a _lot_ of such calls, no?  And each one could be allocating a huge amount of memory with this new setup...  Not sure whether there's a good way to address that.
Attachment #8715592 - Flags: review?(bzbarsky)
Comment on attachment 8715592 [details] [diff] [review]
fix

(In reply to Boris Zbarsky [:bz] from comment #14)
> So I'm trying to understand what this code is trying to do in general.  Is
> it looking for a range that entirely contains everything from (aNode,
> aStartOffset) to (aNode, aEndOffset)?

No, it's looking for a range that at least partly overlaps
(aNode, aStartOffset) to (aNode, aEndOffset).
I'll document this better in the header file.

> If so, I would prefer that you used nsTArray::BinaryIndexOf with a custom
> comparator, instead of reimplementing binary search.

We could use nsTArray::BinaryIndexOf, but I suspect that would roughly
double the number of calls to nsContentUtils::ComparePoints, because
BinaryIndexOf takes my Comparator and wraps it in a ItemComparatorEq:
http://hg.mozilla.org/mozilla-central/annotate/4295f9951e93/xpcom/glue/nsTArray.h#l1164
which first calls Equals and then typically also calls LessThan:
http://hg.mozilla.org/mozilla-central/annotate/4295f9951e93/xpcom/glue/nsTArray.h#l749

Calling mozilla::BinarySearchIf directly though should work better.
I'll try that.

> But I'm pretty worried about the (temporary) memory blowup from each
> IsNodeSelected call.  Seems like we would have a _lot_ of such calls, no? 

FYI, it only affects pages that actually have some part selected.
(or have a text control, but those only affect small number of frames)
A Selection usually have exactly one range.  And only nodes that are in
the subtree of the common ancestor of that nsRange ever makes a call:
http://hg.mozilla.org/mozilla-central/annotate/4295f9951e93/layout/generic/nsFrame.cpp#l9068
http://hg.mozilla.org/mozilla-central/annotate/4295f9951e93/layout/generic/nsFrame.cpp#l6240

So, I think it's really only this edge case that is problematic.

> And each one could be allocating a huge amount of memory with this new
> setup...  Not sure whether there's a good way to address that.

For this edge case, yes.  (I'm not sure I'd call a nsTArray or
a Hashtable with a few thousand pointers "huge" exactly, but YMMV :-) )
It's very transient though.

I looked closer at the content of this page and the line numbers are
a bunch of sibling <span>s (with -moz-user-select:none) with ignorable
whitespace text nodes between.  I think it would be OK to treat such
whitespace nodes as unselectable when they are next to an unselectable
node.  That would make all these siblings appear as one big unselectable
chunk, so if the original range starts before and ends after this chunk
it will become just two ranges instead of thousands.  It should fix
the problem on these DXR pages at least, and hopefully make this edge
case somewhat more rare in general.
Attachment #8715592 - Attachment is obsolete: true
Now using mozilla::BinarySearchIf.  Otherwise the same strategy.
Documented IsNodeSelected().
Attachment #8716791 - Flags: review?(bzbarsky)
I also noticed that IsSelected() was called at least twice during painting.
This patch caches the result to avoid that.
Attachment #8716796 - Flags: review?(bzbarsky)
Comment on attachment 8716791 [details] [diff] [review]
part 1 - Optimize nsRange::IsNodeSelected.

r=me, but want to fix http://mxr.mozilla.org/mozilla-central/source/mfbt/BinarySearch.h#50 to mention mTarget instead of the nonexistent mValue?
Attachment #8716791 - Flags: review?(bzbarsky) → review+
Comment on attachment 8716792 [details] [diff] [review]
part 2 - Optimize nsRange::ExcludeNonSelectableNodes by counting ignorable whitespace text nodes next to an unselectable node as unselectable too.

Please at least document why firstNonSelectableContent merely being non-null means we're next to it.  Because that's entirely non-obvious from this code.

r=me assuming that is in fact true.
Attachment #8716792 - Flags: review?(bzbarsky) → review+
Comment on attachment 8716796 [details] [diff] [review]
part 3 - Cache the result of IsSelected() for the duration of painting.

>+    auto& item = const_cast<nsCharClipDisplayItem&>(aItem);

I would prefer this used mutable instead of const_cast.
Attachment #8716796 - Flags: review?(bzbarsky) → review-
The node we're at isn't necessarily next to firstNonSelectableContent, but rather
firstNonSelectableContent being non-null means we're next to a sequence of
non-selectable nodes (where firstNonSelectableContent is the first).

I added a couple of comments, does that help?
Attachment #8717151 - Flags: review?(bzbarsky)
s/Collapsible/Ignorable/
Attachment #8717151 - Attachment is obsolete: true
Attachment #8717151 - Flags: review?(bzbarsky)
Attachment #8717153 - Flags: review?(bzbarsky)
'mutable' version
Attachment #8716796 - Attachment is obsolete: true
Attachment #8717154 - Flags: review?(bzbarsky)
(In reply to Boris Zbarsky [:bz] from comment #19)
Sure, I'll fix that typo when I land.
Comment on attachment 8717153 [details] [diff] [review]
part 2 - Optimize nsRange::ExcludeNonSelectableNodes by counting ignorable whitespace text nodes next to an unselectable node as unselectable too.

This helps a bit, yes.  But now I'm trying to understand the actual condition used in ExcludeIfNextToNonSelectable.  NS_CREATE_FRAME_IF_NON_WHITESPACE makes sense, but why are we also checking for aContent->TextIsOnlyWhitespace()?  Certainly simply being whitespace does not mean that a node is _ignorable_ whitespace...

Worth adding tests using "white-space: pre" and the like that would have caught this.
Attachment #8717153 - Flags: review?(bzbarsky) → review-
Comment on attachment 8717154 [details] [diff] [review]
part 3 - Cache the result of IsSelected() for the duration of painting.

r=me
Attachment #8717154 - Flags: review?(bzbarsky) → review+
Oops, fixed that and added tests.
Attachment #8716792 - Attachment is obsolete: true
Attachment #8717153 - Attachment is obsolete: true
Attachment #8717239 - Flags: review?(bzbarsky)
Comment on attachment 8717239 [details] [diff] [review]
part 2 - Optimize nsRange::ExcludeNonSelectableNodes by counting ignorable whitespace text nodes next to an unselectable node as unselectable too.

>+  checkText("aaaa\n\nbbbb\n", e);

That's a bit odd.  I would have expected three \n between "aaaa" and "bbbb", and nothing after the last 'b'.  r=me with a decent explanation of what's going on here.
Attachment #8717239 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] from comment #29)
> That's a bit odd.  I would have expected three \n between "aaaa" and "bbbb",
> and nothing after the last 'b'.  r=me with a decent explanation of what's
> going on here.

Yeah, the serialization result doesn't seem right, but that seems like an orthogonal problem.
I filed bug 1247799 about it (and made a note in the test).
Flags: in-testsuite+
Looks like minor reftest failures due to anti-aliasing.  I'm pretty sure it's unrelated,
so I'll add some fuzzy-if annotations and re-land.
Comment on attachment 8717154 [details] [diff] [review]
part 3 - Cache the result of IsSelected() for the duration of painting.

Review of attachment 8717154 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsTextFrame.cpp
@@ +4840,5 @@
>    if (((GetStateBits() & TEXT_NO_RENDERED_GLYPHS) ||
>         (NS_GET_A(StyleColor()->mColor) == 0 && !StyleText()->HasTextShadow())) &&
> +      aBuilder->IsForPainting() && !IsSVGText()) {
> +    isSelected.emplace(IsSelected());
> +    if (!isSelected) {

Hmmm, this is equal to "isSelected.isNothing()", and since there is already an "isSelected.emplace()" immediately above, it is actually an always-false condition.

But this whole optimization could be wrong consider animation, visited style, and -webkit-text-fill-color (because of the naive NS_GET_A(StyleColor()->mColor) check). I wonder whether we should fix it or just remove this optimization.
Thoughts?
Flags: needinfo?(mats)
Good catch though, thanks.

My thoughts right now are that I f*cking hate "operator bool" on Maybe<>.  :-)
Seriously, it's error prone and makes the code harder to understand.
We should just remove it and require explicit .isNothing()/.isSome() everywhere.

Regarding the optimization, it was added by roc with the motivation in:
https://bugzilla.mozilla.org/show_bug.cgi?id=1099977#c5
The TEXT_NO_RENDERED_GLYPHS part seems legit, the "mColor == 0 ..." part
seems a bit fishy though, we might want to remove that bit.
Flags: needinfo?(mats)
Attached patch fixSplinter Review
Let's fix the typo for now and handle any change to the optimization
as a separate bug.
Attachment #8736107 - Flags: review?(quanxunzhen)
Looking at his patch though, it seems the "mColor == 0" part is older:
https://bugzilla.mozilla.org/attachment.cgi?id=8523618&action=diff#a/layout/generic/nsTextFrame.cpp_sec4
That "mColor == 0" part is from bug 1054161 and GetVisitedDependentColor was discussed in
bug 1054161 comment 2 / 3 so it's probably OK.
Comment on attachment 8736107 [details] [diff] [review]
fix

Review of attachment 8736107 [details] [diff] [review]:
-----------------------------------------------------------------

Although neither you nor me is a peer... but I guess it is probably fine for such simple patch :)
Attachment #8736107 - Flags: review?(quanxunzhen) → review+
(In reply to Mats Palmgren (:mats) from comment #41)
> That "mColor == 0" part is from bug 1054161 and GetVisitedDependentColor was
> discussed in
> bug 1054161 comment 2 / 3 so it's probably OK.

Oh, okay, although I don't think the reason behind doing so still makes any sense, given bug 884270.
If I understand bug 1054161 comment 3 correctly, it means :visited styles can't affect this,
but feel free to file a bug if you see some issue.
(In reply to Mats Palmgren (:mats) from comment #44)
> If I understand bug 1054161 comment 3 correctly, it means :visited styles
> can't affect this,

Yes. I mean, we do that explicitly because we want to avoid timing attack to user's history via visited style. But given new timing attack has been found, removing that part of code wouldn't make anything worse :) But that should be in another bug even if we do want to do something there. I didn't mean to ask you adding anything to your patch. Sorry if that comment confused you.
Depends on: 1327902
See Also: → 1396667
You need to log in before you can comment on or make changes to this bug.