Closed Bug 190147 Opened 22 years ago Closed 2 years ago

Setting the value of a textarea is very slow

Categories

(Core :: Spelling checker, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: sfraser_bugs, Unassigned)

References

(Depends on 2 open bugs, Blocks 4 open bugs, )

Details

(Keywords: perf, testcase)

Attachments

(3 files, 1 obsolete file)

Running JS that repeatedly sets the value of a textarea causes the browser to
hang for seconds or minutes. See the URL for a testcase.
I have a sampler trace, which I'll attach shortly. Here's a summary (numbers are
sample counts). Here's a reduced summary:

1075 nsHTMLTextAreaElement::SetValue(nsAString const&)
  1075 nsHTMLTextAreaElement::SetValueInternal(nsAString const&,
nsITextControlFrame*)
    1075 nsTextControlFrame::SetProperty(nsIPresContext*, nsIAtom*, nsAString
const&)
      1075 nsTextControlFrame::SetValue(nsAString const&)
        978 nsPlaintextEditor::InsertText(nsAString const&)
          779 nsTextEditRules::WillDoAction(nsISelection*, nsRulesInfo*, int*, int*)
            779 nsTextEditRules::WillInsertText(int, nsISelection*, int*, int*,
nsAString const*, nsAString*, int)
              339 nsPlaintextEditor::CreateBRImpl(nsCOMPtr<nsIDOMNode>*, int*,
nsCOMPtr<nsIDOMNode>*, short)
              249 nsPlaintextEditor::DeleteSelection(short)
              182 nsEditor::InsertTextImpl(nsAString const&,
nsCOMPtr<nsIDOMNode>*, int*, nsIDOMDocument*)

          195 nsEditor::EndPlaceHolderTransaction()
            195 nsEditor::EndUpdateViewBatch()
              130 PresShell::EndReflowBatching(int)
              65 nsViewManager::EndUpdateViewBatch(unsigned)

        56 nsTextInputSelectionImpl::SelectAll()
          56 nsSelection::SelectAll()
            56 nsSelection::TakeFocus(nsIContent*, unsigned, unsigned, int, int)
            
        37 nsTextControlFrame::GetValue(nsAString&, int)
          37 nsPlaintextEditor::OutputToString(nsAString const&, unsigned,
nsAString&)
            35 nsDocumentEncoder::EncodeToString(nsAString&)

53 nsHTMLTextAreaElement::GetValue(nsAString&)
  53 nsHTMLTextAreaElement::GetValueInternal(nsAString&, int)
    53 nsTextControlFrame::GetValue(nsAString&, int)
      53 nsPlaintextEditor::OutputToString(nsAString const&, unsigned, nsAString&)
        52 nsDocumentEncoder::EncodeToString(nsAString&)
so.. most of that time is in the editor core, no?  Are we calling it too much,
or does it just need to be faster?
Attached file Testcase
This testcase shows that we have O(n^2) stuff going on when appending to
textareas. Timings that I got:

Iters.	Time (secs)
10	0.095
20	0.248
50	1.072
100	3.578
200	12.632
400	48.266
Modifying the testcase to accumulate text with newlines into a string, and to
set the value of the textarea once, gives similar non-linear performance. There
are 2  parts of the sample that look suspicious:

1. nsBlockFrame::AddFrames gets called after content has been inserted. It uses
   nsLineBox::RFindLineContaining to find the nsLineBox and index of the
preceding
   sibling
[bad commit)

Modifying the testcase to accumulate text with newlines into a string, and to
set the value of the textarea once, gives similar non-linear performance. There
are 2  parts of the sample that look suspicious:

1. nsBlockFrame::AddFrames gets called after content has been inserted. It uses
   nsLineBox::RFindLineContaining to find the nsLineBox and index of the
   preceding sibling frame of the frames to be inserted. Before we reflow, all
   of the newly inserted frames are in a single nsLineBox, so 
   RFindLineContaining ends up spending a lot of time in nsLineBox::IndexOf. As
   the comment says, this can be O(N^2).

2. There are 2 codepaths that call nsGenericHTMLContainerElement::IndexOf,
   which spends a lot of time in nsVoidArray::IndexOf. They are nsEditor::
   GetNodeLocation, and PresShell::GetPrimaryFrameFor (called from 
   nsCSSFrameConstructor::FindPreviousSibling).

*** Bug 201363 has been marked as a duplicate of this bug. ***
I don't see this mentioned, but the same problem happens when pasting large
amounts of text into a textarea via CTRL-V or rmb->paste. This same operation
with the 'other' browser is very performant.
*** Bug 204206 has been marked as a duplicate of this bug. ***
I've created two test cases that may shed some light on this bug. It appears
that the slow-down is related not to the amount of data but the number of lines
in the textarea. I've created two sample pages, each of 250k bytes. One loads in
about 2 seconds while the other takes over one minute to load. The slow loading
one has a large number of very short lines while the fast loading has a smaller
number of long lines. Both IE and Konqueror load both pages quickly. 

 http://rodan.ncc.com/moz/fast.html
 http://rodan.ncc.com/moz/slow.html

Hopefully we can get this tracked down and fixed soon as it's making Mozilla
almost unusable for some pages on our intranet!
Keywords: testcase
OS: MacOS X → All
Hardware: Macintosh → All
I wonder if the fix for bug 52005 helped here?
No, not really.  Comment 6 sums up the situation here.  See also bug 237735
comment 4 and bug 237735 comment 6 for more detailed discussion of the same,
including discussion of possible fixes.

We should probably split this issue up into two bugs -- one on making blocks
handle <br> more gracefully and one on editor not being quite so silly about
what it's doing.... (this bug and bug 237735 would probbaly work fine as those
two bugs).
Depends on: 237735
Is the test case from <a
href="http://bugzilla.mozilla.org/show_bug.cgi?id=190147#c10">comment 10</a>
still needed? I've had it up for about 4 months and haven't seen any sign that
anyone has accessed it. I wasn't sure if that was just because no one is working
on a fix for this problem or because the test case turned out not to be useful.
If it's not needed I'll remove it...
The testcase is absolutely useful.  It'd be even better attached to the bug, if
it can be run from Bugzilla...

Note that the fix for bug 240291 probably helped a lot with the first issue
described in comment 6.  So this is probably worth retesting in a current build.
*** Bug 309369 has been marked as a duplicate of this bug. ***
with linux seamonkey trunk 2005092505 and 1.3 (roughly same time as comment 4)

     trunk    1.3
10   0.038   0.042
20   0.068   0.134
50   0.346   0.586
100  1.199   1.954
200  4.766   6.823
400 21.623  25.489

both are gtk1.  trunk gtk2 is a bit faster, but the scaling is the same.
Yeah, indeed.  Both of the things Simon saw in comment 6 are now better, but
editor still creates a number of nodes here that is proportinal to the string
length (lots of text nodes and lots of <br> nodes).  The testcase sets the value
N times, to strings of average length of O(N).  Since the editor does O(length
of string) word on each set, we get overall O(N^2) scaling.

Conclusion: we really need to fix bug 240933 if we're going to have a hope of
making Simon's testcase not scale as O(N^2).  That will also help with the fact
that R. Steven's testcase scales as O(N) in the number of carriage returns in
the original value.
Depends on: 240933
*** Bug 319988 has been marked as a duplicate of this bug. ***
Blocks: 399651
See also bug 83841, similar bug report for <input>.
Depends on: 511812
Assignee: layout.form-controls → nobody
QA Contact: tpreston → layout.form-controls
Is this still a problem.
On trunk FF running the test with parameter 400 takes about 0.26 second here.

bug 240933 is still open, but textarea.value+= should be reasonable fast
nowadays.
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.4) Gecko/20100503 Firefox/3.6.4

Running test from comment #4, 400 iterations, Safe Mode:

Run:
1st - 36.661 s
2nd - 54.78 s
3rd - 75.642 s
Carlos, it really doesn't matter what the behavior of 3.6.x is.  What matters is the trunk behavior, in terms of answering the question asked in comment 21.
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.3a5pre) 
Gecko/20100504 Minefield/3.7a5pre
 
1000 Iterations ==>  Time 1.654 seconds
 200 Iterations ==>  Time 0.096 seconds


Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.5pre) 
Gecko/20100508 Namoroka/3.6.5pre 

 200 Iterations ==>  Time 6.281 seconds

IE 7

 200 Iterations ==>  Time 0.385 seconds
 900 Iterations ==>  Time 7.725 seconds

Conclusion:- Definitely better... 

But when editing large text in TEXTAREA, 
it is slower than same text in Notepad.exe
> But when editing large text in TEXTAREA, it is slower than same text in
> Notepad.exe

Sure.  That's not this bug.
With bug 240933 fixed:

100 iter: 0.014
200 iter: 0.03
400 iter: 0.079
800 iter: 0.238
1000 iter: 0.354
2000 iter: 1.207

Shall we call this fixed?
So it's still quadratic but with a much much smaller constant, right?  I just checked Safari, and that's also quadratic, but the constant is about 15x less than us.  Ehsan, do you want to file a followup bug to profile things in the new world and cc me?  I'm happy to do it when I get back.
(In reply to comment #27)
> So it's still quadratic but with a much much smaller constant, right?  I just
> checked Safari, and that's also quadratic, but the constant is about 15x less
> than us.  Ehsan, do you want to file a followup bug to profile things in the
> new world and cc me?  I'm happy to do it when I get back.

Why file a follow-up bug, and not just use this one?
Fair.  I'll do some profiling.
Depends on: 596496
Depends on: 596497
Depends on: 596501
OK, so 60+% of the time was spent getting the value for constraint validation.  We've now filed bug 596496, bug 596497, bug 596501 for things that look like obvious issues there.

Reprofiling with that constraint validation check commented out, I see about 9% of our time spent in vm_fault in the kernel (this is on Mac).  Most of the rest of the time is under nsTextEditorState::SetValue.  If I focus on that, so that all following percentages are percentages of time under SetValue:

 9% nsTextControlFrame::SelectAllOrCollapseToEnd (removing all ranges from the
    selection, adding a range to the selection, creating new range objects,
    etc).
 4% ContentStatesChanged notification.  I wonder whether this would be faster
    if we had a content state cache so we knew that nothing has actually
    changed here.
 5% StringBeginsWith called from SetValue directly.
 4% nsAString::Equals called from SetValue directly (I assume this is to
    compute the boolean passed to the ValueSetter constructor, since doing that
    in fact involves walking the whole string every time).
 3% FindChar called from SetValue directly.  Maybe the StringBeginsWith
    optimization should come earlier and restrict the scan for \r to the string
    tail?
 2% nsCxPusher::PushNull
 2% assigning to strings
68% under nsPlaintextEditor::InsertText.

The time under editor, still as a fraction of SetValue time, is something like:

 5% SpellCheckAfterEditorChange
 8% nsTextEditRules::CollapseSelectionToTrailingBRIfNeeded

(both called from nsTextEditRules::AfterEdit),

 5% ContentStatesChanged(!).  We call nsHTMLTextAreaElement::OnValueChanged
    from both nsTextInputListener::EditAction _and_ from
    nsTextEditorState::SetValue directly.  That seems unfortunate.
 6% nsTypedSelection::Collapse called from InsertTextImpl.  selectFrames
    has these XPCOM creations of content iterators...
 3-4% in various transaction manager gunk
30% under nsGenericDOMDataNode::SetTextInternal.

This last is almost entirely calls to CharacterDataChanged and CharacterDataWillChange.... And a lot of the code for those is self code.  18% of total time is self time in those two functions.  The remainder seems to be nsRange::CharacterDataChanged (for the selection range), the non-virtul thunk to said nsRange function, the actual presshell notification, etc.

Looking into where the self time is in the nodeutils notification functions.
Looks like Shark just blames it all on IMPL_MUTATION_NOTIFICATION, which makes sense but is sad.

I'll try inlining it and seeing whether that tells me anything interesting.
Looks like all of that "self" time is on this instruction:

  mov rcx, r14

(this is a 64-bit build, obviously, and that's Intel asm, not AT&T).  Shark blames that line on the actual obs->CharacterDataChanged call (and in particular, the write to rcx is part of this assembly:

  mov  rax, qword ptr [rdi]
  mov  rcx, r14
  mov  rdx, r13
  mov  rsi, r12
  call qword ptr [rax+32]

I'm pretty sure that rdi has the observer pointer in it when we get to this code.  So this code is putting the vtable pointer into rax, then putting the three arguments (doc, aContent, aInfo) into rcx/rdx/rsi, then calling CharacterDataChanged.

Now what confuses me is why Shark is blaming that mov into rcx.  My best guess is that we actually end up with an L2 miss on [rdi] (looking up the vtable pointer of the observer), but the processor doesn't catch on until the following mov instruction.

).  The only place r14 is written to is that we save "rsi" into it on entry into nsNodeUtils::CharacterDataChanged.
So... I was profiling with 2000 iterations.  If I add a printf to dump out the length of the observer array in CharacterDataChanged, then I see things like:

ObserverCount: 3998
ObserverCount: 5
ObserverCount: 1
ObserverCount: 4000
ObserverCount: 5
ObserverCount: 5
ObserverCount: 5

So I stuck a breakpoint in ~nsRange, and it looks like we get a bunch of nsRange destructors of the event loop after we finish the testcase run.  The stack to those destructors is like so:

#0  nsRange::~nsRange (this=0x1035a6f00) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/base/src/nsRange.cpp:226
#1  0x0000000100400c00 in nsRange::Release (this=0x1035a6f00) at /Users/bzbarsky/mozilla/vanilla/mozilla/content/base/src/nsRange.cpp:239
#2  0x0000000100c8b2f1 in ~mozInlineSpellStatus [inlined] () at /Users/bzbarsky/mozilla/vanilla/mozilla/extensions/spellcheck/src/mozInlineSpellChecker.cpp:499
#3  0x0000000100c8b2f1 in ~mozInlineSpellStatus [inlined] () at /Users/bzbarsky/mozilla/vanilla/mozilla/extensions/spellcheck/src/mozInlineSpellChecker.h:65
#4  0x0000000100c8b2f1 in mozInlineSpellResume::~mozInlineSpellResume (this=0x1208317b0) at /Users/bzbarsky/mozilla/vanilla/mozilla/extensions/spellcheck/src/mozInlineSpellChecker.cpp:499
#5  0x0000000100d5dfd9 in nsRunnable::Release (this=0x1208317b0) at nsThreadUtils.cpp:55

So every value set we try to start spellcheck, and every single spellcheck start attempt holds on to a separate range, all of which are observing the textarea.  I have no idea why we do all that, but can we just do all the spellcheck gunk async?  There's no reason to do it under the value setter, imo.
Oh, and Andreas, I cced you because the range destructor removes it from the observer list, and we really would much prefer eager finalization for things like that, because otherwise observer lists can get very long; this is in reference to our conversation earlier today.
Attached file textarea_speed.html (obsolete) —
Actually setting TEXTAREA is no longer slow than Safari.
It is the reading TEXTAREA that is slow.
See attachment I have modified attachment 112338 [details]
to provide multiple options.

Try it on Safari and Minefield with 1000 iteration.
You will see same speed when 
"Read/write Test instead of writeonly test" is UNCHECKED

but for Read/write we are very slow
That's not a useful test of setting performance.  It sets the value to the same thing over and over when the "read/write" checkbox is not checked; we detect that and bail out from the value setter early without doing any actual work.
Attached file textarea_speed.html
(In reply to comment #36)
> It sets the value to the same thing over and over

FIXED that...
Attachment #475766 - Attachment is obsolete: true
Yep.  And most of comment 35 no longer applies, right?
(In reply to comment #38)
> Yep.  And most of comment 35 no longer applies, right?

Correct, attachment 476164 [details] proved setting textarea value is still slow
(In reply to comment #30)
> Reprofiling with that constraint validation check commented out, I see about 9%
> of our time spent in vm_fault in the kernel (this is on Mac).  Most of the rest
> of the time is under nsTextEditorState::SetValue.  If I focus on that, so that
> all following percentages are percentages of time under SetValue:
> 
>  9% nsTextControlFrame::SelectAllOrCollapseToEnd (removing all ranges from the
>     selection, adding a range to the selection, creating new range objects,
>     etc).
>  4% ContentStatesChanged notification.  I wonder whether this would be faster
>     if we had a content state cache so we knew that nothing has actually
>     changed here.
>  5% StringBeginsWith called from SetValue directly.

I wonder if bug 518122 is now hurting us more than it's helping us...  This, together with FindChar, and some observer stuff, among others, are coming from that bug.

>  5% SpellCheckAfterEditorChange
>  8% nsTextEditRules::CollapseSelectionToTrailingBRIfNeeded
> 
> (both called from nsTextEditRules::AfterEdit),
> 
>  5% ContentStatesChanged(!).  We call nsHTMLTextAreaElement::OnValueChanged
>     from both nsTextInputListener::EditAction _and_ from
>     nsTextEditorState::SetValue directly.  That seems unfortunate.

Does that mean that we pay the cost of looping over document observers twice?

>  6% nsTypedSelection::Collapse called from InsertTextImpl.  selectFrames
>     has these XPCOM creations of content iterators...

I couldn't parse this one...

> This last is almost entirely calls to CharacterDataChanged and
> CharacterDataWillChange.... And a lot of the code for those is self code.  18%
> of total time is self time in those two functions.  The remainder seems to be
> nsRange::CharacterDataChanged (for the selection range), the non-virtul thunk
> to said nsRange function, the actual presshell notification, etc.

What are the observers that need to get notified here?  I assume one of them is the spell checker.  Are there others?
(In reply to comment #32)
> Now what confuses me is why Shark is blaming that mov into rcx.  My best guess
> is that we actually end up with an L2 miss on [rdi] (looking up the vtable
> pointer of the observer), but the processor doesn't catch on until the
> following mov instruction.

Which maps to "kill the observers, and we win!", right?  :-)
(In reply to comment #34)
> Oh, and Andreas, I cced you because the range destructor removes it from the
> observer list, and we really would much prefer eager finalization for things
> like that, because otherwise observer lists can get very long; this is in
> reference to our conversation earlier today.

I'm not really sure what this comment means.  I think what happens here is <http://mxr.mozilla.org/mozilla-central/source/content/base/public/nsINode.h#766>, right?  Which should remove the observer reagerly...
> I wonder if bug 518122 is now hurting us more than it's helping us...

Hard to say.  Worth testing....

> Does that mean that we pay the cost of looping over document observers twice?

The document ones, yes.  But there aren't many of those...  The primary cost here is good old "check whether this change affects our styles" stuff.  As I said, I have some thoughts on speeding it up.

> What are the observers that need to get notified here?

Ranges.  Thousands of ranges, all created by the spellchecker (2 ranges per value set operation, I think) and kept alive until the next time we hit the event loop.  See comment 33.

> Which maps to "kill the observers, and we win!", right?  :-)

Yep.  ;)

> I'm not really sure what this comment means.

That comment was about the fact that this situation is one that could be hurt by destructors being called more lazily (something that Andreas was considering in the context of moving from refcounting to GC for XPCOM objects).

> I think what happens here is

The whole point is that the RemoveMutationObserver call happens from ~nsRange.  So as long as something is keeping ranges alive (either the spellchecker in this testcase, or a GC system that hasn't run finalizers yet in a possible future XPCOM-with-GC) the ranges will be in the observer list.
(In reply to comment #33)
> So every value set we try to start spellcheck, and every single spellcheck
> start attempt holds on to a separate range, all of which are observing the
> textarea.  I have no idea why we do all that, but can we just do all the
> spellcheck gunk async?  There's no reason to do it under the value setter, imo.

We're doing the spell checking async, but I think the problem here is that every ScheduleSpellCheck call posts a single event <http://mxr.mozilla.org/mozilla-central/source/extensions/spellcheck/src/mozInlineSpellChecker.cpp#1175>, so if we do 1000 value sets on the textarea without an event loop, we're going to have to deal with 1000 spell check events.

My hunch is that the spell checker event should be an nsRevocableEventPtr which we revoke each time we want to post a new one, but I'm not sure if spell checker can handle that or not.
Depends on: 598823
(In reply to comment #43)
> > I wonder if bug 518122 is now hurting us more than it's helping us...
> 
> Hard to say.  Worth testing....

Filed bug 598823 for that.

> > Does that mean that we pay the cost of looping over document observers twice?
> 
> The document ones, yes.  But there aren't many of those...  The primary cost
> here is good old "check whether this change affects our styles" stuff.  As I
> said, I have some thoughts on speeding it up.

Is there a bug on that?

> > What are the observers that need to get notified here?
> 
> Ranges.  Thousands of ranges, all created by the spellchecker (2 ranges per
> value set operation, I think) and kept alive until the next time we hit the
> event loop.  See comment 33.

So, comment 44 should help there, right?

> > I'm not really sure what this comment means.
> 
> That comment was about the fact that this situation is one that could be hurt
> by destructors being called more lazily (something that Andreas was considering
> in the context of moving from refcounting to GC for XPCOM objects).

I see.

> > I think what happens here is
> 
> The whole point is that the RemoveMutationObserver call happens from ~nsRange. 
> So as long as something is keeping ranges alive (either the spellchecker in
> this testcase, or a GC system that hasn't run finalizers yet in a possible
> future XPCOM-with-GC) the ranges will be in the observer list.

But we don't have any other way to mark a range as "no longer needed", do we?
> but I think the problem here is that every ScheduleSpellCheck call posts a
> single event

Yes, exactly.  And each event has its own range, etc.

Revoking the event won't help, since the event queue holds a strong ref to the event, right?  Unless the event revoke will drop the ranges?

A more interesting question: once we have one event up, what happens if we just don't post any more?
(In reply to comment #46)
> > but I think the problem here is that every ScheduleSpellCheck call posts a
> > single event
> 
> Yes, exactly.  And each event has its own range, etc.
> 
> Revoking the event won't help, since the event queue holds a strong ref to the
> event, right?  Unless the event revoke will drop the ranges?

What's holding on to these ranges is mozInlineSpellStatus, which is owned by mozInlineSpellResume.  If we use a revocable event, mozInlineSpellResume can release everything held by mozInlineSpellStatus (by calling a fictional mozInlineSpellStatus::CleanUp method which should be very easy to implement), which should mean that we release the ranges in time.  Or am I missing something?

> A more interesting question: once we have one event up, what happens if we just
> don't post any more?

Then we'll be spell checking using the wrong range, which would mean that if you do:

textarea.value = "My name is ";
textarea.value += "ehsan";

Then "ehsan" wouldn't be marked as a mis-spelling (well, it's not, but that's another issue!).  Of course, I haven't tested this, but I suspect that's what happens, given what I remember from the last time that I had to read this code, which was about a week ago I think.  There may be better ways to fix this, but I honestly don't know the spellchecker code enough to judge.
> Or am I missing something?

So the point is that revoking the event would call some method on mozInlineSpellResume?  That should work, yes.

> Then we'll be spell checking using the wrong range

Can we simply modify the existing range to include the new stuff?  Ranges don't seem to have a way to "union" them, but we could create one... for the particular case when both endpoint parents are the same textnode and the same across both ranges (which is where we are here), it's dead easy to union.  So we could just add some special-case code in spellcheck to do that, else post a separate event.
> Is there a bug on that?

Filed bug 598833.

> So, comment 44 should help there, right?

Yes, if we drop the ranges...

> But we don't have any other way to mark a range as "no longer needed", do we?

We actually do.  We can call Detach() on it.
Why we are triggering spellcheck when we set textarea.value programmatically.
I feel we should only trigger when there is user input.
Additionally we could give a method Node.doSpellCheck() and/or HTMLRange.doSpellCheck() for programmers to call spellcheck programmatically.

Also spellcheck should run only when JavaScript is not running
(In reply to comment #50)
> Why we are triggering spellcheck when we set textarea.value programmatically[?]

A good question. Anyone think that we *should* spell-check for value sets from JS?

/be
We spell-check default values (and the UI folks support that, though I've argued against it).  Also, a bunch of places do stuff like catching key events for a textarea, canceling them, then modifying the value themselves to "account for the key".  We'd want spell-checking to work there.

Again, spellchecking per se is not the problem here.  The dumb implementation is.
(In reply to comment #48)
> > Or am I missing something?
> 
> So the point is that revoking the event would call some method on
> mozInlineSpellResume?  That should work, yes.

Yes, that's the idea.

> > Then we'll be spell checking using the wrong range
> 
> Can we simply modify the existing range to include the new stuff?  Ranges don't
> seem to have a way to "union" them, but we could create one... for the
> particular case when both endpoint parents are the same textnode and the same
> across both ranges (which is where we are here), it's dead easy to union.  So
> we could just add some special-case code in spellcheck to do that, else post a
> separate event.

What would the benefit of this be over just revoking the old event and posting a new one?
Depends on: 599074
Filed bug 599074 on trying to use revocable spell check events.
> What would the benefit of this be over just revoking the old event and posting
> a new one?

Ideally, not having to create 2000 range objects (as well as 2000 event objects).  The creation costs show up nicely in the profile too.
Depends on: 602151
Blocks: 667412
Blocks: 561578
Depends on: 682052
Blocks: 1285982
Depends on: 1300293

Firefox: 8450 ms to add 100 textareas, 367 characters each
Chrome: 140 ms seconds

<body>
<p id=p>...</p>
<script>
  const props = {
    value: document.body.textContent.trim(),
    spellcheck: false,
  };
  const makeText = () => Object.assign(document.createElement('textarea'), props);
  const t0 = performance.now();
  document.body.append(...Array(100).fill(0).map(makeText));
  setTimeout(() => {
    p.textContent = (performance.now() - t0 | 0) + ' ms';
  });
</script>
</body>

Typo above: "ms seconds" should be "ms"

On my machine both Nightly and Chrome take ~100ms.

I've tried in Windows 10.

On my machine both Nightly and Chrome take ~100ms.

Linux, I guess, as it's also fast for me in FF.

With both iterations and text lines 1000, Firefox takes about 150 milliseconds while Chrome takes about 40 milliseconds. Quite a difference but I guess it's negligible in common cases.

Kagami, do you think you could capture a performance profile for this?

Flags: needinfo?(krosylight)

Tried a quick profiling with both values 10,000: https://perfht.ml/2yRwDv6

Flags: needinfo?(krosylight)

FWIW using the test case in comment #c56 recorded a profile at https://perfht.ml/2zutPnI which shows that most of those ~9 seconds is spent in nsNativeThemeWin::ThemeSupportsWidget because I have WindowBlinds (even though Firefox.exe is excluded from styling). Evidently, mine is quite an edge case and you can disregard it, but since Chrome and other apps (even styled ones) are unaffected, maybe you could simply cache the theme info at least for the duration of the layout/rendering op...

which shows that most of those ~9 seconds is spent in nsNativeThemeWin::ThemeSupportsWidget

The profile says it took only 1.76% (88ms) of the whole running time, no?

The profile says it took only 1.76% (88ms) of the whole running time, no?

No, AFAICT: if you switch to the flame graph you will see a more obvious rendition of its usage.

I'm looking at that graph and it never exceeds 2% of the running time.

I'm looking at that graph and it never exceeds 2% of the running time.

Sounds like you're looking at a single node in the flame graph or at the pure numbers in call stack mode, which seem to be omitting the overhead of loading and calling an extra dll. If you look at the whole graph you will see the gray areas (dll loader) are occupying most of each pink (native Firefox) block's length. Naturally, none of this is happening in an OS without WindowBlinds.

Ah, now I see, sorry!

I tried the same repro from #c56 on my machine and definitely can see the difference: https://perfht.ml/3aJQucw

Based on the profiler result, a workaround for test case in https://bugzilla.mozilla.org/show_bug.cgi?id=190147#c56 is to set widget.disable-native-theme-for-content to true in about:config. This will prevent Firefox from querying WindowBlinds dll and the test case will take just ~100ms to complete on my machine, same as Chrome.

The code seems that it should be cached, though: https://searchfox.org/mozilla-central/rev/158bac3df3a1890da55bdb6ffdaf9a7ffc0bfb0a/widget/windows/nsUXThemeData.cpp#55

I guess somehow OpenThemeData returns null every time in your case and thus saves no cache.

to set widget.disable-native-theme-for-content to true in about:config.

Doing so will call nsNativeBasicTheme::ThemeSupportsWidget instead of nsNativeThemeWin::ThemeSupportsWidget (by EnsureTheme() which reads that option). Thanks for the info.

Tested comment 56 today's nightly on linux, and now Nightly is constantly faster than Chrome. 60 vs 100ms or so
If I change that the count in test from 100 to 10000, Nightly is ~3500, chrome ~8500.
This is on Linux.
We should aim for similar on Windows :)

Component: Layout: Form Controls → Widget: Win32

I guess the Windows theme thing is a separate issue as it's still slower on Windows without it. :woxxom, do you mind opening a new issue?

Flags: needinfo?(woxxom)
Flags: needinfo?(woxxom)

ok, and the number I get are different to what Kagami sees because we've been running different test cases. I've been using the one from comment 56 and Kagami textarea_speed.html

No longer depends on: 1633718
See Also: → 1633718

Changing component since this has been narrowed down to the spellchecker.

Component: Widget: Win32 → Spelling checker

Hi Kagami, is there still something to investigate under windows here?

IIUC, the original scope "very slow" of this bug has been addressed a long time ago and the special case from bug 1633718 is fixed, too. To me it seems, that if there is still something to do we should have a new bug for it.

Flags: needinfo?(krosylight)

With both iterations and text lines 1000, Firefox takes about 150 milliseconds while Chrome takes about 40 milliseconds. Quite a difference but I guess it's negligible in common cases.

It's better now, Firefox takes only 2x times more than Chrome does. Still significant, but I have no proof that this can affect actual use cases. I'm closing this, but should I file another one for that perf difference?

Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(krosylight)
Resolution: --- → WORKSFORME

I assume we have enough performance monitoring in place, the difference seems not to bad.

Testing textarea_speed.html:
Locally I see Firefox being faster than Chrome when using 10000 iterations and 100 lines. 19ms vs 43ms.
And 1000 iterations with 1000 lines they have the same performance. (16-18ms).
That is on Linux.

On Windows 11, 10000 iteration, 100 lines, Firefox 14ms, Edge 24ms
1000/1000, Firefox 8ms, Edge 8ms

Kagami, I'm still puzzled how you get so bad numbers on Windows.
Could you perhaps create a new performance profile?

Flags: needinfo?(krosylight)

testare_speed is okay, I mean comment #56.

Flags: needinfo?(krosylight)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: