Closed Bug 496360 Opened 11 years ago Closed 5 years ago

[TSF] Severe performance problems pasting into contenteditable editor or designMode editor when TSF is enabled

Categories

(Core :: Editor, defect, P2)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED
mozilla34
Tracking Status
status1.9.2 --- wontfix

People

(Reporter: chado_moz, Assigned: masayuki)

References

()

Details

(Keywords: inputmethod, perf, regression)

Attachments

(8 files, 18 obsolete files)

701 bytes, patch
roc
: review+
Details | Diff | Splinter Review
1.60 KB, text/html
Details
8.25 KB, patch
Details | Diff | Splinter Review
67.17 KB, patch
Details | Diff | Splinter Review
8.76 KB, patch
Details | Diff | Splinter Review
7.86 KB, patch
smaug
: review+
Details | Diff | Splinter Review
29.51 KB, patch
Details | Diff | Splinter Review
43.87 KB, patch
smaug
: review+
Details | Diff | Splinter Review
User-Agent:       Opera/9.64 (Windows NT 5.1; U; ja) Presto/2.1.1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090604 Minefield/3.6a1pre  buildID: 20090604042555

pasting large text into the textarea, it takes longer time until 
the pasted text get visible.


Reproducible: Always

Steps to Reproduce:
1. enable JavaScript.
2. open the testcase.
3. click any of links above making sample text lines.
   500 lines should be better at first.
4. make sure that the textarea has focus.
5. ctrl + A to select text all.
6. ctrl + X to hold all text in clipboad.
   also, this makes textarea clear.
7. ctrl + V to paste holding text, and
   check the time for pasted text appears in the textarea.

Actual Results:  
takes 7 sec for 500 lines, and 60 sec for 1,000 lines 
on my WinXP machine.


Expected Results:  
pasted text appears in a moment.
non-reproducible builds take just a few seconds for 10,000 lines.


repro/non-repro builds I tried:
fine: Firefox/3.0.10      buildID: 2009042316
fine: Firefox/3.5b4       buildID: 20090423204732
fine: Shiretoko/3.5pre    buildID: 20090604045218 
fine: SeaMonkey/2.0b1pre  buildID: 20090531001407
fine: Minefield/3.2a1pre  buildID: 20090301035004 
     (snip)
fine: Minefield/3.6a1pre  buildID: 20090504044532 
slow: Minefield/3.6a1pre  buildID: 20090505041741 
     (snip)
slow: Minefield/3.6a1pre  buildID: 20090604042555 

on Linux (Ubuntu 9.04 on VM Guest), looks working fine.
no Mac around of me.
Keywords: regression
Attached file testcase (obsolete) —
Keywords: perf
Version: unspecified → Trunk
Duplicate of this bug: 496555
It happens only if there are a lot of lines, if you paste the same amount of characters but in 1 line it works fine (or at least better)
Confirming the regression range given in comment #0 using the attached testcase.

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=af694bc8771e&tochange=2a82ba52ed30

Arpad, is it possible that the patch for bug 474369 caused this?
Status: UNCONFIRMED → NEW
Component: General → Editor
Ever confirmed: true
Product: Firefox → Core
QA Contact: general → editor
Looking through my patch again, I can't find anything that could have caused a slowdown.
As this seems to only happen on windows, and the testcase has lang="ja" in it, I believe Masayukis IME patch could have something to do with this.
Regression range is as follows by my investigation.(Pen4 1.2GHz,1GB RAM) 
I measure handling of put_sample_lines time() of above testcase.
The investigation was carried out in a new profile.

Fine:
1561 msec/5000lines
http://hg.mozilla.org/mozilla-central/rev/68cfe7fb9f31
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090413 Minefield/3.6a1pre ID:20090413045916

Worse:
4884 msec/5000lines
http://hg.mozilla.org/mozilla-central/rev/68d9acc70491
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2a1pre) Gecko/20090414 Minefield/3.6a1pre ID:20090414100052

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=68cfe7fb9f31&tochange=68d9acc70491
(In reply to comment #5)
> I believe Masayukis IME patch could have something to do with this.

My patch of bug 460059 added an automated test for IME. It should not be the cause.
Please forget comment #6
I am sorry that I disturb.
(In reply to comment #7)
> (In reply to comment #5)
> > I believe Masayukis IME patch could have something to do with this.
> 
> My patch of bug 460059 added an automated test for IME. It should not be the
> cause.

Ah, I forgot a thing.

>      3.1 --- a/widget/src/windows/nsWindow.cpp	Tue May 05 01:01:47 2009 -0400
>      3.2 +++ b/widget/src/windows/nsWindow.cpp	Tue May 05 15:15:23 2009 +0900
>      3.3 @@ -6917,7 +6917,10 @@ NS_IMETHODIMP
>      3.4  NS_IMETHODIMP
>      3.5  nsWindow::OnIMEFocusChange(PRBool aFocus)
>      3.6  {
>      3.7 -  return nsTextStore::OnFocusChange(aFocus, this, mIMEEnabled);
>      3.8 +  nsresult rv = nsTextStore::OnFocusChange(aFocus, this, mIMEEnabled);
>      3.9 +  if (rv == NS_ERROR_NOT_AVAILABLE)
>     3.10 +    rv = NS_OK; // TSF is not enabled, maybe.
>     3.11 +  return rv;
>     3.12  }

I fixed a logistic mistake of bug 88831. The nsIWidget::OnIMEFocusChange should return only NS_OK or NS_ERROR_NOT_IMPLEMENTED, I think. By my patch, it always returns NS_OK on Windows now. If this bug can be reproduced on 20090504044532 with TSF enabled mode ("intl.enable_tsf_support" is true), this is a regression by my patch.

However, if so, we need to find better implementation of nsTextStateManager. Because the behavior is needed by TSF mode.
Looks like that nsTextEditRules::WillInsertText edits many times per one paste command if that has two or more lines. If every inserting BR node and every inserting text of a line notify the text data change, nsTextStateManager generates "flat text content" very many times. It can be the cause of this regression, probably.
Duplicate of this bug: 496823
I don't think the regression range in comment 4 is correct. I'm finding the regression to be in the range mentioned in comment 6.
It is totally another bug.(In reply to comment #11)
> *** Bug 496823 has been marked as a duplicate of this bug. ***

The regression range is  different.
It is totally another bug.
hg bisect points to 11714b1d9aa4 (bug 487449) being the cause. This is on Windows XP, by the way.
Blocks: 487449
Flags: blocking1.9.2?
So... comment 6 and later have a different regression range from comment 4, and also different steps to reproduce, no?  Comment 4 (and comment 0) are talking about the time needed to paste, while comment 6 and on are talking about the time it takes to create the sample text, no?

I can't reproduce the bug as originally filed.  Can anyone else?
Using the testcase it takes 6 seconds to paste 500 lines and 45 seconds to paste 1000 lines. So I see what the reporter sees.
OK.  So I've reopened bug 496823, which was NOT a duplicate of this bug and covers comment 6 and comment 14.

Back to _this_ bug, Brian, thank you for testing!

How are you timing the paste?  Just wall-clock, given the times?  Are you on Windows as well?

What I see on Mac is that 500/1000/2000 lines all paste in <1sec.  5000 lines takes maybe close to a second to paste...  So the bug is certainly not reproducible here.

If this is Windows-only, that sounds like a regression from bug 460059 as described in comment 9.  Can someone with a Windows nightly from 2009-05-04 or before try setting the pref as described in that comment and seeing how things behave?
Blocks: 460059, 88831
No longer blocks: 487449
Brian, I'd also appreciate if you could tell me what times look like for you with 100 lines, and maybe with 750?  Just trying to get a feel for how slow this IME algorithm, or whatever is causing the problem, actually is.  It looks significantly slower than O(N^2) to me...
Summary: to paste large text into the textarea takes long time → Severe performance problems pasting into textareas
I'm on Windows XP.  I'm estimating the times from the clock.  

100 lines is instantaneous.
2000 was about 6 minutes.
I copied then pasted the 100 lines 6 six times to get 700 lines and cut and pasted that which took 16 seconds
I can also confirm this...

I tried copy Word document (about 4MB) to http://translate.google.com/translate_t?hl=en#auto|en|
and Fx hangs...
when I tried with smaller document (about 1MB) it takes ages to paste it (more then 5min)...
but you know I have C2D E6550 2,33GHz@3,22GHz with 2GB RAM on XP SP3...

paste in Opera and Chromium was in instant...
I have an idea for this. I'll post the patch ASAP.
Assignee: nobody → masayuki
Status: NEW → ASSIGNED
Attached file testcase2 (obsolete) —
Here is a little more flexible testcase, semi-auto-measurement 
timer included.  

Time result with this testcase on MY WinXP machine is following.

Lines   Time  T/100Lines
  100    0.1  0.1
  200    0.3  0.2
  300    1.0  0.3
  400    2.2  0.55
  500    4.2  0.84
  600    7.2  1.20
  700   11.3  1.61
  800   16.8  2.10
  900   23.9  2.66
1,000   32.8  3.28
1,200   56.5  4.71
1,500  109.3  7.29
1,600  134.7  8.42

rv:1.9.2a2pre) Gecko/20090808 Minefield/3.6a2pre ID:20090808044747
Attachment #381547 - Attachment is obsolete: true
Flags: blocking1.9.2? → blocking1.9.2+
Priority: -- → P2
fuck, wrong copy pasta ;p

I want to add that this can be problem that Fx want to paste all text with style like font name, size etc
because I see often on forum that Fx use WYSIWYG editor, and Opera or Chromium not using this

maybe this will be some hint
Don't worry, if I cannot get enough time for fixing this bug, I'll backout the patch only from nsWindow.
I planed that I disables the nsIMEStateManager stores the mutation events and combine them, but it's too difficult and complex. Now, I suggest that we should disable nsTextStateManager's observing.

I think I need to change the nsEditor and related code. It's risky for the branch and I don't have enough time for it.
Attachment #404474 - Flags: review?(roc)
(In reply to comment #26)
> I planed that I disables the nsIMEStateManager stores the mutation events and
> combine them, but it's too difficult and complex. Now, I suggest that we should
> disable nsTextStateManager's observing.

I planned that I change nsIMEStateManger to store the mutation events an to combine them, ...

Sorry for the spam.
http://hg.mozilla.org/mozilla-central/rev/61376f9af6aa
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/b4a5561babc8

# Don't close this bug until I fix the actual cause.
Summary: Severe performance problems pasting into textareas → Severe performance problems pasting into textareas when TSF is enabled
Summary: Severe performance problems pasting into textareas when TSF is enabled → [TSF] Severe performance problems pasting into textareas when TSF is enabled
Whiteboard: Fixed on 1.9.2 branch because TSF isn't enabled on it
Whiteboard: Fixed on 1.9.2 branch because TSF isn't enabled on it → WONTFIX on 1.9.2 branch because TSF isn't enabled on it
This shouldn't block 1.9.2, since TSF is not enabled on 1.9.2.
Flags: blocking1.9.2+
no, see comment 29. TSF isn't going to be enabled on 1.9.3 in default settings.
What about 2.0?
This is still existing while TSF enabled.
(intl.enable_tsf_support; true)

Mozilla/5.0 (Windows NT 5.1; rv:2.0b6pre) Gecko/20100912 Firefox/4.0b6pre ID:20100912041924
Masayuki, is it possible by any chance that part of this performance problem is coming from the fact that we inject gazillions of BR nodes when pasting text inside textareas?  Just asking since I plan to land bug 240933 today, which would mean that the BR issue will be history...
That might fix this bug if text nodes are not separated by each line.
I guess this bug can be reproduced on HTML editor too...
(In reply to comment #35)
> That might fix this bug if text nodes are not separated by each line.

So can someone test things with the latest nightly?
(In reply to comment #36)
> I guess this bug can be reproduced on HTML editor too...

Yes, but there's little which can be done there...
(In reply to comment #37)
> So can someone test things with the latest nightly?

As long as I use test case, paste is faster than previous build.  (when I paste 5,000 lines, it spend 3 sec only!)

But, when I select large text lines to copy it, CPU spikes 100%.  This root cause is xul!nsTextStore::GetText.  Since nsTextStore::GetText() always is called from MSCTF, it occurs another performance problem.
(In reply to comment #38)
> (In reply to comment #36)
> > I guess this bug can be reproduced on HTML editor too...
> 
> Yes, but there's little which can be done there...

We don't support box paste, so, I guess that we can cache a computed range of deleting selection at start of the action, and nsTextStateManager should use it and append the length of inserted text during the action. Finally, we should notify a widget the text change only once.

(In reply to comment #39)
> But, when I select large text lines to copy it, CPU spikes 100%.  This root
> cause is xul!nsTextStore::GetText.  Since nsTextStore::GetText() always is
> called from MSCTF, it occurs another performance problem.

Absolutely, it's another bug. I guess that it's a pure performance problem of nsContentEventHandler because nsTextStore::GetText() uses query event only once.
# If the cause is that nsTextStore::GetText() is called many times a coping, we should cache the result and we can fix it.
Depends on: 596507
(In reply to comment #40)

> Absolutely, it's another bug. I guess that it's a pure performance problem of
> nsContentEventHandler because nsTextStore::GetText() uses query event only
> once.
> # If the cause is that nsTextStore::GetText() is called many times a coping, we
> should cache the result and we can fix it.

I filed as Bug 596507.
Well, I checked times through serveral conditions on:
Mozilla/5.0 (Windows NT 5.1; rv:2.0b7pre) Gecko/20100916 Firefox/4.0b7pre ID:20100916041016
 Built from http://hg.mozilla.org/mozilla-central/rev/f38ef1080bfe

Pasting text:
0.1 sec for  1,000 lines (0.1 sec while TSF support disabled)
0.6 sec for  2,000 lines (0.1 sec while TSF support disabled)
3.0 sec for  5,000 lines (0.1 sec while TSF support disabled)
10  sec for 10,000 lines (0.3 sec while TSF support disabled)

I'm not sure if those speed improvements are to be satisfied.
Summary: [TSF] Severe performance problems pasting into textareas when TSF is enabled → [TSF] Severe performance problems pasting into contenteditable editor or designMode editor when TSF is enabled
Whiteboard: WONTFIX on 1.9.2 branch because TSF isn't enabled on it
No longer blocks: 544769
This testcase measures the time to paste and extra work with TSF.

On my environment (i7 3930K, Win8.1, ATOK 2014), it needs about 35 sec for pasting 1000 lines.

Be careful if you test with larger number or slower environment.
Attachment #393379 - Attachment is obsolete: true
Whiteboard: [leave open]
On HTML editor, removing multiply causes a lot of mutation notification because each line is a text node and each line break is a br element.

So, we should notify IME of all changes (text change, selection change and layout change) once per an edit action.

part.1 - 3 may not affect to performance because it depends on what IME does at receiving the notification.

This patch is the most complicated. This makes multiple TextChangeData mergeable. This behavior cannot be tested by automated tests, this patch includes a text mehtod |TestMergingTextChangeData()| which runs at first time editor to get focus only on debug build.

If there is a bug of StoreTextChangeData(), the tests can detect it and make orange a test which sets focus to an editor at first time.
Attachment #8459100 - Flags: review?(bugs)
This allows IMEContentObserver know when focused editor starts to handling an edit action and ends handling it.

This patch might break some addons which implement nsIEditorObserver.
https://mxr.mozilla.org/addons/search?string=nsIEditorObserver

However, I have no better idea.
Attachment #8459102 - Flags: superreview?(bugs)
Attachment #8459102 - Flags: review?(ehsan)
Attachment #8459102 - Flags: review?(bugs)
With this patch, IMEContentObserver never notifies IME of text change, selection change and layout change two or more times per an editor action.
Attachment #8459103 - Flags: review?(bugs)
If these patches are granted, we should optimize each mutation observer methods only while it handles an edit action. I guess that it's the most expensive path of this performance problem.
FYI: These changes must improve the performance of e10s mode because notifying IME needs to cross process boundary. These patches reduce count of it per edit action.
Comment on attachment 8459102 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action

Review of attachment 8459102 [details] [diff] [review]:
-----------------------------------------------------------------

::: editor/libeditor/base/nsEditor.h
@@ +179,5 @@
>    already_AddRefed<nsIPresShell> GetPresShell();
>    already_AddRefed<nsIWidget> GetWidget();
> +  enum NotificationForEditorObservers
> +  {
> +    NOTIFY_EDITOR_OBSERVERS_OF_END,

Nit: please use eCamelCase for enum values.
Attachment #8459102 - Flags: review?(ehsan) → review+
Comment on attachment 8459102 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action

>   /** 
>    * Called after the editor completes a user action.
>    */
>   void EditAction();
>+  /**
>+   * Called when editor starts to handle a user action.
>+   */
>+  void BeginEditAction();
Please add a comment which explains whether this is called during or right before EditAction()
BeforeEditAction() might be a better name.




>+  /**
>+   * Called after BeginEditAction() is called but won't EditorAction() be
>+   * called.
"but EditorAction() won't be called."
Attachment #8459102 - Flags: superreview?(bugs) → superreview+
Comment on attachment 8459103 [details] [diff] [review]
part.3 IMEContentObserver shouldn't dispatch SelectionChangeEvent nor PositionChangeEvent while editor handles an edit action


> IMEContentObserver::IMEContentObserver()
>   : mESM(nullptr)
>   , mIsEditorInTransaction(false)
>+  , mIsSelectionChangeEventPending(false)
>+  , mSelectionChangeCausedByOnlyComposition(false)

...CausedOnlyByComposition

>   int32_t count = 0;
>   nsresult rv = aSelection->GetRangeCount(&count);
>   NS_ENSURE_SUCCESS(rv, rv);
>   if (count > 0 && mWidget) {
>-    nsContentUtils::AddScriptRunner(
>-      new SelectionChangeEvent(this, causedByComposition));
>+    if (mIsSelectionChangeEventPending) {
>+      mSelectionChangeCausedByOnlyComposition =
>+        mSelectionChangeCausedByOnlyComposition && causedByComposition;
>+    } else {
>+      mSelectionChangeCausedByOnlyComposition = causedByComposition;
>+    }

>+  mIsPositionChangeEventPending = true;
>+  FlushMergeableNotifications();
> }
> 
> NS_IMETHODIMP
> IMEContentObserver::Reflow(DOMHighResTimeStamp aStart,
>                            DOMHighResTimeStamp aEnd)
> {
>-  if (mWidget) {
>-    nsContentUtils::AddScriptRunner(new PositionChangeEvent(this));
>-  }
>+  mIsPositionChangeEventPending = true;
>+  FlushMergeableNotifications();
>   return NS_OK;
> }
> 
> NS_IMETHODIMP
> IMEContentObserver::ReflowInterruptible(DOMHighResTimeStamp aStart,
>                                         DOMHighResTimeStamp aEnd)
> {
>-  if (mWidget) {
>-    nsContentUtils::AddScriptRunner(new PositionChangeEvent(this));
>-  }
>+  mIsPositionChangeEventPending = true;
>+  FlushMergeableNotifications();
There really should be some helper method to set
mIsPositionChangeEventPending to true and call 
FlushMergeableNotifications.

Maybe FlushMergeableNotifications could take a param, and if true,
mIsPositionChangeEventPending is set to true.
Attachment #8459103 - Flags: review?(bugs) → review+
Attachment #8459102 - Flags: review?(bugs) → review+
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #47)
> I guess that it's the most
> expensive path of this performance problem.
Don't guess, profile :)

Gecko profiler is pretty good when used with Nightly, though I don't know whether it works with e10s.
On Linux I still prefer Zoom.
Comment on attachment 8459100 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable

>   TextChangeEvent(IMEContentObserver* aDispatcher,
>-                  uint32_t aStart, uint32_t aOldEnd, uint32_t aNewEnd,
>-                  bool aCausedByComposition)
>+                  const IMEContentObserver::TextChangeData& aData)
>     : mDispatcher(aDispatcher)
>-    , mStart(aStart)
>-    , mOldEnd(aOldEnd)
>-    , mNewEnd(aNewEnd)
>-    , mCausedByComposition(aCausedByComposition)
>+    , mData(aData)
>   {
>     MOZ_ASSERT(mDispatcher);
>+    MOZ_ASSERT(mData.mStored);
So we rely on the default copy ctor. I guess that is fine since we have this assert.


> IMEContentObserver::ContentRemoved(nsIDocument* aDocument,
>                                    nsIContent* aContainer,
>                                    nsIContent* aChild,
>                                    int32_t aIndexInContainer,
>                                    nsIContent* aPreviousSibling)
> {
>   bool causedByComposition = IsEditorHandlingEventForComposition();
>-  if (causedByComposition &&
>+  if (!mTextChangeData.mStored && causedByComposition &&
>       !mUpdatePreference.WantChangesCausedByComposition()) {
>     return;
>   }
So what if we have mTextChangeData.mStored ongoing and we remove the text node for which
CharacterDataChanged was called?


I think I need some more explanation about this patch. This is complicated
(but I can see this kind of approach could help with performance quite significantly).
Attachment #8459100 - Flags: review?(bugs)
(In reply to Olli Pettay [:smaug] from comment #52)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #47)
> > I guess that it's the most
> > expensive path of this performance problem.
> Don't guess, profile :)

Yeah, I'm trying to do it.

(In reply to Olli Pettay [:smaug] from comment #53)
> > IMEContentObserver::ContentRemoved(nsIDocument* aDocument,
> >                                    nsIContent* aContainer,
> >                                    nsIContent* aChild,
> >                                    int32_t aIndexInContainer,
> >                                    nsIContent* aPreviousSibling)
> > {
> >   bool causedByComposition = IsEditorHandlingEventForComposition();
> >-  if (causedByComposition &&
> >+  if (!mTextChangeData.mStored && causedByComposition &&
> >       !mUpdatePreference.WantChangesCausedByComposition()) {
> >     return;
> >   }
> So what if we have mTextChangeData.mStored ongoing and we remove the text
> node for which
> CharacterDataChanged was called?

If there is stored text change data (i.e., mTextChangeData.mStroed is true) and removing text or composition string from JS or something, IME should receive text data change notification which includes changes not caused by composition. Therefore, this check is necessary.

> I think I need some more explanation about this patch.

Hmm, isn't enough the comments in the patch? If not so, what are your questions?
Flags: needinfo?(bugs)
Did you assume like MaybeNotifyIMEOf*Change() methods in this patch?
Attachment #8459103 - Attachment is obsolete: true
Attachment #8460791 - Flags: review?(bugs)
Attachment #8460793 - Flags: review?(bugs) → review+
Flags: needinfo?(bugs)
Trying to still understand the details of part 1
Attachment #8459100 - Flags: review?(bugs)
Comment on attachment 8459100 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable



>+  struct TextChangeData
>+  {
>+    uint32_t mStart;
>+    uint32_t mOldEnd;
>+    uint32_t mNewEnd;
>+    bool mCausedByOnlyComposition;
>+    bool mStored;

So it is not too clear what offsets we're storing here.
What is mOldEnd and what is mNewEnd, and why there isn't
mOldStart and mNewStart?
If that was explained, understanding IMEContentObserver::StoreTextChangeData
would become easier.

Bug 544779 didn't really add any comments what the offsets in TextChangeEvent are.


So could you add some comments.
Attachment #8459100 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #59)
> Comment on attachment 8459100 [details] [diff] [review]
> part.1 Make TextChangeEvent of IMEContentObserver mergeable
> 
> 
> 
> >+  struct TextChangeData
> >+  {
> >+    uint32_t mStart;
> >+    uint32_t mOldEnd;
> >+    uint32_t mNewEnd;
> >+    bool mCausedByOnlyComposition;
> >+    bool mStored;
> 
> So it is not too clear what offsets we're storing here.
> What is mOldEnd and what is mNewEnd, and why there isn't
> mOldStart and mNewStart?
> If that was explained, understanding IMEContentObserver::StoreTextChangeData
> would become easier.
> 
> Bug 544779 didn't really add any comments what the offsets in
> TextChangeEvent are.
> 
> 
> So could you add some comments.

The text change data was designed for TSF support. We notify TSF of text change with ITextStoreACPSink::OnTextChange() which takes TS_TEXTCHANGE structure.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms538412%28v=vs.85%29.aspx
http://msdn.microsoft.com/en-us/library/windows/desktop/ms629221%28v=vs.85%29.aspx

Therefore, the struct has only the members.

Anyway, oldStart and newStart sound strange. A text input should cause deleting text first if there is a selected range. Then, insert text at same insertion point. So, start offset should not need both new and old.

So,

// mStart is start offset of modified or removed text in original content.
uint32_t mStart;
// mOldEnd is the end offset of modified or removed text in original content.
uint32_t mOldEnd;
// mNewEnd is the end offset of inserted text or same as mStart if just removed.
uint32_t mNewEnd;

Like these comments are enough?
Flags: needinfo?(bugs)
is the start offset of the modified... etc

But ok, those help, especially "same as mStart if just removed"
Flags: needinfo?(bugs)
If two ranges cannot merge easy, e.g., two really different ranges are modified by JS or something while an edit action, the patch computes the most smaller modified offset as mStart, the most largest modified offset in old contents as mOldEnd and the largest modified offset in new contents as mNewEnd.
Comment on attachment 8460790 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action (r=ehsan+smaug, sr=smaug)

I wonder, when IMEContentObserver::Unlink() is called, should it call mEditor->RemoveEditorObserver()?

I don't know what Traverse() and Unlink() should do, though.
Attachment #8460790 - Flags: feedback?(bugs)
Adding comments into TextChangeData definition and StoreTextChangeData() implementation.
Attachment #8459100 - Attachment is obsolete: true
Attachment #8461398 - Flags: review?(bugs)
This patch reduces recompute TextChangeData::mStart in NotifyContentAdded().

At adding multiple lines to an HTML editor, notifications of mutation occur text node -> br element -> next line's text node -> br element...

So, caching the end offset of the last inserted text can skip next start offset computation. With this patch, the performance of pasting is improved to same as IMM mode.

At pasting 10000 lines on my machine with opt build:
                        1st time   2nd time   3rd time
TSF mode without patch: 55 sec,    64 sec,    65 sec
TSF mode with patch:    1.2 sec,   8.5 sec,   8.6 sec
IMM mode:               1.1 sec,   8.6 sec,   8.5 sec

I don't know why only the first time paste is faster than the others but it's not scope of this bug.
Attachment #8461403 - Flags: review?(bugs)
Similar to part.4.

At removing multiple lines in HTML editor, nodes are removed by this order: the first text node in the selected range is removed -> its br element -> next line's text node...

I.e., ContentRemoved() computes same value for mStart multiple times. This patch caches the value while an edit action.
Attachment #8461406 - Flags: review?(bugs)
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #63)
> Comment on attachment 8460790 [details] [diff] [review]
> I wonder, when IMEContentObserver::Unlink() is called, should it call
> mEditor->RemoveEditorObserver()?
Ah, yes, sounds right.


> 
> I don't know what Traverse() and Unlink() should do, though.
The same as what they do now, but you just add that
if (tmp->mEditor) {
  tmp->mEditor->RemoveEditorObserver(tmp);
  NS_IMPL_CYCLE_COLLECTION_UNLINK(mEditor);
}
(You need to use TRAVERSE/UNLINK macros, and not NS_IMPL_CYCLE_COLLECTION)
Comment on attachment 8460790 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action (r=ehsan+smaug, sr=smaug)

So I could take another look at this with those TRAVERSE/UNLINK
Attachment #8460790 - Flags: feedback?(bugs) → feedback+
Comment on attachment 8461403 [details] [diff] [review]
part.4 Don't recompute text length of before the insertion point at adding multiple lines into an HTML editor

This all is getting so complicated that we really need to have good comments.
So, please add comments which explain what
FlatTextCache is for, and what the member variables are.

>+  struct FlatTextCache
>+  {
>+    nsCOMPtr<nsINode> mContainerNode;
>+    int32_t mNodeOffset;
>+    uint32_t mFlatTextLength;
>+
>+    FlatTextCache()
>+      : mNodeOffset(0)
>+      , mFlatTextLength(0)
>+    {
>+    }
>+
>+    void Clear()
>+    {
>+      mContainerNode = nullptr;
>+      mNodeOffset = 0;
>+      mFlatTextLength = 0;
>+    }
>+
>+    void Cache(nsINode* aContainer, int32_t aNodeOffset,
>+               uint32_t aFlatTextLength)
>+    {
>+      mContainerNode = aContainer;
>+      mNodeOffset = aNodeOffset;
>+      mFlatTextLength = aFlatTextLength;
Could we add some assertions here that aNodeOffset doesn't point out of the
possible values under mContainerNode and that
mFlatTextLength isn't longer than possible
Attachment #8461403 - Flags: review?(bugs) → review-
Attachment #8461406 - Flags: review?(bugs) → review+
Btw, once all the parts have got r+, it would be still great to see
a full patch containing all the changes.
Comment on attachment 8461398 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable

Sorry about being a nitpicker about comments and variables names here, but
this code isn't exactly easy to read.


>+  uint32_t newStartInOldContent = newData.mStart;
>+  uint32_t newOldEndInOldContent = newData.mOldEnd;
>+  uint32_t oldNewEndInNewContent = oldData.mNewEnd;

The naming is hard to understand here.
"OldContent" and "NewContent"? What do they refer to?
Why oldData.mNewEnd is about NewContent and 
newData.mStart about OldContent? I would expect the opposite.


CausedOnlyByComposition


>+#ifdef DEBUG
>+// Let's test the code of merging multiple text change data in debug build
>+// and crash if one of them fails because this feature is very complex but
>+// cannot test it with mochitest.
cannot be tested with mochitest

>+  struct TextChangeData
>+  {
>+    // mStart is the start offset of modified or removed text in original
>+    // content.
>+    uint32_t mStart;
>+    // mOldEnd is the end offset of modified or removed text in original
>+    // content.  If the value is same as mStart, no text hasn't been removed
>+    // yet.
>+    uint32_t mOldEnd;
I wonder if this all was a bit more clear if mOldEnd was called
mRemovalOffset

>+    // mNewEnd is the end offset of inserted text or same as mStart if just
>+    // removed.
>+    uint32_t mNewEnd;
and mNewEnd was mAdditionOffset.
Attachment #8461398 - Flags: review?(bugs) → review-
Adding comment for explaining the reason why we need to compute mStart and mOldEnd in old content and mNewEnd in new content because they are should be compared in same content.

And renaming mStart -> mStartOffset, mOldEnd -> mRemovedEndOffset, mNewEnd -> mAddedEndOffset.
Attachment #8461398 - Attachment is obsolete: true
Attachment #8462385 - Flags: review?(bugs)
Adding explanation of FlatTextCache and mAddingTextCache.
Attachment #8461403 - Attachment is obsolete: true
Attachment #8462388 - Flags: review?(bugs)
Comment on attachment 8462386 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action (r=ehsan, sr=smaug)

Please check the Unlink() and Destroy().
Attachment #8462386 - Flags: review?(bugs)
Adding comments for explaining mRemovingTextCache.
Attachment #8461406 - Attachment is obsolete: true
Attachment #8462390 - Flags: review?(bugs)
Whiteboard: [leave open]
Comment on attachment 8462386 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action (r=ehsan, sr=smaug)

>+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(IMEContentObserver)
>+  tmp->StopObservingEditor();
>+  tmp->NotifyIMEOfBlur();
>+  tmp->StopListenningSelection();
StopListeningSelection()


And do we really need so many Stop* methods.
Couldn't we have just one, something like 
UnregisterObservers()


>+void
>+IMEContentObserver::NotifyIMEOfBlur()
> {
>   // If CreateTextStateManager failed, mRootContent will be null,
>   // and we should not call NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR))
>-  if (mRootContent) {
>+  if (mRootContent && mWidget) {
>     if (IMEStateManager::IsTestingIME() && mEditableNode) {
>       nsIDocument* doc = mEditableNode->OwnerDoc();
>       (new AsyncEventDispatcher(doc, NS_LITERAL_STRING("MozIMEFocusOut"),
>                                 false, false))->RunDOMEventWhenSafe();
>     }
>-    mWidget->NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR));
>+    // The test event handle might destroy the widget.
>+    if (mWidget) {
>+      mWidget->NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR));
>+    }
>   }
Hmm, firing an event during unlink? Could we dispatch it asynchronously if we're in unlink.
Attachment #8462386 - Flags: review?(bugs) → review-
Comment on attachment 8462385 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable

>+  const TextChangeData& newData = aTextChangeData;
>+  const TextChangeData& oldData = mTextChangeData;
>+
>+  // For comparing new and old mStartOffset/mRemovedEndOffset values, they
>+  // should be adjusted to be in same content. The newData.mStartOffset and
>+  // newData.mRemovedEndOffset should be computed as in old content because
>+  // mStartOffset and mRemovedEndOffset represent the modified text range in
>+  // the old content but even if some text before the values of the newData
>+  // has already been modified, the values don't include the changes.
>+  uint32_t newStartInOldContent = newData.mStartOffset;
>+  uint32_t newOldEndInOldContent = newData.mRemovedEndOffset;
Please add a newline after this to improve readability.

Oh, the offsets are something in the old data...

>+  // For comparing new and old mAddedEndOffset values, they should be adjusted
>+  // to be in same content.  The oldData.mAddedEndOffset should be computed as
>+  // in the new content because mAddedEndOffset indicates the end offset of
>+  // inserted text in the new content but oldData.mAddedEndOffset doesn't
>+  // include any changes of the text before newData.mAddedEndOffset.
>+  uint32_t oldNewEndInNewContent = oldData.mAddedEndOffset;
and here about new...
Based on this I think the member variable names are still wrong, but I don't
have better suggestions.
This is so hard to understand :/

Naming of newStartInOldContent and newOldEndInOldContent is still bizarre.

>+
>+  if (newData.mStartOffset >= oldData.mAddedEndOffset) {
>+    // Case 1:
>+    // If new start is after old end of added text, it means that text after
>+    // the modified range is modified.  Like:
>+    // new range of old change:             +----------+
>+    // old range of new change:                            +----------+
So this makes sense

>+    newStartInOldContent -= oldData.Difference();
...but why we we want to change the start of the old?
(I can see this being right if I read then again that newStartInOldContent is about the new change, not old change, like
 "OldContent" hints)
Similar also elsewhere. This method is just too hard to read this way.



>+    uint32_t mStartOffset;
>+    // mRemovalEndOffset is the end offset of modified or removed text in
>+    // original content.  If the value is same as mStart, no text hasn't been
>+    // removed yet.
mStartOffset

>+    uint32_t mRemovedEndOffset;
>+    // mAddedEndOffset is the end offset of inserted text or same as mStart if
>+    // just removed.  The vlaue is offset in the new content.
mStartOffset
Attachment #8462385 - Flags: review?(bugs) → review-
Comment on attachment 8462390 [details] [diff] [review]
part.5 Don't recompute text length of before the start range at removing multiple lines into an HTML editor


>   };
>   // mAddingTextCache caches text length from the start of content to
>   // the end of the last added content only while an edit action is being
>   // handled by the editor and no other mutation (e.g., removing node)
>   // occur.
>   FlatTextCache mAddingTextCache;
>+  // mRemovingTextCache caches text length from the start of content to
>+  // the start of the last removed content only while an edit action is being
>+  // handled by the editor and no other mutation (e.g., adding node) occur.
>+  FlatTextCache mRemovingTextCache;
Wait, so mRemovingTextCache doesn't actually cache any removed content?
And mAddingTextCache caches from start to the end of the added content, but
mRemovingTextCache from start to the start of the removed content.
Rather odd inconsistency. Naming is almost the same, but the meaning is different.
Attachment #8462390 - Flags: review?(bugs) → review-
Comment on attachment 8462388 [details] [diff] [review]
part.4 Don't recompute text length of before the insertion point at adding multiple lines into an HTML editor

(either something in part 4 or 5 needs to be changed. probably part 5 and this is then fine.)
Attachment #8462388 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #79)
> Comment on attachment 8462386 [details] [diff] [review]
> part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor
> handles an edit action (r=ehsan, sr=smaug)
> >+void
> >+IMEContentObserver::NotifyIMEOfBlur()
> > {
> >   // If CreateTextStateManager failed, mRootContent will be null,
> >   // and we should not call NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR))
> >-  if (mRootContent) {
> >+  if (mRootContent && mWidget) {
> >     if (IMEStateManager::IsTestingIME() && mEditableNode) {
> >       nsIDocument* doc = mEditableNode->OwnerDoc();
> >       (new AsyncEventDispatcher(doc, NS_LITERAL_STRING("MozIMEFocusOut"),
> >                                 false, false))->RunDOMEventWhenSafe();
> >     }
> >-    mWidget->NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR));
> >+    // The test event handle might destroy the widget.
> >+    if (mWidget) {
> >+      mWidget->NotifyIME(IMENotification(NOTIFY_IME_OF_BLUR));
> >+    }
> >   }
> Hmm, firing an event during unlink? Could we dispatch it asynchronously if
> we're in unlink.

Did you meant that it should be performed with runnable? or set a timer?

(In reply to Olli Pettay [:smaug] from comment #80)
> Comment on attachment 8462385 [details] [diff] [review]
> part.1 Make TextChangeEvent of IMEContentObserver mergeable
> Based on this I think the member variable names are still wrong, but I don't
> have better suggestions.
> This is so hard to understand :/

Hmm, mStart, mOldEnd and mNewEnd sound not so strange to me... If we name them as explaining the detail, we need very long names, probably...

Indeed, this is very complicated. Actually, I needed for a week for writing this patch.

> Naming of newStartInOldContent and newOldEndInOldContent is still bizarre.

Hmm, why?

> >+  if (newData.mStartOffset >= oldData.mAddedEndOffset) {
> >+    // Case 1:
> >+    // If new start is after old end of added text, it means that text after
> >+    // the modified range is modified.  Like:
> >+    // new range of old change:             +----------+
> >+    // old range of new change:                            +----------+
> So this makes sense
> 
> >+    newStartInOldContent -= oldData.Difference();
> ...but why we we want to change the start of the old?

because the new offset was moved right by the previous change.

E.g., there was text "0123789".

If first change inserted "456" to 4, then the result is "0123456789" and the old data becomes:

mStartOffset: 4, mRemovedEndOffset: 4, mAddedEndOffset: 7 (4 + 3)

Then, next change inserts "." to 9, then the result is "012345678.9" and the new data becomes:

mStartOffset: 9, mRemovedEndOffset: 9, mAddedEndOffset: 10 (9 + 1)

Finally, merged change data should be:

mStartOffset: 4, mRemovedEndOffset: 6, mAddedEndOffset: 10

because "78" of "0123789" is replaced with "45678."

For computing this, we need to convert the new data's mStartOffset and mRemovedEndOffset because they are moved 3 characters ("456") at first change. Therefore, they are:

mStartOffset: 6 (9 - 3), mRemovedEndOffset: 6 (9 - 3)

mStartOffset should be smaller one, therefore, it's 4.

mRemovedOffset should be larger one, therefore, it's 6.

These values represent the range of "78" in the old content.

And also, both mAddedOffset should be in final content at compared them.

I.e., mAddedEndOffset:7 (7 + 0)

because no text is added before mAddedEndOffset at the last change.

Therefore, the result of the mAddedEndOffset is 10 which is larger one of them.

The result is:

"0123 789"
     ^
     mStartOffset

"012378 9"
       ^
       mRemovedEndOffset

"012345678. 9"
           ^  
           mAddedEndOffset


Does this example help you?

(In reply to Olli Pettay [:smaug] from comment #81)
> >   // mAddingTextCache caches text length from the start of content to
> >   // the end of the last added content only while an edit action is being
> >   // handled by the editor and no other mutation (e.g., removing node)
> >   // occur.
> >   FlatTextCache mAddingTextCache;
> >+  // mRemovingTextCache caches text length from the start of content to
> >+  // the start of the last removed content only while an edit action is being
> >+  // handled by the editor and no other mutation (e.g., adding node) occur.
> >+  FlatTextCache mRemovingTextCache;
> Wait, so mRemovingTextCache doesn't actually cache any removed content?

Yes. mStartOfRemovingTextRangeCache is better? And mEndOfAddedTextCache is better?
Flags: needinfo?(bugs)
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #83)
> > Hmm, firing an event during unlink? Could we dispatch it asynchronously if
> > we're in unlink.
> 
> Did you meant that it should be performed with runnable? or set a timer?
I was thinking runnable during unlink

> > Naming of newStartInOldContent and newOldEndInOldContent is still bizarre.
> 
> Hmm, why?
Perhaps because "Content" hints like something related to nsIContent in Gecko, and
you seem to assign "new" values to the *OldContent.
And "NewOldEndInOld..."
You changed the mOldEnd member variable, and others, to *Offset, so why use
*newStart* and *OldEnd* still here.



> Does this example help you?
It is still mainly about the naming of those variables.

> Yes. mStartOfRemovingTextRangeCache is better? And mEndOfAddedTextCache is
> better?
At least those are pretty clear yes.
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #84)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #83)
> > > Hmm, firing an event during unlink? Could we dispatch it asynchronously if
> > > we're in unlink.
> > 
> > Did you meant that it should be performed with runnable? or set a timer?
> I was thinking runnable during unlink

I'm confused about this.

Currently, "MozIMEFocusOut" is dispatch with AsyncEventDispatcher::RunDOMEventWhenSafe(). So, the event should be dispatched as runnable if during unlink is unsafe.

Do you think unlink is not marked as unsafe? Or NotifyIME() should be called when safe? If the latter, I met this assert on Mac:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44776993&tree=Try#error2

This might indicate that native key event *might* be fired before we set secure input mode after type attribute of <input> is changed.
Oops, see the previous comment.
Flags: needinfo?(bugs)
I also add comments for each case before setting or adjusting the offsets.
Attachment #8462385 - Attachment is obsolete: true
Attachment #8463842 - Flags: review?(bugs)
Renaming the new members to cache the offset and content at adding or removing contents.
Attachment #8462390 - Attachment is obsolete: true
Attachment #8463846 - Flags: review?(bugs)
According to other custom Unlink() implementation, I guess you though that I should make auto script brocker before calling UnregisterObservers() from Unlink().
Attachment #8462386 - Attachment is obsolete: true
Attachment #8463884 - Flags: review?(bugs)
Flags: needinfo?(bugs)
Attachment #8463846 - Flags: review?(bugs) → review+
Comment on attachment 8463884 [details] [diff] [review]
part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor handles an edit action (r=ehsan, sr=smaug)

>+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(IMEContentObserver)
>+  nsAutoScriptBlocker scriptBlocker;
I don't see this being useful. We just run script runners when this script blocker goes away.
I'd prefer doing the complicated stuff asynchronously using a runnable + dispatch to main thread
So perhaps add a bool param to UnregisterObservers and if set, don't use
RunDOMEventWhenSafe but PostDOMEvent.
(I'd prefer to not run any event listeners during Unlink.)
That fixed, r+
Attachment #8463884 - Flags: review?(bugs) → review+
Comment on attachment 8463842 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable

>+  // For comparing new and old |mStartOffset|/|mRemovedEndOffset| values, they
>+  // should be adjusted to be in same text. The |newData.mStartOffset| and
>+  // |newData.mRemovedEndOffset| should be computed as in old text because
>+  // |mStartOffset| and |mRemovedEndOffset| represent the modified text range
>+  // in the old text but even if some text before the values of the newData
>+  // has already been modified, the values don't include the changes.
>+  uint32_t newStartOffsetInOldText = newData.mStartOffset;
>+  uint32_t newRemovedEndOffsetInOldText = newData.mRemovedEndOffset;
Much better



>+  if (newData.mStartOffset >= oldData.mAddedEndOffset) {
>+    // Case 1:
>+    // If new start is after old end offset of added text, it means that text
>+    // after the modified range is modified.  Like:
>+    // new range of old change:             +----------+
>+    // old range of new change:                            +----------+
This could use still some minor tweaking to make it clear what "new range" and "old range" mean.
Perhaps use added/removed range?


>+    // So, the offsets are moved by the old change and we need to cancel the
>+    // move of the old change.
>+    newStartOffsetInOldText -= oldData.Difference();
I still don't understand this although you explained it in comment 83.
You say:
"Finally, merged change data should be:
 mStartOffset: 4, mRemovedEndOffset: 6, mAddedEndOffset: 10"
but why do we need to change start offset here if we know
the old start is before the new one?

>+    newRemovedEndOffsetInOldText -= oldData.Difference();
This one I think I can understand.



>+  } else if (newData.mStartOffset >= oldData.mStartOffset) {
>+    // If new start is in the modified range, it means that new data changes
>+    // a part or all of the range.
>+    newStartOffsetInOldText = oldData.mStartOffset;
So why doesn't the 'Case 1' do this?

>+  } else if (newData.mRemovedEndOffset >= oldData.mStartOffset) {
>+    // If new end of added text is greater than old start (and new start is
>+    // less than old start), it means that a part of modified range is modified
>+    // again and some new text before the modified range is also modified.
>+    if (newData.mRemovedEndOffset >= oldData.mAddedEndOffset) {
I don't know what the comment is there between these 'ifs'.
It talks about "new end of added text", but ifs are about new end of removed text.


>+  // mStartOffset and mRemovedEndOffset should be compared in the old content.
>+  mTextChangeData.mStartOffset =
>+    std::min(newStartOffsetInOldText, oldData.mStartOffset);
>+  mTextChangeData.mRemovedEndOffset =
>+    std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
>+  // mAddedEndOffset should be compared in the new content.
>+  mTextChangeData.mAddedEndOffset =
>+    std::max(newData.mAddedEndOffset, oldAddedEndOffsetInNewText);
oh, we do tons of work to change offset, and then we still compare the min/max here.
Could we not just set m*Offset values is the various if-else cases.
I think that would make the code easier to read. Now one needs to know that essential part of
offset calculations happens after all the various "Case x".



Almost there...
Attachment #8463842 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #91)
> Comment on attachment 8463884 [details] [diff] [review]
> part.2 IMEContentObserver shouldn't dispatch TextChangeEvent while editor
> handles an edit action (r=ehsan, sr=smaug)
> 
> >+NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(IMEContentObserver)
> >+  nsAutoScriptBlocker scriptBlocker;
> I don't see this being useful. We just run script runners when this script
> blocker goes away.
> I'd prefer doing the complicated stuff asynchronously using a runnable +
> dispatch to main thread
> So perhaps add a bool param to UnregisterObservers and if set, don't use
> RunDOMEventWhenSafe but PostDOMEvent.
> (I'd prefer to not run any event listeners during Unlink.)
> That fixed, r+

Okay, perhaps, I understand it.

(In reply to Olli Pettay [:smaug] from comment #92)
> Comment on attachment 8463842 [details] [diff] [review]
> part.1 Make TextChangeEvent of IMEContentObserver mergeable
> >+  if (newData.mStartOffset >= oldData.mAddedEndOffset) {
> >+    // Case 1:
> >+    // If new start is after old end offset of added text, it means that text
> >+    // after the modified range is modified.  Like:
> >+    // new range of old change:             +----------+
> >+    // old range of new change:                            +----------+
> This could use still some minor tweaking to make it clear what "new range"
> and "old range" mean.
> Perhaps use added/removed range?

Ah, yes.

> >+    // So, the offsets are moved by the old change and we need to cancel the
> >+    // move of the old change.
> >+    newStartOffsetInOldText -= oldData.Difference();
> I still don't understand this although you explained it in comment 83.
> You say:
> "Finally, merged change data should be:
>  mStartOffset: 4, mRemovedEndOffset: 6, mAddedEndOffset: 10"
> but why do we need to change start offset here if we know
> the old start is before the new one?

Ah, new mStartOffset never smaller than old mStartOffset in this case. So, adjusting it is redundant. Although, it might make easier to understand by consistency with the other cases.

> >+  } else if (newData.mStartOffset >= oldData.mStartOffset) {
> >+    // If new start is in the modified range, it means that new data changes
> >+    // a part or all of the range.
> >+    newStartOffsetInOldText = oldData.mStartOffset;
> So why doesn't the 'Case 1' do this?

Because the *new* mStartOffset isn't same as old mStartOffset in the "Case 1".

> >+  } else if (newData.mRemovedEndOffset >= oldData.mStartOffset) {
> >+    // If new end of added text is greater than old start (and new start is
> >+    // less than old start), it means that a part of modified range is modified
> >+    // again and some new text before the modified range is also modified.
> >+    if (newData.mRemovedEndOffset >= oldData.mAddedEndOffset) {
> I don't know what the comment is there between these 'ifs'.
> It talks about "new end of added text", but ifs are about new end of removed
> text.

Oh, that's just a mistake of the comment.

> >+  // mStartOffset and mRemovedEndOffset should be compared in the old content.
> >+  mTextChangeData.mStartOffset =
> >+    std::min(newStartOffsetInOldText, oldData.mStartOffset);
> >+  mTextChangeData.mRemovedEndOffset =
> >+    std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
> >+  // mAddedEndOffset should be compared in the new content.
> >+  mTextChangeData.mAddedEndOffset =
> >+    std::max(newData.mAddedEndOffset, oldAddedEndOffsetInNewText);
> oh, we do tons of work to change offset, and then we still compare the
> min/max here.

Yes, this is the simplest work which we need to do here. The previous work is computing and adjusting the source of this comparing.

> Could we not just set m*Offset values is the various if-else cases.
> I think that would make the code easier to read. Now one needs to know that
> essential part of
> offset calculations happens after all the various "Case x".

It's possible but I guess that it will show a lot of redundancy because there will be a lot of similar lines... but I'll do it and compare with this.
Rewriting the method with setting mTextChangeData directly.
Attachment #8463842 - Attachment is obsolete: true
Attachment #8464662 - Flags: review?(bugs)
Comment on attachment 8464662 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable


>+      // So, the new removed end offset is moved by the old change and we need
>+      // to cancel the move of the old change.
I don't understand this comment.


>+      uint32_t newRemovedEndOffsetInOldText =
>+        newData.mRemovedEndOffset - oldData.Difference();
Why we want to do this?

>+      mTextChangeData.mRemovedEndOffset =
>+        std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
>+      // The old end of added text is replaced by new change. So, it should be
>+      // same as the new start.
>+      uint32_t oldAddedEndOffsetInNewText = newData.mStartOffset;
Do you actually need this temporary variable.

>+      mTextChangeData.mAddedEndOffset =
>+        std::max(newData.mAddedEndOffset, oldAddedEndOffsetInNewText);
I don't understand how newData.mStartOffset (which is the same as oldAddedEndOffsetInNewText) could
ever be larger than newData.mAddedEndOffset


>+      return false;
>+    }
>+
>+    // Case 3:
>+    // If new end of removed text is less than old end of added text, it means
>+    // that only a part of the modified range is modified again  Like:
s/again  Like:/again. Like:/

>+    // added range of old change:             +------------+
>+    // removed range of new change:               +-----+
>+    // So, the new end of removed text should be same as old end of removed
>+    // text for preventing end of removed text to be modified.
>+    uint32_t newRemovedEndOffsetInOldText = oldData.mRemovedEndOffset;
>+    mTextChangeData.mRemovedEndOffset =
>+      std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
newRemovedEndOffsetInOldText is the same as  oldData.mRemovedEndOffset, 
so this std::max ends up doing
std::max(oldData.mRemovedEndOffset,  oldData.mRemovedEndOffset);


>+    uint32_t mRemovedEndOffset;
>+    // mAddedEndOffset is the end offset of inserted text or same as
>+    // mStartOffset if just removed.  The vlaue is offset in the new content.
value
Attachment #8464662 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #96)
> Comment on attachment 8464662 [details] [diff] [review]
> part.1 Make TextChangeEvent of IMEContentObserver mergeable
> 
> 
> >+      // So, the new removed end offset is moved by the old change and we need
> >+      // to cancel the move of the old change.
> I don't understand this comment.

Hmm, the reason is explained above... Before comparing the offsets of newData and oldData, they must be in same text. Comparing offset in new text and offset in old text does not have any meaning...

> >+      uint32_t newRemovedEndOffsetInOldText =
> >+        newData.mRemovedEndOffset - oldData.Difference();
> Why we want to do this?

So, it does convert the newData's mRemovedEndOffset to be in old text...

> >+      mTextChangeData.mRemovedEndOffset =
> >+        std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
> >+      // The old end of added text is replaced by new change. So, it should be
> >+      // same as the new start.
> >+      uint32_t oldAddedEndOffsetInNewText = newData.mStartOffset;
> Do you actually need this temporary variable.

Not necessary. However, I believe that the variable name is self document. For the other developers, comparing with new added end offset and new start offset is not clear the meaning.

> >+      mTextChangeData.mAddedEndOffset =
> >+        std::max(newData.mAddedEndOffset, oldAddedEndOffsetInNewText);
> I don't understand how newData.mStartOffset (which is the same as
> oldAddedEndOffsetInNewText) could
> ever be larger than newData.mAddedEndOffset

I guess this is case 2. Then, you're right.

> >+    // added range of old change:             +------------+
> >+    // removed range of new change:               +-----+
> >+    // So, the new end of removed text should be same as old end of removed
> >+    // text for preventing end of removed text to be modified.
> >+    uint32_t newRemovedEndOffsetInOldText = oldData.mRemovedEndOffset;
> >+    mTextChangeData.mRemovedEndOffset =
> >+      std::max(newRemovedEndOffsetInOldText, oldData.mRemovedEndOffset);
> newRemovedEndOffsetInOldText is the same as  oldData.mRemovedEndOffset, 
> so this std::max ends up doing
> std::max(oldData.mRemovedEndOffset,  oldData.mRemovedEndOffset);

Indeed.
Updated for the last review.
Attachment #8464662 - Attachment is obsolete: true
Attachment #8464806 - Flags: review?(bugs)
Comment on attachment 8464806 [details] [diff] [review]
part.1 Make TextChangeEvent of IMEContentObserver mergeable

Still super complicated and error prone.
I wonder if there was some different approach to handle merging.
(like some kind of virtual array of offsets and each offset is put to the
array and then when something is cut from or added to there,
update all the offset after the change etc. Not sure if that would work.)
Attachment #8464806 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #99)
> Comment on attachment 8464806 [details] [diff] [review]
> part.1 Make TextChangeEvent of IMEContentObserver mergeable
> 
> Still super complicated and error prone.

Yeah, and perhaps, it's hard to find bugs of the code because wrong offsets *might* cause odd behavior of IME. So, even with wrong offsets, IME may work without problems but not ideally.

> I wonder if there was some different approach to handle merging.
> (like some kind of virtual array of offsets and each offset is put to the
> array and then when something is cut from or added to there,
> update all the offset after the change etc. Not sure if that would work.)

I have no idea and I don't understand your idea well. The reason of complicated is that the offsets must be compared in same context. Adjusting offsets to old or new text is that point...

https://hg.mozilla.org/integration/mozilla-inbound/rev/e27aee35d0bf
https://hg.mozilla.org/integration/mozilla-inbound/rev/2cfc400dd0ad
https://hg.mozilla.org/integration/mozilla-inbound/rev/c076e8799695
https://hg.mozilla.org/integration/mozilla-inbound/rev/091169ecbd5b
https://hg.mozilla.org/integration/mozilla-inbound/rev/20f6847ad2d9

Anyway thank you for your review of such complicated patch!
No longer depends on: 596507
Flags: qe-verify+
Depends on: 1072137
I was unable to reproduce this issue on Firefox 3.6.2 using Windows XP x32 and Windows XP x64.

I followed the STR from Comment 0 and the message with time result did not show up. 

In all cases (10,100,1000,5000, 10000 lines) the text was pasted almost instantly.  Am I missing something?
Flags: needinfo?(masayuki)
I forgot to mention that I have tried to reproduce this issue with TSF enabled and also with TSF disabled and in both cases the result was the same.
I'm not sure. Starting 4.0, this is reproduced only on HTML editor, e.g., <div contenteditable>foo</div>. But this must be reproduced on 3.6...
Flags: needinfo?(masayuki)
Ah, pref name for enabling TSF mode was changed, if you use intl.tsf.enable, you must fail to enable TSF mode. If so, please check intl.enable_tsf_support.
I tried again to reproduce this issue setting intl.enable_tsf_support -> true, but I could not reproduce it on Firefox 3.6 (http://hg.mozilla.org/releases/mozilla-1.9.2/rev/448d0d2d310c), Firefox 3.6.2 (http://hg.mozilla.org/releases/mozilla-1.9.2/rev/cd857b3b0e33) and Firefox 4.0 (20110318052756) using Windows XP x32.

I also verified this bug on Firefox 34RC (20141124205320) using Windows XP x32 and everything looks fine, the text is displayed almost instantly.

Can anyone reproduce this bug on an old version? If the answer is "yes" please verify this issue on Firefox 34RC (20141124205320)(https://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/34.0-candidates/build1/win32/en-US/).
(In reply to Vasilica Mihasca, QA [:vasilica_mihasca] from comment #106)
> I tried again to reproduce this issue setting intl.enable_tsf_support ->
> true, but I could not reproduce it on Firefox 3.6
> (http://hg.mozilla.org/releases/mozilla-1.9.2/rev/448d0d2d310c), Firefox
> 3.6.2 (http://hg.mozilla.org/releases/mozilla-1.9.2/rev/cd857b3b0e33) and
> Firefox 4.0 (20110318052756) using Windows XP x32.

Um, that's odd. Did you reboot your Firefox after changing the pref?

> I also verified this bug on Firefox 34RC (20141124205320) using Windows XP
> x32 and everything looks fine, the text is displayed almost instantly.

On Win XP, you can enable TSF mode only with "intl.tsf.force_enable" if it's in the list of about:config because XP's TSF is too old and unstable. Therefore, we don't have any plans to support it on WinXP.
I was able to reproduce this bug with HTML editor on Firefox 34.0a1 (2014-07-29) using Windows 7 x64 by setting the following prefs on true: intl.tsf.enable and intl.tsf.force_enable.
	
Performance on Firefox 34.0a1:      100 lines - 0.021 sec for paste, 0.019 sec for extra work
				    500 lines - 0.107 sec for paste, 0.025 sec for extra work
       	  			    1000 lines - 0.284 sec for paste, 0.038 sec for extra work
			            2000 lines - 1.054 sec for paste, 0.058 sec for extra work
				    5000 lines - 6.269 sec for paste, 0.178 sec for extra work
				    10000 lines - 25.889 sec for paste, 0.291 sec for extra work

Verified fixed on Firefox 34 RC build 2 (20141125180439) using Windows 7 x64 by setting the following prefs on true: intl.tsf.enable and intl.tsf.force_enable.

Performance on Firefox 34RC build 2:100 lines - 0.008 sec for paste, 0.005 sec for extra work
				    500 lines - 0.031 sec for paste, 0.022 sec for extra work
				    1000 lines - 0.069 sec for paste, 0.039 sec for extra work
				    2000 lines - 0.079 sec for paste, 0.044 sec for extra work	
				    5000 lines - 0.168 sec for paste, 0.088 sec for extra work
				    10000 lines - 0.438 sec for paste, 0.133 sec for extra work
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.