[TSF] Microsoft IME puts twice the first character of new composition when you start new composition when there is old composition

RESOLVED FIXED in mozilla34

Status

()

Core
Widget: Win32
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: m_kato, Assigned: masayuki)

Tracking

({inputmethod, regression})

Trunk
mozilla34
x86
Windows 8.1
inputmethod, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Reporter)

Description

4 years ago
https://twitter.com/PRiMENON/status/502656164652281856

- Step
1. Turn on Microsoft IME
2. input ああ
3. convert via [space] key
4. input あ

- Result
唖ああ

- Expected Result
唖あ
(Reporter)

Updated

4 years ago
Blocks: 1049488
Only with Office 2010 IME?
Oops, I misunderstood the STR, I can reproduce this bug with MS-IME of Win 8.1.
Created attachment 8477251 [details]
log (Typeing い of "ああ -> Space -> い")
Attachment #8477251 - Attachment mime type: text/x-log → text/plain
> 0[e0f140]: TSF: 0x9e87b80 nsTextStore::SetText(dwFlags=not-specified, acpStart=2, acpEnd=2, pchText=0x82c2b74 "い", cch=1, pChange=0x44dd0c), mComposition.IsComposing()=true

MS-IME tries to insert typed character next to the composition string without calling OnEndComposition().

> 0[e0f140]: TSF: 0x9e87b80 nsTextStore::OnUpdateComposition(pComposition=0x11e6cd18, pRangeNew=0x0), mComposition.mView=0x11e6cd18

Next, OnCompositionUpdate() is called with incomplete range (nullptr).

> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(aRangeNew=0x0), mComposition.mView=0x11e6cd18
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(), range=0-3, mComposition={ mStart=0, mString.Length()=3 }

Then, we try to retrieve new range from stored ITfCompositionView instance. It returns all of them are in composition.

Therefore, we assume that no part of the composition string is not committed.

However,

> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RecordCompositionUpdateAction(), mComposition={ mView=0x11e6cd18, mString="ああい" }
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::CurrentSelection(): acpStart=2, acpEnd=2 (length=0), reverted=false
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::GetDisplayAttribute(): GetDisplayAttribute range=0-2 (hr=S_OK)
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::GetDisplayAttribute() FAILED due to ITfProperty::GetValue() returns non-VT_I4 value
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::GetDisplayAttribute(): GetDisplayAttribute range=2-3 (hr=S_OK)

If getting the composition range detail, we fails to get the range (clause) property due to already committed.

> 0[e0f140]: TSF: 0x9e87b80 nsTextStore::OnUpdateComposition(pComposition=0x11e6cd18, pRangeNew=0x5da6514), mComposition.mView=0x11e6cd18
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(aRangeNew=0x5da6514), mComposition.mView=0x11e6cd18
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(), range=0-3, mComposition={ mStart=0, mString.Length()=3 }
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary() succeeded

After that, OnUpdateCOmposition() is called includes the fixed part in the range.

> 0[e0f140]: TSF: 0x9e87b80 nsTextStore::OnUpdateComposition(pComposition=0x11e6cd18, pRangeNew=0x5da5ef4), mComposition.mView=0x11e6cd18
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(aRangeNew=0x5da5ef4), mComposition.mView=0x11e6cd18
> 0[e0f140]: TSF: 0x9e87b80   nsTextStore::RestartCompositionIfNecessary(), range=2-3, mComposition={ mStart=0, mString.Length()=3 }

Finally, OnUpdateComposition() is called again with proper range.

However, we believe that composing string is including the typed character :-(
Okay, I think that we can hide this bug easy.

Although, we should support committed clauses in nsEditor in the future.
Assignee: nobody → masayuki
Status: NEW → ASSIGNED
Keywords: inputmethod
Summary: Microsoft IME puts character twice when tsf is turned on → [TSF] Microsoft IME puts character twice when tsf is turned on

Comment 6

4 years ago
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #4)
> Next, OnCompositionUpdate() is called with incomplete range (nullptr).

Ah, I didn't used ITfContextOwnerCompositionSink::OnCompositionUpdate() when I implemented the text store for Chromium (for immersive mode).  Perhaps you might be the first victim (outside MSFT?) if its behavior is so unreliable and tricky.

I guess both ITfContextOwnerCompositionSink::OnStartComposition and ITfContextOwnerCompositionSink::OnEndComposition are sane and reliable, because it must be easy for TSF runtime to call them correctly.  This is just a pluming task as follows:
  - ITfContextComposition::StartComposition
    -> ITfContextOwnerCompositionSink::OnStartComposition
  - ITfComposition::EndComposition
    -> ITfContextOwnerCompositionSink::OnEndComposition

As for ITfContextComposition::OnCompositionUpdate, however, it could be error prone unless its implementer, that is Microsoft, and we application developers are sharing the same meaning of "Update".  Honestly I guess this method was introduced just for convenience.  We can even reimplemented OnCompositionUpdate on top of ITfTextEditSink::OnEndEdit and other existing APIs.  Chromium is now going this way actually.

IIRC, ITfTextEditSink::OnEndEdit is called back during ITextStoreACPSink::OnLockGranted(mLock) is being called, that is, before ITextStoreACPSink::OnLockGranted(mLock) returns to the call site.  Whenever ITfTextEditSink::OnEndEdit is called back, and if there remain one ore more on-going compositions, you can a) find text range marked with GUID_PROP_COMPOSING and non-zero value in its VARIANT data, and/or b) get ITfCompositionView objects via ITfContextComposition which can be obtained by QueryInterface from ITfContext.  This article might be helpful.
http://blogs.msdn.com/b/tsfaware/archive/2007/10/05/what-to-do-when-you-push-a-key.aspx

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #5)
> Although, we should support committed clauses in nsEditor in the future.

Yeah, you can find its range with ITfContext::TrackProperties + GUID_PROP_ATTRIBUTE instead of assuming that the entire composing range has the display attribute.  Here is the code in Chromium.
http://lxr.mozilla.org/chromium/source/src/win8/metro_driver/ime/text_store.cc#737

Hope this helps.
Yukawa-san:

Thank you for the information. Indeed, we struggle with OnUpdateComoposition()'s odd behavior. Its behavior actually depends on each TIP. So, forever, we need to keep struggling with it.

Currently, we're using OnUpdateComposition() for catching the timing of updating its composition string. When it's called, even if the updated range is still null, it adds pending action for compositionupdate. This is important if OnEndComposition() will be called during the lock at least our current design.

The problem of this bug is, when we detect that we cannot handle new composition range with current composition, e.g., composition start position is changed, we restarts composition with the new range. However, it's buggy. In this case, MS-IME updates composition string and then commit a part of the composition string in a lock. This is what we don't assumed. So, at least, this is really our bug.


BTW, do you know if there are TIPs which use multiple composition? We still don't support it but I'm not sure if it's worthwhile for us. I guess that current D3E composition event isn't enough for representing multiple composition, though.
Oh, this must be my regression. In the initial implementation, even if aPreserveSelection is true, OnStartCompositionInternal uses selection set event before dispatching compositionstart event.
http://hg.mozilla.org/mozilla-central/annotate/ccf89b42c2e2/widget/src/windows/nsTextStore.cpp#l1172

However, I misunderstood as if aPreserveSelection is true, we shouldn't set selection before dispatching compositionstart. aPreserveSelection means that if it's true, selection should be restored after dispatching compositionstart.
Created attachment 8478151 [details] [diff] [review]
part.1 nsTextStore should set selection before dispatching compositionstart if current selection is different from the range

Okay, this hides this bug. Because of the seriousness of this bug for MS-IME users, we should land this ASAP. However, this causes odd undo transaction. E.g., "ああ" → "いい" case, the first transaction is "ああい" and the next transaction is "いい" withe replacing the first committed string's "い".

I'll fix it in this bug.
Attachment #8478151 - Flags: review?(VYV03354)
Keywords: regression
Whiteboard: [leave open]
Blocks: 1050703
Comment on attachment 8478151 [details] [diff] [review]
part.1 nsTextStore should set selection before dispatching compositionstart if current selection is different from the range

Please file a follow-up bug tot track the root cause.
Attachment #8478151 - Flags: review?(VYV03354) → review+
(In reply to Masatoshi Kimura [:emk] from comment #10)
> Comment on attachment 8478151 [details] [diff] [review]
> part.1 nsTextStore should set selection before dispatching compositionstart
> if current selection is different from the range
> 
> Please file a follow-up bug tot track the root cause.

Okay, I'll do that tomorrow. I think that there are two bugs:

* Support multiple composition
* Support "fixed" clause of composition

The latter should be easy to fix.

Comment 14

4 years ago
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #7)
> Currently, we're using OnUpdateComposition() for catching the timing of
> updating its composition string. When it's called, even if the updated range
> is still null, it adds pending action for compositionupdate. This is
> important if OnEndComposition() will be called during the lock at least our
> current design.

Maybe implementing ITfTextEditSink to subscribe ITfTextEditSink::OnEndEdit could be more reliable and a good long term solution.  Right now, the current implementation is relying on the following callbacks, right?.

- ITfContextOwnerCompositionSink::StartComposition
- ITfContextOwnerCompositionSink::OnUpdateComposition
- ITfContextOwnerCompositionSink::OnEndComposition

Then I'd suggest to switch to the following set of callbacks.

- ITfContextOwnerCompositionSink::StartComposition
- ITfTextEditSink::OnEndEdit
- ITfContextOwnerCompositionSink::OnEndComposition

In short, ITfTextEditSink::OnEndEdit is a more fine-grained version of ITfContextOwnerCompositionSink::OnUpdateComposition.  I know ITfTextEditSink::OnEndEdit is a bit too complicated and hard to understand, but I think we can observe what the TIP is doing more closely by using ITfTextEditSink::OnEndEdit.

(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #7)
> BTW, do you know if there are TIPs which use multiple composition? We still
> don't support it but I'm not sure if it's worthwhile for us. I guess that
> current D3E composition event isn't enough for representing multiple
> composition, though.

AFAIK, no IME supports multiple compositions on Windows.  At least I've never seen such an IME.  Perhaps, one possible scenario could be when multiple TIPs are enabled and then some of them start composing text at the same time.  Anyway, from the viewpoint of the usability, it is OK to simply reject multiple composition, as Firefox has been doing.  TIPs should never go wrong even when the application doesn't accept composition start request.
(In reply to Yohei Yukawa from comment #14)
> Maybe implementing ITfTextEditSink to subscribe ITfTextEditSink::OnEndEdit
> could be more reliable and a good long term solution.  Right now, the
> current implementation is relying on the following callbacks, right?.
> 
> - ITfContextOwnerCompositionSink::StartComposition
> - ITfContextOwnerCompositionSink::OnUpdateComposition
> - ITfContextOwnerCompositionSink::OnEndComposition

Yes, each of them is a trigger to create pending composition (start|update|end) action.

> Then I'd suggest to switch to the following set of callbacks.
> 
> - ITfContextOwnerCompositionSink::StartComposition
> - ITfTextEditSink::OnEndEdit
> - ITfContextOwnerCompositionSink::OnEndComposition

I'm not sure that in this case, we can know OnEndEdit() is exactly same as compositionudpate. I.e., is it called *after* OnEndComposition() too? Anyway, I should implement it locally and check the behavior, though. (bug 544778)

> In short, ITfTextEditSink::OnEndEdit is a more fine-grained version of
> ITfContextOwnerCompositionSink::OnUpdateComposition.  I know
> ITfTextEditSink::OnEndEdit is a bit too complicated and hard to understand,
> but I think we can observe what the TIP is doing more closely by using
> ITfTextEditSink::OnEndEdit.

So, my concern is, if OnEndEdit() may be called in some cases which do not cause OnUpdateComposition() calls, we need to check whether we should append pending compositionupdate action or not.

> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #7)
> > BTW, do you know if there are TIPs which use multiple composition? We still
> > don't support it but I'm not sure if it's worthwhile for us. I guess that
> > current D3E composition event isn't enough for representing multiple
> > composition, though.
> 
> AFAIK, no IME supports multiple compositions on Windows.  At least I've
> never seen such an IME.  Perhaps, one possible scenario could be when
> multiple TIPs are enabled and then some of them start composing text at the
> same time.  Anyway, from the viewpoint of the usability, it is OK to simply
> reject multiple composition, as Firefox has been doing.  TIPs should never
> go wrong even when the application doesn't accept composition start request.

Thanks. I wonder why Microsoft thought that supporting multiple composition is necessary for TSF...
Created attachment 8478866 [details] [diff] [review]
part.2 nsTextStore should not commit a part of composition string which is still being composed at restarting composition

This patch fixes the remaining issue of undo transaction.

At restarting composition, if new range is completely before or after old composition range, we should just call RecordCompositionEndAction() and RecordCompositionStartAction().

Otherwise, this is what the case of renaming issue, we should commit only the part of committing string (i.e., except the new range of old composition string). In this, we should create pending compositionupdate action manually because current range stored by TSF isn't same as committing range. Therefore, we cannot use any methods which we've already implemented. Then, call RecordCompositionEndAction() for dispatching compositionend.

Next, restore the old composition and old selection (i.e., before committing the old composition) for same content as TIP knows.

Finally, RecordCompositionStartAction() for restarting composition with new range.
Attachment #8478866 - Flags: review?(VYV03354)
Whiteboard: [leave open]
FYI:

[TSF] nsEditor and nsTextStore should support multiple composition at same time
https://bugzilla.mozilla.org/show_bug.cgi?id=1058437

[TSF] Should support fixed clause of compostion
https://bugzilla.mozilla.org/show_bug.cgi?id=1058439
Comment on attachment 8478866 [details] [diff] [review]
part.2 nsTextStore should not commit a part of composition string which is still being composed at restarting composition

Review of attachment 8478866 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsTextStore.cpp
@@ +1829,5 @@
> +  // If the new composition is really different range from current
> +  // current composition, we just commit the composition and restart new
> +  // composition with the new range but current selection range should be
> +  // preserved.
> +  if (newStart > mComposition.EndOffset() || newEnd < mComposition.mStart) {

Isn't the end offset exclusive? For example, if newStart == 3, newEnd == 5, mComposition.mStart == 1 and mComposition.EndOffset() == 3, is the composition considered to be overlapped?
(In reply to Masatoshi Kimura [:emk] from comment #18)
> Comment on attachment 8478866 [details] [diff] [review]
> part.2 nsTextStore should not commit a part of composition string which is
> still being composed at restarting composition
> 
> Review of attachment 8478866 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: widget/windows/nsTextStore.cpp
> @@ +1829,5 @@
> > +  // If the new composition is really different range from current
> > +  // current composition, we just commit the composition and restart new
> > +  // composition with the new range but current selection range should be
> > +  // preserved.
> > +  if (newStart > mComposition.EndOffset() || newEnd < mComposition.mStart) {
> 
> Isn't the end offset exclusive? For example, if newStart == 3, newEnd == 5,
> mComposition.mStart == 1 and mComposition.EndOffset() == 3, is the
> composition considered to be overlapped?

I think so.

string: 0123456
old:       ^^^
new:     ^^^

In this case, I think only 4-5 should be committed. Although, I don't know actual case of this. Therefore, I'm not 100% sure this is natural behavior. But my idea is committing only 4-5 first. I.e., after commit the new composition, undo transactions should be "replace 1-3" and "replace 4-5 with original selected text". If you don't think that it's not natural behavior, I think that we should commit only a part only when the range is not expanding to original composition range.
Flags: needinfo?(VYV03354)
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #19)
> I think so.
> 
> string: 0123456
> old:       ^^^
> new:     ^^^

Isn't it
 string: 0123456
 old:       ^^ 
 new:     ^^ 
?

If newEnd is really inclusive, newLength will be 2 while the actual length of the composition string is 3.
I couldn't find such a description from the document of ITfRangeACP::GetExtent.
http://msdn.microsoft.com/en-us/library/windows/desktop/ms628910%28v=vs.85%29.aspx
Flags: needinfo?(VYV03354)
Moreover, if newEnd is inclusive, your range calculation is incorrect. keepComposingLength will be 0 while the range will have one character overlap. So commitString.Cut() will fail to cut the overlapped character.
(In reply to Masatoshi Kimura [:emk] from comment #20)
> (In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #19)
> > I think so.
> > 
> > string: 0123456
> > old:       ^^^
> > new:     ^^^
> 
> Isn't it
>  string: 0123456
>  old:       ^^ 
>  new:     ^^ 
> ?
> 
> If newEnd is really inclusive, newLength will be 2 while the actual length
> of the composition string is 3.

Ah, yes. You're right. So, they should be >= and <=.
Comment on attachment 8478866 [details] [diff] [review]
part.2 nsTextStore should not commit a part of composition string which is still being composed at restarting composition

Review of attachment 8478866 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsTextStore.cpp
@@ +1826,5 @@
> +    return hr;
> +  }
> +
> +  // If the new composition is really different range from current
> +  // current composition, we just commit the composition and restart new

"If the new range has no overlap with the current range,"

@@ +1836,5 @@
> +    return S_OK;
> +  }
> +
> +  // If part of current composition is committed, we should commit only the
> +  // part for avoiding to create odd undo transaction.  I.e., the other part

"If the new range has an overlap with the current one, we should commit only the overlapped range to avoid creating an odd undo transaction."

@@ +1847,5 @@
> +  // Commit only the part of composition.
> +  LONG keepComposingStartOffset = std::max(mComposition.mStart, newStart);
> +  LONG keepComposingEndOffset = std::min(mComposition.EndOffset(), newEnd);
> +  MOZ_ASSERT(keepComposingStartOffset <= keepComposingEndOffset,
> +    "Why keepComposingEndOffset is smaller than keepComposingStartOffset?");

The composition range should have at least one overlap here, so the condition should be |keepComposingStartOffset < keepComposingEndOffset|. Please also change the message accordingly.

@@ +1849,5 @@
> +  LONG keepComposingEndOffset = std::min(mComposition.EndOffset(), newEnd);
> +  MOZ_ASSERT(keepComposingStartOffset <= keepComposingEndOffset,
> +    "Why keepComposingEndOffset is smaller than keepComposingStartOffset?");
> +  LONG keepComposingLength = keepComposingEndOffset - keepComposingStartOffset;
> +  // Remove the not committing part from the commit string.

"Remove the overlapped part from the commit string."

@@ +1853,5 @@
> +  // Remove the not committing part from the commit string.
> +  nsAutoString commitString(mComposition.mString);
> +  commitString.Cut(keepComposingStartOffset - mComposition.mStart,
> +                   keepComposingLength);
> +  // Update composition string.

nit: "the" composition string

@@ +1864,5 @@
> +  }
> +  lockedContent.ReplaceTextWith(mComposition.mStart,
> +                                mComposition.mString.Length(),
> +                                commitString);
> +  // Record compositionupdate action for commiting the part of composing string.

"Record a compositionupdate action to commit the part of the composing string."

@@ +1884,5 @@
> +                                commitString.Length(),
> +                                oldComposition.mString);
> +  currentSelection = oldSelection;
> +
> +  // Record compositionstart action with new range

nit: with "the" new range
Attachment #8478866 - Flags: review?(VYV03354) → review+
> "If the new range has an overlap with the current one, we should commit only the overlapped range to avoid creating an odd undo transaction."

Ah, we should "not" commit the overlapped range.
(In reply to Masatoshi Kimura [:emk] from comment #23)
> @@ +1847,5 @@
> > +  // Commit only the part of composition.
> > +  LONG keepComposingStartOffset = std::max(mComposition.mStart, newStart);
> > +  LONG keepComposingEndOffset = std::min(mComposition.EndOffset(), newEnd);
> > +  MOZ_ASSERT(keepComposingStartOffset <= keepComposingEndOffset,
> > +    "Why keepComposingEndOffset is smaller than keepComposingStartOffset?");
> 
> The composition range should have at least one overlap here, so the
> condition should be |keepComposingStartOffset < keepComposingEndOffset|.
> Please also change the message accordingly.

I think that new range which is passed from ITfContextOwnerCompositionSink::OnUpdateComposition() *could* be empty range. So, I think that "=" is possible. How about you?

I guess that if IME attempts to switch composition string style from on-the-spot to over-the-spot like Chinese IME or moving composition string to different position during over-the-spot, it might be possible.
Flags: needinfo?(VYV03354)
(In reply to Masatoshi Kimura [:emk] from comment #24)
> > "If the new range has an overlap with the current one, we should commit only the overlapped range to avoid creating an odd undo transaction."
> 
> Ah, we should "not" commit the overlapped range.

we should not commit whole current range to avoid...?
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #25)
> (In reply to Masatoshi Kimura [:emk] from comment #23)
> > @@ +1847,5 @@
> > > +  // Commit only the part of composition.
> > > +  LONG keepComposingStartOffset = std::max(mComposition.mStart, newStart);
> > > +  LONG keepComposingEndOffset = std::min(mComposition.EndOffset(), newEnd);
> > > +  MOZ_ASSERT(keepComposingStartOffset <= keepComposingEndOffset,
> > > +    "Why keepComposingEndOffset is smaller than keepComposingStartOffset?");
> > 
> > The composition range should have at least one overlap here, so the
> > condition should be |keepComposingStartOffset < keepComposingEndOffset|.
> > Please also change the message accordingly.
> 
> I think that new range which is passed from
> ITfContextOwnerCompositionSink::OnUpdateComposition() *could* be empty
> range. So, I think that "=" is possible. How about you?
> 
> I guess that if IME attempts to switch composition string style from
> on-the-spot to over-the-spot like Chinese IME or moving composition string
> to different position during over-the-spot, it might be possible.

Oh, I missed that the range can be empty before we take a intersect. You are right.
Flags: needinfo?(VYV03354)
https://hg.mozilla.org/mozilla-central/rev/3dc7616b2544
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Summary: [TSF] Microsoft IME puts character twice when tsf is turned on → [TSF] Microsoft IME puts character twice of the first character of new composition when you start new composition with already converted string
Summary: [TSF] Microsoft IME puts character twice of the first character of new composition when you start new composition with already converted string → [TSF] Microsoft IME puts twice the first character of new composition when you start new composition when there is old composition
You need to log in before you can comment on or make changes to this bug.