Closed Bug 1052343 Opened 7 years ago Closed 7 years ago

[TSF] crash in msctf.dll@0x2efce

Categories

(Core :: Widget: Win32, defect)

x86
Windows NT
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla35
Tracking Status
firefox34 --- affected

People

(Reporter: masayuki, Assigned: masayuki)

References

()

Details

(Keywords: crash, regression)

Crash Data

Attachments

(8 files)

205.07 KB, text/plain
Details
7.71 KB, patch
Details | Diff | Splinter Review
13.85 KB, patch
emk
: review+
masayuki
: checkin+
Details | Diff | Splinter Review
9.26 KB, patch
emk
: review+
Details | Diff | Splinter Review
48.20 KB, patch
emk
: review+
Details | Diff | Splinter Review
3.49 KB, patch
emk
: review+
Details | Diff | Splinter Review
14.45 KB, patch
emk
: review+
Details | Diff | Splinter Review
5.56 KB, patch
emk
: review+
Details | Diff | Splinter Review
This bug was filed from the Socorro interface and is 
report bp-1257ffcb-7746-4b57-9102-692022140812.
=============================================================

1. Load the data URL.
2. Click blank area of the <textarea>
3. Type "Enter".

Then, Nightly crashes.

This must be a regression for a week. I've never reproduced this bug at testing bug 1026997 (according to the bug 1026997 comment 31 - bug 1026997 comment 34, I could reproduce this crash on 8/5 - 8/6).
I tested again with mozregression but got same result...
Oh, it's strange. 8/3's m-c build has this bug if TSF mode is enabled. I'm not sure why mozregression indicates the regression range as https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=ad532252ff9e ...
Regression window with force tsf on.
i.e.,
user_pref("intl.tsf.enable", true);
user_pref("intl.tsf.force_enable", true);


Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/34dddf6f7ec1
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 ID:20140114030236
Bad:
http://hg.mozilla.org/mozilla-central/rev/be9f41887791
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 ID:20140114064535
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=34dddf6f7ec1&tochange=be9f41887791


Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/044a089451c8
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 ID:20140113175851
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/56339de6e744
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 ID:20140113180151
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=044a089451c8&tochange=56339de6e744

Regressed by: Bug 951966
Regression window with force tsf on.
i.e.,
user_pref("intl.tsf.enable", true);
user_pref("intl.tsf.force_enable", true);
and also
user_pref("intl.enable_tsf_support", true);


Regression window(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/0a29ac059eb5
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101124 Firefox/4.0b8pre ID:20101124173846
Bad:
http://hg.mozilla.org/mozilla-central/rev/32cd5d95a95f
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b8pre) Gecko/20101125 Firefox/4.0b8pre ID:20101125030318
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0a29ac059eb5&tochange=32cd5d95a95f
In local build
Last Good:  b890963a137e
First Bad:  34447b39697a
Regressed by:
  34447b39697a	Neil Deakin — Bug 613748 - Keyboard inputs do not work when tab-modal alerts are open in another tab. r=smaug, a=drivers for API change
Blocks: 613748
Thank you very much, Alice-san!

I'll check both bugs.
Odd... After I reboot my computer, I cannot reproduce this...
bp-6357a091-bba6-47e2-bf67-6d75c2140815
Crash Signature: [@ msctf.dll@0x2efce] → [@ msctf.dll@0x2efce] [@ imetip.dll@0x56788]
various crash sig on Windows8.1 + MSIME2012 with
user_pref("intl.tsf.enable", true);
user_pref("intl.tsf.force_enable", true);
user_pref("intl.enable_tsf_support", true);


Nightly34.0a1
bp-0f544cd5-35f7-46f2-bffa-f9de12140816
[@ ntdll.dll@0x25645 ]

bp-c167045a-0f2a-4d1d-9cf8-6cb612140816
[@ tiptsf.dll@0xe880 ]

bp-c02fa57a-946c-4bb2-a812-214ad2140816
[@ msctf.dll@0x2efce ]

bp-6dc33ce6-16c0-4f96-8711-d766b2140816
[@ msctf.dll@0x2efce ]

bp-efacb665-4f7e-4e86-8b08-d958d2140816
[@ msctf.dll@0x2efce ]

bp-7861479a-9a18-4c29-90d1-7ebb82140816
[@ msctf.dll@0x2efce ]

bp-6d025b4a-a3b3-4984-926a-943352140816
[@ msctf.dll@0x1c36c ]

Aurora33.0a2
bp-c5d47bca-b292-4ecd-a639-609ba2140816
[@ msctf.dll@0x2efce ]

Beta32.0b7
bp-719f31b8-0789-443e-a790-ef90e2140816
[@ msctf.dll@0x1c36c ]

Firefox31.0
bp-2dbf65d4-e322-4318-a216-2adad2140816
[@ msctf.dll@0x2efce ]

Firefox24esr
bp-4de6fc46-160f-4cc1-a76a-4dae32140816
[@ msctf.dll@0x1c36c ]

Firefox17esr
bp-20ef0f64-1037-4502-8011-3585b2140816
[@ @0x0 | imetip.dll@0xddbf ]

Firefox10esr
bp-6e286079-5252-4a83-86c0-454a32140816
[@ imetip.dll@0xddbf ]

Firefox4
bp-3c70d850-e50f-414b-971b-e483a2140816
[@ combase.dll@0x113ba2 ]
I've never reproduced this bug since I filed this even with my PC running about for a week.

Alice-san, do you still reproduce this with the STR? Could the fix of bug 1053048 hides this bug.
Flags: needinfo?(alice0775)
I can still reproduce the crash.

Windows7 + MSIME2010 
bp-7270add2-9040-46cf-a84b-c8694214082
https://hg.mozilla.org/mozilla-central/rev/dac8b4a0bd7c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140821030201

Windows8.1 + MSIME2012
bp-85099473-b8ce-4b4d-87f4-6d2e82140821
https://hg.mozilla.org/mozilla-central/rev/dac8b4a0bd7c
Mozilla/5.0 (Windows NT 6.3; WOW64; rv:34.0) Gecko/20100101 Firefox/34.0 ID:20140821030201
Flags: needinfo?(alice0775)
Sorry, paste err.

Windows7 + MSIME2010 
bp-7270add2-9040-46cf-a84b-c86942140821
Attached file log.txt
set NSPR_LOG_MODULES=nsTextStoreWidgets:4
log when crashed bp-46aae54b-cf01-47ff-af8e-796522140821
Thanks, I'll keep testing this bug.
Okay, I succeeded to reproduce this bug again!

> 0[280f140]: TSF: 0x9b522f0   nsTextStore::OnSelectionChangeInternal(), mSink=0x95cf8e4, mSinkMask=TS_AS_TEXT_CHANGE | TS_AS_SEL_CHANGE | TS_AS_LAYOUT_CHANGE | TS_AS_ATTR_CHANGE | TS_AS_STATUS_CHANGE, mIsRecordingActionsWithoutLock=false, mComposition.IsComposing()=false
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::OnLayoutChangeInternal(), calling mSink->OnLayoutChange()...
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::RequestLock(dwLockFlags=TS_LF_READ, phrSession=0x95d618), mLock=TS_LF_READ
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::RequestLock() didn't allow to lock, *phrSession=TS_E_SYNCHRONOUS
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::OnLayoutChangeInternal(), calling mSink->OnLayoutChange()...
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::RequestLock(dwLockFlags=TS_LF_READ, phrSession=0x95d7f8), mLock=TS_LF_READ
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::RequestLock() didn't allow to lock, *phrSession=TS_E_SYNCHRONOUS
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::CurrentSelection(): acpStart=5, acpEnd=5 (length=0), reverted=false
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::GetSelection() succeeded

I think that this is very good hint.

If this is a bug of TSF, we shouldn't call OnLayoutChange() during a document lock because requesting lock won't work.

Otherwise, we actually break heap at retrieving selection.
Attached patch Ugly patchSplinter Review
Okay, I see the cause.

> 0[280f140]: TSF: 0x9b522f0   Locking (TS_LF_READ) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetStatus(pdcs=0x95e1e8)
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetActiveView(pvcView=0x95e1cc)
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::GetActiveView() succeeded: *pvcView=1
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetWnd(vcView=1, phwnd=0x95e204), mWidget=0xa731400
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::GetWnd() succeeded: *phwnd=0x1102d4
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetStatus(pdcs=0x95e134)
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetSelection(ulIndex=4294967295, ulCount=1, pSelection=0x95e064, pcFetched=0x95e114)
> 0[280f140]: TSF:   nsTextStore::OnFocusChange(aGotFocus=false, aFocusedWidget=0xa731400, aIMEState={ mEnabled=ENABLED }), sTsfThreadMgr=0x95a4140, sTsfTextStore=0x9b522f0
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetStatus(pdcs=0x95d4e0)
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetStatus(pdcs=0x95cdbc)
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetActiveView(pvcView=0x95cd94)
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::GetActiveView() succeeded: *pvcView=1
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetWnd(vcView=1, phwnd=0x95cdb8), mWidget=0xa731400
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::GetWnd() succeeded: *phwnd=0x1102d4
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::GetStatus(pdcs=0x95cc78)
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::RequestLock(dwLockFlags=TS_LF_READWRITE, phrSession=0x95cc24), mLock=TS_LF_READ
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::RequestLock() stores the request in the queue, *phrSession=TS_S_ASYNC
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::Destroy(), mComposition.IsComposing()=false
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::UnadviseSink(punk=0x95cf5cc), mSink=0x95cf5cc
> 0[280f140]: TSF: 0x9b522f0   nsTextStore::Destroy() succeeded
> 0[280f140]: TSF:   nsTextStore::OnFocusChange(aGotFocus=true, aFocusedWidget=0xa731400, aIMEState={ mEnabled=ENABLED }), sTsfThreadMgr=0x95a4140, sTsfTextStore=0x9b522f0
> 0[280f140]: TSF: 0x9b522f0 nsTextStore::Create(aWidget=0xa731400)

By reframing the <textarea> element, nsTextStore is also recreated during document lock. This is the cause of this bug. If I put off destroying and recreating textstore, I succeeded to avoid this bug.

However, this patch is ugly. I think that we should make nsTextStore non-singleton.
Status: NEW → ASSIGNED
Summary: [TSF]? crash in msctf.dll@0x2efce → [TSF] crash in msctf.dll@0x2efce
The scenario what should we do in this bug is that we should make nsTextStore multi-instanceable and during the document lock, it should be grabbed during it locks the document. Then, nsTextStore can keep necessary TSF internal instances during a lock.

This patch just renames sTsfTextStore which is a pointer to the only instance of nsTextStore to sEnabledTextStore for storing "current" nsTextStore instance managing focused editor.
Attachment #8482133 - Flags: review?(VYV03354)
Current code assumes there is always existing nsTextStore intance if it's in TSF mode.

However, it will be false. Therefore, we should fix it.

1. mNativeIMEContext of IMEContext should be pointer of ITfThreadMgr because an IME context is it. nsTextStore just represents text editor. So, using ITfThreadMgr for this is better for its meaning.

2. Some static methods of nsTextStore which accesses sEnabledTextStore should check if it's nullptr.

NOTE: I don't change MetroWidget because it doesn't set mNativeIMEContext for now but it should do it. However, currently, we cannot build MetroWidget due to bug 1059626. I'll file a bug for doing it later.
Attachment #8482138 - Flags: review?(VYV03354)
Currently, we observes active TIP change with nsTextStore. However, nsTextStore will be destroyed when focused element is not an editable editor.

Therefore, we need to create new observer class whose life time is same as our process's.

This patch creates TSFStaticSink class. It's methods are just copied from the methods of nsTextStore which work for same purpose.

Moving static methods which are helper of logging from nsTextStore section to top of nsTextStore.cpp causes this diff file big. They are moved just for making usable in TSFStaticSink. So, you can ignore the change around the methods.
Attachment #8482148 - Flags: review?(VYV03354)
I found this dead code. This obstructs to make nsTextStore non-singleton. Therefore, we need to do this in this bug.
Attachment #8482153 - Flags: review?(VYV03354)
This makes nsTextStore instance created when editable editor gets focus and destroyed when editable editor loses focus.

By this, we need to change the timing of initializing InputScope. It should be initialized after nsTextStore instance is created.

And also, we need to run the blur case in OnFocusChange() first for ensuring to destroy old nsTextStore instance. Therefore, looks like nsTextStore::OnFocusChange() is rewritten. But it just reordered the code.

The if block of OnFocusChange() is separated to CreateAndSetFocus().
Attachment #8482155 - Flags: review?(VYV03354)
This is the actual fix of this bug. We should use kungFuDeathGrip at locking the document and making a call of Destroy() delayed until the document is unlocked.
Attachment #8482156 - Flags: review?(VYV03354)
What do you think about bug 329937?
Curious, on which branches do we need this? Setting tracking flags would be good.
(In reply to Masatoshi Kimura [:emk] from comment #24)
> What do you think about bug 329937?

I agree with what kungFuDeathGrip(this) is ugly.

However, at this bug, TSF doesn't grab ITextStoreACP at calling ITextStoreACP::RequestLock() actually. Therefore, any refcounted members can be released even during the document lock.

So, anyway, we need to guarantee all TSF objects stored by nsTextStore to be released during the document lock. However, I think that it doesn't make sense to grab each members temporary because it's very error prone. E.g., at adding a new member, it could be missed from the grabbing list.

Therefore, my conclusion is that we should make nsTextStore non-singleton and locking instance should never be destroyed until unlock.
(In reply to Olli Pettay [:smaug] from comment #25)
> Curious, on which branches do we need this? Setting tracking flags would be
> good.

I'm not sure what did you mean. Do you like the ugly patch better?

I think that we cannot fix this bug "correctly" with the approach. Although, the patches have not conformed to new architecture completely yet. If there is new focused editor, we should refresh the members of nsTextStore for the new editor. If we don't do so, we cannot distinguish which request from TSF is for new focused editor or old one. Use different instance for new editor makes this issue simpler.

We should research what old editor's nsTextStore instance should do if there is composition at losing focus later. However, managing both editors by an nsTextStore instance may make our implementation very complicated. Therefore, I believe that we should create nsTextStore instance at each focus event.
Ah, smaug said about tracking flag... We shouldn't block any release by TSF specific bug because it's enabled only on Nightly. So, we don't need to uplift to Aurora and other branches.
Comment on attachment 8482133 [details] [diff] [review]
part.1 Renamse nsTextStore::sTsfTextStore to nsTextStore::sEnabledTextStore

Review of attachment 8482133 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsTextStore.h
@@ +789,3 @@
>    // although Create is called when an editor is focused and Destroy called
>    // when the focused editor is blurred.
> +  static nsTextStore* sEnabledTextStore;

Please use mozilla::StaticRefPtr.
Attachment #8482133 - Flags: review?(VYV03354) → review+
Attachment #8482138 - Flags: review?(VYV03354) → review+
Attachment #8482148 - Flags: review?(VYV03354) → review+
Attachment #8482153 - Flags: review?(VYV03354) → review+
Comment on attachment 8482155 [details] [diff] [review]
part.5 Recreate nsTextStore instance at each focus change

Review of attachment 8482155 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsTextStore.cpp
@@ +3982,5 @@
> +  // TSF might do something which causes that we need to access static methods
> +  // of nsTextStore.  At that time, sEnabledTextStore may be necessary.
> +  // So, we should set sEnabledTextStore directly.
> +  NS_ADDREF(sEnabledTextStore = new nsTextStore());
> +  if (NS_WARN_IF(!sEnabledTextStore->Create(aFocusedWidget))) {

This method name should be changed to Initialize() or something.
Attachment #8482155 - Flags: review?(VYV03354) → review+
Comment on attachment 8482156 [details] [diff] [review]
part.6 Don't destroy nsTextStore instance during it's locking the document

Review of attachment 8482156 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsTextStore.cpp
@@ +1263,5 @@
> +     "mComposition.IsComposing()=%s",
> +     this, GetLockFlagNameStr(mLock).get(),
> +     GetBoolName(mComposition.IsComposing())));
> +
> +  if (mLock) {

Is it possible for nsTextStore::Destroy to be called during the lock even after we have added kungFuDeathGrip?

@@ +1438,5 @@
>        ("TSF: 0x%p   Locking (%s) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>"
>         ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>",
>         this, GetLockFlagNameStr(mLock).get()));
> +    // Don't release this instance during this lock.
> +    nsRefPtr<nsTextStore> kungFuDeathGrip(this);

Please add a comment explaining that the caller is not under our control.
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) from comment #19)
> NOTE: I don't change MetroWidget because it doesn't set mNativeIMEContext
> for now but it should do it. However, currently, we cannot build MetroWidget
> due to bug 1059626. I'll file a bug for doing it later.

Please file a follow-up bug about MetroWidget.
Thank you for your review!

(In reply to Masatoshi Kimura [:emk] from comment #30)
> > +  if (NS_WARN_IF(!sEnabledTextStore->Create(aFocusedWidget))) {
> 
> This method name should be changed to Initialize() or something.

I think that it should be Init() which is usual name of XPCOM. And Initialize() is used at start up.

(In reply to Masatoshi Kimura [:emk] from comment #31)
> > +     "mComposition.IsComposing()=%s",
> > +     this, GetLockFlagNameStr(mLock).get(),
> > +     GetBoolName(mComposition.IsComposing())));
> > +
> > +  if (mLock) {
> 
> Is it possible for nsTextStore::Destroy to be called during the lock even
> after we have added kungFuDeathGrip?

It's possible because Destroy() is called by us. (OnFocusChange())
If it's possible to call it in destructor, we can make this simpler but unfortunately, we cannot do it because until we release the TSF objects from Destroy(), nsTextStore won't be released forever.
Not actually fixed, or merged, mcMerge just got confused by the commit message and thought only one of the six csets was backed out, when actually all six were and none of them are landed in mozilla-central.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla34 → ---
Oops, I failed to backout only part.1 but I has r+ and not actually changed the logic. So, let's keep it.
While this signature isn't showing up as a topcrash it sounds like the underlying issue is causing the crashes in bug 1053885 so we should verify this in crash-stats a few days after it lands.
Flags: qe-verify+
Attachment #8482156 - Flags: review?(VYV03354) → review+
Masayuki, I think this will land in firefox 35 since the merge already happened; could you nominate it for uplift to aurora (so that it will also go into 34)?  :)
Flags: needinfo?(masayuki)
(In reply to Liz Henry :lizzard from comment #40)
> Masayuki, I think this will land in firefox 35 since the merge already
> happened; could you nominate it for uplift to aurora (so that it will also
> go into 34)?  :)

I don't think that it's necessary because this bug occurs only when TSF is enabled. But it's enabled only in Nightly (not so in Aurora, Beta and also Release).
Flags: needinfo?(masayuki)
Ah, but the landing of part.1 might cause bustage with GCC on Windows. If so, we should back it out from Aurora.

Even if so, it will be fixed by bug 1062053.

# I cannot confirm it.
No crashes in Socorro containing this signature in more than 28 days.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.