Crash in `Microsoft::WRL::Module<T>::Create''::`2''::`dynamic atexit destructor for ''moduleSingleton'''' (win10 creators update)

RESOLVED FIXED in Firefox 55

Status

()

Core
Widget: Win32
P1
critical
RESOLVED FIXED
6 months ago
a month ago

People

(Reporter: marcia, Assigned: masayuki)

Tracking

({crash})

55 Branch
mozilla56
Unspecified
Windows 10
crash
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(platform-rel ?, firefox-esr52 wontfix, firefox53 wontfix, firefox54 wontfix, firefox55 fixed, firefox56 fixed)

Details

(Whiteboard: tpi:+ [platform-rel-Microsoft][platform-rel-Windows][tbird crash], crash signature)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-a4c73517-54b1-42f4-8370-0a34d0170501.
=============================================================

This bug has been at the top of crash stats, but looks like no one filed it - not sure of the component...all Windows 10 crashes. Currently #7 top Windows browser crash.

In other versions, but fairly high in 55 relative to release: http://bit.ly/2pBbP46. 85 crashes on nightly and 113 on Release in the last 7 days.

Comments:

Seemingly random crash.

Comment 1

5 months ago
This looks like a creator update crash in tsf. Masayuki, any idea?
Flags: needinfo?(masayuki)
Hmm, I have no idea, how about you, Kato-san?
Flags: needinfo?(masayuki) → needinfo?(m_kato)
Comment hidden (spam)

Comment 4

5 months ago
Ah this is creator update.  sorry for spam

Comment 5

5 months ago
Humm, textinputframework (for touch?) has a bug by focus/blur.  But I don't know root cause from this stacks.

Updated

5 months ago
Priority: -- → P2
Whiteboard: tpi:+

Comment 6

5 months ago
:dees, do you have a contact for Microsoft's Windows core team?   This seems to be Microsoft's bug and this is TOP #20 bug on Nightly.
Flags: needinfo?(dchinniah)
(In reply to Makoto Kato [:m_kato] from comment #6)
> :dees, do you have a contact for Microsoft's Windows core team?   This seems
> to be Microsoft's bug and this is TOP #20 bug on Nightly.

We have a Microsoft DL and I believe that :marcia reached out to them overnight. They have yet to respond...
Flags: needinfo?(dchinniah) → needinfo?(mozillamarcia.knous)
platform-rel: --- → ?
Whiteboard: tpi:+ → tpi:+ [platform-rel-Microsoft][platform-rel-Windows]
Yes, no response yet. Will update the bug when there is a response.
Flags: needinfo?(mozillamarcia.knous)

Comment 9

5 months ago
The platform version breakdown seems relevant - this appears to be something new in creators update. 

https://crash-stats.mozilla.com/search/?signature=%3D%60Microsoft%3A%3AWRL%3A%3AModule%3CT%3E%3A%3ACreate%27%27%3A%3A%602%27%27%3A%3A%60dynamic%20atexit%20destructor%20for%20%27%27moduleSingleton%27%27%27%27&product=Firefox&date=%3E%3D2017-05-11T23%3A17%3A00.000Z&date=%3C2017-05-18T23%3A17%3A00.000Z&_sort=-date&_facets=signature&_facets=platform_version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-platform_version

Updated

5 months ago
Priority: P2 → P1
We did hear back from them once, but I pinged them again today to get a status update.
Nightly has about 100 crashes in the last seven days and Firefox 53.0.3 has 231.
We have fixed a lot of bugs of our TSF module in Nightly. However, the crash still occurs with the latest Nightly build.

So, I have no idea to avoid this crash by outside. The crash occurs when we receive a message and which is handled *without* mozilla::widget::WinUtils::PeekMessageW nor CThreadInputMgr::PeekMessageW in the stack. I don't understand the reason.
I see some directly calls of PeekMessage API:
https://searchfox.org/mozilla-central/source/ipc/chromium/src/base/message_pump_win.cc
https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/base/win32socketserver_unittest.cc
https://searchfox.org/mozilla-central/source/dom/gamepad/windows/WindowsGamepad.cpp
https://searchfox.org/mozilla-central/source/ipc/glue/WindowsMessageLoop.cpp
https://searchfox.org/mozilla-central/source/xpcom/threads/HangMonitor.cpp

And calls of GetMessage API:
https://searchfox.org/mozilla-central/source/widget/windows/nsWindow.cpp
https://searchfox.org/mozilla-central/source/mozglue/misc/StackWalk.cpp
https://searchfox.org/mozilla-central/source/gfx/skia/skia/src/views/win/skia_win.cpp
https://searchfox.org/mozilla-central/source/media/webrtc/trunk/webrtc/base/win32socketserver.cc

They do really wrong thing because TSF-aware application should use ITfMessagePump to get/peek message from the queue. So, I guess that some of them are unexpected behavior of TSF framework or TIP (IME of TSF). As far as possible, we rewrite them with WinUtils::(Get|Peek)Message(). However, some of them are in skia, webrtc and chromium. I don't know if we can change them, though.
Depends on: 1369419
Hmm, unfortunately, the fix of bug 1369419 won't help this.
s/won't/couldn't

Comment 16

5 months ago
I used a clipboard manager when I tried to paste some text into a textfield when I got this crash:
https://crash-stats.mozilla.com/report/index/fbd8ad42-dea9-4317-a040-d5ff70170606

Comment 17

5 months ago
Hey Henrik, what clipboard manager were you using?
Flags: needinfo?(bugzilla)
Yoshida-san:

This crash occurs in textinputframework.dll. The crash signature is really wired to us:
https://crash-stats.mozilla.com/report/index/f53ae844-412c-4651-ad3d-c887d0170606

When we call ::CallWindowProcW() (we're not sure what message is being handled), TSF is crashed internally. I'd be happy if you'd check this too.
Looking
Comment hidden (typo)
Note this is #7 Windows top crash for June 6 nightly with 16 distinct installation. Nightly has about 136 crashes in the last seven days.
Anyone know how I can get access to dumps?

Comment 23

4 months ago
(In reply to Jim Mathies [:jimm] from comment #17)
> Hey Henrik, what clipboard manager were you using?

I'm using Ditto, but I cant reproduce it again
Flags: needinfo?(bugzilla)

Updated

4 months ago
Summary: Crash in `Microsoft::WRL::Module<T>::Create''::`2''::`dynamic atexit destructor for ''moduleSingleton'''' → Crash in `Microsoft::WRL::Module<T>::Create''::`2''::`dynamic atexit destructor for ''moduleSingleton'''' (win10 creators update)
Whiteboard: tpi:+ [platform-rel-Microsoft][platform-rel-Windows] → tpi:+ [platform-rel-Microsoft][platform-rel-Windows][tbird crash]
(In reply to ebadger@microsoft.com from comment #22)
> Anyone know how I can get access to dumps?

For privacy reasons we generally do not grant minidump access to non-Mozilla personnel. We can provide you with a specific minidump if the originating user consents to their minidump being shared with you.

Is there anything in particular that you would want to look for in these minidumps? Perhaps somebody within Mozilla could assist.
Flags: needinfo?(ebadger)
looking for a dump so that I can see the call stack for Windows components
Flags: needinfo?(ebadger)
(In reply to ebadger@microsoft.com from comment #25)
> looking for a dump so that I can see the call stack for Windows components

Just so everyone is on the same page, MS now has a dump - we were able to get permission from a user that was hitting the crash.
Adding 56 as affected. Currently #4 top browser crash on 56. Eric - any update on the dump we provided?

Also in early 55 beta this is #14 top crash.
status-firefox56: --- → affected
Flags: needinfo?(ebadger)
yes, bug was fixed - thanks for the dump
fix is ~3 weeks from being flighted on the insider program

The root cause is that when focus is placed in a control, we will immediatley query for selection position.  That selection query is returning a failure code, and we die.  If you can ensure that you always can return a selection range in response to focus change, this will also solve.
Flags: needinfo?(ebadger)
(In reply to ebadger@microsoft.com from comment #28)
> The root cause is that when focus is placed in a control, we will
> immediatley query for selection position.  That selection query is returning
> a failure code, and we die.  If you can ensure that you always can return a
> selection range in response to focus change, this will also solve.

Thank you for the information. I think that it's possible.

On the other hand, IIRC, we do so now for a crash bug of MS-IME... I'll check it ASAP.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=35a62ea8764cfde9745c80f758f78e556a3ba694
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3359d63b94c0b1496da56134bfc791345463c1db
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f6f16e679e621e770fc2ee00aeb1a43b83d53425
Comment hidden (mozreview-request)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d2712e35049e7412a39bf86ddd30f9d7812c823a
Comment hidden (mozreview-request)
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

Hmm, I cannot reproduce bug 1312302 with patched build (if it's reproduced, "TSFTextStore::GetSelection() returns fake selection range for avoiding a crash in TSF" is logged into the log file (nsTextStoreWidgets:3).

Can you check it before review? (Use the last try build to test it.)
Attachment #8880327 - Flags: feedback?(m_kato)
Assignee: nobody → masayuki
Status: NEW → ASSIGNED
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

(In reply to Masayuki Nakano [:masayuki] (JST, +0900) from comment #36)
> Comment on attachment 8880327 [details]
> Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on
> Win10 Anniversary Update or later and build 16215 or earlier.
> 
> Hmm, I cannot reproduce bug 1312302 with patched build (if it's reproduced,
> "TSFTextStore::GetSelection() returns fake selection range for avoiding a
> crash in TSF" is logged into the log file (nsTextStoreWidgets:3).

I cannot reproduce it without patch now since we don't have same environment...
 
> Can you check it before review? (Use the last try build to test it.)

I verified on touch enabled device.
Attachment #8880327 - Flags: feedback?(m_kato) → feedback+
Attachment #8880327 - Flags: review?(m_kato)
Thank you. Then, could you review this? After dogfooding the patch in this weekend, I'll land the patch.

Comment 39

4 months ago
mozreview-review
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

https://reviewboard.mozilla.org/r/151692/#review157000

We might care for TS_E_NOSELECTION situation.  But it is OK for me.
Attachment #8880327 - Flags: review?(m_kato) → review+
(In reply to Makoto Kato [:m_kato] from comment #39)
> We might care for TS_E_NOSELECTION situation.  But it is OK for me.

Hmm, I think that both are safe at least for bug 1312302 since even if selection is not dirty when this is called, it returns the error in these cases.

# And I realized that I need to update the summary because it stopped checking fixed build number.
Comment hidden (mozreview-request)

Comment 42

4 months ago
Pushed by masayuki@d-toybox.com:
https://hg.mozilla.org/integration/autoland/rev/27778b8bc4cf
TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later r=m_kato

Comment 43

4 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/27778b8bc4cf
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
status-firefox56: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
status-firefox53: affected → wontfix
status-firefox54: affected → wontfix
status-firefox-esr52: --- → affected

Comment 44

4 months ago
thank you, the crash signature is indeed gone in the latest nightly builds after this patch has landed. can you request an uplift once you deem fit to do so?
Yeah, looks like really gone:
https://crash-stats.mozilla.com/signature/?product=Firefox&build_id=%3E%3D20170627000000&signature=%60Microsoft%3A%3AWRL%3A%3AModule%3CT%3E%3A%3ACreate%27%27%3A%3A%602%27%27%3A%3A%60dynamic%20atexit%20destructor%20for%20%27%27moduleSingleton%27%27%27%27&date=%3E%3D2017-06-26T02%3A30%3A52.000Z

I'll request to uplift them after checking if it's graftable.
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

Approval Request Comment
[Feature/Bug causing the regression]:
Regression of Win10 Anniversary Update, not ours.

[User impact if declined]:
This is 68th top crash bug of 54.

[Is this code covered by automated tests?]:
No.

[Has the fix been verified in Nightly?]:
Yes. There are no crash reports after 20170627xxxxxx.

[Needs manual test from QE? If yes, steps to reproduce]: 
No because nobody is sure how to reproduce this crash.

[List of other uplifts needed for the feature/fix]:
No.

[Is the change risky?]:
As far as I've tested, no.

[Why is the change risky/not risky?]:
Bug 1312302 is same crash bug and the patch avoided to crash only a specific path. This patch extends the fix to any path. (The dll of TSF crashes when we return E_FAIL if selection has still not ready.  These patches returns fake selection range instead.  Additionally, this patch restrict the environment to Win10 Anniversary Update and later.)

[String changes made/needed]:
No.
Attachment #8880327 - Flags: approval-mozilla-beta?
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
This bug is not a top crash bug of ESR52.2.0, but 26th top crash bug for Thunderbird users. We can take this only for Thunderbird, though, I think that we should take this to ESR52 for avoiding the Windows 10's crash bug.

User impact if declined:
When focus is changed, window is opened or closed, and the user uses Win10 Anniversary Update or later, user may meet this crash.

Fix Landed on Version:
56.

Risk to taking this patch (and alternatives if risky):
See previous comment (requesting to uplift to Beta55).

String or UUID changes made by this patch:
No.


Note that if we'll take this into ESR52, we also need to uplift the patch for bug 1368150 because for risk management, this patch restricts to enable the hack with OS's version.
Attachment #8880327 - Flags: approval-mozilla-esr52?
Depends on: 1368150
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

crash fix, beta55+
Attachment #8880327 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 49

4 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/ec2e245d18d1
status-firefox55: affected → fixed
Comment on attachment 8880327 [details]
Bug 1361132 TSFTextStore::GetSelection() shouldn't return if it runs on Win10 Anniversary Update or later

Since this is not a topcrash in ESR52, we can take this only for Thunderbird. ESR52-.
Attachment #8880327 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52-
status-firefox-esr52: affected → wontfix

Comment 51

2 months ago
THUNDERBIRD_52_VERBRANCH:
https://hg.mozilla.org/releases/mozilla-esr52/rev/459150ea79f6be75c0228450b2ae15c28b5fbe1b
(In reply to Masayuki Nakano [:masayuki] (JST, +0900)(offline: 9/13 ~ 9/18) from comment #46)
> [Needs manual test from QE? If yes, steps to reproduce]: 
> No because nobody is sure how to reproduce this crash.

Since there is no proper way of reproducing this crash and based on Masayuki's assessment on manual testing needs, marking it as qe-verify-.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.