Closed Bug 1856637 Opened 9 months ago Closed 8 months ago

Startup crash in [@ gfxDWriteFont::UpdateSystemTextVars ]

Categories

(Core :: Graphics, defect)

Firefox 118
Desktop
Windows
defect

Tracking

()

RESOLVED FIXED
120 Branch
Tracking Status
firefox-esr115 --- fixed
firefox118 + fixed
firefox119 + fixed
firefox120 --- fixed

People

(Reporter: yannis, Assigned: jfkthame)

References

Details

(Keywords: topcrash, topcrash-startup, Whiteboard: [win:stability])

Crash Data

Attachments

(1 file)

This main process startup crash started spiking with the release of 118.0, coming from Windows users. We reach gfxDWriteFont::UpdateSystemTextVars but gfxVars::sInstance.mRawPtr is still a nullptr, so we crash on gfxVars::SystemTextQuality() as defined here.

Example crash report: here

Crashing stack on the main thread:

 # Child-SP          RetAddr               Call Site
00 00000057`4adfb1d0 00007ffe`0361d230     xul!gfxDWriteFont::UpdateSystemTextVars+0x66 [/builds/worker/checkouts/gecko/gfx/thebes/gfxDWriteFonts.cpp @ 139] 
01 00000057`4adfb210 00007ffe`0361a568     xul!nsWindow::ProcessMessageInternal+0x23f0 [/builds/worker/checkouts/gecko/widget/windows/nsWindow.cpp @ 5081] 
02 (Inline Function) --------`--------     xul!nsWindow::ProcessMessage+0x1ca [/builds/worker/checkouts/gecko/widget/windows/nsWindow.cpp @ 4890] 
03 00000057`4adfb4b0 00007ffe`0361a234     xul!nsWindow::WindowProcInternal+0x2e8 [/builds/worker/checkouts/gecko/widget/windows/nsWindow.cpp @ 4842] 
04 (Inline Function) --------`--------     xul!CallWindowProcCrashProtected+0x10 [/builds/worker/checkouts/gecko/xpcom/base/nsCrashOnException.cpp @ 27] 
05 00000057`4adfb730 00007ffe`92f35c1d     xul!nsWindow::WindowProc+0x34 [/builds/worker/checkouts/gecko/widget/windows/nsWindow.cpp @ 4795] 
06 00000057`4adfb780 00007ffe`92f357ec     user32!UserCallWinProcCheckWow+0x2bd
07 00000057`4adfb910 00007ffe`92f41f83     user32!DispatchClientMessage+0x9c
08 00000057`4adfb970 00007ffe`935c0464     user32!_fnDWORD+0x33
09 00000057`4adfb9d0 00007ffe`908010c4     ntdll!KiUserCallbackDispatcherContinue
0a 00000057`4adfba58 00007ffe`92f394f2     win32u!NtUserPeekMessage+0x14
0b 00000057`4adfba60 00007ffe`92f39459     user32!_PeekMessage+0x42
0c 00000057`4adfbad0 00007ffe`92dffab0     user32!PeekMessageW+0x149
0d 00000057`4adfbb40 00007ffe`92e00457     msctf!CCtfClientPort::SendAsync+0x400
...
25 00000057`4adfc8c0 00007ffe`04a21f80     msctf!CThreadInputMgr::Activate+0x17
26 00000057`4adfc8f0 00007ffe`06687378     xul!mozilla::widget::TSFTextStore::Initialize+0xe0 [/builds/worker/checkouts/gecko/widget/windows/TSFTextStore.cpp @ 6796] 
27 00000057`4adfc970 00007ffe`03611c8b     xul!mozilla::widget::IMEHandler::Initialize+0x18 [/builds/worker/checkouts/gecko/widget/windows/WinIMEHandler.cpp @ 69] 
28 (Inline Function) --------`--------     xul!mozilla::widget::IMEHandler::InitInputContext+0x25d [/builds/worker/checkouts/gecko/widget/windows/WinIMEHandler.cpp @ 490] 
29 00000057`4adfc9b0 00007ffe`035b99c8     xul!nsWindow::Create+0x71b [/builds/worker/checkouts/gecko/widget/windows/nsWindow.cpp @ 1203] 
2a 00000057`4adfcb30 00007ffe`03926913     xul!nsIWidget::Create+0x128 [/builds/worker/checkouts/gecko/widget/nsIWidget.h @ 463] 
2b (Inline Function) --------`--------     xul!mozilla::AppWindow::Initialize+0x172 [/builds/worker/checkouts/gecko/xpfe/appshell/AppWindow.cpp @ 212] 
2c 00000057`4adfcc20 00007ffe`03925b91     xul!nsAppShellService::JustCreateTopWindow+0x673 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 671] 
2d 00000057`4adfcde0 00007ffe`039b4049     xul!nsAppShellService::CreateTopLevelWindow+0xa1 [/builds/worker/checkouts/gecko/xpfe/appshell/nsAppShellService.cpp @ 178] 
2e 00000057`4adfce90 00007ffe`039d3782     xul!nsAppStartup::CreateChromeWindow+0xf9 [/builds/worker/checkouts/gecko/toolkit/components/startup/nsAppStartup.cpp @ 759] 
2f 00000057`4adfcf30 00007ffe`039d83cf     xul!nsWindowWatcher::CreateChromeWindow+0x72 [/builds/worker/checkouts/gecko/toolkit/components/windowwatcher/nsWindowWatcher.cpp @ 439] 
30 00000057`4adfcfb0 00007ffe`039da757     xul!nsWindowWatcher::OpenWindowInternal+0x287f [/builds/worker/checkouts/gecko/toolkit/components/windowwatcher/nsWindowWatcher.cpp @ 1046] 
31 00000057`4adfd560 00007ffe`04a08ea2     xul!nsWindowWatcher::OpenWindow+0xd7 [/builds/worker/checkouts/gecko/toolkit/components/windowwatcher/nsWindowWatcher.cpp @ 295] 
32 00000057`4adfd640 00007ffe`042abb25     xul!XPTC__InvokebyIndex+0x72
33 (Inline Function) --------`--------     xul!NS_InvokeByIndex+0x5 [/builds/worker/checkouts/gecko/xpcom/reflect/xptcall/md/win32/xptcinvoke_x86_64.cpp @ 57] 
34 (Inline Function) --------`--------     xul!CallMethodHelper::Invoke+0x24 [/builds/worker/checkouts/gecko/js/xpconnect/src/XPCWrappedNative.cpp @ 1627] 
...
46 00000057`4adfe7f0 00007ffe`0722428c     xul!NS_CreateServicesFromCategory+0x375 [/builds/worker/checkouts/gecko/xpcom/components/nsCategoryManager.cpp @ 684] 
47 00000057`4adfea80 00007ffe`0722593c     xul!XREMain::XRE_mainRun+0x46c [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5412] 
48 00000057`4adfeda0 00007ffe`04a0c935     xul!XREMain::XRE_main+0x31c [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5874] 
49 00000057`4adfee50 00007ff6`9eb8f2e4     xul!XRE_main+0x85 [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5930] 
4a (Inline Function) --------`--------     firefox!do_main+0xce [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 227] 
4b (Inline Function) --------`--------     firefox!NS_internal_main+0x4d0 [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 445] 

In all these crashes, the call to gfxDWriteFont::UpdateSystemTextVars seems to originate from a Windows-internal call to PeekMessage as part of CThreadMgr::Activate as called from our mozilla::widget::TSFTextStore::Initialize function. In order for gfxVars::sInstance.mRawPtr to be something different from nullptr, we would need to run after mDirProvider.DoStartup(); in XREMain::XRE_mainRun. However, the crash stack above occurs during the following call, which is right before that:

    // "app-startup" is the name of both the category and the event
    NS_CreateServicesFromCategory("app-startup", cmdLine, "app-startup",
                                  nullptr);
Severity: -- → S2
Flags: needinfo?(jfkthame)

After looking at more stack data, this occurs more precisely while nsBrowserGlue is observing app-startup, and we reach nsWindowWatcher::OpenWindow. This thus seems to point to the following call in _earlyBlankFirstPaint:

    let win = Services.ww.openWindow(
      null,
      "about:blank",
      null,
      browserWindowFeatures,
      null
    );

So I think the crash can be summarized as follows:

  • gfxVars::Initialize() is called as part of nsXREDirProvider::DoStartup();
  • nsBrowserGlue observes app-startup before that call, and that can reach into our nsWindow::WindowProc;
  • but our handling of window messages WM_NCPAINT and WM_SETTINGCHANGE assumes that gfxVars::Initialize() has necessarily already been called, thus we can crash as we dereference gfxVars::sInstance.mRawPtr if these specific window messages were queued.

This is very similar to bug 1851991 actually.

See Also: → 1851991

It seems that the fact that we call mozilla::widget::IMEHandler::Initialize() after our window has been created can sometimes have the side effect of flushing our window's message queue, due to how CThreadMgr::Activate works internally. This makes us reach our nsWindow::WindowProc earlier than we would like and seems to cause this bug as well as bug 1851991.

[:masayuki], mozilla::widget::IMEHandler::Initialize() looks like a function that just needs to be called once but is not required to be called during window initialization. Would it be possible to move that call outside of mozilla::widget::IMEHandler::InitInputContext, so that it occurs earlier, at a point where we are sure that no window has been created yet?

Flags: needinfo?(masayuki)

According to the MSDN document, it can be called later. In my understanding, it was intended that it's called as far as earlier in the widget module. However, I'm not sure which method can be the root caller of IMEHandler::Initialize(). And we should collect feedback from the testers for a while because there are a lot of 3rd party IMEs especially for Simplified Chinese and they might have some trouble with the later initialization.

Flags: needinfo?(masayuki)

Perhaps the simplest mitigation here would be to assume that if gfxDWriteFonts::UpdateSystemTextVars is called early, before the gfxVars have been initialized, we can safely ignore it; we're not painting any text yet, and when we eventually do, we'll get another chance to set up the relevant vars.

So just null-checking the gfxVars::sInstance pointer there and bailing out should be fine, I think.

Flags: needinfo?(jfkthame)
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED

In the crash reports, we crash because the IME initialization code ends up calling PeekMessage, but even if we moved the IME initialization, there could be a PeekMessage coming from somewhere else.

(In reply to Jonathan Kew [:jfkthame] from comment #5)

So just null-checking the gfxVars::sInstance pointer there and bailing out should be fine, I think.

I think that should be the less risky way to stop the crash, so it's a good candidate for an uplift, like the patch we took for bug 1856637.

For longer term, would there be any risk with ensuring that the graphics vars are initialized earlier? On my machine, it seems that they get initialized as a byproduct of the first call to gfxPlatform::GetPlatform() -- but that first call doesn't seem explicitly intended as "the point in execution from which it is guaranteed that graphics stuff will be initialized", it just happens to be the first call.

Would it be possible to guarantee that the gfxVars are initialized before we create a window that uses nsWindow::WindowProc, by calling gfxPlatform::GetPlatform() either in nsWindow::Create before the lines that mention nsWindow::WindowProc, or in XREMain::XRE_mainRun before the call to NS_CreateServicesFromCategory?

I agree, it may be better to ensure the gfxVars initialization happens earlier; but I don't have a good handle on how this might interact with other startup things that are happening, and probably we're trying to paint the initial "skeleton" window as soon as possible; putting more initialization ahead of this could hurt perceived startup speed.

So revising the graphics initialization setup may well be a good thing to do, but seems potentially riskier and more disruptive, so I figured for a quick fix, we should at least band-aid the specific case that's causing the crash here.

Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e26e0ccae944
Ignore gfxDWriteFont::UpdateSystemTextVars() call if the gfxVars are not yet initialized. r=gfx-reviewers,lsalzman
Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → 120 Branch

The patch landed in nightly and beta is affected.
:jfkthame, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox119 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(jfkthame)

[Tracking Requested - why for this release]:

Given the uptick seen on crash-stats, I think we should uplift this -- the patch is essentially zero-risk. IMO it could even go into a 118 dot-release, if there's one going out.

Flags: needinfo?(jfkthame)

Comment on attachment 9356788 [details]
Bug 1856637 - Ignore gfxDWriteFont::UpdateSystemTextVars() call if the gfxVars are not yet initialized. r=#gfx-reviewers

Beta/Release Uplift Approval Request

  • User impact if declined: Possible startup crashes on Windows
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: (no known STR; just startup crashes seen in the wild)
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Minimal risk, just adding a null-check and ignoring an event if it's too early in startup for us to handle it
  • String changes made/needed:
  • Is Android affected?: No
Attachment #9356788 - Flags: approval-mozilla-release?
Attachment #9356788 - Flags: approval-mozilla-beta?

Comment on attachment 9356788 [details]
Bug 1856637 - Ignore gfxDWriteFont::UpdateSystemTextVars() call if the gfxVars are not yet initialized. r=#gfx-reviewers

Approved for 119.0b7

Attachment #9356788 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9356788 - Flags: approval-mozilla-release? → approval-mozilla-release+
Attachment #9356788 - Flags: approval-mozilla-esr115?
Flags: qe-verify+
Flags: qe-verify+

Comment on attachment 9356788 [details]
Bug 1856637 - Ignore gfxDWriteFont::UpdateSystemTextVars() call if the gfxVars are not yet initialized. r=#gfx-reviewers

Approved for 115.4esr.

Attachment #9356788 - Flags: approval-mozilla-esr115? → approval-mozilla-esr115+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: