Closed Bug 1675714 Opened 4 years ago Closed 4 years ago

Startup Crash in [@ gfxDWriteFontFamily::LocalizedName]

Categories

(Core :: Layout: Text and Fonts, defect)

Unspecified
Windows
defect

Tracking

()

RESOLVED FIXED
84 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox82 --- unaffected
firefox83 --- fixed
firefox84 --- fixed

People

(Reporter: aryx, Assigned: jfkthame)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files)

All crash reports are for late beta (83.0b7+) - depending on gfx.e10s.font-list.shared being false?
All crashes are hit in the first 5 minutes after startup.

Crash report: https://crash-stats.mozilla.org/report/index/c2e3d059-b7e3-4fb1-bfe4-453250201103

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll gfxDWriteFontFamily::LocalizedName gfx/thebes/gfxDWriteFontList.cpp:289
1 xul.dll gfxPlatformFontList::GetFontList gfx/thebes/gfxPlatformFontList.cpp:845
2 xul.dll gfxPlatform::GetFontList gfx/thebes/gfxPlatform.cpp:1795
3 xul.dll EnumerateFontsTask::Run gfx/src/nsThebesFontEnumerator.cpp:119
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1197
5 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:332
6 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:327
7 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:309
8 xul.dll static nsThread::ThreadFunc xpcom/threads/nsThread.cpp:442
9 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:399
Flags: needinfo?(jfkthame)

We're crashing with a null-deref at https://hg.mozilla.org/releases/mozilla-beta/file/7b3b0961c32b79f75fb3e5f8f3ad4d63817a2dde/gfx/thebes/gfxDWriteFontList.cpp#l289, where we call

OSPreferences::GetInstance()->GetSystemLocale(locale);

which suggests we must be getting a null pointer from OSPreferences::GetInstance().

Initially, this didn't make any sense to me, as GetInstance() looks at first glance like it should be infallible. But then I realized this is happening on a non-main thread (FontEnumThread), and in every case I looked at, the main thread is in the midst of XPCOM shutdown at the same time. And the OSPreferences instance has been registered with ClearOnShutdown.

So what I think must be happening is that the FontEnumThread's attempt to use OSPreferences is racing with XPCOM-shutdown's deletion of the instance, and sometimes it loses the race. We should null-check the OSPreferences instance before using it, and ensure we're holding a strong ref here for safety.

The other issue here is that even the OSPreferences::GetInstanceAddRefed() accessor could be risky, because it uses GetInstance() internally and if it's called off-main-thread during shutdown, it'd be possible for the instance to be destroyed immediately after we check its existence but before the accessor takes a new strong reference to it.

(I think it's unlikely this is actually a regression from bug 1661247, which didn't significantly modify this code for the gfx.e10s.font-list.shared=false case. My guess is that it was triggered by changes in how shutdown is handled, but didn't immediately show up because of the font-list pref which meant this code wasn't being exercised on Nightly/early Beta.)

Flags: needinfo?(jfkthame)
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d9fdd572c1e1
Null-check the OSPreferences instance before potential off-main-thread usage. r=lsalzman
https://hg.mozilla.org/integration/autoland/rev/f57ef99c9c04
Improve reliability of OSPreferences instance getters. r=lsalzman
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch

Comment on attachment 9186262 [details]
Bug 1675714 - Null-check the OSPreferences instance before potential off-main-thread usage. r=lsalzman

Beta/Release Uplift Approval Request

  • User impact if declined: Possible shutdown-time crash for Windows users
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce: (Unclear how to reliably reproduce this, it looks like a race condition during shutdown)
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Trivial patch to null-check the OSPreferences service pointer. (Just uplifting the first patch should be sufficient to stop the crashes for now.)
  • String changes made/needed:
Attachment #9186262 - Flags: approval-mozilla-beta?
Attachment #9186262 - Flags: approval-mozilla-beta? → approval-mozilla-release?

Comment on attachment 9186262 [details]
Bug 1675714 - Null-check the OSPreferences instance before potential off-main-thread usage. r=lsalzman

Volume is not big but this is a startup crash and it is new in 83, taking for our RC build.

Attachment #9186262 - Flags: approval-mozilla-release? → approval-mozilla-release+
Crash Signature: [@ gfxDWriteFontFamily::LocalizedName] → [@ gfxDWriteFontFamily::LocalizedName] [@ mozilla::intl::OSPreferences::GetSystemLocales]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: