starting Chrome makes Firefox content processes hang for about two minutes, due to FontConfig font rescanning (FcInitReinitialize)
Categories
(Core :: Graphics: Text, defect, P3)
Tracking
()
People
(Reporter: dbaron, Assigned: jfkthame)
References
Details
(Keywords: perf)
Attachments
(2 files)
6.17 KB,
text/plain
|
Details | |
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
Sometime in the past few weeks I've noticed a regression: when I start Chromium (Ubuntu's packaged version of it) to test something, all of my Firefox content processes hang for about 2 minutes. This is pretty bad! Today I decided to profile, and the profile at https://perfht.ml/2y8pYIS showed what I expected to be the problem: FontConfig. In particular, we spend basically all of the time inside FcInitReinitialize. I'm not sure if the thing that caused the regression is something in Chromium (e.g., a change in version 69, which I got the upgrade to on 2018-09-12, which seems plausible for when this started happening), or whether the regression is on the Firefox side. [Tracking Requested - why for this release]: Either way, spending this much time rescanning font directories just because I started Chromium doesn't seem acceptable, and seems likely to be problematic for any Web developers running Linux. On the flip side, it's possible my times are particularly long because I have a lot of fonts installed. (I don't have a particularly large number of active tabs -- probably about 20-30, though I have a very large number of dormant tabs that haven't been loaded but are part of my session. If we do something once per tab, that could certainly be a problem...) The bulk of the time (again, see the profile at https://perf-html.io/public/e0db389afe6c3fdcafc8df3db7f0ec0447a1f432 ) is spent in stacks like this: TT_New_Context FT_Load_Glyph FcPatternFormat FcPatternFormat FcFreeTypeQueryFace FcFileIsDir FcFileIsDir FcDirScan FcConfigAppFontClear FcConfigBuildFonts FcInitLoadConfig FcInitReinitialize gtk_settings_get_type g_closure_invoke g_signal_handler_disconnect g_signal_emit_valist g_signal_emit g_cclosure_marshal_BOOLEAN__BOXED_BOXEDv g_object_notify_by_pspec gtk_main_do_event gdk_event_free gdk_x11_drag_context_get_type g_main_context_dispatch g_main_context_dispatch g_main_context_iteration nsAppShell::ProcessNextNativeEvent(bool) nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) nsThread::ProcessNextEvent(bool, bool*) NS_ProcessNextEvent(nsIThread*, bool) mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) MessageLoop::Run() nsBaseAppShell::Run() XRE_RunAppShell() MessageLoop::Run() XRE_InitChildProcess(int, char**, XREChildData const*) XRE_InitChildProcess
Reporter | ||
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
Looks like this is not only within fontconfig, it's within fontconfig called from GTK code. That seems like it may not really be under our control. I guess what's happening is that when chromium is started, it's changing the (global) fontconfig configuration in such a way that fontconfig (in any process that uses it) decides it needs to reinitialize everything. What version of fontconfig is installed? This might be a fontconfig issue that is fixed in newer versions; see also bug 1411338 comment 11.
Updated•6 years ago
|
Reporter | ||
Comment 2•6 years ago
|
||
$ dpkg -l 'fontconfig*' | grep "^ii" ii fontconfig 2.12.6-0ubuntu2 amd64 generic font configuration library - support binaries ii fontconfig-config 2.12.6-0ubuntu2 all generic font configuration library - configuration
Reporter | ||
Comment 3•6 years ago
|
||
Here's a more accurate callstack from gdb. There seems to be a single call per child process, and that single call really does take a few minutes.
Reporter | ||
Comment 4•6 years ago
|
||
Oh, and the version you really wanted is: ii libfontconfig1:amd64 2.12.6-0ubuntu2 amd64 generic font configuration library - runtime ii libfontconfig1:i386 2.12.6-0ubuntu2 i386 generic font configuration library - runtime but it's the same.
Assignee | ||
Comment 5•6 years ago
|
||
Judging by the discussion in bug 1411338, I suspect this is a fontconfig 2.12.x issue and the fix is to update to fontconfig-2.13 or later. Could you try that and confirm whether it resolves the problem?
Reporter | ||
Comment 6•6 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #5) > Judging by the discussion in bug 1411338, I suspect this is a fontconfig > 2.12.x issue and the fix is to update to fontconfig-2.13 or later. Could you > try that and confirm whether it resolves the problem? That reduces the pause from 2-3 minutes to about 4-5 seconds, though there's still a pause. (I updated to the fontconfig packages from cosmic (i.e., Ubuntu 18.10), which are version 2.13.0-5ubuntu3.)
Comment 8•6 years ago
|
||
Is there anything we can do here to avoid/reduce the pause or is this a wontfix?
Assignee | ||
Comment 9•6 years ago
|
||
I'm not sure. I don't see an obvious way forward, but the behavior is pretty bad. :( Are any other (gtk-using) applications similarly affected? Given that the pause occurs because a fontconfig re-scan appears to be triggered from inside gtk event-handling code, it seems like other processes might be suffering the same issue.
Comment 11•6 years ago
|
||
Just wanted to pipe up and confirm that installing the latest fontconfig packages from cosmic significantly reduced the wait time for me, which makes running my tests a whole lot more manageable :) For the record I just installed these debs: http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/fontconfig-config_2.13.0-5ubuntu3_all.deb http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/libfontconfig1_2.13.0-5ubuntu3_amd64.deb
Comment 12•6 years ago
|
||
Is there anything more we can do here? Not sure what to do with the Fx64 tracking request since this sounds like an issue that's out of our control.
Assignee | ||
Comment 13•6 years ago
|
||
I don't know of any way forward here, unfortunately. The one thing I'd be curious to know is whether other Gtk-based applications are similarly affected; it seems like they should be, given that the issue seems to be happening where fontconfig is used from within Gtk code rather than directly from Gecko. (It may be much less noticeable, though, if the app isn't visibly busy during the Chrome launch.) David, can you try to see whether other apps are affected on your system, especially when using the older fontconfig? If this actually is unique to Firefox, that would suggest there might be something about how we use Gtk that's provoking it.
Comment 15•6 years ago
|
||
FWIW I asked a contact on the ubuntu desktop team a couple of weeks ago, it sounds like they hadn't seen this reported on their side, either for firefox or other applications.
Reporter | ||
Comment 16•6 years ago
|
||
I'm reasonably confident that gnome-terminal wasn't affected, although I didn't retest (by downgrading) just now. Are there other apps that are more Firefox-like that you'd want me to check?
Reporter | ||
Comment 17•6 years ago
|
||
I tried setting a breakpoint on settings_update_fontconfig in some other GTK apps (gnome-system-monitor, gnome-terminal-server), and I didn't even hit it when starting Chromium, whereas I do hit it in Firefox.
Assignee | ||
Comment 18•6 years ago
|
||
I wonder if apps like gnome-system-monitor or terminal don't do as much with font configuration? How about something like Gedit or LibreOffice -- do they hit settings_update_fontconfig at all? Speculation: maybe it's related to the fact that we register a callback to be notified when there's a change in Gtk default settings (see <https://searchfox.org/mozilla-central/rev/007b66c1f5f7a1b9a900a2038652a16a020f010c/widget/gtk/nsWindow.cpp#4073-4081>). If you comment out that block of code, does it make any difference?
Reporter | ||
Comment 19•6 years ago
|
||
So one difference I found between Firefox and gnome-system-monitor is that in Firefox, we execute this check_manager_window function that sets up the gdk_xsettings_manager_window_filter event filter that leads to an XEvent (property change) being turned into the fontconfig notification. However, in gnome-system-monitor there's a call to gtk_init but it doesn't reach check_manager_window. Here's the Firefox callstack: #0 0x00007f071684ebf0 in check_manager_window (x11_screen=0x5612c32c4020 [GdkX11Screen], notify_changes=notify_changes@entry=0) at ../../../../../gdk/x11/xsettings-client.c:527 #1 0x00007f071684edc6 in _gdk_x11_xsettings_init (x11_screen=<optimized out>) at ../../../../../gdk/x11/xsettings-client.c:618 #2 0x00007f0716830e3c in _gdk_x11_display_open (display_name=<optimized out>) at ../../../../../gdk/x11/gdkdisplay-x11.c:1608 #3 0x00007f0716805d5d in gdk_display_manager_open_display (manager=<optimized out>, name=0x5612c32a1480 ":0") at ../../../../gdk/gdkdisplaymanager.c:472 #4 0x00007f0716cf4ffa in gtk_init_check (argc=<optimized out>, argv=<optimized out>) at ../../../../gtk/gtkmain.c:1104 #5 0x00007f0716cf5029 in gtk_init (argc=<optimized out>, argv=<optimized out>) at ../../../../gtk/gtkmain.c:1161 #6 0x00007f070b12d90c in mozilla::dom::ContentChild::Init(MessageLoop*, int, char const*, IPC::Channel*, unsigned long, bool) (this=0x5612c329d060, aIOLoop= 0x7f0703d66cc8, aParentPid=10218, aParentBuildID=0x7ffd28fa8a3a "20180917145248", aChannel=0x7f06fc002560, aChildID=1, aIsForBrowser=<optimized out>) at /home/dbaron/builds/mozilla-central/mozilla/dom/ipc/ContentChild.cpp:656
Reporter | ||
Comment 20•6 years ago
|
||
gedit, like gnome-system-manager, does not hit settings_update_fontconfig, and also does not call check_manager_window within gtk_init.
Reporter | ||
Comment 21•6 years ago
|
||
And the difference is because in gedit, gdk_display_manager_open_display tries _gdk_wayland_display_open first, and it succeeds, so there's no x11 display at all. So, basically, I think it doesn't happen in gedit because the notification code for fontconfig changes was never hooked up for wayland.
Reporter | ||
Comment 22•6 years ago
|
||
Not sure if the above leads to anything useful...
Comment 24•6 years ago
|
||
This is likely wontfix for 64 at this point, though it'd be good to figure out a way to avoid this.
Comment 27•5 years ago
|
||
(In reply to Hiroyuki Ikezoe (:hiro) from comment #26)
FWIW, this issue probably also happens with Android Emulator as well.
Android does not use FontConfig, so is not affected by this.
Comment 28•5 years ago
|
||
(In reply to Lee Salzman [:lsalzman] from comment #27)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #26)
FWIW, this issue probably also happens with Android Emulator as well.
Android does not use FontConfig, so is not affected by this.
Oh, you mean qemu or something like that doesn't use it at all on Linux? I've been seeing quite often similar issue when I run the emulator.
Comment 29•5 years ago
|
||
Hmm actually qemu-system-x86_64 doesn't linked to libfontconfig.so (I am not sure it's dynamically loaded or not), so probably some other processes might affect Firefox.
Comment 30•5 years ago
|
||
(In reply to svansintjan from comment #11)
Just wanted to pipe up and confirm that installing the latest fontconfig
packages from cosmic significantly reduced the wait time for me, which makes
running my tests a whole lot more manageable :)For the record I just installed these debs:
http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/fontconfig-config_2.
13.0-5ubuntu3_all.deb
http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/libfontconfig1_2.13.
0-5ubuntu3_amd64.deb
Looks like the files specified in these links have a problem. Firefox is complaining that the files have 'Virus or Malware' when trying to download them. I downloaded the same package from 'Launchpad.net', it did not complain - https://launchpad.net/ubuntu/disco/amd64/fontconfig/2.13.1-2ubuntu2
Probably the links need to be removed and people should download the packages from Launchpad.
Comment 33•5 years ago
|
||
I am using Ubuntu 18.04, is it possible to fix this? I cant seem to install the deb files provided.
Comment 34•5 years ago
|
||
Hi folks,
David, can you try to see whether other apps are affected on your system
actually, when in my spare time, there are only three desktop apps that i use on my system:
Firefox, Chromium, and skype. There is actually no difference in handling w.r.t this problem.
So starting skype oftentimes freezes chromium and firefox, as well as starting chromium might freeze the others.
This is likely wontfix for 64 at this point, though it'd be good to figure out a way to avoid this.
Well, we are at 71 now (in my distribution), and the problem is still there.
That reduces the pause from 2-3 minutes to about 4-5 seconds, though there's still a pause.
Well, even 4-5 seconds are too much, when there is a video or a real time financial trader running in the browser,
or a chat with a friend just having asked you the "ultimate question" and you cannot answer for several minutes ....
As user "contacto" above, I cannot install 2.13 on my system due to missing dependencies, and the next stable long time version will not come so soon.
Can you please let us know, why firefox has to do this job synchroneously and wait for fontconfig to finish? Can't that be done in an async way?
Why does firefox have to update the fonts anyway? If the user installs new fonts, why just don't wait until the user restarts firefox to see the new fonts in the app? How about a user-option to opt-out of this behaviour?
Finally: What can we do now to fix this problem?
Comment 35•5 years ago
|
||
I seem to have run into this when starting Atom, which uses Electron (which is itself based on Chromium).
Every firefox process is at max CPU. Looking at strace it seems to mainly hang out after opening a NotoSansCJK or NotoSerifCJK font. These fonts have 2^16 - 1 glyphs, the maximum for a truetype font. It's possible the delay wouldn't be perceivable without such large fonts installed.
Firefox 72.0.1+build1-0ubuntu0.18.04.1
libfontconfig1 2.12.6-0ubuntu2
Ubuntu 18.04.3 LTS
x86-64
I believe I'm using X, not Wayland.
Can't seem to trigger it with other GTK apps. Tried epiphany, gedit, evince, Gimp, Inkscape, QuodLibet, and LibreOffice Writer (not sure if that's GTK), and Thunderbird (although I don't have an account configured in Thunderbird).
Comment 36•5 years ago
|
||
Morgan,
yes, I have that fonts, too. But even when deactivating NotoCJK, the load of the firefox (sub)processes/threads goes up to 95% (while with the Noto*CJK fonts all of them went up to 100+%).
Still waiting here for an answer that I can use as a workaround, since I cannot install the 2.13 lib version.
Comment 39•5 years ago
|
||
Out of curiosity, does this repro with MOZ_DISABLE_CONTENT_SANDBOX=1
? I expect it'd be at least much faster with that...
Comment 41•5 years ago
|
||
So this may be because we route all the system calls through the parent process, which means that file reads from content have a lot of overhead... Next thing to figure out would be whether the FcReinitialize stuff still happens consistently with the sandbox disabled...
Jed, do you know if the broker thread is registered with the profiler? It'd be kinda cool to see where time is spent there if the sandboxing is indeed the culprit.
Comment 42•5 years ago
|
||
I don't think they are, but that could be changed; this is the top of the broker threads' main function.
But something else to consider: is fontconfig trying to shell out to fc-cache
from the content processes? That will fail under sandboxing, and that might explain some of the difference between sandboxed and unsandboxed. One experiment to try there is to set security.sandbox.content.level
to 1, which should block fork
but allow direct (unbrokered) filesystem access.
Comment 43•5 years ago
|
||
(In reply to David Baron :dbaron: π΄σ ΅σ ³σ £σ ‘σ Ώ βUTC-8 from comment #0)
Today I decided to profile, and the profile at https://perfht.ml/2y8pYIS
showed what I expected to be the problem: FontConfig. In particular, we
spend basically all of the time inside FcInitReinitialize.
In particular, this appears to be spending almost all of the time doing computations in fontconfig itself, not blocked on opening files (seen as recvmsg
with inverted call stacks), which β assuming the nature of the problem hasn't changed since then β supports the idea that the cause of the bad performance isn't the file brokering, but it might be fontconfig falling back to ignoring a cache after failing to run an external command to rebuild it.
Comment 44•5 years ago
|
||
I've keep security.sandbox.content.level
to 4, but whitelisted fontconfig caches
user_pref("security.sandbox.content.read_path_whitelist", "/var/cache/fontconfig/,/home/lynn/.cache/fontconfig/");
Now touch ~/.local/share/fonts/
doesn't make firefox to eat 100% of CPU.
Assignee | ||
Comment 45•5 years ago
|
||
According to comment 44, it looks like the problem here arises because the sandbox is blocking fontconfig from reading its caches (which then results in it doing a full rescan in each process).
So if we add these cache dirs to the sandbox policy, it may help with this issue.
Assignee | ||
Comment 46•5 years ago
|
||
Updated•5 years ago
|
Comment 47•5 years ago
|
||
(In reply to Alexey Ten (Lynn) from comment #44)
I've keep
security.sandbox.content.level
to 4, but whitelisted fontconfig cachesuser_pref("security.sandbox.content.read_path_whitelist", "/var/cache/fontconfig/,/home/lynn/.cache/fontconfig/");
Now
touch ~/.local/share/fonts/
doesn't make firefox to eat 100% of CPU.
FWIW, with the pref value (though I used my home directory) Firefox crashes when a wpt finished running on my environment.
Assignee | ||
Comment 48•5 years ago
|
||
FWIW, with the pref value (though I used my home directory) Firefox crashes when a wpt finished running on my environment.
Huh, that's a bit worrying..... it looks a bit like bug 1627605, but that only applies when gfx.e10s.font-list.shared
is enabled, which I don't see in your crash report. Setting needinfo? to remind myself to see if I can reproduce that locally.
Comment 49•5 years ago
|
||
Hmm I don't recall when I set gfx.e10s.font-list.shared to false and re-set to true, but today I can't reproduce the crash so far, I think I am using a nightly which is not yet including bug 1627605.
Comment 50•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Updated•5 years ago
|
Comment 51•5 years ago
|
||
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a58f96ac7e03 Add fontconfig cache directories to content-process sandbox read paths. r=jld
Comment 52•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Comment 53•4 years ago
|
||
Isn't firefox-77 a tad ahead? For example, Ubuntu 18.04 is a LTS release that should benefit from having stable software, and Firefox is at 75.
This bug is a bit of a show-stopper for people who have to develop websites and test them on both Firefox and Chromium, and/or simultaneously use an Electron-based editor such as the enormously popular VSCode.
What other workarounds are available in a shorter timeframe?
Comment 54•4 years ago
|
||
Dennis: there was a workaround posted in this bug thread that may help you, namely, upgrade libfontconfig1 to 2.13.*. I am also on Ubuntu 18.04/Firefox 75 and this bug isn't causing me problems now. You will probably have to grab a newer libfontconfig deb from a later release of Ubuntu.
Comment 55•4 years ago
|
||
(In reply to Dennis Mayr from comment #53)
Ubuntu LTS has special policy for some programs (e.g. Firefox) to upgrade them. I'm using Ubuntu 16.04 and have latest stable Firefox. So this is not a problem.
What other workarounds are available in a shorter timeframe?
Assignee | ||
Comment 59•4 years ago
|
||
Comment on attachment 9139105 [details]
Bug 1495900 - Add fontconfig cache directories to content-process sandbox read paths. r=jld
Beta/Release Uplift Approval Request
- User impact if declined: Long pause (jank) in Firefox as a result of starting up chrome (or possibly other triggers)
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce: On affected system, launch chrome while firefox is running, observe whether firefox pauses for a noticeable period. (Dependent on fontconfig version and possibly other aspects of system configuration.)
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Trivial, just whitelisting directory where fontconfig caches may be stored
- String changes made/needed:
Comment 61•4 years ago
|
||
Comment on attachment 9139105 [details]
Bug 1495900 - Add fontconfig cache directories to content-process sandbox read paths. r=jld
May help ameliorate jank experienced by some users by whitelisting the fontconfig cache directory. Approved for 76.0rc1.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 63•4 years ago
|
||
Hello,
Since I was unable to reproduce this issue in first place, I cannot confirm if it's indeed fixed or not.
Tried to reproduce it using Firefox 67.0, 75.0, 76.0b4 and 73.0a1 (2020-01-01) on multiple Ubuntu 18.04 x64 machines with the STR found in this bug and within the duplicates, Iβve also used a Firefox 68.7.0esr build from Ubuntu Store. It worked fine on each attempt, Firefox worked as expected.
I assume this might also be related to the older Chromium/Chrome versions that were available within that timeframe.
The Chromium version that is installed on the machines I've used is 81.0.4044.122.
Description
•