Closed Bug 1495900 Opened 2 years ago Closed 4 months ago
starting Chrome makes Firefox content processes hang for about two minutes, due to Font
Config font rescanning (Fc Init Reinitialize)
Sometime in the past few weeks I've noticed a regression: when I start Chromium (Ubuntu's packaged version of it) to test something, all of my Firefox content processes hang for about 2 minutes. This is pretty bad! Today I decided to profile, and the profile at https://perfht.ml/2y8pYIS showed what I expected to be the problem: FontConfig. In particular, we spend basically all of the time inside FcInitReinitialize. I'm not sure if the thing that caused the regression is something in Chromium (e.g., a change in version 69, which I got the upgrade to on 2018-09-12, which seems plausible for when this started happening), or whether the regression is on the Firefox side. [Tracking Requested - why for this release]: Either way, spending this much time rescanning font directories just because I started Chromium doesn't seem acceptable, and seems likely to be problematic for any Web developers running Linux. On the flip side, it's possible my times are particularly long because I have a lot of fonts installed. (I don't have a particularly large number of active tabs -- probably about 20-30, though I have a very large number of dormant tabs that haven't been loaded but are part of my session. If we do something once per tab, that could certainly be a problem...) The bulk of the time (again, see the profile at https://perf-html.io/public/e0db389afe6c3fdcafc8df3db7f0ec0447a1f432 ) is spent in stacks like this: TT_New_Context FT_Load_Glyph FcPatternFormat FcPatternFormat FcFreeTypeQueryFace FcFileIsDir FcFileIsDir FcDirScan FcConfigAppFontClear FcConfigBuildFonts FcInitLoadConfig FcInitReinitialize gtk_settings_get_type g_closure_invoke g_signal_handler_disconnect g_signal_emit_valist g_signal_emit g_cclosure_marshal_BOOLEAN__BOXED_BOXEDv g_object_notify_by_pspec gtk_main_do_event gdk_event_free gdk_x11_drag_context_get_type g_main_context_dispatch g_main_context_dispatch g_main_context_iteration nsAppShell::ProcessNextNativeEvent(bool) nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) non-virtual thunk to nsBaseAppShell::OnProcessNextEvent(nsIThreadInternal*, bool) nsThread::ProcessNextEvent(bool, bool*) NS_ProcessNextEvent(nsIThread*, bool) mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) MessageLoop::Run() nsBaseAppShell::Run() XRE_RunAppShell() MessageLoop::Run() XRE_InitChildProcess(int, char**, XREChildData const*) XRE_InitChildProcess
OS: Unspecified → Linux
Hardware: Unspecified → Desktop
Looks like this is not only within fontconfig, it's within fontconfig called from GTK code. That seems like it may not really be under our control. I guess what's happening is that when chromium is started, it's changing the (global) fontconfig configuration in such a way that fontconfig (in any process that uses it) decides it needs to reinitialize everything. What version of fontconfig is installed? This might be a fontconfig issue that is fixed in newer versions; see also bug 1411338 comment 11.
$ dpkg -l 'fontconfig*' | grep "^ii" ii fontconfig 2.12.6-0ubuntu2 amd64 generic font configuration library - support binaries ii fontconfig-config 2.12.6-0ubuntu2 all generic font configuration library - configuration
Here's a more accurate callstack from gdb. There seems to be a single call per child process, and that single call really does take a few minutes.
Oh, and the version you really wanted is: ii libfontconfig1:amd64 2.12.6-0ubuntu2 amd64 generic font configuration library - runtime ii libfontconfig1:i386 2.12.6-0ubuntu2 i386 generic font configuration library - runtime but it's the same.
Judging by the discussion in bug 1411338, I suspect this is a fontconfig 2.12.x issue and the fix is to update to fontconfig-2.13 or later. Could you try that and confirm whether it resolves the problem?
(In reply to Jonathan Kew (:jfkthame) from comment #5) > Judging by the discussion in bug 1411338, I suspect this is a fontconfig > 2.12.x issue and the fix is to update to fontconfig-2.13 or later. Could you > try that and confirm whether it resolves the problem? That reduces the pause from 2-3 minutes to about 4-5 seconds, though there's still a pause. (I updated to the fontconfig packages from cosmic (i.e., Ubuntu 18.10), which are version 2.13.0-5ubuntu3.)
Is there anything we can do here to avoid/reduce the pause or is this a wontfix?
I'm not sure. I don't see an obvious way forward, but the behavior is pretty bad. :( Are any other (gtk-using) applications similarly affected? Given that the pause occurs because a fontconfig re-scan appears to be triggered from inside gtk event-handling code, it seems like other processes might be suffering the same issue.
Just wanted to pipe up and confirm that installing the latest fontconfig packages from cosmic significantly reduced the wait time for me, which makes running my tests a whole lot more manageable :) For the record I just installed these debs: http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/fontconfig-config_2.13.0-5ubuntu3_all.deb http://mirrors.kernel.org/ubuntu/pool/main/f/fontconfig/libfontconfig1_2.13.0-5ubuntu3_amd64.deb
Is there anything more we can do here? Not sure what to do with the Fx64 tracking request since this sounds like an issue that's out of our control.
I don't know of any way forward here, unfortunately. The one thing I'd be curious to know is whether other Gtk-based applications are similarly affected; it seems like they should be, given that the issue seems to be happening where fontconfig is used from within Gtk code rather than directly from Gecko. (It may be much less noticeable, though, if the app isn't visibly busy during the Chrome launch.) David, can you try to see whether other apps are affected on your system, especially when using the older fontconfig? If this actually is unique to Firefox, that would suggest there might be something about how we use Gtk that's provoking it.
Flags: needinfo?(jfkthame) → needinfo?(dbaron)
FWIW I asked a contact on the ubuntu desktop team a couple of weeks ago, it sounds like they hadn't seen this reported on their side, either for firefox or other applications.
I'm reasonably confident that gnome-terminal wasn't affected, although I didn't retest (by downgrading) just now. Are there other apps that are more Firefox-like that you'd want me to check?
I tried setting a breakpoint on settings_update_fontconfig in some other GTK apps (gnome-system-monitor, gnome-terminal-server), and I didn't even hit it when starting Chromium, whereas I do hit it in Firefox.
I wonder if apps like gnome-system-monitor or terminal don't do as much with font configuration? How about something like Gedit or LibreOffice -- do they hit settings_update_fontconfig at all? Speculation: maybe it's related to the fact that we register a callback to be notified when there's a change in Gtk default settings (see <https://searchfox.org/mozilla-central/rev/007b66c1f5f7a1b9a900a2038652a16a020f010c/widget/gtk/nsWindow.cpp#4073-4081>). If you comment out that block of code, does it make any difference?
So one difference I found between Firefox and gnome-system-monitor is that in Firefox, we execute this check_manager_window function that sets up the gdk_xsettings_manager_window_filter event filter that leads to an XEvent (property change) being turned into the fontconfig notification. However, in gnome-system-monitor there's a call to gtk_init but it doesn't reach check_manager_window. Here's the Firefox callstack: #0 0x00007f071684ebf0 in check_manager_window (x11_screen=0x5612c32c4020 [GdkX11Screen], notify_changes=notify_changes@entry=0) at ../../../../../gdk/x11/xsettings-client.c:527 #1 0x00007f071684edc6 in _gdk_x11_xsettings_init (x11_screen=<optimized out>) at ../../../../../gdk/x11/xsettings-client.c:618 #2 0x00007f0716830e3c in _gdk_x11_display_open (display_name=<optimized out>) at ../../../../../gdk/x11/gdkdisplay-x11.c:1608 #3 0x00007f0716805d5d in gdk_display_manager_open_display (manager=<optimized out>, name=0x5612c32a1480 ":0") at ../../../../gdk/gdkdisplaymanager.c:472 #4 0x00007f0716cf4ffa in gtk_init_check (argc=<optimized out>, argv=<optimized out>) at ../../../../gtk/gtkmain.c:1104 #5 0x00007f0716cf5029 in gtk_init (argc=<optimized out>, argv=<optimized out>) at ../../../../gtk/gtkmain.c:1161 #6 0x00007f070b12d90c in mozilla::dom::ContentChild::Init(MessageLoop*, int, char const*, IPC::Channel*, unsigned long, bool) (this=0x5612c329d060, aIOLoop= 0x7f0703d66cc8, aParentPid=10218, aParentBuildID=0x7ffd28fa8a3a "20180917145248", aChannel=0x7f06fc002560, aChildID=1, aIsForBrowser=<optimized out>) at /home/dbaron/builds/mozilla-central/mozilla/dom/ipc/ContentChild.cpp:656
gedit, like gnome-system-manager, does not hit settings_update_fontconfig, and also does not call check_manager_window within gtk_init.
And the difference is because in gedit, gdk_display_manager_open_display tries _gdk_wayland_display_open first, and it succeeds, so there's no x11 display at all. So, basically, I think it doesn't happen in gedit because the notification code for fontconfig changes was never hooked up for wayland.
Not sure if the above leads to anything useful...
This is likely wontfix for 64 at this point, though it'd be good to figure out a way to avoid this.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/autoland/rev/a58f96ac7e03 Add fontconfig cache directories to content-process sandbox read paths. r=jld
Attachment #9139105 - Flags: approval-mozilla-beta?
You need to log in before you can comment on or make changes to this bug.