Open Bug 1411338 Opened 7 years ago Updated 2 years ago

Firefox on Linux uses 100% CPU after font installation/removal

Categories

(Core :: Graphics: Text, defect, P3)

58 Branch
Unspecified
Linux
defect

Tracking

()

People

(Reporter: ye.jingchen, Unassigned)

References

Details

(Keywords: regression, Whiteboard: [gfx-noted])

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20171023220222

Steps to reproduce:

I am currently using firefox-nightly 58.0a1.20171024.00-1 from AUR on Arch Linux x86_64, the build id is 20171023220222.
Using 4 content process as it is the recommended setting for my laptop (Quad core i7-6700HQ).

1. Open Firefox, open several websites (I had ~40 tabs open at that time, but not all of them are loaded);
2. Install or remove a font. According to my experience, CJK fonts which have large size and many variants are easier to reproduce this behavior. One choice is adobe-source-han-sans-otc-fonts.
3. Do a fc-cache, either without arguments if the font above is installed/removed by an unprivileged user into his home directory, or with -s (system-wide only) if a system font package is installed/removed. If fc-cache is run automatically, this step can be skipped.


Actual results:

All of Firefox processes starts to use 100% CPU, and this lasts tens of seconds, sometimes even 1 minute or more. During this period web page elements are not able to interact, and the mouse pointer won't change its shape when hovering over links/texts.
After the period, CPU usage drops to normal level, and the web page (and the browser) becomes responsive.


Expected results:

Firefox stays as calm as normal.
Component: Untriaged → Widget: Gtk
OS: Unspecified → Linux
Product: Firefox → Core
Component: Widget: Gtk → Graphics: Text
I also noticed that when this happens, firefox processes are performing continuous disk IO at about 3M/s. When CPU usage drops to normal level, that disk IO also stops.
Whiteboard: [gfx-noted]
Status: UNCONFIRMED → NEW
Ever confirmed: true
This bug mentions that all content processes start taking CPU, but that may have been changed in bug 1412090 because content processes should not be allowed to access random location in file system directly.

However, bug 1446756 is opened against 59, so maybe there is still some problem.

jfkthame, since you fixed bug 1412090, mind having a look at this as well?
Flags: needinfo?(jfkthame)
I've done some simple tests and it doesn't reproduce. I don't see all 100% for seconds after font changes. In fact I don't see any CPU core going up to 100% so I'm not sure whether the cache generating process has been triggered.
Would it be related to FontConfig version?
This seems indeed related to FontConfig version. I got an update from 2.12.6 to 2.13.0 and the issues is essentially gone. Removing the entire Noto family and installing it again does not seem to freeze Firefox any more.
I'm currently using 2.12.6+5+g665584a-1 and it doesn't reproduce. Five months ago when I could reproduce it reliably my fontconfig version was 2.12.6-1.

Ye Jingchen, can you still reproduce this?
Flags: needinfo?(ye.jingchen)
I haven't seen Firefox freeze after fc-cache recently, but not sure when the problem went away.
I was indeed using fontconfig 2.12.6-1 when i filed this bug, but I vaguely remember that a fontconfig update didn't resolve this. 
Maybe I can try installing fontconfig 2.12.6-1 from archlinux archive and see whether it reproduces.

Update logs:
➜  ~ rg fontconfig /var/log/pacman.log | rg -v lib32 | rg '(upgraded|installed)'
[2016-08-17 11:43] [ALPM] installed fontconfig (2.12.1-3)
[2016-10-22 05:12] [ALPM] reinstalled fontconfig (2.12.1-3)
[2017-01-12 19:09] [ALPM] upgraded fontconfig (2.12.1-3 -> 2.12.1-4)
[2017-06-01 08:10] [ALPM] upgraded fontconfig (2.12.1-4 -> 2.12.3-1)
[2017-07-29 13:00] [ALPM] upgraded fontconfig (2.12.3-1 -> 2.12.4-1)
[2017-09-10 01:32] [ALPM] upgraded fontconfig (2.12.4-1 -> 2.12.5-1)
[2017-09-22 09:03] [ALPM] upgraded fontconfig (2.12.5-1 -> 2.12.6-1) # <-- was here
[2017-11-17 11:07] [ALPM] upgraded fontconfig (2.12.6-1 -> 2.12.6+5+g665584a-1)
[2018-03-26 16:04] [ALPM] upgraded fontconfig (2.12.6+5+g665584a-1 -> 2.13.0+10+g58f5285-1)
➜  ~
Flags: needinfo?(ye.jingchen)
It's sounding a lot like this was a fontconfig issue; the one thing I wonder is whether (on a system with the older fontconfig) something changed in firefox that caused it to become apparent. Was there a regression around FF58 that made this suddenly become a problem?
Flags: needinfo?(jfkthame)
On Ubuntu 18.04 with fontconfig 2.12.6, I can trigger this behavior by simply running

$ mkdir -p ~/.local/share/fonts && touch ~/.local/share/fonts

By running strace, I can see that this triggers fontconfig in the Firefox process(es) to rescan every font on my system.  This is seemingly overkill since all I am doing is updating the modified time on an empty directory!

By installing the latest fontconfig from git, the issue is resolved, which would suggest that fontconfig is indeed the culprit and that this is fixed in some update to it.

I can repro with those STR (though quite a smaller lag, with newer fontconfig). I can take a poke with rr and try to see what's going on.

Flags: needinfo?(emilio)
See Also: → 1495900

I'm now not able to reproduce it consistently... Does this reproduce with MOZ_DISABLE_CONTENT_SANDBOX=1?

Flags: needinfo?(emilio)

MOZ_DISABLE_CONTENT_SANDBOX=1 fixed the bug.

Actually, I've narrowed it down to

user_pref("security.sandbox.content.read_path_whitelist", "/var/cache/fontconfig/,/home/lynn/.cache/fontconfig/");

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.