280ms in gfxMacPlatformFontList::ActivateFontsFromDir during startup
Categories
(Core :: Graphics: Text, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox90 | --- | fixed |
People
(Reporter: mstange, Assigned: jfkthame)
References
Details
Attachments
(1 file)
Profile: https://share.firefox.dev/2PuFVG0
This is a profile of startup on macOS captured by Florian.
Jonathan, is there anything we can do to make this faster?
Assignee | ||
Comment 1•4 years ago
|
||
Maybe.... I don't think we can avoid activating the language-support fonts, as that would break rendering for a bunch of writing systems, but perhaps we can arrange to do it on a separate thread early in the startup process. I'll try to prototype something and see if it looks feasible/useful.
Assignee | ||
Comment 2•4 years ago
|
||
Assignee | ||
Comment 3•4 years ago
|
||
I have a patch that seems to help at least a bit in my local testing; instrumentation indicates it shaves around 30-40ms off startup time in my local build (which is opt, but not pgo, so results with a real "shippable" build may be different).
Pushed to try to check that it doesn't break all the tests :) https://treeherder.mozilla.org/jobs?repo=try&revision=4c398da46cf6a2b2f171205fa43b12baf2385db3 ... Markus, if you (or Florian) have a chance to test and see what effect this has for your testcases, that would be great.
Comment 4•4 years ago
|
||
Assigning on Jonathan to keep this tracked :)
Assignee | ||
Comment 5•4 years ago
|
||
Florian, could you try re-profiling with the build from comment 3, to compare with your original profile and see if this helps? Thanks!
Comment 6•4 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #5)
Thanks for working on this!
Florian, could you try re-profiling with the build from comment 3, to compare with your original profile and see if this helps? Thanks!
I applied locally the 2 patches in your try push. Here's a profile from the resulting build: https://share.firefox.dev/2Sd91dX
gfxPlatform::Init took 120ms instead of the 430ms in the profile from comment 0.
I did a second run, where this took 202ms, including 66ms waiting for PR_JoinThread
: https://share.firefox.dev/3eyoDA0
It looks like the new thread created by the patch is not registered with the profiler, so I can't see what it did and when. Could you name it and register it with the profiler? (see https://phabricator.services.mozilla.com/D106495 if an example of naming/registering can be helpful)
Assignee | ||
Comment 7•4 years ago
•
|
||
Hmm. I tried to register the thread (see https://treeherder.mozilla.org/jobs?repo=try&revision=4a0253b5a60d4e3259d7e5513e1287ff887419a2), but I still don't see it showing up at all in my profile; is there something else required to get it into the profiler?
I also tried using nsIThread APIs (NS_CreateNamedThread) instead of PRThread, but still didn't see it in the profiler; but I did see that I ended up waiting substantially longer for the thread to finish (where WaitForFontRegistration calls PR_JoinThread). I take this to mean that the nsIThread version doesn't actually start running the task early enough (maybe it's waiting for the event loop?).
In theory I think we could start the task a bit earlier yet, if it weren't for the need to use the Directory Service to get the gecko directory for ActivateBundledFonts. The system's supplemental fonts -- the main issue here -- could be handled purely via Core Foundation APIs instead of relying on gecko services to iterate over the files, which would mean we can start activating them even before those services are ready for use.
There'll still be the possibility that we end up having to wait in InitFontList for the registration thread to finish its work; but the wait should be significantly shorter than the time we're currently blocking on the font registration calls.
Comment 8•4 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #7)
Hmm. I tried to register the thread (see https://treeherder.mozilla.org/jobs?repo=try&revision=4a0253b5a60d4e3259d7e5513e1287ff887419a2), but I still don't see it showing up at all in my profile; is there something else required to get it into the profiler?
You will need to set the MOZ_PROFILER_STARTUP_FILTERS
environment variable to have a non default set of profiled threads. MOZ_PROFILER_STARTUP=1 MOZ_PROFILER_STARTUP_FILTERS="GeckoMain,RegisterFonts" ./mach run
gave me this profile: https://share.firefox.dev/3e7Lvrr
Assignee | ||
Comment 9•4 years ago
|
||
I've updated the patch to try and start the RegisterFonts thread earlier; new try job is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdbe002e2dde07b7df0b01f033f3b94f8f8a5b25. My profiling indicates this improves things a bit further, but it'd be great if you could also test it and compare results.
Comment 10•4 years ago
|
||
(In reply to Jonathan Kew (:jfkthame) from comment #9)
I've updated the patch to try and start the RegisterFonts thread earlier; new try job is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdbe002e2dde07b7df0b01f033f3b94f8f8a5b25. My profiling indicates this improves things a bit further, but it'd be great if you could also test it and compare results.
Note: I'm using a packaged build (mach package
and then I use --packaged
with mach run
), MOZ_PROFILER_STARTUP=1 MOZ_PROFILER_STARTUP_FILTERS="GeckoMain,RegisterFonts" ./mach run --packaged
. Packaged builds should have faster I/O for chrome:// files, so they are more likely to wait for the font thread than non-packaged local builds.
First profile with this new patch: https://share.firefox.dev/3t5uG4C (11ms waiting for PR_JoinThread).
Second profile: https://share.firefox.dev/3u7nk1K 101ms waiting for PR_JoinThread
Third profile: https://share.firefox.dev/3t12DTT only 4.1ms waiting
4th profile: https://share.firefox.dev/3eG9YTH 133ms wait
After that I stopped uploading the profiles, but I've done a total of 20 attempts.
Here are my observations:
- I've not seen a single run where the main thread doesn't wait for the RegisterFonts thread to finish. The shortest wait was 4.1ms, the longest was 145ms. In most run it's between 50 and 100ms.
- The RegisterFonts thread is started very early (I don't think we can start it earlier), and uses CPU for 250 to 300ms (always a much longer time than the time when the main thread is blocked on PR_JoinThread).
Conclusions:
- the patch is a nice win!
- (assuming there's nothing inside the RegisterFonts thread that can be further optimized), the next way to optimize this to avoid waiting for the thread would be to delay the first
gfxPlatformMac::CreatePlatformFontList
call on the main thread. Currently this is triggered by telemetry accessing the nsIGfxInfo service, that doesn't seem to contain font related information. When is the first time we actually need info info? How much later is that?
Assignee | ||
Comment 11•4 years ago
|
||
Thanks for the testing; that pretty much confirms my observations/expectations. I'll ask for review on this patch, as I think it's about as good as it's going to get.
As you say, the next thing to check would be whether we can defer the initial creation of the font list, but that'll be a separate patch that I think we should do in a followup bug. Not sure yet whether we can gain a significant amount there, but it's at least worth investigating.
Updated•4 years ago
|
Comment 12•4 years ago
|
||
Comment 13•4 years ago
|
||
bugherder |
Updated•3 years ago
|
Description
•