Closed Bug 1707655 Opened 4 years ago Closed 4 years ago

280ms in gfxMacPlatformFontList::ActivateFontsFromDir during startup

Categories

(Core :: Graphics: Text, defect)

All
macOS
defect

Tracking

()

RESOLVED FIXED
90 Branch
Performance Impact ?
Tracking Status
firefox90 --- fixed

People

(Reporter: mstange, Assigned: jfkthame)

References

Details

Attachments

(1 file)

Profile: https://share.firefox.dev/2PuFVG0

This is a profile of startup on macOS captured by Florian.

Jonathan, is there anything we can do to make this faster?

Flags: needinfo?(jfkthame)

Maybe.... I don't think we can avoid activating the language-support fonts, as that would break rendering for a bunch of writing systems, but perhaps we can arrange to do it on a separate thread early in the startup process. I'll try to prototype something and see if it looks feasible/useful.

Flags: needinfo?(jfkthame)

I have a patch that seems to help at least a bit in my local testing; instrumentation indicates it shaves around 30-40ms off startup time in my local build (which is opt, but not pgo, so results with a real "shippable" build may be different).

Pushed to try to check that it doesn't break all the tests :) https://treeherder.mozilla.org/jobs?repo=try&revision=4c398da46cf6a2b2f171205fa43b12baf2385db3 ... Markus, if you (or Florian) have a chance to test and see what effect this has for your testcases, that would be great.

Assigning on Jonathan to keep this tracked :)

Assignee: nobody → jfkthame
Severity: -- → S3

Florian, could you try re-profiling with the build from comment 3, to compare with your original profile and see if this helps? Thanks!

Flags: needinfo?(florian)

(In reply to Jonathan Kew (:jfkthame) from comment #5)

Thanks for working on this!

Florian, could you try re-profiling with the build from comment 3, to compare with your original profile and see if this helps? Thanks!

I applied locally the 2 patches in your try push. Here's a profile from the resulting build: https://share.firefox.dev/2Sd91dX
gfxPlatform::Init took 120ms instead of the 430ms in the profile from comment 0.

I did a second run, where this took 202ms, including 66ms waiting for PR_JoinThread: https://share.firefox.dev/3eyoDA0

It looks like the new thread created by the patch is not registered with the profiler, so I can't see what it did and when. Could you name it and register it with the profiler? (see https://phabricator.services.mozilla.com/D106495 if an example of naming/registering can be helpful)

Flags: needinfo?(florian)

Hmm. I tried to register the thread (see https://treeherder.mozilla.org/jobs?repo=try&revision=4a0253b5a60d4e3259d7e5513e1287ff887419a2), but I still don't see it showing up at all in my profile; is there something else required to get it into the profiler?

I also tried using nsIThread APIs (NS_CreateNamedThread) instead of PRThread, but still didn't see it in the profiler; but I did see that I ended up waiting substantially longer for the thread to finish (where WaitForFontRegistration calls PR_JoinThread). I take this to mean that the nsIThread version doesn't actually start running the task early enough (maybe it's waiting for the event loop?).

In theory I think we could start the task a bit earlier yet, if it weren't for the need to use the Directory Service to get the gecko directory for ActivateBundledFonts. The system's supplemental fonts -- the main issue here -- could be handled purely via Core Foundation APIs instead of relying on gecko services to iterate over the files, which would mean we can start activating them even before those services are ready for use.

There'll still be the possibility that we end up having to wait in InitFontList for the registration thread to finish its work; but the wait should be significantly shorter than the time we're currently blocking on the font registration calls.

(In reply to Jonathan Kew (:jfkthame) from comment #7)

Hmm. I tried to register the thread (see https://treeherder.mozilla.org/jobs?repo=try&revision=4a0253b5a60d4e3259d7e5513e1287ff887419a2), but I still don't see it showing up at all in my profile; is there something else required to get it into the profiler?

You will need to set the MOZ_PROFILER_STARTUP_FILTERS environment variable to have a non default set of profiled threads. MOZ_PROFILER_STARTUP=1 MOZ_PROFILER_STARTUP_FILTERS="GeckoMain,RegisterFonts" ./mach run gave me this profile: https://share.firefox.dev/3e7Lvrr

I've updated the patch to try and start the RegisterFonts thread earlier; new try job is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdbe002e2dde07b7df0b01f033f3b94f8f8a5b25. My profiling indicates this improves things a bit further, but it'd be great if you could also test it and compare results.

(In reply to Jonathan Kew (:jfkthame) from comment #9)

I've updated the patch to try and start the RegisterFonts thread earlier; new try job is at https://treeherder.mozilla.org/#/jobs?repo=try&revision=bdbe002e2dde07b7df0b01f033f3b94f8f8a5b25. My profiling indicates this improves things a bit further, but it'd be great if you could also test it and compare results.

Note: I'm using a packaged build (mach package and then I use --packaged with mach run), MOZ_PROFILER_STARTUP=1 MOZ_PROFILER_STARTUP_FILTERS="GeckoMain,RegisterFonts" ./mach run --packaged. Packaged builds should have faster I/O for chrome:// files, so they are more likely to wait for the font thread than non-packaged local builds.

First profile with this new patch: https://share.firefox.dev/3t5uG4C (11ms waiting for PR_JoinThread).
Second profile: https://share.firefox.dev/3u7nk1K 101ms waiting for PR_JoinThread
Third profile: https://share.firefox.dev/3t12DTT only 4.1ms waiting
4th profile: https://share.firefox.dev/3eG9YTH 133ms wait
After that I stopped uploading the profiles, but I've done a total of 20 attempts.

Here are my observations:

  • I've not seen a single run where the main thread doesn't wait for the RegisterFonts thread to finish. The shortest wait was 4.1ms, the longest was 145ms. In most run it's between 50 and 100ms.
  • The RegisterFonts thread is started very early (I don't think we can start it earlier), and uses CPU for 250 to 300ms (always a much longer time than the time when the main thread is blocked on PR_JoinThread).

Conclusions:

  • the patch is a nice win!
  • (assuming there's nothing inside the RegisterFonts thread that can be further optimized), the next way to optimize this to avoid waiting for the thread would be to delay the first gfxPlatformMac::CreatePlatformFontList call on the main thread. Currently this is triggered by telemetry accessing the nsIGfxInfo service, that doesn't seem to contain font related information. When is the first time we actually need info info? How much later is that?

Thanks for the testing; that pretty much confirms my observations/expectations. I'll ask for review on this patch, as I think it's about as good as it's going to get.

As you say, the next thing to check would be whether we can defer the initial creation of the font list, but that'll be a separate patch that I think we should do in a followup bug. Not sure yet whether we can gain a significant amount there, but it's at least worth investigating.

Attachment #9218725 - Attachment description: Bug 1707655 - Move the activation of supplemental fonts on macOS to a separate thread and try to get it started earlier. → Bug 1707655 - Move the activation of supplemental fonts on macOS to a separate thread and try to get it started earlier. r=lsalzman
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/0895df8f4a77 Move the activation of supplemental fonts on macOS to a separate thread and try to get it started earlier. r=lsalzman
Blocks: 1708768
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch
Performance Impact: --- → ?
Whiteboard: [qf]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: