Last Comment Bug 517045 - Perf regression in font initialization on the Mac hurts startup time
: Perf regression in font initialization on the Mac hurts startup time
Status: RESOLVED FIXED
[ts]
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: x86 Mac OS X
: P1 normal (vote)
: ---
Assigned To: John Daggett (:jtd)
:
:
Mentors:
Depends on:
Blocks: 493280
  Show dependency treegraph
 
Reported: 2009-09-16 13:03 PDT by Joel Reymont (:joelr)
Modified: 2009-11-12 17:50 PST (History)
11 users (show)
mbeltzner: blocking1.9.2-
mbeltzner: wanted1.9.2+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
beta1-fixed


Attachments
patch, v.0.1, fix perf regression within InitSingleFaceList (1.28 KB, patch)
2009-09-27 21:35 PDT, John Daggett (:jtd)
jfkthame: review+
dietrich: approval1.9.2+
Details | Diff | Splinter Review
patch (WIP) to move font-list initialization out of the startup path (49.72 KB, patch)
2009-09-28 09:32 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Splinter Review

Description Joel Reymont (:joelr) 2009-09-16 13:03:16 PDT
According to bug 491283, font initialization on Windows happens 8s after startup. 

On the Mac the hidden window seems to be created during startup and font initialization is part of that. I'm wondering if she should kick this can down the road as font initialization takes 8% of startup time on my super-fast 2.94Ghz C2D MacBook Pro with 256Gb SSD.

	0.0%	8.2%	XUL	       nsCocoaWindow::Create(nsIWidget*, nsIntRect const&, nsEventStatus (*)(nsGUIEvent*), nsIDeviceContext*, nsIAppShell*, nsIToolkit*, nsWidgetInitData*)
	0.0%	8.2%	XUL	        nsCocoaWindow::StandardCreate(nsIWidget*, nsIntRect const&, nsEventStatus (*)(nsGUIEvent*), nsIDeviceContext*, nsIAppShell*, nsIToolkit*, nsWidgetInitData*, void*)
	0.0%	8.2%	XUL	         nsBaseWidget::BaseCreate(nsIWidget*, nsIntRect const&, nsEventStatus (*)(nsGUIEvent*), nsIDeviceContext*, nsIAppShell*, nsIToolkit*, nsWidgetInitData*)
	0.0%	8.2%	XUL	          CallCreateInstance(nsID const&, nsISupports*, nsID const&, void**)
	0.0%	8.2%	XUL	           nsComponentManagerImpl::CreateInstance(nsID const&, nsISupports*, nsID const&, void**)
	0.0%	8.2%	XUL	            nsFactoryEntry::GetFactory(nsIFactory**)
	0.0%	8.2%	XUL	             nsGenericModule::GetClassObject(nsIComponentManager*, nsID const&, nsID const&, void**)
	0.0%	8.2%	XUL	              nsGenericModule::Initialize(nsIComponentManager*)
	0.0%	8.2%	XUL	               gfxPlatform::Init()
	0.0%	8.1%	XUL	                gfxMacPlatformFontList::InitFontList()
	0.0%	5.5%	AppKit	                 -[NSFontManager availableMembersOfFontFamily:]
	0.0%	2.4%	XUL	                 gfxMacPlatformFontList::InitSingleFaceList()
	0.0%	0.2%	AppKit	                 -[NSFontManager availableFontFamilies]
Comment 1 John Daggett (:jtd) 2009-09-16 14:05:40 PDT
What you're listing here is font enumeration which takes place at startup.  The reading of font cmaps takes place 8s after startup on mac/windows, that's a distinct and separate event.

Joel, how many fonts are on your system?  Open FontBook, at the bottom of the window it lists "xxx fonts".  Did you run this test right after rebooting or after running/quitting FF a couple times?
Comment 2 Joel Reymont (:joelr) 2009-09-16 15:14:29 PDT
138 fonts on my system and I have a 256Gb SSD. 

I ran this test on the 2nd run after rebuilding as Taras told me that caches, etc. are rebuilt on the first run. Font initialization is 54% of the 1st run (below). Note the drastic difference between the call to the font manager above and all the heavy-duty work that's going on below.

	0.3%	54.1%	XUL	               gfxPlatform::Init()
	0.0%	53.8%	XUL	                gfxMacPlatformFontList::InitFontList()
	0.0%	40.1%	XUL	                 gfxMacPlatformFontList::InitSingleFaceList()
	0.0%	40.1%	XUL	                  gfxMacPlatformFontList::GetStandardFamilyName(nsAString_internal const&, nsAString_internal&)
	0.0%	39.8%	XUL	                   gfxPlatformFontList::FindFamily(nsAString_internal const&)
	0.0%	39.8%	XUL	                    PL_DHashTableEnumerate
	0.0%	39.8%	XUL	                     gfxPlatformFontList::InitOtherFamilyNamesProc(nsAString_internal const&, nsRefPtr<gfxFontFamily>&, void*)
	0.0%	39.8%	XUL	                      gfxFontFamily::ReadOtherFamilyNames(AddOtherFamilyNameFunctor&)
	0.0%	39.8%	XUL	                       gfxFontFamily::ReadOtherFamilyNamesForFace(AddOtherFamilyNameFunctor&, gfxFontEntry*, int)
	0.0%	39.5%	XUL	                        MacOSFontEntry::GetFontTable(unsigned int, nsTArray<unsigned char>&)
	0.0%	39.3%	ATS	                         ATSFontGetTable
	0.0%	39.3%	ATS	                          _eATSFontGetTable
	0.0%	25.9%	ATS	                           _eFOFindTableIndex
	0.0%	25.9%	ATS	                            FOFindTableInCache
	0.1%	25.8%	ATS	                             FOFindTableIndexInternal
	0.0%	19.2%	ATS	                              FTDataForkGetDirectory(privateFontObjectRecord const*)
	0.0%	19.2%	ATS	                               GetDataFontDirectory
	0.0%	19.2%	ATS	                                AssureMappedDataFork(unsigned short)
	0.0%	19.2%	ATS	                                 SendGetContainerMappingMessage
	0.0%	19.2%	ATS	                                  SendFontManagementMessageWithMessageStatus
Comment 3 John Daggett (:jtd) 2009-09-16 17:43:48 PDT
(In reply to comment #2)
> 0.3% 54.1% XUL  gfxPlatform::Init()
> 0.0% 53.8% XUL  gfxMacPlatformFontList::InitFontList()
> 0.0% 40.1% XUL  gfxMacPlatformFontList::InitSingleFaceList()

This is the first run after a *reboot*?  Or the first run after manually clearing Mozilla caches?  The reason this is important is the ATS (Apple Type Server) service caches font data, so fetching font info the first time after a reboot is much slower than the second time around, often by an order of magnitude.  Same if you run Photoshop, it spends a lot more time creating the font menu the first time after a reboot because of this.

That said, we shouldn't be spending a lot of time in InitSingleFaceList since that's typically just setting up data for a single font.  Looks like it's touching off InitOtherFamilyNamesProc which we should avoid at startup.
Comment 4 Joel Reymont (:joelr) 2009-09-17 02:02:12 PDT
This is first run after a rebuild of Firefox.  I'm not sure if this was a first run of Firefox after a reboot but I'm almost positive that it wasn't.
Comment 5 Joel Reymont (:joelr) 2009-09-18 05:34:30 PDT
Going back to my profile of 2nd run,

	0.0%	8.2%	XUL	               gfxPlatform::Init()
	0.0%	8.1%	XUL	                gfxMacPlatformFontList::InitFontList()
	0.0%	5.5%	AppKit	                 -[NSFontManager availableMembersOfFontFamily:]
	0.0%	2.4%	XUL	                 gfxMacPlatformFontList::InitSingleFaceList()
	0.0%	2.4%	XUL	                  gfxMacPlatformFontList::GetStandardFamilyName(nsAString_internal const&, nsAString_internal&)
	0.0%	2.4%	XUL	                   gfxPlatformFontList::FindFamily(nsAString_internal const&)
	0.0%	0.0%	libSystem.B.dylib	                   dyld_stub_binder_
	0.0%	0.0%	ATS	                   ATSFontFamilyFindFromQuickDrawName
	0.0%	0.2%	AppKit	                 -[NSFontManager availableFontFamilies]

Any suggestions of how this can be optimized?

Do we really need to init all the font families on startup or can we do it lazily, the first time a font is requested or the font menu is pulled up?
Comment 6 Joel Reymont (:joelr) 2009-09-22 05:46:34 PDT
Why not move font enumeration to the cmap callback 8s into startup, i.e.

http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/src/gfxPlatformFontList.cpp#428

Are there any issues with doing this?
Comment 7 John Daggett (:jtd) 2009-09-22 13:56:16 PDT
(In reply to comment #6)
> Why not move font enumeration to the cmap callback 8s into startup, i.e.
> 
> http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/src/gfxPlatformFontList.cpp#428
> 
> Are there any issues with doing this?

Yes, you need the family list for matching fonts (e.g. when drawing chrome or content text), this is unrelated to font menus, so you need this list before the cmap loader proc fires.

The time spent in InitSingleFaceList is probably a regression, I think we should be able to eliminate the time spent in that routine.  Beyond that we should be able to tune what is done per-family within InitFontList.

Joel, are you profiling with your own build or with Shark builds?  If your own, what's the mozconfig you're using?
Comment 8 Joel Reymont (:joelr) 2009-09-22 14:01:55 PDT
Snow Leopard, 2.93Ghz Core2Duo MacBook Pro, 256Gb SSD.

Here's the mozconfig I'm using...

export CC="gcc-4.2 -arch i386"
export CXX="g++-4.2 -arch i386"
ac_add_options --target=i386-apple-darwin8.0.0
ac_add_options --with-macos-sdk=/Developer/SDKs/MacOSX10.5.sdk
ac_add_options --enable-macos-target=10.5

export HOST_CC="gcc-4.2"
export HOST_CXX="g++-4.2"
export RANLIB=ranlib
export AR=ar
export AS=$CC
export LD=ld
export STRIP="strip -x -S"
export CROSS_COMPILE=1
export MOZ_DEBUG_SYMBOLS=1

. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/objdir-release-profiling
mk_add_options MOZ_MAKE_FLAGS=-j2
ac_add_options --disable-tests
ac_add_options --disable-crashreporter
ac_add_options --enable-dtrace
ac_add_options --enable-libxul
ac_add_options --enable-optimize
ac_add_options --enable-debugger-info-modules
ac_add_options --disable-strip
ac_add_options --disable-install-strip

# mk_add_options AUTOCONF=autoconf213
Comment 9 John Daggett (:jtd) 2009-09-22 22:22:04 PDT
Timings on 10.5.8, Mac Mini 2.26GHz/4GB, trunk perf build (similar to tinderbox config).

After purge (cold start):

Total startup: 12809ms
Total time in InitFontList: 3083ms
  -- individual families: 487ms
  -- single face list: 2595ms
  -- preload names: 0ms
  -- other: 0ms

Total startup: 13175ms
Total time in InitFontList: 2751ms
  -- individual families: 444ms
  -- single face list: 2307ms
  -- preload names: 0ms
  -- other: 0ms

Subsequent runs (warm start):

Total startup: 677ms
Total time in InitFontList: 66ms
  -- individual families: 36ms
  -- single face list: 30ms
  -- preload names: 0ms
  -- other: 0ms

Total startup: 862ms
Total time in InitFontList: 70ms
  -- individual families: 38ms
  -- single face list: 32ms
  -- preload names: 0ms
  -- other: 0ms

Total startup: 674ms
Total time in InitFontList: 67ms
  -- individual families: 36ms
  -- single face list: 31ms
  -- preload names: 0ms
  -- other: 0ms

Same machine running 3.5.4

cold start: 11615ms, 10975ms, 11101ms
warm start: ~750ms

Looks like we have a roughly 2s regression in mac cold startup time on trunk.  This is likely due to changes from bug 493280.  Should be easy to eliminate time spent dealing with single face list, overall time spent initializing family list will be trickier.  Goal: cold startup InitFontList = 500ms, warm start = 40ms.
Comment 10 John Daggett (:jtd) 2009-09-22 22:34:18 PDT
Safari times on the same machine for comparison:

cold start: 10647ms, 6325ms, 6975ms
warm start: ~310ms

Measured startup time using Vlad's method:

  http://blog.vlad1.com/2009/07/28/measuring-startup/

The Safari numbers were calculated manually by dumping the time on the command line and in the startup page and taking the difference.
Comment 11 John Daggett (:jtd) 2009-09-27 21:35:57 PDT
Created attachment 403181 [details] [diff] [review]
patch, v.0.1, fix perf regression within InitSingleFaceList

As noted in comment 9, a lot of time is consumed when starting up Mac trunk builds in InitSingleFaceList.  The cause of this is the change in logic that occurred when Jonathan made his Mac font system changes in bug 493280.  As noted in bug 519149, this logic is currently broken, it fails to handle Osaka Mono correctly when running with Japanse as the default locale.  The code is currently calling GetStandardFamilyName.  Localized name initialization is handled by the deferred cmap loading process but if a font family lookup fails it gets called (otherwise we'd miss localized family name lookup at startup):

http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/src/gfxPlatformFontList.cpp#349

This is why the original code in 3.5 gingerly looks up family names manually to avoid sparking unneeded name lookups to occur:

http://mxr.mozilla.org/mozilla1.9.1/source/gfx/thebes/src/gfxQuartzFontCache.mm#876

So the simple fix is just to eliminate the unneeded call to GetStandardFamilyName.  The correctness issue can be dealt with in bug 519149.

For more background on the general problems related to Osaka Mono, see bug 419370.
Comment 12 John Daggett (:jtd) 2009-09-27 21:50:19 PDT
Re-running the tests noted in comment 9 with the attached patch:

After purge (cold start):

Total startup: 10291ms
Total time in InitFontList: 639ms
  -- individual families: 478ms
  ---- method availableMembersOfFontFamily: 304ms
  -- single face list: 0ms
  -- preload names: 161ms
  -- other: 0ms

Total startup: 10394ms
Total time in InitFontList: 583ms
  -- individual families: 422ms
  ---- method availableMembersOfFontFamily: 271ms
  -- single face list: 0ms
  -- preload names: 161ms
  -- other: 0ms

Subsequent runs (warm start):

Total startup: 643ms
Total time in InitFontList: 38ms
  -- individual families: 36ms
  ---- method availableMembersOfFontFamily: 22ms
  -- single face list: 0ms
  -- preload names: 2ms
  -- other: 0ms

Total startup: 648ms
Total time in InitFontList: 38ms
  -- individual families: 36ms
  ---- method availableMembersOfFontFamily: 22ms
  -- single face list: 0ms
  -- preload names: 2ms
  -- other: 0ms

Total startup: 649ms
Total time in InitFontList: 38ms
  -- individual families: 36ms
  ---- method availableMembersOfFontFamily: 22ms
  -- single face list: 0ms
  -- preload names: 2ms
  -- other: 0ms

So the patch reduces overall cold startup time by roughly 20% and warm startup by roughly 4%.  

Note the time attributed to availableMembersOfFontFamily is the time needed to enumerate faces within a given family.  We could eliminate some of this time by enumerating faces more lazily but at a runtime cost within font matching code (i.e. a check to see whether a family has been initialized when looking up or deciding upon the particular face to use).  Jonathan may have a patch for this but this is much trickier and will definitely have tradeoffs.
Comment 13 Jonathan Kew (:jfkthame) 2009-09-28 04:40:45 PDT
Comment on attachment 403181 [details] [diff] [review]
patch, v.0.1, fix perf regression within InitSingleFaceList

Yes, this seems fine to address the startup regression, thanks.

With the addition of the patch I have posted at bug 519149, does Osaka-Mono work correctly for you, or are there further issues?
Comment 14 Jonathan Kew (:jfkthame) 2009-09-28 09:32:20 PDT
Created attachment 403260 [details] [diff] [review]
patch (WIP) to move font-list initialization out of the startup path

This patch attempts to address startup time by deferring the font-list initialization. The idea is to move the whole enumeration of platform fonts to the delayed loader; for fonts needed before the list is available, it attempts to look them up directly via Cocoa, without waiting for the full list to be built.

In most cases, the first fonts needed will be those used for chrome, along with a few common web fonts for the user's start page and/or saved tabs. If these can be found via the Cocoa APIs, without waiting to build the complete font list, we win. (In my tests, this patch cuts about 20% off overall warm-start time, as measured by dbaron's technique.)

The worst case occurs if we need the full font list immediately; this will happen if the user loads a page that requires a full system font fallback search (not just our language-based preferred fonts), or decides to open Preferences/Content immediately after launch, so that we have to build the font menu. In this case, the loader is "forced" to finish loading the font list, but the result should be no worse than our current behavior.

This patch is definitely not ready to land: for one thing, it has some bad interaction with @font-face. (For now, it disables the @font-face reftests, as otherwise it will cause some of them to fail, and others to hang.) I think I know why this happens, but have not decided how to deal with it yet. Nevertheless, if anyone wants to run some performance comparisons with this patch, it would be interesting to see the results.

There is a preference setting "gfx.lazy_fontlist_init" (defaults to true) to enable/disable the lazy font list, btw.
Comment 15 Mike Beltzner [:beltzner, not reading bugmail] 2009-09-28 11:46:55 PDT
This is not a blocker, IMO, but a very nice to have. Want to be clear; I don't think work should stop nor do I think we should abandon hope of this landing on 1.9.2 - I just think that if it comes down to deadline, we could ship without it.
Comment 16 Jonathan Kew (:jfkthame) 2009-09-28 11:59:47 PDT
Just to clarify, there are two things we need to distinguish here:
(1) John's small patch to address the regression from FF3.5, which I think we should go ahead and land on trunk ASAP, and then on branch assuming all goes well;
(2) The possible reworking of the font list to use lazy initialization. There's no way that will be ready for 1.9.2, as it still needs work to resolve some problems, and it's a significant and potentially risky internal change in behavior. So IMO, that part of the work is aimed at improving things for post-1.9.2 only.
Comment 17 Dietrich Ayala (:dietrich) 2009-09-28 12:50:07 PDT
Thanks Jonathan! Can you split out your work on #2 into a separate bug?
Comment 18 John Daggett (:jtd) 2009-09-28 17:43:26 PDT
Comment on attachment 403181 [details] [diff] [review]
patch, v.0.1, fix perf regression within InitSingleFaceList

Pushed perf regression fix to trunk
http://hg.mozilla.org/mozilla-central/rev/8243569f6b4f

Requesting approval to land on 1.9.2, this is a very minor change but reduces time spent in InitFontList by half and reduces the cold startup time by roughly 2s(!!!).
Comment 19 John Daggett (:jtd) 2009-09-28 17:47:28 PDT
(In reply to comment #13)
> With the addition of the patch I have posted at bug 519149, does Osaka-Mono
> work correctly for you, or are there further issues?

No, as noted on bug 519149, comment 6 additional work needs to be done to correctly show the Osaka Mono name in Japanese.
Comment 20 Jonathan Kew (:jfkthame) 2009-09-29 07:37:25 PDT
(In reply to comment #17)
> Thanks Jonathan! Can you split out your work on #2 into a separate bug?

Filed bug 519445 for further work on eliminating font startup time; I will also adjust the summary of this bug to reflect more accurately what's been done here already.
Comment 21 Dietrich Ayala (:dietrich) 2009-10-02 14:09:11 PDT
already landed on trunk -> FIXED.
Comment 22 John Daggett (:jtd) 2009-10-05 07:17:14 PDT
Pushed to trunk 9/29
http://hg.mozilla.org/mozilla-central/rev/8243569f6b4f
Comment 23 Dietrich Ayala (:dietrich) 2009-10-05 09:49:15 PDT
Comment on attachment 403181 [details] [diff] [review]
patch, v.0.1, fix perf regression within InitSingleFaceList

low risk, high win, had some bake time on trunk w/o regressions -> a+
Comment 24 John Daggett (:jtd) 2009-10-05 18:22:42 PDT
Pushed to 1.9.2
http://hg.mozilla.org/releases/mozilla-1.9.2/rev/cb4ef9030b12
Comment 25 Dietrich Ayala (:dietrich) 2009-11-11 18:10:45 PST
What's the risk level here for 1.9.1 consideration?
Comment 26 John Daggett (:jtd) 2009-11-11 18:17:54 PST
Jonathan did a rework of the Mac font code for 1.9.2 so we'd need to rework the patch for 1.9.1.  It's doable but I think there's still the risk of regression bugs that I didn't spot.
Comment 27 John Daggett (:jtd) 2009-11-12 17:50:29 PST
(In reply to comment #25)
> What's the risk level here for 1.9.1 consideration?

Argh, didn't look at which bug you added this comment to.  The specific bug here was a regression from 1.9.1 to 1.9.2 so there's no need for a backport to 1.9.1.x.  The issue of a how to make font initialization take less time is the subject of bug 519445.

Note You need to log in before you can comment on or make changes to this bug.