Last Comment Bug 519445 - Font initialization on the Mac is in the startup path
: Font initialization on the Mac is in the startup path
Status: RESOLVED FIXED
[ts]
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Jonathan Kew (:jfkthame)
:
Mentors:
Depends on:
Blocks: songbird
  Show dependency treegraph
 
Reported: 2009-09-29 07:36 PDT by Jonathan Kew (:jfkthame)
Modified: 2011-04-20 23:57 PDT (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
implement lazy initialization of the font list on Mac OS X (48.29 KB, patch)
2009-09-29 07:36 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Splinter Review
timing patch, dumps out various font list init timings and lookup timings (12.19 KB, patch)
2009-09-30 22:36 PDT, John Daggett (:jtd)
no flags Details | Diff | Splinter Review
updated lazy-font-list patch (48.65 KB, patch)
2009-10-01 03:17 PDT, Jonathan Kew (:jfkthame)
jd.bugzilla: review-
Details | Diff | Splinter Review
updated timing patch, lists number of font families and faces (13.14 KB, patch)
2009-10-01 20:06 PDT, John Daggett (:jtd)
no flags Details | Diff | Splinter Review
startup test page (1.07 KB, text/html)
2009-10-01 20:10 PDT, John Daggett (:jtd)
no flags Details
reworked the font list construction to defer enumeration of faces (36.87 KB, patch)
2009-10-12 16:57 PDT, Jonathan Kew (:jfkthame)
no flags Details | Diff | Splinter Review
updated following review comments (37.99 KB, patch)
2009-11-11 03:38 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Splinter Review
patch, v4 - with fix from comment 19 (38.79 KB, patch)
2009-11-11 06:27 PST, Jonathan Kew (:jfkthame)
no flags Details | Diff | Splinter Review
lazy font list patch v5 - cache the shared NSFontManager for performance (39.76 KB, patch)
2009-11-17 07:10 PST, Jonathan Kew (:jfkthame)
jd.bugzilla: review+
Details | Diff | Splinter Review
lazy font list v6 - ensure styles are loaded in gfxFontFamily::FindFontForChar (40.39 KB, patch)
2009-11-20 01:53 PST, Jonathan Kew (:jfkthame)
jd.bugzilla: review+
Details | Diff | Splinter Review

Description Jonathan Kew (:jfkthame) 2009-09-29 07:36:05 PDT
Created attachment 403479 [details] [diff] [review]
implement lazy initialization of the font list on Mac OS X

(split out from bug 517045; the patch landed there addresses the regression from 3.5 but not the larger issue of font initialization being an expensive operation at startup)

We currently build a list of all available font families and faces on the system; this is needed to present a font menu to the user (e.g, in Preferences / Content), and also for font matching and fallback. However, if many fonts are installed, this makes a significant contribution to startup time.

To improve startup, I propose deferring the main font list creation until after the browser is up. For fonts that are needed for initial chrome and page display, before the list has been created, we can query the platform directly for the fonts we need, and in most cases proceed to display the page. In cases where the startup page requires system font fallback, we may still have to block until we have the full list, but for most users this should give a significant improvement to startup time.

This patch is a slightly updated version of what was originally posted on bug 517045 (attachment 403260 [details] [diff] [review]). It appears to pass reftests OK, and works for me in general browsing. The @font-face problems noted there turned out to be a result of having Leopard's "automatic font activation" feature turned on; this can upset some of our tests because the system "magically" enables reftest fonts even when we didn't expect them! With that option turned off, the tests pass as expected.
Comment 1 Jonathan Kew (:jfkthame) 2009-09-29 10:26:25 PDT
I've done a little timing of an optimized Minefield build with and without this patch. On my MBPro, with a fairly moderate number of fonts installed (Font Book reports that there are 157), it reduces warm-start time from 569ms to 522ms (average of 5 runs), and cold-start from approx 14.9sec to 10.7sec (average of two tests, but these vary quite a bit).

We can expect the speedup to vary depending on the number of fonts installed; those users with huge font libraries will get the most benefit, but even with just the original system fonts, there should be a significant gain.
Comment 2 John Daggett (:jtd) 2009-09-30 02:27:16 PDT
I'm still looking over the patch but I do have one comment, I think we should be very cautious about making performance assertions here.  The patch contains this code:

> +gfxFontFamily*
>  gfxPlatformFontList::FindFamily(const nsAString& aFamily)
>  {
>      nsAutoString key;
>      gfxFontFamily *familyEntry;
>      PRBool found;
>      GenerateFontListKey(aFamily, key);
>  
>      // lookup in canonical (i.e. English) family name list
> @@ -358,16 +367,26 @@ gfxPlatformFontList::FindFamily(const ns
>          return familyEntry;
>      }
>  
>      // lookup in other family names list (mostly localized names)
>      if ((familyEntry = mOtherFamilyNames.GetWeak(key, &found))) {
>          return familyEntry;
>      }
>  
> +    if (!mFontFamilyListLoaded) {
> +        // try a direct platform lookup (and add to the family list, if found)
> +        familyEntry = FindFamilyOnPlatform(aFamily);
> +        if (familyEntry)
> +            return familyEntry;
> +
> +        ForceLoadFontFamilyList();
> +        return FindFamily(aFamily);
> +    }
> +
>      // name not found and other family names not yet fully initialized so
>      // initialize the rest of the list and try again.  this is done lazily
>      // since reading name table entries is expensive
>      if (!mOtherFamilyNamesInitialized) {
>          InitOtherFamilyNames();
>          if ((familyEntry = mOtherFamilyNames.GetWeak(key, &found))) {
>              return familyEntry;
>          }

When a font family name miss occurs, the full font list is read in, not just when system fallback occurs.  Any user with a startup page that contains a Windows-only font at the beginning of a font list will see no performance difference using this method of lazy initialization.  You can't rely completely on the OS name lookup routines because they don't fully support localized name lookup (hence the code that follows related to localized name loading).

Example (from the Yahoo Japan homepage):

  font-family:"MS PGothic","Osaka",Arial,sans-serif;

There's no MS PGothic under Mac OS X, so the full font list will get pulled in for users with Yahoo Japan as their startup page.
Comment 3 Jonathan Kew (:jfkthame) 2009-09-30 04:35:02 PDT
Yes, there will certainly be users who get no benefit from this, because they immediately load a page that causes the full list to be loaded. Nevertheless, there will also be a large number (I'd guess it's the great majority) who do benefit.

A slightly more complex strategy we could take would be to treat this like @font-face loading: if the requested family isn't found, fall back to the next in the list, and then potentially refresh the page once the font list has been loaded. This introduces the potential for a disconcerting delayed font change, like what happens when a downloadable font finishes loading, but I suspect that in real-world usage the majority of cases would be like the Yahoo Japan one, where nothing would actually change after the font list is loaded, as we still wouldn't have a font called "MS PGothic".
Comment 4 John Daggett (:jtd) 2009-09-30 22:36:01 PDT
Created attachment 403964 [details] [diff] [review]
timing patch, dumps out various font list init timings and lookup timings

This will dump out a list of timings related to InitFontList and measures lookup times in FindFamily compared to NSFontManager calls.
Comment 5 Jonathan Kew (:jfkthame) 2009-10-01 03:14:52 PDT
Running with this timing patch supports the results I've been seeing with overall startup-time measurement: on my laptop, a warm start typically spends around 50-65ms in InitFontList, and on a cold start the time is easily an order of magnitude more.

ISTM that moving this out of the startup path for a large number of users -- and in particular, for those whose startup page is a simple page such as the default Google search page -- is a worthwhile win, even though there will be some users who don't get the benefit of it.

Of those who don't benefit from this, many will be users who have set a more complex, often graphic-rich startup page that loads more slowly anyway, or who have many tabs to restore; having to wait for the font list to load becomes a relatively less significant element of the overall startup experience in these cases.

We could consider experimenting with the strategy suggested in comment #3, but I think that carries a greater risk of possibly-unwanted side effects, and would be better done as a separate patch on top of this initial one, if we decide to pursue it.

I have updated the lazy-font-list patch to apply on top of the localized font name patch in bug 519149, as that is touching the same areas of code, and is needed regardless of any decision about this patch.
Comment 6 Jonathan Kew (:jfkthame) 2009-10-01 03:17:13 PDT
Created attachment 404006 [details] [diff] [review]
updated lazy-font-list patch

Note that this patch depends on attachment 403871 [details] [diff] [review] in bug 519149 being applied first.
Comment 7 John Daggett (:jtd) 2009-10-01 20:06:24 PDT
Created attachment 404175 [details] [diff] [review]
updated timing patch, lists number of font families and faces


Timings with this patch on mac mini running 10.5.  These are after several runs so this reflects warm font cache behavior:

Default 10.5 install:
Total time in InitFontList: 37ms (109 families, 228 fonts)
  -- individual families: 36ms
  ---- method availableFamilies: 13ms
  ---- face enumeration: 22ms
  ------ method availableMembersOfFontFamily: 22ms
  -- single face list: 0ms
  -- preload names: 2ms
  -- other: 0ms

With all fonts in Adobe Font Folio 11 in ~/Library/Fonts:
Total time in InitFontList: 6059ms (576 families, 2634 fonts)
  -- individual families: 6057ms
  ---- method availableFamilies: 364ms
  ---- face enumeration: 5691ms
  ------ method availableMembersOfFontFamily: 5682ms
  -- single face list: 0ms
  -- preload names: 2ms
  -- other: 0ms

FindFamily results, comparing system lookup to hashtable lookup for several popular sites (duplicates trimmed for brevity):

www.google.com/en

FindFamily: Lucida Grande - system: 10us hashtable: 1us
FindFamily: Times - system: 12us hashtable: 1us
FindFamily: arial - system: 9749us (not found) hashtable: 2us
FindFamily: Arial - system: 18us hashtable: 0us
FindFamily: Helvetica - system: 13us hashtable: 1us

www.yahoo.com

FindFamily: Lucida Grande - system: 10us hashtable: 1us
FindFamily: Times - system: 12us hashtable: 1us
FindFamily: arial - system: 11184us (not found) hashtable: 2us
FindFamily: Arial - system: 19us hashtable: 0us
FindFamily: helvetica - system: 8449us (not found) hashtable: 1us
FindFamily: Helvetica - system: 21us hashtable: 0us
FindFamily: clean - system: 8464us (not found) hashtable: 1us (not found)

www.facebook.com

FindFamily: Lucida Grande - system: 10us hashtable: 1us
FindFamily: Times - system: 11us hashtable: 1us
FindFamily: lucida grande - system: 12623us (not found) hashtable: 2us
FindFamily: Lucida Grande - system: 20us hashtable: 1us
FindFamily: tahoma - system: 8529us (not found) hashtable: 2us
FindFamily: Tahoma - system: 23us hashtable: 0us
FindFamily: verdana - system: 10675us (not found) hashtable: 2us
FindFamily: Verdana - system: 189us hashtable: 1us
FindFamily: arial - system: 9089us (not found) hashtable: 2us
FindFamily: Arial - system: 16us hashtable: 0us
FindFamily: Helvetica - system: 17us hashtable: 0us
Comment 8 John Daggett (:jtd) 2009-10-01 20:10:58 PDT
Created attachment 404177 [details]
startup test page

Startup page along the lines of Vlad's original version.  Includes an iframe to simulate the loading of an actual startpage rather than just a page with default text.

To time, run with this command line:

NO_EM_RESTART=1 ./firefox -no-remote -P profilename file:///path/to/this/startup.html#`python -c 'import time; print int(time.time() * 1000);'`

Note: replace 'profilename' and 'path/to/this' with appropriate values
Comment 9 John Daggett (:jtd) 2009-10-01 21:38:52 PDT
Comment on attachment 404006 [details] [diff] [review]
updated lazy-font-list patch

There are a number of key problems with this patch, it's not correct and there are problems with the lazy initialization strategy.  

The first correctness problem is that because several "fixup" procedures need to be delayed until after the font list is loaded, fonts that are loaded on an ad hoc basis will not be "fixed up".  This includes fonts affected by EliminateDuplicateFaces(), InitBadUnderlineList(), and the fixed pitch adjustments to Courier and Monaco.  

The second problem is in the way the deferred font list load has been set up, there's a bad interaction with ForceLoadFontFamilyList.  Font name loading is done via timer events on the main thread so if between two of those events something causes a font name lookup miss, ForceLoadFontFamilyList will fire and smash the state of the running name lookup routine and the retained mFamiliesToLoad will leak.

+void
+gfxMacPlatformFontList::ForceLoadFontFamilyList()
+{
+    if (mState == stateInitial) {
+        StartLoader(0, 0);
+    }
+    while (!RunLoader()) {
+        // RunLoader will return PR_TRUE when finished
+        ;
+    }
+    FinishLoader();
+}

On top of this, the way the font name loading code occurs in an overridden RunLoader method is really peculiar and hard to follow, it feels very hacky.  Did you view this as an interim step until a gfxWindowsFontPlatform object exists?

(In reply to comment #5)
> ISTM that moving this out of the startup path for a large number of users --
> and in particular, for those whose startup page is a simple page such as the
> default Google search page -- is a worthwhile win, even though there will be
> some users who don't get the benefit of it.

The strategy of looking up family names via [fontManager availableMembersOfFontFamily] doesn't quite work in practice, this routine does a case-sensitive lookup and these lookups need to be case-insensitive.  Font name misses are expensive with this routine, on the order of 5-10ms, which pretty much negates the savings from not initializing the font family name hashtable.  Our font group code tries to look up all names in a font list so there's a high probability of at least one font name miss, causing ForceLoadFontFamilyList to fire and negating any benefit of using this method.

A better approach would be to mimic the Windows appoach, enumerate font families at startup and lazily enumerate the faces for individual families as needed or as part of the deferred initialization of cmaps.  In fact, I really think the work here needs to be unified with the Windows code.  What's the status of gfxWindowsPlatformFontList work?

Here's what I would recommend:

- InitFontList creates a list of empty gfxFontFamily objects
- Rework AddFamilyToList as gfxMacFontFamily::EnumerateFaces
- Within FindFamily, check to see if a family is enumerated and enumerate if not
- Same for other methods that need a full font list
- gfxPlatformFontList::RunLoader calls the EnumerateFaces method
- rework the existing FindStyleVariations into EnumerateFaces
- preload families in InitFontList that need to be "fixed up" or mold the fixup routines into EnumerateFaces

This will ameliorate the amount of time spent enumerating faces without introducing regression bugs.

Longer term, I think we should revisit how system font fallback works.  Right now we look at *all* the cmaps for every character, we could probably distill this down to looking at a small set of fonts for some code ranges and hardwired fallback fonts in others but done so that the results match identically the process done today.  We could then store those results in the profile and check for font modifications at startup (not sure how hard/easy this is exactly, probably easy on Windows, harder on Mac, and potential nasty interactions with font manager software).  Then we would only need to load cmaps for fonts actually used.
Comment 10 Jonathan Kew (:jfkthame) 2009-10-05 01:22:47 PDT
(In reply to comment #9)
> The second problem is in the way the deferred font list load has been set up,
> there's a bad interaction with ForceLoadFontFamilyList.  Font name loading is
> done via timer events on the main thread so if between two of those events
> something causes a font name lookup miss, ForceLoadFontFamilyList will fire and
> smash the state of the running name lookup routine and the retained
> mFamiliesToLoad will leak.

I don't think this is correct; if ForceLoadFontFamilyList is triggered in this situation, it does not start a new lookup, it simply "accelerates" the lookup that is in progress by calling RunLoader repeatedly until the loader finishes.

> On top of this, the way the font name loading code occurs in an overridden
> RunLoader method is really peculiar and hard to follow, it feels very hacky. 
> Did you view this as an interim step until a gfxWindowsFontPlatform object
> exists?

Yes, this was a first step to see how it would perform, but consolidating so as to avoid the extra override would certainly be nicer.

> The strategy of looking up family names via [fontManager
> availableMembersOfFontFamily] doesn't quite work in practice, this routine does
> a case-sensitive lookup and these lookups need to be case-insensitive.

Yup. The case-sensitivity issue is a real problem here, and would negate the benefits in some (unknown but probably large) number of cases. :(

I'll look into reworking this along the lines suggested....

> Longer term, I think we should revisit how system font fallback works.  Right
> now we look at *all* the cmaps for every character,.....

Because we fall back to our language-based preferred fonts first, I have the impression that we don't hit the full-system-list fallback very often in practice. But to address this, I've done a really simple patch to "cache" fonts that are successfully used for fallback according to unicode block; this seems to work fine, so that after the first character in a given block has been seen, any subsequent ones will "fall back" almost instantly to the same font.

I want to get some performance figures with this, to see if I can detect a difference, but am having problems with an apparent regression on trunk that prevents it completing the long pageload test..... trying to isolate that so I can file it.
Comment 11 Jonathan Kew (:jfkthame) 2009-10-12 16:57:11 PDT
Created attachment 405952 [details] [diff] [review]
reworked the font list construction to defer enumeration of faces

New patch taking the approach suggested in comment #8. I think this is much closer to being correct now. On the Mac mini with a modest collection of fonts, I'm seeing about a 10% improvement in warm startup time with this patch.

John, I'd be interested to know whether it performs as expected with your large font collection. In principle, it seems like it ought to take 5.5s off your warm startup time with the whole Font Folio set installed.
Comment 12 Jonathan Kew (:jfkthame) 2009-10-14 03:27:38 PDT
I did a couple of cold-start timings as well (freshly rebooted Mac Mini, 125 fonts). Without the patch, I see times around 14-16 seconds; with it, 9-10 seconds. That's a pretty clear improvement! :)
Comment 13 Dietrich Ayala (:dietrich) 2009-11-03 11:08:47 PST
Mr Daggett, what's the ETA on a review here?
Comment 14 John Daggett (:jtd) 2009-11-03 12:11:37 PST
I'm on the road currently, I'm going to do testing and the review when I get back home.  ETA Friday.
Comment 15 John Daggett (:jtd) 2009-11-06 01:02:21 PST
Timings with the same setup as in comment 7

Default 10.5 install:
Total time in InitFontList: 14ms (109 families, 228 fonts)

With all fonts in Adobe Font Folio 11 in ~/Library/Fonts:
Total time in InitFontList: 335ms (576 families, 2634 fonts)

Compare with 37ms and 6059ms for current trunk code.  Big win for users with lots of fonts.
Comment 16 Jonathan Kew (:jfkthame) 2009-11-10 16:39:38 PST
(In reply to comment #15)
> Timings with the same setup as in comment 7
> 
> Default 10.5 install:
> Total time in InitFontList: 14ms (109 families, 228 fonts)
> 
> With all fonts in Adobe Font Folio 11 in ~/Library/Fonts:
> Total time in InitFontList: 335ms (576 families, 2634 fonts)
> 
> Compare with 37ms and 6059ms for current trunk code.  Big win for users with
> lots of fonts.

So my guess on the performance win (in comment #11) was pretty much on the mark. It'd be nice to get this landed on trunk, so that we can consider whether to take it for 1.9.2 as well.
Comment 17 John Daggett (:jtd) 2009-11-10 21:13:33 PST
Looks good, with the exception of the code noted below:

>      // clean up various minor 10.4 font problems for specific fonts
>      if (gfxPlatformMac::GetPlatform()->OSXVersion() < MAC_OS_X_VERSION_10_5_HEX) {
> -        // Cocoa calls report that italic faces exist for Courier and Helvetica,
> -        // even though only bold faces exist so test for this using ATS font refs (10.5 has proper faces)
> -        EliminateDuplicateFaces(NS_LITERAL_STRING("Courier"));
> -        EliminateDuplicateFaces(NS_LITERAL_STRING("Helvetica"));
> -        
>          // Cocoa reports that Courier and Monaco are not fixed-pitch fonts
>          // so explicitly tweak these settings
>          SetFixedPitch(NS_LITERAL_STRING("Courier"));
>          SetFixedPitch(NS_LITERAL_STRING("Monaco"));
>      }

These are both only used on 10.4, so rather than make EliminateDuplicateFaces a method of gfxMacFontFamily, just leave it as is and call FindStyleVariations to load the faces after looking up the family.  Same for SetFixedPitch, as structured now you will introduce a regression bug since the faces of the family are not populated at the time SetFixedPitch is called.  Once we drop support for 10.4 we can trim this code out cleanly.
Comment 18 Jonathan Kew (:jfkthame) 2009-11-11 03:38:28 PST
Created attachment 411649 [details] [diff] [review]
updated following review comments

This addresses the SetFixedPitch problem, by ensuring FindStyleVariations is called.

For EliminateDuplicateFaces, ISTM that the actual functionality belongs most properly in gfxMacFontFamily, rather than having the fontlist reach inside it and manipulate the family's list. So I have kept this, but reverted to calling it from InitFontList only for the problem families. Hope that seems OK to you.

(As you note, we can remove this completely once the decision to drop 10.4 support is considered irrevocable.)
Comment 19 John Daggett (:jtd) 2009-11-11 06:05:16 PST
Argh, I missed one.  For PreloadNamesList() to function correctly I think you need to add a call to FindStyleVariations() within gfxFontFamily::ReadOtherFamilyNames.

Also, I think you want me to review the patch, right...
Comment 20 Jonathan Kew (:jfkthame) 2009-11-11 06:27:34 PST
Created attachment 411662 [details] [diff] [review]
patch, v4 - with fix from comment 19

(In reply to comment #19)
> Argh, I missed one.  For PreloadNamesList() to function correctly I think you
> need to add a call to FindStyleVariations() within
> gfxFontFamily::ReadOtherFamilyNames.

Yes, looks like it. Added this.

> Also, I think you want me to review the patch, right...

Duh... yes please. I sure thought I put your name in there!
Comment 21 John Daggett (:jtd) 2009-11-12 00:26:22 PST
I ran a bunch of tests today to compare the performance, using the
startup page described in comment 8. This tests "effective" startup time
for different startup pages, including both initial startup and initial
page load.  The numbers below are formatted as ( startup time / initial
page load time ) in ms with a '|' character separating runs. I used the
'purge' command to simulate cold startup, that's reflected in the first
pair of numbers on each line.  I ran through the tests with and without
Adobe's entire font library, Font Folio 11, installed (see bug 519445,
comment 15).

For some reason that I don't quite understand, there appears to be a
page load time regression with the v4 code.  This needs to be tracked
down before it makes sense to land this.  Look at the numbers for
facebook startup time, the load time appears to increase with the v4
code.  For users with just default fonts installed there's effectively a
startup regression if the startup page is facebook.

Using default system fonts:

trunk sys: www.google.com/en | 31087 /  5406 |   641 /   350 |   640 /   333 |   635 /   337
lazy sys:  www.google.com/en | 11254 /  2313 |  1358 /   337 |   620 /   334 |   632 /   363 |   619 /   342

trunk sys: www.yahoo.com     |  8434 /  9109 |   638 /  1449 |   642 /  1705 |   638 /  1311
lazy sys:  www.yahoo.com     | 10111 /  8609 |   617 /  1518 |   627 /  1375 |   616 /  1751

trunk sys: www.facebook.com  |  9000 /  2595 |   638 /   605 |   637 /   737 |   640 /   599
lazy sys:  www.facebook.com  | 11311 /  2513 |   621 /   649 |   617 /   882 |   622 /   880

trunk sys: www.yahoo.co.jp   |  8494 /  6162 |   639 /   678 |   635 /   907 |   639 /   559
lazy sys:  www.yahoo.co.jp   |  9264 /  7586 |   927 /   665 |   615 /  1620 |   619 /   742

trunk sys: www.cnn.com       |  9570 /  9532 |   649 /  2055 |   649 /  2314 |   649 /  2086 |   648 /  1949
lazy sys:  www.cnn.com       |  8770 /  8832 |   669 /  2080 |   630 /  2513 |   623 /  2080 |   629 /  2034 |   623 /  2042

With font folio 11 installed:

trunk f11: www.google.com/en | 15664 /  1623 |  2472 /   372 |  1726 /   340 |  1723 /   343
lazy f11:  www.google.com/en |  7508 /  1814 |   951 /   403 |   953 /   340 |   954 /   348

trunk f11: www.yahoo.com     | 11176 / 34017 |  1799 /  2432 |  1792 /  2024 |  1790 /  2076
lazy f11:  www.yahoo.com     | 12067 / 81129 |   956 /  3235 |   952 /  2849 |  1064 /  2805

trunk f11: www.facebook.com  | 11305 /  2986 |  1726 /   758 |  1735 /   677 |  1730 /   715
lazy f11:  www.facebook.com  | 11711 /  2409 |   955 /  1014 |   959 /   945 |   953 /  1061

trunk f11: www.yahoo.co.jp   |  7750 / 30898 |  1860 /  1717 |  1731 /  1189 |  1733 /  1274
lazy f11:  www.yahoo.co.jp   | 11596 / 43261 |   956 /  2076 |   951 /  3167 |   957 /  2143

I also ran some tests with the single-page HTML5 spec.  This might be a
useful place to track down the page load time problem as it seems to
consistently load more slowly with the v4 patch, especially when lots of
fonts are installed.

HTML5 spec http://dev.w3.org/html5/spec/Overview.html

cold startup times:

trunk sys: HTML5 spec | 13351 / 18119 | 28337 / 16691 |  9111 / 19065 |  9437 / 17253 |  9914 / 17901 |  9081 / 17003
lazy sys:  HTML5 spec |  6631 / 17777 | 10364 / 18732 | 10465 / 18232 | 10502 / 18099 | 10358 / 18315
trunk f11: HTML5 spec | 11003 / 63522 | 11209 / 65383 | 11695 / 68816 | 11441 / 67743
lazy f11:  HTML5 spec | 16516 / 77336 | 13502 / 71152 | 10699 / 74628 | 11320 / 74328 | 10896 / 77017

warm startup times:

trunk sys: HTML5 spec |   650 / 10415 |   652 / 10370 |   654 / 10449 |   651 / 10404
lazy sys:  HTML5 spec |   627 / 10324 |   624 / 10394 |   820 / 10439 |   623 / 10350
trunk f11: HTML5 spec |  1742 / 13951 |  1736 / 14948 |  1739 / 15603
lazy f11:  HTML5 spec |  1093 / 20313 |   959 / 19463 |   966 / 19428 |   961 / 19430 |   964 / 19540
Comment 22 Jonathan Kew (:jfkthame) 2009-11-12 04:21:32 PST
Seems odd. I have not been able to reproduce such a page-load regression here. I installed a few hundred extra font families, and ran several dozen tests of loading the HTML5 (single-page) spec, but I see no consistent or significant difference between trunk and lazy times.

Mac mini, 342 font families installed, HTML5 (local file, networking disabled), page-load times as reported by the comment #8 page:

trunk, series of 20 warm page-loads:
10412  10181  10069  10408  10211
10385  10265  10549  10438  10559
10874  10415   9950  10651   9923
10781  10026  10687  10113  10697
avg: 10379

lazy, series of 20 warm page-loads:
10858   9051  10739   9886  10487
 9988  10320  10411  10130  10318
10427  10394  10441  10380  10509
10365  10674  10395  10539  10334
avg: 10332

Here, the patched version averages very slightly faster, though the difference is insignificant compared to the noise in the measurements. In particular, if we exclude the one noticeably faster measurement (9051) here, the average shifts to 10399, almost identical to the trunk measurement.

I also did some profiling with Shark, and as far as I can tell, the most significant factor in the timing variation seems to be GC-related activity. Comparing profiles of "faster" and "slower" page loads, the most noticeable difference is that in the slower case, I can see more CGGraphBuilder activity. So I'm guessing that the time difference relates primarily to how garbage-collection happens to fall in relation to the page load, rather than anything to do with the fonts.

I'll download some more fonts and do further testing, as I don't understand how you would be seeing a consistent regression as suggested above. Substantial variability in timings does seem to be normal.
Comment 23 Jonathan Kew (:jfkthame) 2009-11-12 05:21:21 PST
Now with 780 font families installed (not sure how many total faces that represents), I'm still not seeing any page-load regression on the HTML5 spec.

Averages of 10 timings with 780 font families:
  trunk: 10409 (range 9738-11848)
   lazy: 10356 (range 9833-11729)
Nothing significant here.
Comment 24 Jonathan Kew (:jfkthame) 2009-11-12 06:09:06 PST
Also ran tests loading Facebook, and no regression seen here. So I'm somewhat puzzled. I'm inclined to suspect that there was something odd about the test environment or procedures, and the apparent regression is not a real issue, but I don't see what else I can do to investigate this further at the moment.
Comment 25 John Daggett (:jtd) 2009-11-13 02:25:11 PST
I ran a bunch of automated tests today and they confirm that the v4 patch is a big win for simple start pages (Google search, Facebook home) but doesn't help or is slightly slower for more complex sites (Yahoo, CNN, wikipedia).  My guess is that system font fallback occurs during the load of complex sites but right now that's still just a hunch.  The analysis is here:

https://spreadsheets.google.com/ccc?key=0ArCKGq7OfNmMdGFvMTdHcDJoSm5ReXhkNVoyNVQwTVE&hl=en

I ran the tests on saved content for different sites with both default fonts plus a few local ones (sys) and after adding in Adobe's Font Folio 11 (f11).  I ran the tests on different machines using both 10.5 and 10.6:

MacBook Pro (macbookpro)
2.66 GHz Core 2 Duo
4GB memory
10.5.8

MacPro (macpro)
2 x 2.8 GHz Quad-Core Xeon
4GB memory
10.5.8

Mac Mini (mini106)
2.26 GHz Core 2 Duo
4GB memory
10.6

For each saved page, I ran the test a few times before recording six runs.  The startup time is the time until the page loads, the page load time is the difference between the startup time and the time at which onload fires.

I used these tryserver builds:

lazy font init v4: https://build.mozilla.org/tryserver-builds/jdaggett@mozilla.com-lazytest1/

trunk:
https://build.mozilla.org/tryserver-builds/jdaggett@mozilla.com-trunktest1/

The saved testpages I used (along with the builds above) are here:

http://people.mozilla.org/~jdaggett/testfiles/lazytesting.tar.gz

Instructions to run timing tests

1. Untar the file above and copy the 'lazytesting' folder to the Desktop
2. Install the two builds lazytest1 and trunktest1 into the lazytest and trunktest folders
3. Run the trunktest1 build and create a new profile 'lazytest'
4. In about:config create a boolean setting 'browser.dom.window.dump.enabled' and set it to true
5. Install the quitter extension: https://www.squarefree.com/extensions/quitter.xpi
6. Edit the script 'runtests.sh' to change the directory settings (e.g. /Users/jd)
7. Copy the runtests.sh script into the Contents/MacOS directory of the two builds
8. Change to the Contents/MacOS directory and run the script:

      ./runtests.sh trunktimings.out
     
The script will run and dump results into a file 'trunktimings.out'. 
For each page of saved content, the script opens the page and quits
several times before recording the results.

Sometimes the Quitter extension needs a nudge, when switching between
builds and using the same profile it doesn't quit the first time.
After manually quitting the first time it's fine (this has no effect on the timings recorded).
Comment 26 John Daggett (:jtd) 2009-11-13 02:42:41 PST
I think the way forward with this bug is to track down why sites like Wikipedia have great initial page load times with this patch.  My first guess would be that system fallback is occurring.  Not sure why system fallback would occur with the Yahoo home page.  

If the slowdown is due to system fallback occurring, then we should confirm that somehow the font system is initialized differently when that occurs, maybe we missed something along that path?

A more general point is that patches like this really require a lot of testing across sites, locales, machines and OS's and should be avoided for branches near release.
Comment 27 Jonathan Kew (:jfkthame) 2009-11-17 07:10:31 PST
Created attachment 412847 [details] [diff] [review]
lazy font list patch v5 - cache the shared NSFontManager for performance

As far as I can tell (modulo testing noise) this version eliminates the regression when using wikipedia.org as startup page and with 650+ font families (over 1000 faces) from FontSquirrel.org installed.

It seems that calling Cocoa's [NSFontManager sharedFontManager] within each family, when getting its faces and localized name, is a significant overhead. So this version of the patch simply caches the sharedFontManager in a static variable.
Comment 28 John Daggett (:jtd) 2009-11-19 21:55:06 PST
Comment on attachment 412847 [details] [diff] [review]
lazy font list patch v5 - cache the shared NSFontManager for performance

Ran tests with v5 patch, looks good, the startup page load times mimic more closely the times with trunk code.
Comment 29 Jonathan Kew (:jfkthame) 2009-11-20 01:53:42 PST
Created attachment 413568 [details] [diff] [review]
lazy font list v6 - ensure styles are loaded in gfxFontFamily::FindFontForChar

I found one further issue: we need to check that the faces are loaded in gfxFontFamily::FindFontForChar, otherwise it's possible for fallback to miss a potential candidate font.

This showed up (on my system - this is dependent on installed fonts) with reftest layout/bidi/bidi-005.html, which displayed hexboxes if loaded directly at startup, but showed Phoenecian characters from Code2001 when part of a full reftest run.

(As a followup, I'm intending to try a patch to improve fallback performance, but that's really a separate issue from the font list startup, it will apply whenever we hit the need for system fallback.)
Comment 30 Dietrich Ayala (:dietrich) 2009-11-24 10:52:01 PST
is this ready to land once the tree re-opens?
Comment 31 Jonathan Kew (:jfkthame) 2009-11-24 11:34:33 PST
(In reply to comment #30)
> is this ready to land once the tree re-opens?

Yes, should be. I was hoping to land it this week, but I guess it won't be till after Thanksgiving.
Comment 32 Jonathan Kew (:jfkthame) 2009-12-10 13:38:32 PST
Pushed http://hg.mozilla.org/mozilla-central/rev/92539d5c41b1
Comment 33 Dietrich Ayala (:dietrich) 2009-12-14 22:19:53 PST
From the graphs, looks like <1% change on Leopard. Not suprising given that it's only the default set of fonts, and the test loads an almost blank page.

Note You need to log in before you can comment on or make changes to this bug.