Closed Bug 205536 Opened 22 years ago Closed 21 years ago

nsFontMetricsGTK::GetTextDimensions is too slow

Categories

(Core Graveyard :: GFX: Gtk, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: jim_nance, Assigned: blizzard)

References

Details

(Keywords: perf, testcase)

Attachments

(2 files)

When running jprofs on loading large pages of text I found that nsFontMetricsGTK::GetTextDimensions accounts for about 5% of the runtime. It appears that most of this time is spent looking for the font that each character is to be displayed with. A simple optimization of this process is possible. I instrumented the code and found that 97% of the time the font that is going to be used to display a character is the same as the font that was used to display the prior character. By testing for this case before the general font search case, runtime of this function decreases by about 3X: Before: Total hit count: 613 Count %Total Function Name 26 4.2 nsFontMetricsGTK::GetTextDimensions(unsigned short *, unsigned 9 1.5 uMapCode 9 1.5 nsCOMPtr_base::begin_assignment(void) 9 1.5 _init 7 1.1 nsCOMPtr_base::~nsCOMPtr_base(void) 6 1.0 nsRuleNode::GetStyleData(nsStyleStructID, nsStyleContext *, int) After: Total hit count: 600 Count %Total Function Name 16 2.7 nsContainerFrame::GetFrameForPointUsing(nsIPresContext *, 15 2.5 nsCOMPtr_base::~nsCOMPtr_base(void) 12 2.0 nsStyleContext::GetStyleData(nsStyleStructID) 12 2.0 uMapCode 10 1.7 nsFrame::GetFrameForPoint(nsIPresContext *, nsPoint &, 10 1.7 chunk_free 9 1.5 nsFontMetricsGTK::GetTextDimensions(unsigned short *, unsigned
This patch checks to see if the previous font is being reused before searching for the new font.
Keywords: perf
The patch isn't quite right. Even if prevFont contains the char, it could be that prevPrevFont (i.e., one of the other fonts before that) contains the char. In which case the prevPrevFont has to be used for CSS compliance. [BTW, since this happens with GetTextDimensions/GetWidth/DrawString/etc, the whole glyph resolution can really be best factored out as an enumerator callback (e.g., GfXWin, GfxXft). And so tuning the enumerator will benefit all its clients].
That is unfortunate. This was such a simple patch with such a nice result. I guess that is why it wasnt done this way in the first place :-( Im going to think about this for a while. No ideas are jumping into my head.
You might want to take a look at ResolveForwards in GfxWin or EnumerateGlyphs in GfxXft, both of which are based on an enumerator callback approach. In particular, the one in GfxWin has an optimization for the common case where all characters are found in the same first font.
I looked at ResolveForwards (though I could only find it for gtk, not win). I believe that the code I attempted to modify had the same optimizations as this code. It attempted to find sequences of characters that were all the same font rather than processing each char individually. It also automatically handled the case of the font being the first in the list because it stopped the list search when it found the first font. One comment you made make me think something may be wrong though. You (and the ResolveForwards comment) said that the common case was for the font to be first in the list. For my testcase almost every time this function is called there are 5 fonts in the list and we are the last font. Thats why it was so much faster to check prevFont first. If it had matched the first element, there would have been no speedup. So I am wondering if something is wrong with the list. The page I was viewing is a jprof page. Its a pretty straightforward table even though it is large. What do you think?
> I looked at ResolveForwards (though I could only find it for gtk, not win) As of now, LXR shows it at: http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/nsFontMetricsWin.cpp#3734 > So I am wondering if something is wrong with the > list. The page I was viewing is a jprof page. Its a pretty straightforward > table even though it is large. What do you think? Something curious seems to be going on, then. If it is plain ASCII, the first font is often enough to cover all the characters.
I built for gtk2 & xft to see if anything was different. (Is that the direction you are moving in Chris?). We now spend a great deal of time in nsFontMetricsXft::GetTextDimensions(). This does not seem to have the problem with always needing to get the last element of the list. Now we almost always get the first element which is good. This method is using the enumerators like you wanted, but it is doing callbacks 1 char at a time. I may try and code up the optimization to send all chars that use the same font at the same time.
The enumerator in Xft has been improved to pass several characters. Please use the patch in bug 176290 which is planned to be checked in any time now.
I don't think there's any benefit to doing it more than one character at a time because under the covers it's done one character at a time. If you try to do it more than one at a time, you end up spending time building up a big string and passing it in and then have it disassembled, anyway. Just take out the middle man.
Sounds suboptimal, indeed. Maybe something that Xft-core can improve upon? Although arguably the font engine glyph compositer works one char at a time anyway, and so perhaps where a gain might come is in the fact that the higher function calls overhead have been minimized.
BTW, with things shifting to Xft, just want to remind that the discovery in comment 5 that, on a mere plaintext page, nsFontMetricsGTK::GetTextDimensions is slowed down due to several fonts in mLoadedFonts[] deserves a special attention by the GTK & Xlib maintainers. It is strange and unclear why mLoadedFonts[] has several fonts on a plain text page.
> Just take out the middle man. This makes perfect sense if the (sole) reason for building up a string in the patch for bug 176290 is to enhance the performance. However, it's not as we know well :-) > where a gain might come is in the fact that the higher > function calls overhead have been minimized. I guess so. > strange and unclear why mLoadedFonts[] has several fonts on > a plain text page. I have little idea, but could it have to do with the way mLoadedFonts[] is built up collecting fonts from various sources (the user pref., locale, css, and so forth). Hmm, that doesn't explain at all why it's the last font in the array that ends up being used to render plain ASCII text. Jim, does it make any difference if you change View|Character Coding to other values? How about launching Mozilla under C locale ('LC_ALL=C mozilla') vs say, Japanese locale ('LC_ALL=ja_JP.UTF-8 mozilla)? You can also check what you have in Pref|Navigator|Languages. All these things can affect the font selection in Gfx:GTK(X11core), but no matter what, it still doesn't seem right that 5 fonts are loaded and the last one is used for plain ASCII text.
> Hmm, that doesn't explain at all why it's the last font in the array > that ends up being used to render plain ASCII text. That is actually what I mean. Why is the first font(s) so poor -- to the point that several fonts have to be loaded before reaching a font with ASCII characters. The load (there) is lazy. And so if the first font isn't too poor, the load of other fonts wouldn't kick in.
I redid the test with mLoadedFonts, and something definitly looks strange to me. I instrumented the gtk code as such: diff -u -r1.248 nsFontMetricsGTK.cpp --- gfx/src/gtk/nsFontMetricsGTK.cpp 28 Apr 2003 04:16:15 -0000 1.248 +++ gfx/src/gtk/nsFontMetricsGTK.cpp 17 May 2003 00:34:53 -0000 @@ -3915,7 +3915,8 @@ nsFontGTK** font = mLoadedFonts; nsFontGTK** end = &mLoadedFonts[mLoadedFontsCount]; - while (font < end) { + int cnt=0; + while (++cnt && (font < end)) { if (CCMAP_HAS_CHAR_EXT((*font)->mCCMap, c)) { currFont = *font; goto FoundFont; // for speed -- avoid "if" statement @@ -3925,6 +3926,8 @@ currFont = FindFont(c); FoundFont: + static FILE *fp = fopen("/tmp/look", "w"); + fprintf(fp, "%d/%d\n", cnt, mLoadedFontsCount); // XXX avoid this test by duplicating code -- erik if (prevFont) { if (currFont != prevFont) { I loaded my test page (which I will attach) and almost all the lines in the log file indicate 5 fonts with the last one being selected: tricia> sort /tmp/look | uniq -c 38 1/1 1512291 5/5 I looked at Pref|Navigator|Languages and it indicates [en-us] and [en]. Ill try with the other languages in a few minutes.
care to tell the names of those fonts?
Jim, thanks for the update. It's the current value of View|Character Coding and the locale under which Mozilla is running that are more important than Pref|Nav|Languageus. Can you make sure that View|Character Coding is set to 'Western' when you run your test? Can you also tell me under what locale you're conducting your test? Is it en_US.UTF-8? Unlike truetype fonts, a lot of X11 BDFs (for CJK) don't have ASCII coverage. For instance, fonts with XLFD jisx0208-19xx don't have any ASCII character(they do have 'double width' replica, though). Therefore, it would certainly help to print out XLFDs of mLoadedFonts[]. nsFontMetricsGFX will emit verbose messages if you activate logging by setting an environment variable whose name is escaping me at the moment.
Blocks: 203448
Keywords: testcase
I got the font names: -arabic-newspaper-medium-r-normal--32-246-100-100-p-*-iso10646-1 -daewoo-gothic-medium-r-normal--16-120-100-100-c-*-ksc5601.1987-0 -isas-fangsong ti-medium-r-normal--16-160-72-72-c-*-gb2312.1980-0 -misc-fixed-medium-r-normal--14-130-75-75-c-*-jisx0208.1983-0 -misc-fixed-medium-r-normal--13-100-100-100-c-*-iso8859-1 The fangsong font does have a space in the name. I double checked that one because it looked so strange. I do not have any LC environment variables set. Is this what you mean when you ask about the local? The character coding is ISO-8859-1
unix.js has the following settings: pref("font.name.serif.x-western", "adobe-times-iso8859-1"); pref("font.name.sans-serif.x-western", "adobe-helvetica-iso8859-1"); pref("font.name.monospace.x-western", "adobe-courier-iso8859-1"); Meaning the first font tried should have been 'courier'. gisburn, simon, any clues what is happening here? Why is that the font.name pref isn't honored?
> I do not have any LC environment variables set. Is this > what you mean when you ask about the locale? How about the output of $ env | egrep '(LC_|LANG|LING)' You may have LANG set to ll_CC.UTF-8.
Does running Mozilla this way make any difference? $ LC_ALL=C mozilla ( sh/bash/ksh) $ env LC_ALL=C mozilla ( sh/bash/ksh/tcsh/csh)
Another experiment (well, we don't have to do this because gdb or the diagnostic output would be sufficient, but I'm lazy here...) to do: Does specifying lang=en make any difference? That is, at the beginning of attachment attachment 123554 [details], you can replace '<html>' with '<html lang=en>' and see if there's any difference. I guess there will be for both cases (with LC_ALL=C and lang=en in html).
I suspect mLangGroup is unset in what follows and GFX:Gtk falls back to selecting fonts based on the current locale which happens to be ll_CC.UTF-8. Alternatively, mLangGroup may have been already set to x-unicode by the time |nsFontMetricsGTK::FindStyleSheetGenericFont| is invoked. http://lxr.mozilla.org/seamonkey/source/gfx/src/gtk/nsFontMetricsGTK.cpp#5931 5931 font = FindLangGroupPrefFont(mLangGroup, aChar); 5932 if (font) { 5933 NS_ASSERTION(font->SupportsChar(aChar), "font supposed to support this char"); 5934 return font; 5935 } http://lxr.mozilla.org/seamonkey/source/gfx/src/gtk/nsFontMetricsGTK.cpp#6009 6009 // find font based on user's locale's lang group 6010 // if different from documents locale 6011 if (gUsersLocale != mLangGroup) { 6012 FIND_FONT_PRINTF((" find font based on user's locale's lang group")); 6013 font = FindLangGroupPrefFont(gUsersLocale, aChar); 6014 if (font) { 6015 NS_ASSERTION(font->SupportsChar(aChar), "font supposed to support this char"); 6016 return font; 6017 } 6018 }
>I suspect mLangGroup is unset in what follows and GFX:Gtk falls back to >selecting fonts based on the current locale which happens to be ll_CC.UTF-8. Sounds like an incentive for GfxGTK/GfxXlib to fix bug 202921, and to share the "language detector" for vanilla documents which don't specify their language or mix characters from different languages (bug 116030, bug 184848, bug 206123).
I'm afraid my comment #23 (and a few comments before that) was off the mark. 'x-western' is used regardless of the locale if I set View|Character Coding to Western(ISO-8859-1). What I don't understand is that GFX:Gtk looks for fonts like 'Luxi Mono' even though font.*.*.x-western is set to -adobe-courier-...... Where the hell does it pick up 'Luxi Mono'? I have no idea. Is it because I compiled in Xft but ran Mozilla with 'GDK_USE_XFT=0'? That doesn't make much sense, but I couldn't think of any other 'source' of 'Luxi Mono'... I've gotta try a build without Xft...
'Luxi Mono' and others like it came from prefs.js that was written by Mozilla-Xft. Although I visited Pref|Appearance|Font|Western, I killed Mozilla-X11core during a long delay after pressing 'OK' so that new values were not saved in prefs.js. Once I made sure that prefs.js don't have any font-setting like 'Luxi Mono', the first font in the loaded font list became '*-iso8859-1'. Jim, why don't you run Mozilla-x11core with a new profile (if you don't want to delete font-setting for Mozilla-Xft)? You can use '-P' cmd. line option to create a new profile. I believe if you do, it'll be always the first font that's used for Western documents. And, the time spent in |..TextDimension| will be reduced dramatically because it doesn't have to check iso10646 fonts.
Running with a new profile fixed this problem. I am going to mark this bug as invalid. Thanks for all your help.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → INVALID
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: