Closed
Bug 416725
Opened 16 years ago
Closed 16 years ago
font selection for numerals affected by characters of different language elsewhere on the page
Categories
(Core :: Graphics, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla1.9
People
(Reporter: karlt, Assigned: karlt)
References
Details
(Keywords: perf, regression, testcase)
Attachments
(10 files, 6 obsolete files)
475 bytes,
text/html
|
Details | |
2.08 KB,
image/png
|
Details | |
571 bytes,
text/html
|
Details | |
2.73 KB,
image/png
|
Details | |
569 bytes,
text/html
|
Details | |
156.32 KB,
image/png
|
Details | |
170.76 KB,
image/png
|
Details | |
3.60 KB,
patch
|
Details | Diff | Splinter Review | |
153.06 KB,
image/png
|
Details | |
14.83 KB,
patch
|
pavlov
:
review+
pavlov
:
superreview+
pavlov
:
approval1.9+
|
Details | Diff | Splinter Review |
In the attached testcase the two numbers are rendered with different fonts due to the Japanese characters in the page. pango_itemize selects fonts based on the script/language of the character, but for common characters (belonging to more than one script) the language/font is selected based on surrounding characters, which may be in different words. This is (debatably) helpful behavior when there is text from more than one script in the page. However, the TextRunWordCache constructs pseudo-sentences of words which have not already been cached, so the text that pango_itemize gets can be very different from the original sentence.
Assignee | ||
Comment 1•16 years ago
|
||
Assignee | ||
Comment 2•16 years ago
|
||
Suggesting P2 (even though this situation may not happen too often) because a fix for this will fix bug 416475.
Blocks: 416475
Assignee | ||
Updated•16 years ago
|
Flags: blocking1.9?
Priority: -- → P2
Assignee | ||
Comment 3•16 years ago
|
||
This can be fixed simply by ensuring that pango_itemize receives only individual words if it is TextRuns for individual words that are wanted. The "right" place to fix this would seem to be TextRunWordCache::MakeTextRun as that is what is constructing pseudo-sentences of unrelated words. Stuart and John, would you expect calling gfxFontGroup::MakeTextRun with individual words instead of a space-separated list to adversely affect performance on MS and Mac.
Assignee | ||
Comment 4•16 years ago
|
||
Rob, I see you went to quite some effort to delay generating word entries so that they could all be generated from one string. Sorry to suggest removing this. I'll do a bit of performance testing.
Assignee | ||
Comment 5•16 years ago
|
||
(fixes up placement of preprocessor conditional) single run Talos Tp Results on Linux without patch: 149.1, 149.6 with patch: 148.1, 150.0
Attachment #302703 -
Attachment is obsolete: true
Attachment #303015 -
Flags: review?(roc)
Try a page that's just a big page of Unicode (say Chinese) text? This patch affects all platforms so it might be a performance issue on some other platform. But I'm OK with the concept, we might just have to check it in and see what Talos says officially.
Maybe not Chinese. Maybe just English text with spaces?
Flags: blocking1.9? → blocking1.9+
Comment on attachment 303015 [details] [diff] [review] gfxFontGroup::MakeTextRun with individual words v1.1 Why are you pulling tempString into the loop? More efficient to leave it outside the loop and just clear it at the end of each iteration. Other than that it looks fine.
Attachment #303015 -
Flags: superreview+
Attachment #303015 -
Flags: review?(roc)
Attachment #303015 -
Flags: review+
Assignee | ||
Comment 9•16 years ago
|
||
Comment on attachment 303015 [details] [diff] [review] gfxFontGroup::MakeTextRun with individual words v1.1 checked this in http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&branch=HEAD&cvsroot=%2Fcvsroot&date=explicit&mindate=1203284400&maxdate=1203284830&who=karlt%2B%25karlt.net but backed out due to performance issues on MS Windows Tp 440 -> 465.
Assignee | ||
Comment 10•16 years ago
|
||
Talos machines didn't notice any difference but Tp2 did: 355 -> 375.
Assignee | ||
Comment 11•16 years ago
|
||
The word cache is also interfering with Pango's algorithm for matching fonts of paired characters. The word cache can cause the opening and closing characters to no longer balance.
Assignee | ||
Comment 12•16 years ago
|
||
This won't be fixed by ensuring that pango_itemize receives only individual words.
Assignee | ||
Comment 13•16 years ago
|
||
I think the best solution is to ensure that pango_itemize uses the same font for the first font in the fontset irrespective of the language of the characters (as was the case before changes in bug 401988).
Blocks: 401988
Assignee | ||
Comment 14•16 years ago
|
||
Implements the solution suggested in comment 13. This gives about a 5% improvement in local talos page load time. With browser.display.auto_quality_min_font_size = 0, the improvement is about 10%. The difference between auto_quality_min_font_size = 20 and = 0 drops from about 8% to 3%.
Attachment #303015 -
Attachment is obsolete: true
Attachment #304657 -
Flags: review?(roc)
Assignee | ||
Comment 15•16 years ago
|
||
For the patch to perform completely correctly we need to fix a separate issue (with similar symptoms) in that generic fonts [for GetFontAt(0)->GetPangoFont()] are being selected based on the first language used rather than the current DOM language.
Keywords: perf
Assignee | ||
Comment 16•16 years ago
|
||
The problem is that the gfxPangoFontCache keys fonts against PangoFontDescription but the selection also depends on PangoLanguage. Including the language in the key increases the number of calls to pango_context_load_font() during a talos pageset run from 2132 to 2375. This seems to make the page load metric 1% worse. The gfxPangoFontCache also has the issue that a hash is used as a key and there is no check that even the PangoFontDescription matches. I'm tempted to remove this cache and increase the expiry time on the gfxFontCache, which serves a similar purpose, and expiry seems nicer than filling up 5000 entries then clearing them all. However, even with a very large expiry time on the gfxFontCache, this increases the number of calls to pango_context_load_font() from 2375 to 2616. The performance affect from this actually seems less noticeable perhaps due to the PangoFcFontMap pattern cache, but there may be performance effect. The difference may be due to different weights in gfxFontStyles mapping to the same weight in the font description, or perhaps the double size in gfxFontStyle mapping to a fixed point size (with a scale factor of 1000). There are no PangoFontDescriptions that share the same PangoFontDescription hash.
Comment on attachment 304657 [details] [diff] [review] PangoFontMap that provides the same primary font irrespective of script Looks OK to me but behdad really should review this I'd think. +NewGfxPangoFontMap(PangoFontMap *aChildFontMap, PangoFont *aBaseFont); + +void +SetBaseFont(gfxPangoFontMap *aFontMap, PangoFont *aBaseFont); For namespace purity, should make these static methods of class gfxPangoFontMap
Attachment #304657 -
Flags: review?(roc) → superreview+
Assignee | ||
Comment 18•16 years ago
|
||
Comment on attachment 304657 [details] [diff] [review] PangoFontMap that provides the same primary font irrespective of script Would you be able to look over this, please, Behdad? Comment 0 and comment 11 explain why there is a problem, and comment 12 to 14 are relevant. Our word cache assumes that the rendering of each word doesn't depend on surrounding context. "words" here means groups of characters separated by spaces. This patch provides/causes consistent font selection for script-common characters (assuming that the base font supports the common characters). This behavior is more in line with the "UA-dependent default 'font-family'" in Step 5 of the CSS font matching algorithm, which Mozilla should follow. http://www.w3.org/TR/CSS21/fonts.html#algorithm and with the (implied) concept of a generic family being a (fixed) chosen real family. http://www.w3.org/TR/CSS21/fonts.html#generic-font-families Authors will need to use the "lang" attribute if they want to indicate a change in the generic/default font for the same character. (The shape engine selection may still be corrupted by the adjacent unrelated words. If this is a problem then we'll also need to have a separate call to pango_itemize for each word [or group of words in the same script].)
Attachment #304657 -
Flags: review?(mozilla)
Comment 19•16 years ago
|
||
(In reply to comment #14) > Created an attachment (id=304657) [details] > PangoFontMap that provides the same primary font irrespective of script > > Implements the solution suggested in comment 13. > > This gives about a 5% improvement in local talos page load time. > With browser.display.auto_quality_min_font_size = 0, > the improvement is about 10%. > > The difference between auto_quality_min_font_size = 20 and = 0 drops from about > 8% to 3%. I tried this patch. It is not good for me. Two or more fonts are applied for alphabets and numerical characters. I will attach screenshots.
Comment 20•16 years ago
|
||
This is the screen shot with current trunk build with the patch for http://hidenosuke.org/antenna/ . It is not good.
Comment 21•16 years ago
|
||
This is the screen shot with 080212(JST) build for http://hidenosuke.org/antenna/ . It is good for me.
Assignee | ||
Comment 22•16 years ago
|
||
(In reply to comment #19) > I tried this patch. > It is not good for me. Thank you for testing. I now notice that I failed to attach the necessary changes to the Makefile to compile the new files. I would have expected unresolved symbols without these. (In reply to comment #20) > This is the screen shot with current trunk build > with the patch for http://hidenosuke.org/antenna/ . I'll attach a new patch. I can't reproduce the behavior with this patch (but I do see that behavior without the patch).
Assignee | ||
Comment 23•16 years ago
|
||
Included Makefile.in changes. Also keeping Pango's behavior for system fonts, because system fonts should mimic what the system does. There wouldn't be much difference if the language of the style for the UI were the language of the selected localization (i.e. the strings). However, it seems that the language of the style for the UI comes from the locale, so, for non-localized apps, the language is not representative of the English strings, and the generic fonts for the language are different. (In reply to comment #17) > +NewGfxPangoFontMap(PangoFontMap *aChildFontMap, PangoFont *aBaseFont); > + > +void > +SetBaseFont(gfxPangoFontMap *aFontMap, PangoFont *aBaseFont); > > For namespace purity, should make these static methods of class > gfxPangoFontMap Defined struct gfxPangoFontMap in the header so that these methods can be static PangoFontMap * gfxPangoFontMap::NewFontMap(PangoFontMap *aChildFontMap, PangoFont *aBaseFont); void gfxPangoFontMap::SetBaseFont(PangoFont *aBaseFont);
Attachment #304657 -
Attachment is obsolete: true
Attachment #305296 -
Flags: review?(mozilla)
Attachment #304657 -
Flags: review?(mozilla)
Assignee | ||
Comment 24•16 years ago
|
||
This is useful for testing but not ready to check in. It hacks around the language issue mentioned in comment 15 and 16.
Assignee | ||
Comment 25•16 years ago
|
||
corrected imbalanced parentheses
Attachment #305297 -
Attachment is obsolete: true
Comment 26•16 years ago
|
||
I tried new patch(305296). This is a screen shot for the build with new patch for http://hidenosuke.org/antenna/ . This is better than previous patch. By the way, I think only Japanese fonts are used for all characters. (Please look at attachment 305161 [details].) Is this intended?
Assignee | ||
Comment 27•16 years ago
|
||
(In reply to comment #26) > By the way, I think only Japanese fonts are used for all characters. It is expected. It is not always possible to determine language from the characters, and Mozilla's word cache means only the context of the word is available, so any attempt to determine the language of the words independently is likely to result in inconsistent looking text. What this patch does is use the language specified in the document to select the (primary) font. The document at http://www.hidenosuke.org/antenna specifies lang=ja_JP and font-family:monospace. With default Mozilla preferences, this CSS generic family maps to the system/fontconfig generic pseudo-family also called "monospace". For lang=ja, fontconfig provides a Japanese font for the generic pseudo-family. > (Please look at attachment 305161 [details].) The behavior of 081202 used no language in resolving the fontconfig generic, and so Western fonts tended to be selected as primary fonts. I can see that this produced a pleasant result for this web page (and probably many others). This however is not the best behavior for Hebrew or Arabic text, and it can mean that a Chinese font may be used to render Han characters in a Japanese document. Different people will have different opinions on what the best behavior is for a mix of Latin and Japanese characters. If there were a Japanese font that provided good glyphs for both Latin and Japanese characters, then I think that would be the best font to use as the glyphs would be designed to be used together. This is particularly important for monospace text as a mix of fonts would provide a mix of widths (though I see that the Japanese characters are about twice the width of Latin characters with the ja monospace font from fontconfig here). However, some fonts designed for Western characters render Latin characters better than some (many?) Japanese fonts. It is possible to have Latin characters in Japanese documents rendered with Western fonts, either * by the author using the "lang=en" attribute on elements that contain (only!) Latin content, or * by the user setting up font preferences for the Japanese generic fonts in such a way that a Western font is tried first. The Western font will then be used for the characters that it supports but (provided the Western font doesn't support Japanese characters) a different font will be used for Japansese characters. The user can set up font preferences either in Mozilla (as was done with attachment #XXXXXXXX, and I expect that is what Bill's extension in Bug XXXXX Comment XXXXX does) or with fontconfig. Here's an example fontconfig element that could be in ~/.fonts.conf: <match target="pattern" > <test name="lang" > <string>ja</string> </test> <test name="family" > <string>sans-serif</string> </test> <edit mode="prepend" binding="strong" name="family" > <string>DejaVu Sans</string> </edit> <edit mode="prepend" binding="weak" name="family" > <string>Sazanami Gothic</string> </edit> </match> A similar element could be added for family=serif, but I wouldn't recommend doing this for monospace.
If the author specifies a Western font first in CSS 'font-family', that works too, right? A lot of pages do that. Maybe we should specify some Western fonts first in our prefs for ja?
Assignee | ||
Comment 29•16 years ago
|
||
(In reply to comment #27) > The user can set up font preferences either in Mozilla (as was done with > attachment #XXXXXXXX, and I expect that is what Bill's extension in Bug XXXXX > Comment XXXXX does) attachment 299736 [details] [diff] [review] and bug 401988 comment 59, that is!
Assignee | ||
Comment 30•16 years ago
|
||
(In reply to comment #28) > If the author specifies a Western font first in CSS 'font-family', that works > too, right? A lot of pages do that. Yes, provided the fonts exist on the platform. fontconfig <alias> elements have "weak" binding, so the language overrides the alias. > Maybe we should specify some Western fonts first in our prefs for ja? Probably, but it caused a performance hit (bug 416068) and the name-list preferences are not very visible so can be confusing. The performance hit would be less now though with attachment 302746 [details] [diff] [review] (bug 414692).
Assignee | ||
Comment 31•16 years ago
|
||
Implemented gfx_pango_fontset_get_font() and corrected some indentation. The only time a PangoContext is being passed to a PangoFontMap of different type from get_fontmap(context) is pango_font_map_load_fontset(fontmap, context,...) (where the fontmap is explicitly passed), and the only assumptions I've found that get_fontmap(context) returns a fontmap of typeof(self) are in render_layout functions. But if you'd like to have get_fontmap(context) in fontmaps always return a fontmap of typeof(self) then the fontmap can be set temporarily on the context. (I can't see a good way to copy a PangoContext.) The reasoning behind this patch is that a word is not enough context to reliably and consistently detect language. So the document/element language is used to select the primary font and the word language is used if that font doesn't support the chars. This is pretty consistent with HTML specification that implies that the lang attribute should be applied to most of the content but individual "characters that are atypical for a particular language" are still rendered with a best attempt. IIUC it is also what we do on other platforms. http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1
Attachment #305296 -
Attachment is obsolete: true
Attachment #305686 -
Flags: review?(mozilla)
Attachment #305296 -
Flags: review?(mozilla)
Assignee | ||
Comment 32•16 years ago
|
||
+ if (childLevel > baseLevel || !self->mBaseFont) The !self->mBaseFont should not (need not) be here.
Assignee | ||
Comment 33•16 years ago
|
||
I recommend that this should be fixed for 1.9 so that we don't have behavior as in attachment 305160 [details] (where the date and time should be one monospace font).
Target Milestone: --- → mozilla1.9
Updated•16 years ago
|
Flags: tracking1.9+ → blocking1.9+
Karl, you should ping behdad on irc in case he's not seeing the review request.
Comment 35•16 years ago
|
||
(In reply to comment #31) Hi Karl, Waht's going on? attachment 305686 [details] [diff] [review] works fine as far as I tested.
Assignee | ||
Updated•16 years ago
|
Attachment #305686 -
Attachment description: PangoFontMap that provides the same primary font irrespective of script v1.2 → PangoFontMap that provides the same primary font from document language rather than adjacent characters v1.2
Assignee | ||
Comment 36•16 years ago
|
||
(In reply to comment #35) Thanks for testing. The patch needs review. Behdad would be the best person to review, but he is also a busy person. Behdad: Mozilla's situation is slightly different from normal Pango use for two reasons: 1. We only have one word of context available to determine the script. 2. HTML provides for specifying the language of the text, so we should use that when it makes sense, even for common characters.
Assignee | ||
Updated•16 years ago
|
Attachment #305686 -
Flags: review?(mozilla) → review?(pavlov)
Comment 37•16 years ago
|
||
Comment on attachment 305686 [details] [diff] [review] PangoFontMap that provides the same primary font from document language rather than adjacent characters v1.2 I would suggest doing thist without adding new files. Besides the first font for the language, it doesn't seem like this addresses using different fonts if the "base" font doesn't work? I have some thoughts on that, but maybe this is enough for now?
Comment 38•16 years ago
|
||
This isn't something we'd stop the release for. We'll take a good patch. Moving to wanted1.9.0.x+.
Flags: wanted1.9.0.x+
Flags: blocking1.9-
Flags: blocking1.9+
Assignee | ||
Comment 39•16 years ago
|
||
Personally I feel we should make sure this is fixed for the release, but I won't renom as I have no new information beyond comment 33.
Assignee | ||
Updated•16 years ago
|
Keywords: regression
Assignee | ||
Comment 40•16 years ago
|
||
(In reply to comment #37) > I would suggest doing this without adding new files. Moved to single file. > Besides the first font for the language, it doesn't seem like this > addresses using different fonts if the "base" font doesn't work? Yes, if the prefs for the document language specify only a fontconfig generic name, and a common character near a character from a script of different language cannot be represented using the base font for the doc language, then that different language sometimes get used to choose the font. I doubt that happens so often. This patch should cover most cases.
Attachment #305686 -
Attachment is obsolete: true
Attachment #314516 -
Flags: review?(pavlov)
Attachment #305686 -
Flags: review?(pavlov)
Updated•16 years ago
|
Attachment #314516 -
Flags: superreview+
Attachment #314516 -
Flags: review?(pavlov)
Attachment #314516 -
Flags: review+
Updated•16 years ago
|
Attachment #314516 -
Flags: approval1.9+
Assignee | ||
Comment 41•16 years ago
|
||
http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&branch=HEAD&cvsroot=%2Fcvsroot&date=explicit&mindate=1207724062&maxdate=1207724165&who=karlt%2B%25karlt.net
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: in-testsuite?
Keywords: testcase
Resolution: --- → FIXED
Assignee | ||
Comment 42•16 years ago
|
||
This may have been responsible for a 16-17% reduction in Tp.
Comment 43•16 years ago
|
||
(In reply to comment #42) > This may have been responsible for a 16-17% reduction in Tp. > Have you confirmed this? Have a link to talos? That's a huge hit that can't go unchecked.
That's not a hit, it's an improvement :-).
Assignee | ||
Comment 45•16 years ago
|
||
It is sort of visible here but the times plotted are up to 2 hours after the build ids. http://graphs.mozilla.org/#spst=range&spss=1207712982.4914286&spse=1207734847.627143&spstart=1198095381&spend=1208548750&bpst=cursor&bpstart=1207712982.4914286&bpend=1207734847.627143&m1tid=61148&m1bl=0&m1avg=0&m2tid=61164&m2bl=0&m2avg=0&m3tid=61214&m3bl=0&m3avg=0&m4tid=146480&m4bl=0&m4avg=0 The buildids in the csv data are not so helpful either http://graphs.mozilla.org/dumpdata.cgi?setid=61148&setid=61164&setid=61214&setid=146480 "61148","qm-mini-ubuntu01","1.9","tp_loadtime_avg","0" "61148","1207722738","625.911","200804082258","None" "61148","1207727648","533.444","200804090020","None" "61164","qm-mini-ubuntu02","1.9","tp_loadtime_avg","0" "61164","1207721355","623.73","200804082234","None" "61164","1207726249","527.273","200804090007","None" "61164","1207730879","527.719","200804090119","None" "61214","qm-mini-ubuntu05","1.9","tp_loadtime_avg","0" "61214","1207725989" (Wed Apr 9 00:26:29 2008),"629.122","200804082314","None" "61214","1207730929","528.941","200804090034","None" "61214","1207735498","529.459","200804090146","None" "146480","qm-mini-ubuntu03","1.9","tp_loadtime_avg","0" "146480","1207722494","530.398","200804082150","None" "146480","1207726869","437.033","200804082258","None" "146480","1207730924","430.444","200804090020","None" As they imply a non-existent range: <200804082258 >200804082314 But analysing what was reported on Tinderbox http://tinderbox.mozilla.org/showbuilds.cgi?tree=Firefox&maxdate=1207729498&legend=0&norules=1 gives this range: http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=1207723740&maxdate=1207724819 which would indicate attachment 314516 [details] [diff] [review] (as expected).
Updated•16 years ago
|
Flags: wanted1.9.0.x+
You need to log in
before you can comment on or make changes to this bug.
Description
•