<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Comment 2

•

23 years ago

Shanjian, I believe we can use NS_FONT_DEBUG to find out what lang group the font code thinks the document is in and where in the font search path the font is found. Could you explain to them how to do this? Once we know what is happening we can try to determine what can/should be done. Thanks.

Assignee

Comment 3

•

23 years ago

For utf-8 and other unicode encodings, we are currently using user's locale charset to figure out the language. This is because we lack of a mechinism to specify/recognize language in xul (and probably other xml files). Before that is fixed, we could not do much to this bug.

Target Milestone: --- → Future

ji

Comment 4

•

23 years ago

This happens on Windows too. Changed the platform to ALL. On my Simplified Chinese Windows XP, UTF-8 page display is using Simplified Chinese fonts.

Hardware: PC → All

Frank Tang

Comment 5

•

23 years ago

give to shanjian

Assignee: yokoyama → shanjian

Joaquin Menchaca

Comment 6

•

23 years ago

I'm not sure I understand what's the exact problem. I have multiple languages inside inside a UTF page, and Mozilla auto-senses the region and displays the appropriate fonts for the appropriate language. Mozilla's seems to use multiple fonts in one page. URL: http://www.realmspace.com/unicode/ut/h/utf8.html

Reporter

Comment 7

•

23 years ago

Joaquin Menchaca has one example of a multilingual page that works, but one working example doesn't mean the code is correct in general. In my original report I gave quite exhaustive instructions on how to reproduce the problem. For Unicode characters not present in non-Unicode fonts, Mozilla is clearly and unambiguously doing the wrong thing. Joaquin, please surf over to <http://www.cs.berkeley.edu/~liblit/>, and look at the first word in the title: "Ben's". Do you see a vertical apostrophe, or do you see a curved single right quote? If you see a vertical apostrophe, then Mozilla is doing the wrong thing.

Comment 8

•

23 years ago

I see 2 issues here: 1) Having Unicode in the list of font language groups in the font prefs seems inappropiate. The rest of the entries are language groups (excluding "User Defined" which is there to in the hope that it will allow people to trick the browser into working for unsupported languages). I suspect that the Unicode entry is a leftover from NS 4x days when the code did not support Unicode. 2) The font system tries to avoid iso10646 fonts because it is so expensive to determine which chars they support and we do not have any good way to tell which language group they are appropiate for. It would be great if we could tell what chars are in iso10646 fonts but we cannot without doing a XLoadQueryFont (or XQueryFont) which is very expensive. When I added the TrueType support I ended up writing 4000+ lines of code to address this issue of getting the list of supported chars in a font. I was able to cache the info because I had access to the TrueType font file timestamps and could tell if the files had changed. Unfortuantely the X font API provides no way to tell if the fonts have changes so if we were to cache which chars ae X iso10646 font had we would never be able to tell if it was stale. If the info is stale we would get complaints that we did the wrong thing. Until we have a reasonable way to get the list of chars in iso10646 fonts we we either have to choose to be very inefficient when searching for glyphs (all languages) or to have less than perfect Unicode support.

Assignee

Comment 9

•

23 years ago

Let me restate the problem to make it more clear. When we choosing fonts, we use the language group information guiding our font search. This info can be provided through HTML attribute "lang". When it is missing from doc, in most cases we are able to figure it out from document's encoding, ie. charset. For unicode encodings, this approach does not work and we can only mark the document as unicode. Since all of mozilla's xul files (UI implementation) are in unicode encoding, and we don't want to use unicode font in such situation, we put a hack and current locale language replaces unicode. If you run mozilla in western locale, western language group will be use, thus western font will be selected. I do plan to fix this, but my effort is blocked by xml's incapability of handling "lang". Keeping XUL files working well is a priority. Anyway bug has been filed and I am waiting for it. For characters like ’ “ ”, Their glyphs could not be found in western font. If we choose to use asian font, glyph will be too wide. Unicode font is too expensive (as bstell suggest in his last comment) and we always try to avoid it. So the current approach is, if we cann't find them in western font, we transliterate them and using subsitute glyph found in western font to replace it. I have been thinking if we should try 10646 font or not. There is some other bugs filed against those problem and should not be the concern of this one. to brian, Current unicode language group is really misleading and practically does not work at all. It might be a good idea to eliminate it for now. But for future, I guess it might be useful in certain situation. I have no strong opinion about this issue.

Comment 10

•

23 years ago

Ben: in the css could you try adding "adobe-times-iso10646-1" to the font list?

Reporter

Comment 11

•

23 years ago

Per Brian's request, I tried adding the following CSS rule: html { font-family: adobe-times-iso10646-1 } With this change, the various curly Unicode quotes do appear as intended. I'm not sure if Brian was trying to debug things or was suggesting a workound. I wouldn't really consider this to be a viable workaround, because it has an additional unwanted side effect (selecting a Times font regardless of the user's defaults). If there were a way to specify the "iso10646-1" part without the "adobe-times" part, that might be a reasonable workaround.

Reporter

Comment 12

•

23 years ago

Is this a duplicate of bug #91190? A blocker of it? Dependent upon it? I think both reports are basically talking about the same issue.

Reporter

Comment 13

•

23 years ago

If I understand things correctly, when a character is not mapped to a specific non-Unicode language group, Mozilla falls back on the language group associated with the current locale. If the character is not actually defined in that locale's fonts, then Mozilla performs a reasonable best-effort substitution. What about adding one more stage to this logic? Before doing the best-effort substitution, check to see if that character is defined in the iso10646 font. If it is, then use it. If it's missing from there too, then fall back on the best-effort substitution. That should fix the sort of problems I'm seeing without changing the behavior of anything that was already working correctly. Can this be done in a way which is efficient relative to Brian Stell's concerns about XQueryFont() inefficiency and such?

Comment 14

•

23 years ago

this looks like a dup of bug 91190

Comment 15

•

23 years ago

> Before doing the best-effort substitution, check to see if that character is > defined in the iso10646 font. ... Can this be done in a way which is > efficient relative to Brian Stell's concerns about XQueryFont() inefficiency The problem *is* that to check if a iso10646 font has the char is very expensive. Thats why we only do it when we are desparate (such as when tranliteration fails). This is a problem with trying to use X's XLFD for iso10646 (Unicode) fonts. All other encoding (mostly) fill in all possible chars. Unicode does not. Thus we are stuck needing to get the list of chars via XLoadQueryFont (or XQueryFont). For a long time now we have talked about caching the data but without a way to check if the cached data is stale this is not safe to do. For the TrueType fonts I was able to check for stale data because I have access to the font file timestamps (if timestamp is not the same as when the data was generated then the data is stale).

Roland Mainz

Comment 16

•

23 years ago

bstell wrote: > This is a problem with trying to use X's XLFD for iso10646 (Unicode) fonts. > All other encoding (mostly) fill in all possible chars. Unicode does not. > Thus we are stuck needing to get the list of chars via XLoadQueryFont (or > XQueryFont). Actually the XLFD standard _allows_ to peek if a char is available in the font or not... For example: '-misc-fixed-medium-r-normal--0-0-0-0-c-0-iso8859-1[65 70 80_92]' tells the font source (Xserver or xfs) that the client is interested only in characters 65, 70, and 80-92. Question is whether major vendors like XFree86 implement that correctly...

Comment 17

•

23 years ago

Peeking like this implies a round trip to the X server per font which is also expensive. Perhaps for local X servers we could detect that the font info cache is stale by checking the X font path and the files on that path. If the path or the files on the path change we could update the cached font info. I have very limited time and I'm am working on TrueType printing. If someone would care to volunteer to work on caching the X font info I think I can guide them. I'd guess that it would only take about a week to get working code and another 2-3 weeks to bring it up for production grade.

Assignee

Comment 18

•

23 years ago

accept.

Status: NEW → ASSIGNED

Reporter

Comment 19

•

23 years ago

If the only problematic issue here is when to invalidate the cache, why not invalidate at Mozilla exit? I.e., cache for the lifetime of the process. Fonts don't change all *that* often, so it seems reasonable to require a quit/restart cycle to pick up changes. Or flush the cache whenever font prefs change. Anything more sophisticated, such as monitoring the font search path, is bonus work that shouldn't prevent us from getting something simple up and running that will do the job for most people in most common usage scenarios.

Comment 20

•

23 years ago

> If the only problematic issue here is when to invalidate the cache, why not > invalidate at Mozilla exit? Generating the data is extremely expensive (in the multiple minute range). Thus we cannot regenerate it every startup (unless we want a multiple minute delay on startup). Because of the huge time cost to be useful we would need to generate it once only and then only check if the data needs to be updated (as I do for the TrueType fonts).

Reporter

Comment 21

•

23 years ago

Egad. I knew it was bad, but I didn't know it was *that* bad. Thanks for the info.

Reporter

Comment 22

•

22 years ago

I just went back and revisited the cited URL (<http://www.cs.berkeley.edu/~liblit/>) using Mozilla 1.0, and the curvy quotes show up correctly. The Western font preference is still used for the majority of text on the page, but a proper Unicode font is being used for the Unicode-only characters (quotes, in this case). Is this bug now fixed? Or has it merely changed in some curious way?

Assignee

Comment 23

•

22 years ago

That is probably because of the support of freetype.

Reporter

Comment 24

•

22 years ago

No, I don't think this is because of freetype support: I'm using the prebuilt Red Hat RPM's, which supposedly do not include freetype support. Perhaps the addition of conditional freetype support affected font handling elsewhere, though, causing this change even without freetype support in my binary,

Comment 25

•

22 years ago

Actually the mozilla moz (non Redhat) has direct FreeType2 (Truetype) support and I believe the Redhat rpms have FreeType2 via Xft (there was/is a long discussion on whether Xft was/is ready for mozilla) so you might have Truetype working. You could use 'xmag' to capture/enlarge the pixels and see if they have "grey" pixels on the edges (while the direct FreeType2 code does use the Truetype embedded bitmaps if available I believe that the Xft version cannot).