Open Bug 91190 Opened 23 years ago Updated 2 years ago

Locale font is used for Unicode, UI font pref is ignored.

Categories

(Core :: Internationalization, defect)

defect

Tracking

()

Future

People

(Reporter: nhottanscp, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: intl)

Attachments

(2 files)

For Unicode pages (e.g. UTF-8), locale language is set overriding "x-unicode"
and font of the system locale (e.g. Western if US system) is used instead of the
unicode font specified by the font pref.

http://lxr.mozilla.org/seamonkey/source/intl/locale/src/nsLanguageAtomService.cpp
229   if (langGroup.get() == mUnicode.get()) {
230     res = GetLocaleLanguageGroup(getter_AddRefs(langGroup));
231     NS_ENSURE_SUCCESS(res, res);
232   }

This was done in order to make XUL to use locale font instead of unicode font.
But unicode font setting in the pref UI is ignored.
In mail/news, this causes the problem of message display is scaled twice if the
message's charset belongs to the default locale (bug 62756).
Status: NEW → ASSIGNED
QA Contact: andreasb → ylong
give to shanjian
Assignee: nhotta → shanjian
Status: ASSIGNED → NEW
After I removed those line, browser still looks fine. So what are the 
difference made by those lines? Naoki, can you attach a mail which has 
problem? 
Status: NEW → ASSIGNED
CCing yokoyama since nhotta is on sabbattical.
Shanjian, the original mail problem (bug 62756) was resolved by not specifying
font size inside libmime. So not reproducible with recent builds.
Please contact Frank about the original reason of overriding unicode font
setting by the locale font.
Blocks: 103743
future this one. 
Target Milestone: --- → Future
Keywords: intl
so... what happen is the following (for the code was there)
1. All the XUL is in UTF-8
2. in our font engine, we take a hint from upper layer to decide which font we 
use first (especially between TC, SC, KO and JA font since Unicode Han 
Unification use the same code point for some characters share between them but 
the glyph will be slightly different between them) 
3. the hint we pass down are two - 1. the LANG attribute, if exist, we map to a 
language group, 2. the charset if (1) does not exist, then we convert the 
charset to a lang group and pass down
4. So... for all the xul, since it is UTF-8 without lang attribute, the 
information we pass down is "x-unicode" which are not helpful
5. now we need to display the text for all xul, say we have a Japanese xul, 
since we only get "x-unicode" from the langgroup and the font setting for 
x-unicode may or maynot be Japanes, it is possible we pick a simplified chinese 
font first the the fallback list, and what happen is the following
a. it may display Japanese characters from different font, mostly from 
Simplified Chinese font, some from Korean font, and from Japanese font. The 
inconsistent of type face style make it look very very ugly. While it is OK to 
use different font to display character as the fallback, it is not acceptable to 
show them in general UI.
b. Some of the glyph have different vairant beteen CJK, even we use the same 
typeface ,we still want to use the correct variant for that language. So we 
rather to use a Japanese Gothic instead of a Korean Gothic to display Japanese, 
even they are the same foundary same typeface. It is especially important in 
fullwidth period and comma. The position is very different between Traditional 
Chinese and Japanese.

So... the realy problem is we miss the lang information from the xul. maybe we 
should address it by attach a global lang attribute to the xul. 

You might not see big difference untill you try Japanese localized build on 
Unix. That is most of the problemn appear.
Blocks: 107217
It seems like much of this problem comes from overloading the meaning of
"x-unicode".  In XUL, we use that to mean "whatever is appropriate for the
current locale", while in truly UTF-8 pages, it should really mean Unicode.

Perhaps the XUL pages should all be given their own language, such as "x-xul" or
"x-use-default-for-locale", so that "x-unicode" can be released to really mean
Unicode.

Or perhaps it's all more subtle than this....
this looks the same as bug 122436
Anything in the making here, guys? From ftang's analysis in comment 6 and from
what shanjian summarized in bug 122436 comment 9, the problem appears to come
from the fact that XUL documents don't specify a language.

re: comment 6:
> So... the realy problem is we miss the lang information from the xul. maybe we 
> should address it by attach a global lang attribute to the xul. 

Fortunately, there is already the xmlns:lang attribute which is a predefined
attribute in XML and which is ultimately passed down the rendering code. So just
as there are localizable .title, .etc, one could now add a configurable .lang in
the main XUL DTD as well. And then doing:

<window id="main-window"
        xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul"
        xmlns:lang="&mainWindow.lang;" <!--- TO BE ADDED EVERYWHERE --->
        title="&mainWindow.title;"
        [...]
        >
[...]
</window>
*** Bug 122436 has been marked as a duplicate of this bug. ***
*** Bug 185523 has been marked as a duplicate of this bug. ***
they said that presContext is informed (or may be) if it is used by XUL, so in
theory it can provide different lang info for XUL and page rendering. Or?
> Fortunately, there is already the xmlns:lang attribute which is a predefined
> attribute in XML and which is ultimately passed down the rendering code.

   How is xmlns:lang related with xml:lang? bug 41978 (bug 115121) is a long
standing bug to make the font selection mechanism get 'hints' from the value of
xml:lang. 
> How is xmlns:lang related with xml:lang?

via a typo :-)
> xml:lang .... which is ultimately passed down the rendering code.

  Is this the case or not now? If so, what is keeping us from fixing bug 41978?
I think that bug was already fixed during the work done for bug 35768.
Thanks. Indeed the work done for bug 35768 fixed bug 41978 (which should have
been a dupe
of bug 35768). As for this bug, it seems to be a job for a script? Is it safe to
assume
that where there's 'title=.....;' we need to add 'xml:lang=&mainWindow.lang'? 
>that where there's 'title=.....;' we need to add 'xml:lang=&mainWindow.lang'? 

Yep, and cumbing the diff will help to catch unwanted changes, if any. Since
xml:lang is inherited, <window xml:lang="..."> will apply to all elements --
until it is overwritten by an inner descendant.
Thank you for the answer.

>one could now add a configurable .lang in the main XUL DTD

It just occurred to me that we may already have this (if in a different name).

Adding Tao to CC to ask him whether we have and what it is if so.
> xml:lang is inherited, <window xml:lang="..."> will apply to all
> elements -- until it is overwritten by an inner descendant.
                                                                               
                
In addition to <window ...>, I found dialog, page and wizard have
xmlns explicitly set to "http://.....is.only.xul"'. They also seem
to be 'topmost' elements and need 'xml:lang'. Am I right?
As a rule of thumb, I might suggest that every top element in a .xul file that
references a configurable dtd (e.g. title="&mainWindow.title;") is a good spot
because it means that the configurable lang="&mainWindow.lang" can be added
there as well.

If the top element doesn't reference a dtd, it usually means that it is going to
included as a child somewhere. And so there isn't much point messing with it.
This looks trickier than I thought at first unless there's an easy universal way
to retrieve
the value of 'lang'. What follows is only relevant if there's no easy universal
way to get
'lang'. 

Some xul files have DOCTYPE that includes 'brand.dtd' (which might be 
one of places to add '&locale/lang..' or equivalent), but the majority
of xul files have DOCTYPE like this:

<!DOCTYPE window SYSTEM "chrome://messenger/locale/fieldMapImport.dtd">

Still others have 
<!DOCTYPE window [
<!ENTITY % deviceManangerDTD SYSTEM "chrome://pippki/locale/deviceManager.dtd">
%deviceManangerDTD;
<!ENTITY % pippkiDTD SYSTEM "chrome://pippki/locale/pippki.dtd" >
%pippkiDTD;
]>

In the first case, we can just modify brand.dtd. For the third group, changing
the 'top-level'
dtd works. For the second group, each individual dtd has to be modified. This is
doable with 
a script, but it seems too redundant to define 'lang/locale' in over 200 dtd
files all over
the place. L10N people wouldn't like this much. 

As an alternative, we can create lang.dtd (or locale.dtd) and include it in DOCTYPE
declaration in xul files. (or add 'lang/locale' to brand.dtd and include it in
DOCTYPE where it's not yet).
> As an alternative, we can create lang.dtd (or locale.dtd)

Or just add the entity in |region.dtd| which is already included in the xul
files of relevance through the magic of xul overlays (utilityOverlay.xul in this
case).
> |region.dtd| which is already included in the xul files of relevance through
the magic of xul overlays 

  So, I don't have to modify DOCTYPE lines A to make them read like B, do I?  

A. 
<!DOCTYPE window SYSTEM "chrome://messenger/locale/fieldMapImport.dtd">

B.

<!DOCTYPE window [
<!ENTITY % dtd1 SYSTEM "chrome://global-region/locale/region.dtd"> %dtd1;
<!ENTITY % dtd2 SYSTEM "chrome://messenger/locale/fieldMapImport.dtd" %dtd2;
]>

You will need some experimentation to confirm/deny. I suspect you might have to
do it the hard way.
Not all xul/dtd 'trees' include utilityOverlay.xul or other overlay files
that refers to region.dtd (or brand.dtd). For instance, fieldMapImport.xul
includes dialogOverlay.xul that in turn includes platformDialogOverlay.xul
which does not include utilityOverlay.xul nor refers to region.dtd. However, 
I'm hoping to be able to identify a few 'top-level' xul files(such as 
dialogOverlay.xul) that can cover the whole tree of xul files and add to 
them 'region.dtd'(or brand.dtd).  
Below is the tally of xul files I obtained with a script:

Total : 607

1. has just <overlay .... /> : 7
2. No top element refering to a configurable dtd : 345
3. top element referencing a configurable dtd : 255
   (1) brand.dtd or region.dtd is included directly/indirectly : 63
      a. brand.dtd alone : 45 (directly 40, indirectly : 5)
      b. region dtd alone : 3 (all directly)
      c. both region/brand : 15 (directly 1, indirectly : 14)
   (2) brand.dtd/region.dtd not included : 192
      a. dialogOverlay.xul is included : 50
      b. EdDialogOverlay.xul is included : 30
      c. Others : 112 (The 'histogram' of included 
         overlay files  may cut down the number of files to be
         modified)

Group 1 and group 2 do not need xml:lang (per rbs' rule of thumb in comment #21).

Depending on where we decide to add 'lang' (brand.dtd, region.dtd, or
yet-to-be-made lang.dtd) some or all of Group 3(1) have to be given an
additional dtd reference. Regardless of where to put 'lang', files in group
3(2).c need to be given an add. dtd reference. For files in group 3(2)a and
3(2)b, we can just edit dialogOverlay.xul and EdDialogOverlay.xul. 

The total number of files that need to be 'edited' is 117(brand.dtd) or 
159(region.dtd) or 164(new lang.dtd). This number can go down (files like
dialogOverlay.xul can be identified.) Anyway, it won't be done by hand so that
164 or 117 doesn't matter much unless adding a dtd reference (it'll be very
simple) to a large number of files could have a performance issue. 

So, which one is the best 'conceptually'? I'm slightly more inclined to lang.dtd
than to others, but much. 
Adding a new file with just a single line (+ the wordy copyright) looks like a
waste to me.
Well, both brand.dtd and region.dtd don't have any copyright notice. They're
just 4 lines and 2 lines long. Given this, lang.dtd can be just a 1-liner if
it's deemed the best. 
Note to myself (sorry for spamming):

http://lxr.mozilla.org/seamonkey/source/modules/libpref/src/init/all.js#686

686 pref("intl.charset.detector",              
"chrome://navigator/locale/navigator.properties");
687 pref("intl.charset.default",               
"chrome://navigator-platform/locale/navigator.properties");
688 pref("intl.content.langcode",              
"chrome://communicator-region/locale/region.properties");
689 pref("intl.locale.matchOS",                 false);
assigning to myself (i should've changed it in the prev. step. sorry)
Assignee: shanjian → jshin
Status: ASSIGNED → NEW
Flags: blocking1.6a+
Flags: blocking1.6a+
*** Bug 222777 has been marked as a duplicate of this bug. ***
This bug as originally described is indeed a dupe of bug 222777. It also affects
plain text mail messages in UTF-8, with the serious consequence that such
messages are unreadable if they contain characters not in the system default
font, or badly displayed in that font. I don't quite understand where all the
proposed fixes are going, but please make sure that they do fix this problem.
Peter, a temporary work around is to set 'Western' font(assuming your default
locale is en-GB) to a font with the most comprehensive coverage. Fixing this bug
involves modifying over 100 files across the tree. It has to be done with a
script, but I haven't gotten around to write one.  

> messages are unreadable if they contain characters not in the system default

  unreadable? That's odd. They should be readable although they may be rendered
in a ransom-note style (mixing several fonts with different styles). 
Thanks for the tip in comment 34.

As for "unreadable", this happens in several circumstances:

1) The selected font may claim to cover a certain Unicode block but not do so
completely. For example, Courier New claims to cover Hebrew but has no glyphs
for Hebrew accents, which are not used in modern Hebrew but are one of my
research interests. So I get empty boxes instead of the accents I want to see.

2) The glyphs in a font for a particular Unicode block may be so bad as to be
unreadable, at the selected font size. This is almost true of Courier New 10
point (which is fine for English) with the rest of Hebrew: its Hebrew base
consonants are almost unreadable and vowel point distinctions are lost because
of the screen resolution.

3) When the selected font does not over the relevant Unicode block, the system
chooses a substitute font apparently at random. Among the fonts it sometimes
chooses on my system is one, not intended for reading text, in which Unicode
characters are represented by glyph numbers. Others might have a "Last Resort"
font which indicates only the Unicode block, not the actual character. This
renders text unreadable, of course. I would like to be able to select my own
list of substitutes, but that's a separate issue.
>I would like to be able to select my own list of substitutes

about:config

and look for: 

font.name-list

-- this is where to specify your ordered list (a-la CSS font-family) of
preferred substitutes before a further fallback to other substitutes.
The code to be modified has moved to layout/base/src/nsPresContext.cpp in bug 245770
*** Bug 252386 has been marked as a duplicate of this bug. ***
*** Bug 280001 has been marked as a duplicate of this bug. ***
How can I work around this?  Is there a file I can alter to customize the plain-text UTF-8 display (in Thunderbird)?  Or perhaps a gnome setting?

I can't seem to find anything mentioned here or in the duplicates, and my poking around gnome hasn't yielded any results either.
Assignee: jshin1987 → smontagu
QA Contact: amyy → i18n
Blocks: 461734
Wow, this bug is going on 8 yrs old, damn!

I just spent an hour or two trying to change the default display font of UTF-8 emails, and was moments away from posting yet another random plea for help, when I finally found what I wanted buried in "duplicate" bug 382273: To change UTF-8 fonts in Thunderbird set fonts for "Other Languages".

Not only is that workaround "non-obvious", as the title declares, but it was also non-obvious (to me) to go look at bugs that had been marked as duplicates. Perhaps a bit off-topic, but seeing as THIS bug is linked directly from some Thunderbird beginner tutorials, I figure more people will stumble this way, so I hope this helps them.

(and best of luck to those of you working on the core issue... yuck!)
Bug 655330, which is mainly concerning Firefox, has been marked as a duplicate of bug 91190.  It think this is pretty important for users viewing CJK contents.

I noticed this issue yesterday when I tried to adjust the accessibility settings for my parent who is having difficulties reading web pages of their default sizes.  I am using Firefox 6.0.

The system locale on my computer is English and I set "Western" proportional font size to 30 and all other languages remain default at 16.

I have attached two Japanese valid XHTML 1.1 file (both UTF-8 encoded), the first specified the meta content-language, xml:lang, and lang as "ja" while the second one does not.  The first one is displayed with font-family from "Japanese" and the second one is displayed with font-family from "Western", which is OK.  But both files is displayed with font-size 30 (from "Western").  In either case, Firefox did not use settings from "Other Language", which is x-unicode settings in about:config.

Why would Firefox use font-family and font-size from different font setting sets?  If it picks up font-size from Western because system locale is English, why does it use font-family from Japanese?

It is good that Firefox honors the meta content-language, xml:lang and lang attribute in choosing the font to display the document, but it should also use the font size specified for that language as well.
(In reply to Anson Ng from comment #50)
> Created attachment 553882 [details]
> Japanese valid XHTML 1.1 with xml:lang, lang, and meta content-language
> specified as "ja"
Valid XHTML 1.1 documents cannot have lang attributes.
(In reply to Masatoshi Kimura [:emk] from comment #52)
> (In reply to Anson Ng from comment #50)
> > Created attachment 553882 [details]
> > Japanese valid XHTML 1.1 with xml:lang, lang, and meta content-language
> > specified as "ja"
> Valid XHTML 1.1 documents cannot have lang attributes.

Only if it is served as application/xhtml+xml.  But if it is served as text/html, it should be fine, isn't it?  W3C validator says the test page as valid XHTML 1.1.
See Also: → 1142378
Blocks: 1056479
Blocks: 458497
See Also: → 1044461

The bug assignee didn't login in Bugzilla in the last 7 months, so the assignee is being reset.

Assignee: smontagu → nobody
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 13 duplicates and 19 votes.
:m_kato, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(m_kato)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(m_kato)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: