unrecognized lang tag should be regarded as x-unicode instead of x-western




15 years ago
13 years ago


(Reporter: Jungshik Shin, Assigned: Jungshik Shin)




Firefox Tracking Flags

(Not tracked)



(1 attachment)



15 years ago
This bug is related with bug 91190, but is different.

Currently, unrecognized lang tags are regarded as x-western [1]. Howeer, I think
it's better to regard them as x-unicode because most, if not all, languages
covered by ISO-8859-1/ISO-8859-15 are listed in 

When a lang tag is not listed there, it's much more likely that it's
only supported by Unicode than that it's supported by ISO 8859-1/15.  Now that
Mozilla-Win can properly display Indic scripts (and Mozilla-Xft will follow
it soon : see bug 204439, bug 204286), there should be ways to 
designate separate fonts for those scripts. At the moment, when
a page is tagged as 'ta' (Tamil), fonts to render the page come
from fonts for x-western (only if the coverage of western fonts is not 
sufficient, other fonts are searched for). Therefore, Tamil speakers
have to set fonts for Western to specify fonts to render Tamil
pages (tagged with 'ta') with. It's not very intuitive.

There are three ways to solve this problem.

  1. Add a whole bunch of lang tags to langGroups.properties mapping 
     them to x-unicode lang group. This is not such a good idea because
     eventually (#3 below) we have to map them to their own langgroup.

  2. Fix 
     to assign x-unicode to unrecognized lang tags instead of x-western.
     It has to be checked whether this has any unexpected result somewhere.

  3. Add new lang tags to pref-fonts.xul and pref-fonts.dtd [2] as support for
     them are added. This is a bit tricky because the list of supported scripts
     are platform-dependent. On Win2k/XP, virtually all scripts supported by
     are also supported by Mozilla. On other platforms, Mozilla is not quite
there yet.

I think we have to push forward #2 and #3 in parallel.   

[0] IMHO, 'script(name/group)' would have been a better choice than 'langgroup'.
'x-dv' would be a scriptgroup not just for Hindi(hi) but also for other
languages written with Devanagari just as 'x-western'(latin1), 'x-baltic',
'x-centraleuro', and 'x-cyrillic' are for a number of European languages.   


197       res = mLangGroups->GetStringFromName(lowered.get(),
198       if (NS_FAILED(res)) {
199         PRInt32 hyphen = lowered.FindChar('-');
200         if (hyphen >= 0) {
201           nsAutoString truncated(lowered);
202           truncated.Truncate(hyphen);
203           res = mLangGroups->GetStringFromName(truncated.get(),
204           if (NS_FAILED(res)) {
205             langGroupStr.Assign(NS_LITERAL_STRING("x-western"));
206           }
207         } else {
208           langGroupStr.Assign(NS_LITERAL_STRING("x-western"));
209         }

Comment 1

15 years ago
Created attachment 122848 [details] [diff] [review]
a patch

I added several European and African languages supported by X-Western
langgroup to langGroups.properties file. I also changed the default
langGroup for unrecognized lang tags to  x-unicode.
As I mentioned earlier, this will enable speakers of those languages to 
set fonts to use when viewing lang-tagged web pages in UTF-8 by
setting fonts for X-unicode (instead of x-western) until we add them
to font preference menu.

Comment 2

15 years ago
If this changes get in, we have to release-note that 'Unicode' fonts
have to be set  to controll fonts for pages lang-tagged with languages not
belonging to those covered by font-pref. menu. They include but not are limited
to  languages written with Tamil, Bengali, and other Indic scripts. Otherwise,
they have to change Western fonts, which is not desriable in most cases because
Western European text had better be rendered  with  high-quality Latin fonts
refined over the years instead of non-Latin fonts that happen to cover Latin

Needless to say, as more and more web pages are put up in Unicode, this is not
the best either (because European text in Unicode will be rendered with
less-than-optimal fonts). Therefore, we have to implement item #3 soon. The only
roadblock is the platform-dependent coverage of scripts. I wnoder if 
platform-dependent xul overlay can solve this problem. 

Comment 3

15 years ago
Implementing #1 and #3 alone doesn't work. Gfx ports for different platforms
have different ways of selecting fonts based on langGroup and currently most of
Gfx ports support only 'major' scripts (Latin, Greek, Cyrillic, CJK, Thai,
Hebrew and Arabic). I filed bug 206123 for Gfx-Win on this issue. In my patch to
bug 204039, I partly fixed it for Tamil and Devanagari. 

Comment 4

13 years ago
langGroups.properties changes were already checked in.
I forgot I had filed this bug and filed bug 256383 where I uploaded a patch and
asked for r/sr.

*** This bug has been marked as a duplicate of 256383 ***
Last Resolved: 13 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.