Closed Bug 237434 Opened 21 years ago Closed 20 years ago

Add 'langGroup's for languages written in non-complex scripts and with 'well-defined' script association

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jshin1987, Assigned: jshin1987)

References

Details

(Keywords: intl)

Attachments

(1 file, 4 obsolete files)

Armenian, Georgian, Ethiopic, Unified Canadian Syllabari (and some more in SMP) don't require any special handling other than fonts with glyphs. Indic and other South and SouthEast Asian scripts are different from these scripts in that they need complex characters to glyphs transformation. For these 'non-complex' scripts, we have to add 'langGroups'(script-group) so that fonts for them can be designated.
This is just to see what we need to change how much. I haven't tested it, yet. Besides, I used 'x-i' prefix for no specific reason...
For Camino, see bug 222919.
Status: NEW → ASSIGNED
I was wondering just the other day whether it would make sense to move the whole langGroup mechanism over to ISO-15924 codes (and maybe rename it to "scriptCode"). Backward compatibility might be sticky, and there are some cases where ISO-15924 is finer-grained than we would want (e.g. Hiragana and Katakana have separate codes) or coarser-grained than we would want (I don't see a way to distinguish our current Western, Central European and Baltic varieties of "Latn"), but I think those problems are solveable.
Simon, do you expect to have a structural change or just a name mapping between our current scheme and ISO 15924 scheme? I think it's more of the latter than of the former. As for the difference in 'granuality' between two, I guess we have to resort to 'x-blah' (or any other user-defined extension method). For instance, we have to keep 'x-western', 'x-central-euro' (well, that's one of relics of XLFD which we don't need any more in GFX ports other than X11core). For Japanese, we may have to add a new 'x-japanese' or something (hmm....) Basically, if our current name matches ISO 15924 name (and two are well-aligned with each other), just use ISO 15924 name. Otherwise, we have to use 'x-blahblah'.
Yes, I agree that what's needed is more of a name mapping. Do we actually need to keep x-western or can we map it to Latn? Here's a draft mapping: ar Arab el Grek he Hebr ja x-japanese (or maybe Hrkt?) ko Hang th Thai tr x-turkish zh-CN Hans zh-Hans Hans zh-Hant Hant zh-HK ???? zh-TW Hant x-baltic x-baltic x-central-euro x-central-euro x-cyrillic Cyrl x-gurmukhi Guru x-devanagari Deva x-tamil Taml x-western Latn x-unicode x-unicode x-userdef x-userdef Another option is to use codes in the private use block Qaaa-Qabx instead of x-blahblah, and Zyyy (undetermined script) instead of x-unicode and Zzzz (uncoded script) instead of x-userdef. That's probably not such a great idea, since the codes will appear in prefs and can be set by a user in about:config, so the more informative x-blahblah names are probably better.
Can you articulate why this new scheme would offer a definite advantage over the old one? BTW, Japanese can't be Hrkt; zh-hk would be Hant.
(In reply to comment #6) > Can you articulate why this new scheme would offer a definite advantage over the > old one? BTW, Japanese can't be Hrkt; zh-hk would be Hant. I think it's cleaner and expresses better what we really mean. We've made mistakes in the past by thinking in terms of languages instead of script groups (see bug 232487). I wasn't sure about zh-hk because we currently have it as a separate category from zh-tw. Can they both be Hant or do we need to keep them separate somehow?
(In reply to comment #7) > (In reply to comment #6) > > I wasn't sure about zh-hk because we currently have it as a separate category > from zh-tw. Can they both be Hant or do we need to keep them separate somehow? > I think this is a font selection question. zh-hk uses Big5 traditional chinese character set + 3000 or so additional characters. They have a special font for this. I would think that this is the reason for the separation. The UI for font slection necessarily reflects language/font availability. Unless this situation changes we may end up creating many x-lang_yyy categories.
(In reply to comment #8) > I think this is a font selection question. zh-hk uses Big5 traditional chinese > character set + 3000 or so additional characters. They have a special font for > this. I would think that this is the reason for the separation. That's why I was requested to separate zh-HK from zh-TW. However, that's *partly* the artifact of 'ancient' X11core font system(XLFD-based) in a sense just as the distinction between x-western and x-central-euro is the artifact of X11core font system and Mac OS classic font system. For modern font systems (Windows, Xft, and Mac OS X [1] althought we don't fully exploit that on Mac OS X, yet) the distinction is somewhat (not entirely) moot. > The UI for font slection necessarily reflects language/font availability. > Unless this situation changes we may end up creating many x-lang_yyy > categories. Yes, our 'langGroup' is overloaded to mean both language and scriptGroup. As for the proliferation of 'x-lang_yyy', I don't think we're gonna have more than what we have now. I know what you have in mind, but I don't think we'll ever put all these (how many? hundreds, thousands, tens of thousands? [2]) languages in the font preference. Instead, I believe we'll keep them coarse-grained while sending down from layout to gfx 'lang' explicitly specified by the author (along with our coarse-grained lang/scriptGroup) so that Gfx implementations capable of taking advantage of fine-grained lang distinctions (e.g. Xft and Pango) can do so. [1] Mac OS classic had 'Times CE'(Central Europe), 'Times CY'(Cyrillic) etc, but Mac OS X consolidated 'Times CE', 'Times CY' and 'Times' to 'Times'. The same is true of Helvetica and Courier. [2] I wouldn't be wrong to say that the number of scripts is considerably smaller than the number of languages, would I?
Am I correct in understanding that currently there is no way to tell Gecko what font to use for these languages? (And therefore it just "guesses at random"?) I'm trying to figure out whether bug 288571 reported against Camino about Armenian text displaying as ???? is because there's no UI (or even user/prefs.js entry, this bug?) to set a default font or if it's because Gfx:Mac can't handle/recognize the sole Mac font with Armenian glyphs (loosely bug 246527, according to what I've been told). Sorry for the noise/stupid question, and thanks for any help you can provide.
(In reply to comment #10) > Am I correct in understanding that currently there is no way to tell Gecko what > font to use for these languages? (And therefore it just "guesses at random"?) Actually, there's a way if 'lang=xy' is specified. Fonts set for 'Unicode' (or in recent nightlies, 'Other scripts') will be used for those scripts for which we don't have a UI, yet. > I'm trying to figure out whether bug 288571 reported against Camino about > Armenian text displaying as ???? is because there's no UI (or even user/prefs.js > entry, this bug?) to set a default font or if it's because Gfx:Mac can't > handle/recognize the sole Mac font with Armenian glyphs (loosely bug 246527, > according to what I've been told). Camino in particular and Mac products in general have lagged behind other ports in terms of font and rendering. I'll see what I can do in bug 288571.
Depends on: 288638
Attached patch patch (still work in progress) (obsolete) — Splinter Review
Updated the patch to the trunk and included gfx:win part while excluding gfx:mac part. I didn't change gfx:gtk/gfx:xlib. I used ISO 15924 script names for new 'langGroups' (with 'x-' prefix instead of 'x-i-' prefix). Perhaps, it's better to combine this patch with Simon's patch for bug 248690
Attachment #154870 - Attachment is obsolete: true
Attached patch patch (update) (obsolete) — Splinter Review
My TB tree was not up to date. After making it up to date, I made a new patch.
Attachment #179284 - Attachment is obsolete: true
On Windows, it works 'well'. Need to test on Linux and Mac OS X. I added 5 more scripts (Malayalam, Gujarati, Gurmukhi, Bengali, Khmer). I didn't add Kanada, Telugu, and so forth because I couldn't find fonts for them that cover Basic Latin completely. Fonts for them have punctuation marks and numbers in Basic Latin range. Some of our Gfx implementation assume that basic Latin is fully covered by a font.
Attachment #179354 - Attachment is obsolete: true
Attached patch update Splinter Review
I added changes for gfx:mac and fixed a couple of mistakes. What's to be done additionally are: 1. Mac OS X font pref. However, bug 246527 and other related bugs need to be fixed. According to bug 288571 comment #4, some Pan-unicode TTFs for Windows work on Mac OS X for some of these scripts being added, but fonts shipped by Apple don't work due to bug 246527 and friends 2. Gfx:Gtk/Xlib fixes that would be kinda like 'place holders' because for new scripts added, iso10646-1 is the only sensible XLFD(charset-encoding part) entry. 3. Camino fix : Adding menu items to Camino is a 'black art' to me. I asked how, but practictioners don't seem to want to reveal the secret. :-) Because we don't build Gfx:Gtk/xlib any more for tier-1 platforms, this patch is good enough for now. Btw, I didn't change nsFontMetricsPango and nsFontMetricsCairoXft because the change made here will be automatically propagated once bug 288634 and bug 277656 are fixed.
Attachment #179474 - Attachment is obsolete: true
Attachment #179708 - Flags: superreview?(bryner)
Attachment #179708 - Flags: review?(smontagu)
Comment on attachment 179708 [details] [diff] [review] update r=smontagu
Attachment #179708 - Flags: review?(smontagu) → review+
(In reply to comment #15) > 3. Camino fix : Adding menu items to Camino is a 'black art' to me. I asked > how, but practictioners don't seem to want to reveal the secret. :-) Here's smfr's checkin for adding zh-HK to Camino's langGroup font prefs menu http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=camino&branch=&branchtype=match&dir=%2Fmozilla%2Fcamino&file=&filetype=match&who=smfr*&whotype=regexp&sortby=Date&hours=2&date=explicit&mindate=2005-03-01+21%3A31%3A00&maxdate=2005-03-02+00%3A00%3A00&cvsroot=%2Fcvsroot If you're also adding these to the View: Text Encoding (selection/override) menu, I can't even provide a pointer.
Comment on attachment 179708 [details] [diff] [review] update Asking dbaron for sr.
Attachment #179708 - Flags: superreview?(bryner) → superreview?(dbaron)
(In reply to comment #17) > Here's smfr's checkin for adding zh-HK to Camino's langGroup font prefs menu > http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=camino&branch=&branchtype=match&dir=%2Fmozilla%2Fcamino&file=&filetype=match&who=smfr*&whotype=regexp&sortby=Date&hours=2&date=explicit&mindate=2005-03-01+21%3A31%3A00&maxdate=2005-03-02+00%3A00%3A00&cvsroot=%2Fcvsroot I am aware of that check-in, but it doesn't help because a binary file was modified and my question as to how to do that hasn't been answered.
Attachment #179708 - Flags: superreview?(dbaron) → superreview+
Comment on attachment 179708 [details] [diff] [review] update asking for approval to aviary 1.1a and suite 1.8b2. This is a low risk patch adding a bunch of scripts to the font selection menu and languages written in them to the language selection menu. For some of them, there are (un)official language packs so that not offering font selection menu for them leads to kinda 'mismatch'. Btw, Localizers have to translate the names of newly added scripts and languages.
Attachment #179708 - Flags: approval1.8b2?
Attachment #179708 - Flags: approval-aviary1.1a?
Comment on attachment 179708 [details] [diff] [review] update a=asa
Attachment #179708 - Flags: approval1.8b2?
Attachment #179708 - Flags: approval1.8b2+
Attachment #179708 - Flags: approval-aviary1.1a?
Attachment #179708 - Flags: approval-aviary1.1a+
I just filed bug 292416 for Camino; I hope I got all the new scripts correct. (I'll double-check when the next new Mac Fx nightly appears and see which [other] ones are missing.)
resolving as fixed. (landed on the trunk)
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Have you not forgot to add new entitites to messenger/locale/preferences/fonts.dtd?
Thanks for catching my mistake. It was updated a moment ago. Checking in mail/locales/en-US/chrome/messenger/preferences/fonts.dtd; /cvsroot/mozilla/mail/locales/en-US/chrome/messenger/preferences/fonts.dtd,v <-- fonts.dtd new revision: 1.2; previous revision: 1.1 done
Hi, I've just checked out the latest nightly (9 May) and seems to be working fine for Gurmukhi. However, is it possible to get the scripts listed in alphabetical order? The current order is a bit strange. Sukh
Sorry about replying to myself... but danda and double danda (U+0964 and U+0965) seem to use a different font in Gurmukhi. Even though these are in the Devanagari block, they should be counted as being Gurmukhi when viewed with Gurmukhi text.
(In reply to comment #27) > Sorry about replying to myself... but danda and double danda (U+0964 and > U+0965) seem to use a different font in Gurmukhi. Even though these are in the > Devanagari block, they should be counted as being Gurmukhi when viewed with > Gurmukhi text. Depending on the situation, that's a non-tirivlal problem, I'm afraid. Actually, it should work most of time. Can you make up a very simple test case and upload a screenshot? (well, I can make one). If you really have a case for bug, please file a new bug and assign it to me (also note it here for others) (In reply to comment #26) > I've just checked out the latest nightly (9 May) and seems to be working fine > for Gurmukhi. Thanks for testing. It must have bene on Windows XP, right? > However, is it possible to get the scripts listed in alphabetical > order? The current order is a bit strange. Filed bug 293499
See bug 293511 for details on the Dandas.(In reply to comment #28) > (In reply to comment #27) > > I've just checked out the latest nightly (9 May) and seems to be working fine > > for Gurmukhi. > > Thanks for testing. It must have bene on Windows XP, right? Yes it was on Windows XP. The latest nightly also seemed to fix issues with selecting Gurmukhi text, although the problems with justified text is still there.
See bug 293511 for details on the Dandas.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: