Closed
Bug 237434
Opened 21 years ago
Closed 20 years ago
Add 'langGroup's for languages written in non-complex scripts and with 'well-defined' script association
Categories
(Core :: Internationalization, defect)
Core
Internationalization
Tracking
()
RESOLVED
FIXED
People
(Reporter: jshin1987, Assigned: jshin1987)
References
Details
(Keywords: intl)
Attachments
(1 file, 4 obsolete files)
39.51 KB,
patch
|
smontagu
:
review+
dbaron
:
superreview+
asa
:
approval-aviary1.1a1+
asa
:
approval1.8b2+
|
Details | Diff | Splinter Review |
Armenian, Georgian, Ethiopic, Unified Canadian Syllabari (and some more in SMP)
don't
require any special handling other than fonts with glyphs. Indic and other South
and SouthEast
Asian scripts are different from these scripts in that they need complex
characters to glyphs
transformation.
For these 'non-complex' scripts, we have to add 'langGroups'(script-group) so
that fonts for them
can be designated.
Assignee | ||
Comment 1•20 years ago
|
||
This is just to see what we need to change how much.
I haven't tested it, yet. Besides, I used 'x-i' prefix for no specific
reason...
Comment 3•20 years ago
|
||
I was wondering just the other day whether it would make sense to move the whole
langGroup mechanism over to ISO-15924 codes (and maybe rename it to "scriptCode").
Backward compatibility might be sticky, and there are some cases where ISO-15924
is finer-grained than we would want (e.g. Hiragana and Katakana have separate
codes) or coarser-grained than we would want (I don't see a way to distinguish
our current Western, Central European and Baltic varieties of "Latn"), but I
think those problems are solveable.
Assignee | ||
Comment 4•20 years ago
|
||
Simon, do you expect to have a structural change or just a name mapping between
our current scheme and ISO 15924 scheme? I think it's more of the latter than of
the former.
As for the difference in 'granuality' between two, I guess we have to resort to
'x-blah' (or any other user-defined extension method). For instance, we have to
keep 'x-western', 'x-central-euro' (well, that's one of relics of XLFD which we
don't need any more in GFX ports other than X11core). For Japanese, we may have
to add a new 'x-japanese' or something (hmm....)
Basically, if our current name matches ISO 15924 name (and two are well-aligned
with each other), just use ISO 15924 name. Otherwise, we have to use 'x-blahblah'.
Comment 5•20 years ago
|
||
Yes, I agree that what's needed is more of a name mapping.
Do we actually need to keep x-western or can we map it to Latn? Here's a draft
mapping:
ar Arab
el Grek
he Hebr
ja x-japanese (or maybe Hrkt?)
ko Hang
th Thai
tr x-turkish
zh-CN Hans
zh-Hans Hans
zh-Hant Hant
zh-HK ????
zh-TW Hant
x-baltic x-baltic
x-central-euro x-central-euro
x-cyrillic Cyrl
x-gurmukhi Guru
x-devanagari Deva
x-tamil Taml
x-western Latn
x-unicode x-unicode
x-userdef x-userdef
Another option is to use codes in the private use block Qaaa-Qabx instead of
x-blahblah, and Zyyy (undetermined script) instead of x-unicode and Zzzz
(uncoded script) instead of x-userdef. That's probably not such a great idea,
since the codes will appear in prefs and can be set by a user in about:config,
so the more informative x-blahblah names are probably better.
Comment 6•20 years ago
|
||
Can you articulate why this new scheme would offer a definite advantage over the
old one? BTW, Japanese can't be Hrkt; zh-hk would be Hant.
Comment 7•20 years ago
|
||
(In reply to comment #6)
> Can you articulate why this new scheme would offer a definite advantage over the
> old one? BTW, Japanese can't be Hrkt; zh-hk would be Hant.
I think it's cleaner and expresses better what we really mean. We've made
mistakes in the past by thinking in terms of languages instead of script groups
(see bug 232487).
I wasn't sure about zh-hk because we currently have it as a separate category
from zh-tw. Can they both be Hant or do we need to keep them separate somehow?
Comment 8•20 years ago
|
||
(In reply to comment #7)
> (In reply to comment #6)
>
> I wasn't sure about zh-hk because we currently have it as a separate category
> from zh-tw. Can they both be Hant or do we need to keep them separate somehow?
>
I think this is a font selection question. zh-hk uses Big5 traditional chinese
character set + 3000 or so additional characters. They have a special font for
this. I would think that this is the reason for the separation. The UI for font
slection necessarily reflects language/font availability. Unless this situation
changes we may end up creating many x-lang_yyy categories.
Assignee | ||
Comment 9•20 years ago
|
||
(In reply to comment #8)
> I think this is a font selection question. zh-hk uses Big5 traditional chinese
> character set + 3000 or so additional characters. They have a special font for
> this. I would think that this is the reason for the separation.
That's why I was requested to separate zh-HK from zh-TW. However, that's
*partly* the artifact of 'ancient' X11core font system(XLFD-based) in a sense
just as the distinction between x-western and x-central-euro is the artifact of
X11core font system and Mac OS classic font system. For modern font systems
(Windows, Xft, and Mac OS X [1] althought we don't fully exploit that on Mac OS
X, yet) the distinction is somewhat (not entirely) moot.
> The UI for font slection necessarily reflects language/font availability.
> Unless this situation changes we may end up creating many x-lang_yyy
> categories.
Yes, our 'langGroup' is overloaded to mean both language and scriptGroup. As
for the proliferation of 'x-lang_yyy', I don't think we're gonna have more than
what we have now. I know what you have in mind, but I don't think we'll ever put
all these (how many? hundreds, thousands, tens of thousands? [2]) languages in
the font preference. Instead, I believe we'll keep them coarse-grained while
sending down from layout to gfx 'lang' explicitly specified by the author (along
with our coarse-grained lang/scriptGroup) so that Gfx implementations capable of
taking advantage of fine-grained lang distinctions (e.g. Xft and Pango) can do so.
[1] Mac OS classic had 'Times CE'(Central Europe), 'Times CY'(Cyrillic) etc, but
Mac OS X consolidated 'Times CE', 'Times CY' and 'Times' to 'Times'. The same is
true of Helvetica and Courier.
[2] I wouldn't be wrong to say that the number of scripts is considerably
smaller than the number of languages, would I?
Am I correct in understanding that currently there is no way to tell Gecko what
font to use for these languages? (And therefore it just "guesses at random"?)
I'm trying to figure out whether bug 288571 reported against Camino about
Armenian text displaying as ???? is because there's no UI (or even user/prefs.js
entry, this bug?) to set a default font or if it's because Gfx:Mac can't
handle/recognize the sole Mac font with Armenian glyphs (loosely bug 246527,
according to what I've been told).
Sorry for the noise/stupid question, and thanks for any help you can provide.
Assignee | ||
Comment 11•20 years ago
|
||
(In reply to comment #10)
> Am I correct in understanding that currently there is no way to tell Gecko what
> font to use for these languages? (And therefore it just "guesses at random"?)
Actually, there's a way if 'lang=xy' is specified. Fonts set for 'Unicode' (or
in recent nightlies, 'Other scripts') will be used for those scripts for which
we don't have a UI, yet.
> I'm trying to figure out whether bug 288571 reported against Camino about
> Armenian text displaying as ???? is because there's no UI (or even user/prefs.js
> entry, this bug?) to set a default font or if it's because Gfx:Mac can't
> handle/recognize the sole Mac font with Armenian glyphs (loosely bug 246527,
> according to what I've been told).
Camino in particular and Mac products in general have lagged behind other ports
in terms of font and rendering. I'll see what I can do in bug 288571.
Assignee | ||
Comment 12•20 years ago
|
||
Updated the patch to the trunk and included gfx:win part while excluding
gfx:mac part. I didn't change gfx:gtk/gfx:xlib. I used ISO 15924 script names
for new 'langGroups' (with 'x-' prefix instead of 'x-i-' prefix). Perhaps, it's
better to combine this patch with Simon's patch for bug 248690
Attachment #154870 -
Attachment is obsolete: true
Assignee | ||
Comment 13•20 years ago
|
||
My TB tree was not up to date. After making it up to date, I made a new patch.
Attachment #179284 -
Attachment is obsolete: true
Assignee | ||
Comment 14•20 years ago
|
||
On Windows, it works 'well'. Need to test on Linux and Mac OS X.
I added 5 more scripts (Malayalam, Gujarati, Gurmukhi, Bengali, Khmer). I
didn't add Kanada, Telugu, and so forth because I couldn't find fonts for them
that cover Basic Latin completely. Fonts for them have punctuation marks and
numbers in Basic Latin range. Some of our Gfx implementation assume that basic
Latin is fully covered by a font.
Assignee | ||
Updated•20 years ago
|
Attachment #179354 -
Attachment is obsolete: true
Assignee | ||
Comment 15•20 years ago
|
||
I added changes for gfx:mac and fixed a couple of mistakes. What's to be done
additionally are:
1. Mac OS X font pref. However, bug 246527 and other related bugs need to be
fixed. According to bug 288571 comment #4, some Pan-unicode TTFs for Windows
work on Mac OS X for some of these scripts being added, but fonts shipped by
Apple don't work due to bug 246527 and friends
2. Gfx:Gtk/Xlib fixes that would be kinda like 'place holders' because for new
scripts added, iso10646-1 is the only sensible XLFD(charset-encoding part)
entry.
3. Camino fix : Adding menu items to Camino is a 'black art' to me. I asked
how, but practictioners don't seem to want to reveal the secret. :-)
Because we don't build Gfx:Gtk/xlib any more for tier-1 platforms, this patch
is good enough for now.
Btw, I didn't change nsFontMetricsPango and nsFontMetricsCairoXft because the
change made here will be automatically propagated once bug 288634 and bug
277656 are fixed.
Assignee | ||
Updated•20 years ago
|
Attachment #179474 -
Attachment is obsolete: true
Attachment #179708 -
Flags: superreview?(bryner)
Attachment #179708 -
Flags: review?(smontagu)
Comment 16•20 years ago
|
||
Comment on attachment 179708 [details] [diff] [review]
update
r=smontagu
Attachment #179708 -
Flags: review?(smontagu) → review+
(In reply to comment #15)
> 3. Camino fix : Adding menu items to Camino is a 'black art' to me. I asked
> how, but practictioners don't seem to want to reveal the secret. :-)
Here's smfr's checkin for adding zh-HK to Camino's langGroup font prefs menu
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=camino&branch=&branchtype=match&dir=%2Fmozilla%2Fcamino&file=&filetype=match&who=smfr*&whotype=regexp&sortby=Date&hours=2&date=explicit&mindate=2005-03-01+21%3A31%3A00&maxdate=2005-03-02+00%3A00%3A00&cvsroot=%2Fcvsroot
If you're also adding these to the View: Text Encoding (selection/override)
menu, I can't even provide a pointer.
Assignee | ||
Comment 18•20 years ago
|
||
Comment on attachment 179708 [details] [diff] [review]
update
Asking dbaron for sr.
Attachment #179708 -
Flags: superreview?(bryner) → superreview?(dbaron)
Assignee | ||
Comment 19•20 years ago
|
||
(In reply to comment #17)
> Here's smfr's checkin for adding zh-HK to Camino's langGroup font prefs menu
>
http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=camino&branch=&branchtype=match&dir=%2Fmozilla%2Fcamino&file=&filetype=match&who=smfr*&whotype=regexp&sortby=Date&hours=2&date=explicit&mindate=2005-03-01+21%3A31%3A00&maxdate=2005-03-02+00%3A00%3A00&cvsroot=%2Fcvsroot
I am aware of that check-in, but it doesn't help because a binary file was
modified and my question as to how to do that hasn't been answered.
Attachment #179708 -
Flags: superreview?(dbaron) → superreview+
Assignee | ||
Comment 20•20 years ago
|
||
Comment on attachment 179708 [details] [diff] [review]
update
asking for approval to aviary 1.1a and suite 1.8b2.
This is a low risk patch adding a bunch of scripts to the font selection menu
and languages written in them to the language selection menu. For some of
them, there are (un)official language packs so that not offering font selection
menu for them leads to kinda 'mismatch'.
Btw, Localizers have to translate the names of newly added scripts and
languages.
Attachment #179708 -
Flags: approval1.8b2?
Attachment #179708 -
Flags: approval-aviary1.1a?
Comment 21•20 years ago
|
||
Comment on attachment 179708 [details] [diff] [review]
update
a=asa
Attachment #179708 -
Flags: approval1.8b2?
Attachment #179708 -
Flags: approval1.8b2+
Attachment #179708 -
Flags: approval-aviary1.1a?
Attachment #179708 -
Flags: approval-aviary1.1a+
I just filed bug 292416 for Camino; I hope I got all the new scripts correct.
(I'll double-check when the next new Mac Fx nightly appears and see which
[other] ones are missing.)
Assignee | ||
Comment 23•20 years ago
|
||
resolving as fixed. (landed on the trunk)
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Comment 24•20 years ago
|
||
Have you not forgot to add new entitites to messenger/locale/preferences/fonts.dtd?
Assignee | ||
Comment 25•20 years ago
|
||
Thanks for catching my mistake. It was updated a moment ago.
Checking in mail/locales/en-US/chrome/messenger/preferences/fonts.dtd;
/cvsroot/mozilla/mail/locales/en-US/chrome/messenger/preferences/fonts.dtd,v
<-- fonts.dtd
new revision: 1.2; previous revision: 1.1
done
Comment 26•20 years ago
|
||
Hi,
I've just checked out the latest nightly (9 May) and seems to be working fine
for Gurmukhi. However, is it possible to get the scripts listed in alphabetical
order? The current order is a bit strange.
Sukh
Comment 27•20 years ago
|
||
Sorry about replying to myself... but danda and double danda (U+0964 and
U+0965) seem to use a different font in Gurmukhi. Even though these are in the
Devanagari block, they should be counted as being Gurmukhi when viewed with
Gurmukhi text.
Assignee | ||
Comment 28•20 years ago
|
||
(In reply to comment #27)
> Sorry about replying to myself... but danda and double danda (U+0964 and
> U+0965) seem to use a different font in Gurmukhi. Even though these are in the
> Devanagari block, they should be counted as being Gurmukhi when viewed with
> Gurmukhi text.
Depending on the situation, that's a non-tirivlal problem, I'm afraid. Actually,
it should work most of time. Can you make up a very simple test case and upload
a screenshot? (well, I can make one). If you really have a case for bug, please
file a new bug and assign it to me (also note it here for others)
(In reply to comment #26)
> I've just checked out the latest nightly (9 May) and seems to be working fine
> for Gurmukhi.
Thanks for testing. It must have bene on Windows XP, right?
> However, is it possible to get the scripts listed in alphabetical
> order? The current order is a bit strange.
Filed bug 293499
Comment 29•20 years ago
|
||
See bug 293511 for details on the Dandas.(In reply to comment #28)
> (In reply to comment #27)
> > I've just checked out the latest nightly (9 May) and seems to be working fine
> > for Gurmukhi.
>
> Thanks for testing. It must have bene on Windows XP, right?
Yes it was on Windows XP. The latest nightly also seemed to fix issues with
selecting Gurmukhi text, although the problems with justified text is still there.
Comment 30•20 years ago
|
||
See bug 293511 for details on the Dandas.
You need to log in
before you can comment on or make changes to this bug.
Description
•