need a way to specify script-specific fonts for many other scripts

RESOLVED FIXED

Status

Core Graveyard
GFX: Win32
RESOLVED FIXED
15 years ago
9 years ago

People

(Reporter: Jungshik Shin, Assigned: Kevin McCluskey (gone))

Tracking

({intl})

Trunk
x86
Windows 2000

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 5 obsolete attachments)

(Reporter)

Description

15 years ago
Mozilla-Win under Win2k/XP can render complex scripts (especially Indic scripts) 
reasonably well with the help of the native OS support of those scripts.
With the patch for bug 204039, Tamil also can be rendered under Win9x/ME. 
However, there's no way to specify fonts for those scripts. Specifying fonts for
Unicode works (in case of GFX-Win), but we need to let users specify fonts per
script as is possible with MS IE(Options|Tools|Font)

There are two parts, UI part(cross platform : see the patch for bug 204039 for
adding Tamil) and platform(GFX implementation)-dependent part. This bug is about
the latter for GFX-Win. An easy way to see the problem is to go
to Pref|Appearance|Font and select Hindi. In Moz-Win, no font can be specified
even if there are fonts covering Devanagari on the system. 

Most 'infrastructure' is already there for GFX-Win
(nsUnicodeRange.cpp [1] and nsFontMetricsWin.cpp), but currently UnicodeRange
to LangGroup mapping works only for 'major'(?) scripts(Latin, Cyrillic, Greek,
CJK, Arabic, Hebrew and Thai). This needs
to be extended. Another problem is UnicodeRange is not fine-grained enough
in some cases. For instance, two Indic scripts are grouped into a single
unicodeRange (e.g. kDevanagariBengali instead of separate kDevanagari and kBengali).



[1] We may have to move the code for UnicodeRange and related tasks to gfx/share
or some other places so that they can be used by other GFX implementations (and
other parts of Mozilla)
(Reporter)

Comment 1

15 years ago
Created attachment 123651 [details] [diff] [review]
a patch (not yet working)

I thought this patch would work, but didn't. I still got 'no fonts available
for ...' when I selected 'Devanagari' (Hindi) in Pref|Appearnace|Font.
This patch assumes that the patches for bug 206146 and bug 204039 are
committed. However, by just replacing 'x-devanagari' with 'hi-IN', this should
be applied to the current cvs.

Devanagari does not have 'ANSI' codepage because it's never supported on
Win9x/ME. I thought this wouldn't be a problem because we don't care about
bitmap fonts (gCharset?Info[].. in nsFontMetricsWin.cpp is only for bitmap
fonts, right?). nsFontList::AvailableFonts (-->
nsFontEnumerator::EnumerateFonts)
seems to return null.. I'll try to figure out what's going on

Comment 2

15 years ago
> we don't care about bitmap fonts (gCharset?Info[].. in nsFontMetricsWin.cpp 

Nope, bitmap fonts are supported as well.

> I still got 'no fonts available for ...' when I selected 'Devanagari' (Hindi)

Watch bug 142511 in which I am going to remove 'no fonts available for ...'.
However, you still need to find out why the sublist isn't found. This is still
used to group fonts in bug 142511.
(Reporter)

Comment 3

15 years ago
I'm sorry I was too terse..I know bitmap fonts are used as well. What I meant is
that gCharsetinfo[] is only for bitmap fonts so that Devanagari NOT being listed
in that array should not mattter, which turned out to be wrong.  

Anyway, I figured out why no font is returned. That's because the above
assumption was wrong. In |SingatureMatchesLangGroup|
gCharsetInfo[...].mLangGroup is compared against aLangGroup. We have to find out
a better way to determine the lang. coverage of a given font. As it stands, we
can't determine the coverage of scripts that were never supported as 'ANSI'
charset (i.e. on Win9x/ME).  I guess there  is a Win32 API  we can use for that.
(Reporter)

Comment 4

15 years ago
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_3h0y.asp

I guess we have to use fsUsb field of FONTSIGNATURE after trying fsCsb field.

As you wrote, even with bug 142511 fixed, this still has to be done because bug
142511 is about style-difference while this is about script coverage.
(Reporter)

Comment 5

15 years ago
Created attachment 123669 [details] [diff] [review]
a new patch that works

After trying to read between  the lines in MSDN documents, I managed to figure
out  how to convince Visual C++ to compile |enumProc|  with a new signature...
Attachment #123651 - Attachment is obsolete: true
(Reporter)

Comment 6

15 years ago
 @@ -5213,7 +5366,7 @@
   if (!strcmp(aLangGroup, "ar"))    return 1;
 
   switch (aFont->logFont.lfPitchAndFamily & 0xF0) {
-    case FF_DONTCARE:   return 0;
+    case FF_DONTCARE:   return 1;
     case FF_ROMAN:      return !strcmp(aGeneric, "serif");
     case FF_SWISS:      return !strcmp(aGeneric, "sans-serif");

I had to make the above change because Mangal(the default opentype for
Devanagari) and Raghu(sp?, GPL'd opentype fonts for Devanagari) fall to
FF_DONTCARE category. Moreover, if a font says DONTCARE, shouldn't it match any
generic type? 
It seems to make sense, and I think the spec can be interpreted to make that a
valid implementation. rbs?
(Reporter)

Comment 8

15 years ago
Created attachment 123761 [details] [diff] [review]
a new patch (with range check fix)

There was a stupid mistake in nsUnicodeRange.cpp that led tirtiary ranges not
to work. I was too preocuppied with making the font-pref working to notice
that. Anyway, with this patch, fonts to render pages in Hindi (written in
Devanagari script) with are controlled by
Pref|Appeareance|Fonts|Devanagari(with the patch for bug 206146) instead of
'Pref...fonts|Unicode'.[1] This can be tried at sites like
http://www.bbchindi.com

[1] In the current CVS snapshot of Mozilla-Win, Unicode font-pref works for
scripts not covered by any codepages supported in Win9x/ME (e.g. Devanagari)
because two 'negatives' make  a positive :-). Due to bug 91190, 'x-unicode'
langGroup is mapped to the langGroup of the current locale, fonts for which
don't cover  scripts like Devanagari. Therefore, |FindGenericFont| fails and
x-unicode pref. values are refered to as the last resort in |FindPrefFont|. 
Even after bug 91190 is fixed, this has to work this way because
|FindGenericFont|  works only for 'legacy code	 pages'(Latin,Cyriilic,Greek,
Hebrew, Arabic, Thai, CJK)  while |FindPrefFont|  also works for any scripts we
have font-pref. UI for thanks to this patch.  This way, 'x-unicode' font-pref
values would be used only for scripts we don't have font-pref. UI for.	I added
a block of code that's blocked for the now with a comment that reads 'XXX : to
be activated when bug 91190 is fixed.'.
Attachment #123669 - Attachment is obsolete: true
(Reporter)

Comment 9

15 years ago
Comment on attachment 123761 [details] [diff] [review]
a new patch (with range check fix)

Asking for r/sr.
Attachment #123761 - Flags: superreview?(rbs)
Attachment #123761 - Flags: review?(kmcclusk)

Comment 10

15 years ago
With bug 142511 checked in, care to attach a new patch in sync with the trunk?

The FF_DONTCARE change might not be crucial anymore to you since all the fonts
are shown on the pref panel now. But you can leave it there if you want.
(Reporter)

Comment 11

15 years ago
Created attachment 123857 [details] [diff] [review]
a new patch (against the updated cvs + with Devanagari entries for winpref.js

Basically the same as attachment 123761 [details] [diff] [review] except for some offsets in line
numbers. I added Devanagari entries to winpref.js. 

As for FF_DONTCARE,  I'd rather keep the change I made to get Devanagari fonts
(Mangal and Raghindi) to show up at the top of the list and above the
separator.
Otherwise, I'd have to go down the long list of fonts to pick either of them if
I want to change fonts.
Attachment #123761 - Attachment is obsolete: true
(Reporter)

Updated

15 years ago
Attachment #123761 - Flags: superreview?(rbs)
Attachment #123761 - Flags: review?(kmcclusk)
(Reporter)

Comment 12

15 years ago
Comment on attachment 123857 [details] [diff] [review]
a new patch (against the updated cvs + with Devanagari entries for winpref.js

Asking Simon for r because this part was previously worked on by Shanjian so
that it rather belongs to intl.
Attachment #123857 - Flags: superreview?(rbs)
Attachment #123857 - Flags: review?(smontagu)

Comment 13

15 years ago
Comment on attachment 123857 [details] [diff] [review]
a new patch (against the updated cvs + with Devanagari entries for winpref.js

Index: gfx/src/windows/nsUnicodeRange.cpp
===================================================================

 /**********************************************************************
@@ -224,24 +226,24 @@
     kRangeCyrillic,	      //u04xx
     kRangeHebrew,	      //u05xx	  XXX 0530-058f is in fact
kRangeArmenian
     kRangeArabic,	      //u06xx
-    kRangeSriacThaana,       //u07xx
+    kRangeTableTirtiary,     //u07xx

s/kRangeTableTirtiary/kRangeTertiaryTable/g


+// Between U+0700 and U+16FF, most ranges are 0x80 code points long.
+#define SUBTABLE2_SIZE ((0x17 - 0x07) * 2)

I don't understand (both the comment and the size calculation).

+
+static PRUint8 gUnicodeSubrangeTable2[SUBTABLE2_SIZE] =

Got a better name than 'Table2' ?!?

+const PRUint8	 kRangeYi		    = 63;
+const PRUint8	 kRangeCombiningDiacriticalMarks = 64;
+const PRUint8	 kRangeArmenian 	       = 65;

nit: last indentation

 const PRUint8	 kRangeTableBase   = 128;    //values over 127 are reserved for
internal use only
+const PRUint8	 kRangeTableTirtiary	 = 192; 

where does the magic '192' come from? 

Index: gfx/src/windows/nsFontMetricsWin.cpp
===================================================================

@@ -3213,6 +3332,23 @@
   if (mLangGroup) {
     const char* langGroup;
     mLangGroup->GetUTF8String(&langGroup);
+  
+    // x-unicode psuedo-langGroup should be the last resort to turn to.
+    // That is, it should be refered to only when we don't  recognize 
+    // |langGroup| specified by the authors of documents and  the 
+    // determination of |langGroup| based  on Unicode range also fails 
+    // in |FindPrefFont|. At this stage, we just have to skip it. (bug 206123)
+    // because |FindPrefFont| will refers to it after trying the Unicode-range
+    // based attempt. As of 2003-05-19, this code has no effect because 
+    // x-unicode is mapped to the langGroup of the current locale 
+    // (see bug 91190).
+    // XXX : has to be turned on when bug 91190 is resolved.
+#if 0
+    if (!strcmp(langGroup, "x-unicode")) {
+      mTriedAllGenerics = 1;
+      return nsnull;
+    }
+#endif


I don't see what/where is the problem if this be enabled right now.

@@ -5197,6 +5336,39 @@
     }
   }

+  // For aLangGroup corresponding to one of 'ANSI' codepages, the negative
+  // result from fsCsb should be considered final. Otherwise, we risk getting
+  // a false positive from fsUsb, which could lead to 'ransom note' 
+  // style mix of glyphs from different fonts.
+
+  if (!strcmp("x-western", aLangGroup)) 
+    return 0;
+  if (!strcmp("x-central-euro", aLangGroup)) 
+    return 0;
+  for (PRUint8 range = 0; range <= kRangeMaxANSI; ++range) {
+    if (!strcmp(gUnicodeRangeToLangGroupTable[range], aLangGroup)) {
+      return 0;
+    }
+  }

Simply use OR in the two first cases. In fact, what is the problem
with checking against the list in gCharsetInfo[] instead?

These are changes that should have happened in a more relax period. You are
going to wear me out with all those 1.4 reviews :-)
Attachment #123857 - Flags: superreview?(rbs) → superreview-
(Reporter)

Comment 14

15 years ago
I'm sorry to ask you for too many reviews over a short time period near the
release point and thanks for your thorough review.

+#if 0
+    if (!strcmp(langGroup, "x-unicode")) {
+      mTriedAllGenerics = 1;
+      return nsnull;
+    }
+#endif

> I don't see what/where is the problem if this be enabled right now.

No difference in Mozilla's behavior except for one more test that would always
be false because before reaching this part of the code, 'x-unicode' is replaced
by langGroup of the current locale until we fix 91190. I'll just remove '#if 0'

+const PRUint8	 kRangeTableTirtiary	 = 192; 

> where does the magic '192' come from?

Nowhere :-) kRangeTableBase is 128 and there are some values assigned up to
133(or 134), 'kRangeTableBase+5'(or +6). I wanted to leave a sufficiently large
gap beteween kRangeTableBase and kRangeTertitaryTable. It was unnecessarily too
large a gap. Perhaps, 144 (128 + 16: allowing at most 16 subtables) is better.
When we add non-BMP ranges, we probably have to use a separate array.  

+// Between U+0700 and U+16FF, most ranges are 0x80 code points long.
+#define SUBTABLE2_SIZE ((0x17 - 0x07) * 2)

> I don't understand (both the comment and the size calculation).

  Most scripts in the range (U+0700 - U+16FF) are assigned  128 code points each
so that the number of entries in the tirtiray range table' (for that range) is
obtained by dividing the number of code points in the range by 128. Would you
like '(0x1700 - 0x0700) / 128' better with a comment similar to what I'm
writting now? 

As for 'SubrangeTable2', how about 'nsUnicodeTirtiaryRangeTable' and
'TIRTIARY_TABLE_SIZE'?  
 
(Reporter)

Comment 15

15 years ago
Created attachment 123875 [details] [diff] [review]
a patch addressing rbs' concerns
Attachment #123857 - Attachment is obsolete: true
(Reporter)

Updated

15 years ago
Attachment #123857 - Flags: review?(smontagu)
(Reporter)

Updated

15 years ago
Attachment #123875 - Flags: superreview?(rbs)
Attachment #123875 - Flags: review?(smontagu)

Comment 16

15 years ago
Comment on attachment 123875 [details] [diff] [review]
a patch addressing rbs' concerns

Index: gfx/src/windows/nsUnicodeRange.cpp
===================================================================
+    kRangeTirtiaryTable,     //u07xx

I think the spelling is 'Tertiary' as I mentioned earlier.

+// Most scripts in the range between U+0700 and U+16FF take 128 (0x80) 

Do you mean
+// Scripts in the range between U+0700 and U+16FF take at most 128 (0x80) 
?

Index: gfx/src/windows/nsFontMetricsWin.cpp
===================================================================
+// the mapping from bitfield in fsUsb (of FONTSIGNATURE) to UnicodeRange
+// defined in nsUnicodeRange.h. Only the first 96 bits are mapped here. 

because... ?  Add a comment explaining why.


+    // x-unicode pseudo-langGroup should be the last resort to turn to.
+    // That is, it should be refered to only when we don't recognize 
+    // |langGroup| specified by the authors of documents and the 
+    // determination of |langGroup| based on Unicode range also fails 
+    // in |FindPrefFont|.
+
+    if (!strcmp(langGroup, "x-unicode")) {
+      mTriedAllGenerics = 1;
+      return nsnull;
+    }

Just use the above for the 'x-unicode' stuff (with typo in 'psuedo' corrected).
What is happening is logical and fits well with the rest. And we won't have to
come back here anymore. [Other comments, although useful to known, don't help.
Instead, they confuse a bit. Leave the remaining criss-cross to LXR/Bonsai.]

+  // For aLangGroup corresponding to one of 'ANSI' codepages, the negative
+  // result from fsCsb should be considered final. Otherwise, we risk getting
+  // a false positive from fsUsb, [which could lead to 'ransom note' 
+  // style mix of glyphs from different fonts.]

Just say: which could lead unnecessarily to a mix of glyphs
from different fonts.

+  for (i = 1; i < 14; i++) // x-western .. zh-TW. (exclude JOHAB)

retain the comment and use |++i| as done next to you with the existing
enums, i.e.,
// x-western .. zh-TW. (exclude JOHAB)
for (i = eCharset_ANSI; i <= eCharset_CHINESEBIG5; ++i)
...

A third iteration addressing those comments, and you should be fine with me.
Attachment #123875 - Flags: superreview?(rbs) → superreview-
(Reporter)

Comment 17

15 years ago
Created attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs


+// Most scripts in the range between U+0700 and U+16FF take 128 (0x80) 

> Do you mean
+// Scripts in the range between U+0700 and U+16FF take at most 128 (0x80) 
?

No, I meant what I wrote (of course, there are a lot of reserved points in each
block, but that's not our concern for the purpose at hand.) Most of scripts in
the range are assigned a chunk of 128 code points each. Exceptions are Tibetan
(2 chunks), Canadian aboriginal syllabaries (5? chunks), Korean (2 chunks) and
Ogham and Runic (sub-chunk). The way code points are allocated in that range
makes it rather natural to use a block of 128 code points as a 'unit'.
Attachment #123875 - Attachment is obsolete: true
(Reporter)

Comment 18

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

Hope it's the last iteration :-)
Attachment #123938 - Flags: superreview?(rbs)
Attachment #123938 - Flags: review?(smontagu)
(Reporter)

Comment 19

15 years ago
Comment on attachment 123875 [details] [diff] [review]
a patch addressing rbs' concerns

sorry for spamming...
Attachment #123875 - Flags: review?(smontagu)

Comment 20

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

sr=rbs
Attachment #123938 - Flags: superreview?(rbs) → superreview+
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

r=smontagu
Attachment #123938 - Flags: review?(smontagu) → review+
(Reporter)

Comment 22

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

Since 1.2.x, the font-pref. UI for Hindi(now changed to Devanagari) has been
present in Mozilla, but it's kinda dummy because there's no actual code
associated with it.  With this low-risk changes (virtually no risk to users of
other scripts), users of two major scripts (Devanagari and Tamil) in India will
have Mozilla that treats them on par with other scripts (under Win2k/XP.) in
terms of rendering.
Attachment #123938 - Flags: approval1.4-
(Reporter)

Comment 23

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

Since 1.2.x, the font-pref. UI for Hindi(now changed to Devanagari) has been
present in Mozilla, but it's kinda dummy because there's no actual code
associated with it.  With this low-risk changes (virtually no risk to users of
other scripts), users of two major scripts (Devanagari and Tamil) in India will
have Mozilla that treats them on par with other scripts (under Win2k/XP.) in
terms of rendering.

Sorry for spamming to others.
Attachment #123938 - Flags: approval1.4- → approval1.4?

Comment 24

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

a=mkaply for checkin to 1.4
Attachment #123938 - Flags: approval1.4? → approval1.4+

Comment 25

15 years ago
Comment on attachment 123938 [details] [diff] [review]
a new patch with revised comments per rbs

a=mkaply for checkin to 1.4
(Reporter)

Comment 26

15 years ago
Thanks all, Fix checked in
Status: NEW → ASSIGNED
(Reporter)

Comment 27

15 years ago
Sorry for spamming...
Status: ASSIGNED → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.