Closed Bug 231426 Opened 21 years ago Closed 19 years ago

CJK native font names are not recognized on non-CJK Windows and English names are not on CJK windows

Categories

(Core Graveyard :: GFX: Win32, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jshin1987, Assigned: jshin1987)

References

Details

(Keywords: intl)

Attachments

(2 files, 4 obsolete files)

Thanks for the reference [0] (see bug 203745 comment #11 and bug 203745 comment
#10). That's new to me. MS must have shuffled around MSDN pages. When I wrote
comment #8, I tried to give a reference. I'm sure I have seen it in the
reference page for EnumFontFamiliesEx [1], but they changed the page and there's
no mention of 'native' font names any more.

Now according to the page you cited, on Win2k/XP, everything should just work
fine because CreateFont/CreateFontIndirect take care of it as long as we use 'W'
version of APIs, which we don't at the moment. The MSDN doesn't make it clear,
but that's the only way 'CreateFont/CreateFontIndirect' can work with both
'native' ('localized name' used in MSDN is a misnomer when refering to CJK names
of CJK fonts) names and  English names on Win2k/XP regardless of the current
system locale. 

On Win 9x/ME and NT4, unless we hard-code a mapping table between 'native' names
 and English names for Windows core fonts for SC, TC, J and K, I think we're out
of luck. We can bypass Win32 APIs  to retrieve whatever we want from a
(truetype) font, but that seems to be too much work (for too little gain). So,
we can do two things on Windows:

1. a. detect the Windows version in |GlobalInit| of nsFontMetricsWin.
   b. use 'W' APIs and Unicode font names (as passed by layout)  on Win 2k/XP
and 'A' APIs and font names converted to the system codepage (with
WideCharToMultiByte) on Win 9x/ME and NT4 [2]
   c. make pretty extensive changes because we use 'A' version of logFont with
|lfFaceName| in a C string throughout. We have to use 'W' version now.

2. This is optional and necessary only on Win 9x/ME and NT4.
   a. add the mapping table mentioned above. 
   b. if CreateFont* fails and lfFaceName of logFont is non-ASCII, look it up in
the mapping table (hash it if necessary) for the corresponding English name. [3]
   c. copy 'W' logFont to 'A' logFont except for lfFaceName, for which we have
to use the English name obtained in step b. 
   d. call CreateFont* (A version) with 'A' logFont

I'm tempted to do just the first of two because 1) that alone may (or may not)
be quite a work 2) the number of those who can benefit from this are relatively
small (users of Win 9x/ME/NT4 the language of which don't match the language of
fonts. e.g. English Win 9x/ME users who want to view Chinese pages). We can do
the second later if we determine that we can't ignore those cases.

[0]
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/gdi/fontext_4rw4.asp
[1]
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/gdi/fontext_4mk8.asp
[2]
http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/nsFontMetricsWin.cpp#2388
    Here, font names in UTF-16 are converted to C string in the current system
code page.
    If we can assume that every Win9x/ME user has MS IE 5.x or higher installed,
we're safe to use 'W' APIs even on Win 9x/ME. In similar cases, however, ftang
and roy didn't make that assumption. Instead, they wrote an equivalent of MSLU
(Microsoft Layer for Unicode).We can do the same here to avoid |if (Win2k/XP)
use W APIs else use A APIs| pattern.
[3] In theory, we have to do the opposite as well. That is, we have to make
English font names get recognized on Win 9x/ME (CJK) and NT4 (with CJK locale).
Therefore, we have to find the corresponding native name in the system code
page. In practice, the need for that is not greate because most CJK web pages
use native font names.
No longer blocks: 203745
Perhaps, switching to CreateFontIndirectW "just works" on Win9x/ME with the CJK
native name, thus recoving the parity with WinXP.

The MSDN artcile says that CreateFontIndirectW needs UnicoWS.dll on Win9x/ME
(the "Microsoft Layer for Unicode on Windows 95/98/Me Systems"). If if "just
works", it might be tempting to switch over, and request users on those "old"
plaforms to get it...

But if that isn't the case and the native font names don't work with IE either,
I wouldn't mind giving the bug a very low priority. It seems to involve rather
intensive changes whichever bullet point you listed.
> Perhaps, switching to CreateFontIndirectW "just works" on Win9x/ME with the CJK
> native name, thus recoving the parity with WinXP.

  As it is now, Korean fonts names work on Korean Win 9x/ME and Win 2k/XP with
Korean locale. However, they don't work on non-Korean Win 9x/ME and Win 2k/XP
with non-Korean locale.  The other way around, English names for Korean fonts
are not recognized on Korean Win 9x/ME and Win 2k/XP with Korean locale. Both
are true of SC, TC and J. 

Switching to 'W' APIs solve both problems for Win 2k/XP. However, it doesn't
solve either of the problem on Win 9x/ME and NT4. 'W' APIs on Win 9x/ME
supported by MSLU are just wrappers over 'A' APIs with WideCharToMultiByte
called on our behalf for wchar_t strings and can't do anything beyond what
native 'A' APIs can do on that platform. Therefore, requiring MSLU doesn't earn
much for us.

 A way to "solve' this problem on Win 9x/ME and NT4 is to hard-code a mapping
table for widely used fonts (i.e. the solution is not universal) and try
alternative names if the first attempt fails. Alternative is to poke into fonts
ourselves.  This is #2 item in comment #0.


As for MS IE 6, I'm pretty sure MS IE didn't work either until recently [1].
However, the latest version works on WIn 2k/XP (I'll attach a test case to bug
203745). I have to boot to Windows ME (English version) and see how MS IE works
there. What Opera does on Win 2k/XP is inconclusive...At least it can't
recognize English names for Korean fonts on Win2k/XP (with Korean locale), but
it seems like it can recognize C/J names for C/J fonts on Win2k/XP with Korean
locale. 

Given that numerous CJK web pages specify only native font names  and MS IE now
works correctly on Win 2k/XP. I think we have to do the first of two in comment
#0 (maybe not very urgent but nonetheless rather soon). The second item should
be given a very low priority as you wrote. 

[1] It could have been the case, in a typical MS fashion, that they had known
the 'trick' (undocumented) and used it in MS IE while leaving relevant MSDN
pages out of date.
Status: NEW → ASSIGNED
>  As it is now, Korean fonts names work on Korean Win 9x/ME and Win 2k/XP with
>Korean locale.

So, this side of the equation is fine.

> However, they don't work on non-Korean Win 9x/ME and Win 2k/XP
>with non-Korean locale.

This is the other side that prompted my earlier query in the other bug. The MSDN
article seemed to imply that, in this situation, given the CJK name on input,
EnumerateFontEx will return the English name on output, which can then be
subsequently used in CreateFont, thereby saving from hardcoding a mapping table.
That's is why I asked if this was also your reading of that article?

> The other way around, English names for Korean fonts
>are not recognized on Korean Win 9x/ME and Win 2k/XP with Korean locale. Both
>are true of SC, TC and J.

This case remains open, knowing that the first goal is to find a cheap way to
support the CJK native name, "given that numerous CJK web pages specify only
native font names" as you said.
I guess you referred to the following :
> To get the appropriate font on different language versions of the OS, call 
> EnumFontFamiliesEx with the desired font characteristics in the LOGFONT 
> structure, retrieve the appropriate typeface name, and create the font using 
> CreateFont or CreateFontIndirect.

It's not very clear (it seems to imply that). Anyway, assuming it does, this
will solve the problem on 2k/XP (if we use 'W' APIs), but I suspect it will have
no effect on Win 9x/ME  because there's no _genuine_ W API on Win 9x/ME. For
instance, suppose  you pass a Korean font name to EnumFontFamilesExW on English
Win 9x/ME (with MSLU installed). The Korean font name in UTF-16 will be
converted to CP1252 before EnumFontFamilesExA is called. Because CP1252 doesn't
have any Korean character, the font name will turn to '?' or sth like that and
EnumFontFamilesExA can't do any magic with that. 
> there's no _genuine_ W API on Win 9x/ME

Some (such as ExtOutW) are indeed genuine. Should EnumFontFamilesExW be genuine
too, it would do the magic.
You're right. After wrting comment #4, I realized that ExtTextOutW is genuine on
Win 9x/ME. So, I expected MSDN not to mention the need for MSLU  in the
reference page of ExtTextOutW, but it is not the case. Anyway, the only way to
figure it out is to experiment. 
Blocks: 145288
http://www.trigeminal.com/samples/font_choices.html
has some information on this issue.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/mslu/winprog/other_existing_unicode_support.asp
has the list of 'W' APIs natively implemented on Win 9x/ME. 
Depends on: mzlu
rbs, I didn't realize that you had made this a blocker to bug 145288 until a
moment ago. Anyway, I hit upon basically the same idea as yours (bug 145288,
comment #28).
I didn't know about gFamilyName so that my patch just uses a new persistent
properties (gFontNameProperties), but I'll change that. Fortunately, as a proof
of concept, I didn't yet implement caching. It'd have been a total waste of my
name because everything is there for me (gFamilyName) :-)
Attached file testcase with CJK Windows 'core' fonts (obsolete) —
Attachment #173505 - Attachment is patch: false
Attachment #173505 - Attachment mime type: text/plain → text/html; charset=UTF-8
Attached file test case(corrected)
Sorry for bug spam
Attachment #173505 - Attachment is obsolete: true
Attached patch patch (obsolete) — Splinter Review
This works well in all the cases as far as fonts listed in
fontNameMap.properties are concerned. It also 'fixes' bug 145288.

rbs, can you take a look? 

With CJK Windows core fonts installed (regardless of the system locale), all 4
combinations (for the 5th line, I only specify 'lang=xx' so that it's normal
that it looks different) for each font should look identical (btw, I had a typo
in my script that generated  attachment 173506 [details]. The corrected version is at
http://jshin.net/moztest/cjkfonts2.html )
Attachment #173544 - Flags: review?(rbs)
Comment on attachment 173544 [details] [diff] [review]
patch

It seems that your patch is not equivalent to what was there.

The winName used to behave as if it was a full-blown substitution directly into
the given CSS font-family list.

Thus, it was used in
LookForFontWeightTable(aDC, aName) as well.

Now, your patch skips that. Beware to check other places too.
Attached patch patch update (obsolete) — Splinter Review
> It seems that your patch is not equivalent to what was there.

In my patch, the mapping function is called later (when LogFont struct is
filled up for Win32 'A' APIs. WideCharToMultiByte was used in that place) than
it was before (FindLocalFont). As you pointed out, I missed the other place
where I should have called LookupWinFontName (as a substitute for
WideCharToMultiByte). In this patch, I took care of it.
Attachment #173544 - Attachment is obsolete: true
Attached patch patch (alternative) (obsolete) — Splinter Review
In this patch, LookupWinFontName is used in place of PL_HashLookup() in
FindLocalFont(). WideCharToMultiByte() has to be used down the road because
with this patch, nsString is hashed instead of nsCString (the latter can  be
passed over directly to 'A' APIs while the former need go through WCtoMB()).

The code size is slightly smaller this way, but perf-wise, it's a little
(perhaps a very little) slower than attachment 174029 [details] [diff] [review]. In addition, this patch
has a lower risk of regression.

rbs, can you review whichever you like better?
Attachment #173544 - Flags: review?(rbs)
Comment on attachment 174029 [details] [diff] [review]
patch update 

I prefer this one, with the following comments.

 PLHashTable* nsFontMetricsWin::gFontMaps = nsnull;
 PRUint16* nsFontMetricsWin::gEmptyCCMap = nsnull;
+nsString nsFontMetricsWin::gCodepageStr; 

Better to use a pointer (like the other variables next to you).

==================

I find |LookupWinFontName| rather convoluted. Why have both
|gFamilyNames| and |gFontNameMapProperties|, and go to great 
lengths to add things in the hash on OOM conditions. Adding things can 
only trigger other OOM conditions.

I suggest to only have the properties map (it is a hash), and change
the signature to:

/* static */ void
LookupWinFontName(const nsAString& aName, nsAString& aWinName)
{
   aWinName = aName;

+  [...] // early check

+  if (!gFontNameMapProperties)
+    NS_LoadPersistentPropertiesFromURISpec(&gFontNameMapProperties,
+      NS_LITERAL_CSTRING("resource://gre/res/fonts/fontNameMap.properties"));

   if (gFontNameMapProperties) {
     ... look up there
   }
}

=================

Also, I think it might be easier to simply the property file as:

AsciiName.cp = CJKName 
CJKName = AsciiName.cp

with the lookup becoming:

if (IsASCII(aName)) {
  lookup value of name.cp
}
else {
  lookup value of name
  check that value ends with cp (with the existing function StringEndsWith)
  and truncate
}

==================

 typedef struct nsFontFamilyName
 {
   char* mName;
   char* mWinName;
 } nsFontFamilyName;

-static nsFontFamilyName gFamilyNameTable[] =

The struct is not needed anymore since nobody uses it.

===================

+  // There's little worry about characters not covered by the current 
+  // codepage because LookupWinFontName invoked in FindLocalFont 
+  // takes care of most cases. 


Suggesting:

The risk of loosing characters not covered by the current 
codepage is reduced because LookupWinFontName invoked earlier 
has taken care of most cases.
Thanks for review and suggestion.

(In reply to comment #16)
> (From update of attachment 174029 [details] [diff] [review] [edit])
> I prefer this one, with the following comments.

attachment 174029 [details] [diff] [review] doesn't have "There's little worry ....." you rewrote. It's
only in attachment 174030 [details] [diff] [review]. So, I'm a little confused about your preference. 

> I find |LookupWinFontName| rather convoluted. Why have both
> |gFamilyNames| and |gFontNameMapProperties|, and go to great 

Yes, it's convoluted because I wanted to do 'lowercase', 'stripwhitespace',
'code page' comparison and WCtoMB conversion (in case of attachment 174029 [details] [diff] [review]) only
for the first time. With your suggestion for the properties file, it's simpler
than before, but still I have to call 'lowercase' (not just for the ASCII range
but for the whole Unicode repertoire which is quite expensive) and
'stripwhitespace' (and WCtoMB in case of 174029) everytime. Do you think we can
live with perf. lose arising from that? 

I think I didn't pick the right attachment. I meant attachment 174030 [details] [diff] [review].

I don't anticipate a perf hit. I feared the premture optimization instead. There
is 1) the early check that you had 2) the relatively fast path for the common
ASCII case and then, 3) the fall through for the rare CJK case.
Attachment #174029 - Attachment is obsolete: true
Attachment #174030 - Attachment is obsolete: true
Attachment #174441 - Flags: superreview?(dbaron)
Attachment #174441 - Flags: review?(rbs)
Comment on attachment 174441 [details] [diff] [review]
patch without caching per rbs' comment

On seeing that you opted not to assign/truncate anything at the start of
LookupWinFontName, you might want to then use a |PRBool LookupWinFontName|, and
do:

     // the familyNameQuirks should not affect generic & global fonts
+    nsAutoString winName; 
+    PRBool found = LookupWinFontName(*name, winName); 
-    nsFontWin* font = LoadFont(aDC, *winName, mFont.familyNameQuirks);
+    nsFontWin* font = LoadFont(aDC, found? winName : *name,
mFont.familyNameQuirks);

(this way, there is no implied dependency on the caller passing an empty
string.)

r=rbs with that.
Attachment #174441 - Flags: review?(rbs) → review+
Comment on attachment 174441 [details] [diff] [review]
patch without caching per rbs' comment

thanks. I made the change in my tree. Besides, I got rid of hash callback
functions (for gFamilyNames).
This patch looks like it removes all the mapping that was represented by
gFamilyNameTable.

Also, I'm not crazy about using a .properties file for something that could
easily be in the .cpp file, although I guess having to encode UTF-8 or UTF-16 in
the .cpp file makes it a bit of a pain.
The mappings in gFamilyNameTable don't have much value anymore (see bug 145288). 

As for a .properties, yeah I don't have strong feelings either way too. I had
initially thought that jshin could just leverage gFamilyNameTable and hardcode
the mappings there. (That's why I marked bug 145288 dependent on this - see bug
145288 comment 28).

jshin, I guess you decided to use a .properties because it eases extensibility
down the track, and you preferred to resolve the problem once for all?
It seems like Visual C++ doesn't like having UTF-8 literal strings in the source
code. (I think I tried them only to get a strange error.) Moreover, other people
who work on the file need to be careful when editing the file. Well, I can use
'\xE..\x8......blah blah', but that's not pretty. More importantly, as rbs
mentioned, I want to let end-users (or packagers) to add fonts they need. Hmm, I
may have to change the wording about 'localizability' in the properties file.
'One should not translate anything, but can add fonts'. 
 
Localizers shouldn't be able to add fonts; if fonts need to be added, they
should be added for everyone.

Furthermore, since this isn't in a localization JAR package or chrome URL,
localizers shouldn't even get to a point where they consider translating it.
Comment on attachment 174441 [details] [diff] [review]
patch without caching per rbs' comment

I'm a little skeptical about removing something like that during a beta cycle,
but sr=dbaron.

(Does the "to speed start-up" case vary by user, or is it always teh exact same
fonts?)
Attachment #174441 - Flags: superreview?(dbaron) → superreview+
landed on the trunk on March 6th
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.