Last Comment Bug 90804 - Fizzilla Doesn't Display Unicode Text that OmniWeb can
: Fizzilla Doesn't Display Unicode Text that OmniWeb can
Status: RESOLVED FIXED
MacOS X
: intl
Product: Core
Classification: Components
Component: Internationalization (show other bugs)
: Trunk
: PowerPC Mac OS X
: P3 enhancement (vote)
: mozilla0.9.6
Assigned To: Frank Tang
: Yuying Long
Mentors:
http://www.methymna.com/unicode/texts...
Depends on: 94745
Blocks: 103669 102998 107067
  Show dependency treegraph
 
Reported: 2001-07-14 06:07 PDT by PTRourke
Modified: 2002-06-15 15:08 PDT (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (4.14 KB, patch)
2001-07-26 16:20 PDT, Frank Tang
no flags Details | Diff | Review
new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.h (1.20 KB, text/plain)
2001-08-09 09:21 PDT, Frank Tang
no flags Details
new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp (9.88 KB, text/plain)
2001-08-09 09:22 PDT, Frank Tang
no flags Details
diff (1.59 KB, patch)
2001-08-09 09:24 PDT, Frank Tang
no flags Details | Diff | Review
2nd version of new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/s (9.80 KB, text/plain)
2001-08-09 09:50 PDT, Frank Tang
no flags Details
3rd version of new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/s (10.25 KB, text/plain)
2001-08-15 17:40 PDT, Frank Tang
no flags Details
nsMacUnicodeFontInfo.h (1.98 KB, text/plain)
2001-10-19 14:40 PDT, Frank Tang
no flags Details
nsMacUnicodeFontInfo.cpp (10.14 KB, text/plain)
2001-10-19 14:41 PDT, Frank Tang
no flags Details
diff v2 (1.54 KB, patch)
2001-10-19 14:42 PDT, Frank Tang
no flags Details | Diff | Review
v3 of nsMacUnicodeFontInfo.cpp which include shutdown observer (11.38 KB, text/plain)
2001-10-22 10:42 PDT, Frank Tang
no flags Details
v4 of nsMacUnicodeFontInfo.cpp (14.47 KB, text/plain)
2001-10-24 18:24 PDT, Frank Tang
mikepinkerton: review+
Details
v5 nsMacUnicodeFontInfo.cpp (14.03 KB, patch)
2001-10-26 18:07 PDT, Frank Tang
no flags Details | Diff | Review

Description PTRourke 2001-07-14 06:07:41 PDT
Fizilla's repertoire of Unicode characters it can render is significantly
smaller than OmniWeb's.  On  the same system, the page indicated above renders
perfectly in OmniWeb, but renders without the Greek Extended characters in
Fizilla.  However, if I copy the text from the Fizilla window and paste it into
TextEdit, it is rendered perfectly. Note that Internet Explorer 5.1 beta has the
same limitations. It doesn't render properly in OS9.1 either, but OS9.1 has far
greater limitations in this regard, so for OS9.1 I would expect it to be the
OS's fault.

Page renders perfectly in Mozilla for Windows. 

See also Alan Wood's Unicode pages at
http://www.hclrss.demon.co.uk/unicode/index.html . Was unable to view many of
the ranges.
Comment 1 PTRourke 2001-07-17 06:56:22 PDT
Reporter changed severity to "enhancement" based on information from Unicode 
list:

>> Thus, to name one example, Internet Explorer is Carbonized but not
>> ATSUI-savvy, and so can't handle all of Unicode. Currently, most
>> Carbonized applications are not ATSUI-savvy.

> Same as Mozilla/ Netscape6. We only use ATSUI in a very
> limited way. . . . I heard that problem [with ATSUI] has been
> addressed with MacOSX. But I have not have chance to verify it.

Comment 2 Henri Sivonen (:hsivonen) 2001-07-19 01:54:18 PDT
An out-of-the-box (non-Greek) Mac OS X installation doesn't come with sufficient
fonts. ptrourke@ziplink.net, could you give a pointer to some ATSUI-compatible
free Greek font? (Does such a font exist?)

Sending over to i18n.
Comment 3 PTRourke 2001-07-19 06:35:25 PDT
TrueType font Athena Unicode by Jeffrey Rusten works with TextEdit, WorldText, 
OmniWeb, and OSX Mail, so I would assume that it's ATSUI compatible. I do not 
know how "free" it is, however. It is "free" as in "free lunch." His website is
http://www.arts.cornell.edu/classics/Faculty/Rusten/greekkeys/faq.htm; the font
can still be downloaded from http://www.jiffycomp.com/smr/unicode/athena.zip .


There are also shareware fonts: Code2000 by James Kass
(http://home.att.net/~jameskass/) I have tested; I have not tested the TrueType
Unicode fonts (i.e., have not tested on OSX yet) Vusillus Old Face Italic, by
Ralph Hancock (http://www.users.dircon.co.uk/~hancock/antioch.htm), or Titus
Cyberbit Basic, by the Titus Project
(http://titus.fkidg1.uni-frankfurt.de/unicode/tituut.asp). I think there's a 
licensing arrangement between TP and Bitstream, though.


Freeware fonts (whether "free" or free, I do not know) I have not tested include
fonts Georgia Greek, by Richard Spaulding Jr., http://www.uvm.edu/~rspauldi/, or
Cardo, by David Perry, http://members.telocity.com/~perryd/cardofnt.html.


I would be happy to test each of the above listed (TrueType) fonts in the
ATSUI-aware applications I listed at the top if so desired; I can provide email
addresses for most of these folks privately if requested.


I assume you're aware of http://fonts.apple.com/LastResort/LastResort.html?


Thanks!!!
Comment 4 nhottanscp 2001-07-19 14:10:22 PDT
Confirm, it shows on Windows 2000 but not on MacOS X.
Comment 5 nhottanscp 2001-07-19 14:13:08 PDT
Reassign to ftang.
Frank, who is showing '?' when viewing
http://www.methymna.com/unicode/texts/sappho31.html, mozilla or ATSUI?
Comment 6 Frank Tang 2001-07-23 14:13:17 PDT
We have code control which characters could be used with ATSUI in 
gfx/src/mac/nsUnicodeRenderingToolkit.cpp
Look at nsUnicodeRenderingToolkit :: ATSUIFallbackDrawChar
and nsUnicodeRenderingToolkit :: ATSUIFallbackGetWidth

The if condiction in the following lines control which character will be used by
ATSUI
275 if (nsATSUIUtils::IsAvailable()  
276 && (IN_STANDARD_MAC_ROMAN_FONT(*aCharPt)
||IN_SYMBOL_FONT(*aCharPt)||SPECIAL_IN_SYMBOL_FONT(*aCharPt)))
277 {


The reason that we use if condiction is because in 8.5 the ATSUI is very slow
(hang the app for several minutes) to display a page which have characters that
there are no font to render 

Try change line 276 and 309 to ")" and see what will happen on MacOS X.
We should not do that if it is very vey slow.
Maybe we should do that for only MacOS X
Comment 7 Frank Tang 2001-07-23 14:16:35 PDT
you can try to see how slow/fast of display different Unicode by visiting
http://people.netscape.com/ftang/demo/ncr.cgi?r=0x2&c=0x0
The extended Greek block is in 1f00 block 

http://people.netscape.com/ftang/demo/ncr.cgi?r=0x1&c=0xf
Comment 8 Frank Tang 2001-07-26 16:20:08 PDT
OK, I made a patch, this patch will make ASTUI for all character if ATSUI take 
less than one minutes to measure all Unicode code point as total. 

Comment 9 Frank Tang 2001-07-26 16:20:42 PDT
Created attachment 43725 [details] [diff] [review]
patch
Comment 10 Frank Tang 2001-07-26 16:39:07 PDT
I try  OmniWeb to view http://people.netscape.com/ftang/demo/ncr.cgi?r=0x1&c=0xf

 It basiscally hang OmniWeb for about 10 second on my G3. 
I think ATSUI still very very bad for the case of no glyph on any font. Untill
that get fix, there are no way we can use ATSUI for general purpose. The current
patch is still good, and while the ATSUI performance for worst case is fixed, it
will then use it automatically. 
Comment 11 Mike Pinkerton (not reading bugmail) 2001-07-26 16:53:09 PDT
will this init of atsui (the runthrough of 0000 to FFFE) occur the first time we 
try to render some non-roman font, or will this happen the first time we try to 
render any text at all?
Comment 12 Frank Tang 2001-08-07 11:01:35 PDT
It will be called the first time that we have no way to render the text using
ANY available Mac Script code. So... for example, if your Mac have MacRoman and
Japanese installed, the code will be init the first time we try to
measure/display some characters which cannot be displayed by MacRoman neither
Japanese (say Arabic , or the ISO-8859-1 "1/4" sign which are not part of
MacRoman). The good thing is it loop from 0x0000 to 0xffff one by one and once
the *total* time spend more than 1 second it will quit the loop and return fail.
We may need to tweak the value to a different value later. So... in the worst
case, we won't spend more than one second in this routine in the life time of
the application.  (Unless a single call to ATSUI to measure one character take
more than one second, but it will quit the loop right after that) 
Comment 13 Frank Tang 2001-08-09 08:56:32 PDT
I am close to have a better fix for MacOS X. I use FMGetFontTable to get the 
CMAP and reuse some of the cmap parsing code erik implemented on 
nsFontMetricsWin.cpp to decide which unicode could be render and put into a 
global bitmask cache (8K) . The performance seems ok ( need to measure) and I 
can see Yi characters after I install SIL Yi fonts. 

Test proceudre:
I. Install SIL Yi font:
I.1  download SILYI.ZIP from 
http://www.sil.org/computing/fonts/silyi/download.htm
I.2 extract the silyi.ttf, make sure the file is 'sfnt'
I.3 on MacOS X , drag the silyi.ttf into /Library/Fonts directory

II. See the Yi characters:
visit http://people.netscape.com/ftang/demo/ncr.cgi?r=0xa&c=0x0
the Yi characters is encoded in U+A000 ~U+A4C?

I will attach a patch later. 
Comment 14 Frank Tang 2001-08-09 09:20:14 PDT
The new patch include three attachment. Two new files and one diff.
Comment 15 Frank Tang 2001-08-09 09:21:55 PDT
Created attachment 45238 [details]
new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.h
Comment 16 Frank Tang 2001-08-09 09:22:38 PDT
Created attachment 45240 [details]
new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp
Comment 17 Frank Tang 2001-08-09 09:24:49 PDT
Created attachment 45241 [details] [diff] [review]
diff
Comment 18 Frank Tang 2001-08-09 09:32:10 PDT
My measurement show it take 1.894383539 second to run through 
nsMacUnicodeFontInfo::InitGolbal
on my G3 which have 128M RAM running MacOS 9.1 and have all language pack
installed (about 82 fonts). This ~ 2 seconds hit will only happen when we try to
render an unicode characters which cannot be converted to any available scripts
code. and after that hit, it run very smoothly. (because we won't ask ATSUI to
render any characters which do not have a glyph on the system)

The measurement is done before I add the Gestalt version code which make it only
run on MacOS X so if you want to measure by yourself, you need to #if 0 that
part of code and uncomment the line 
//#define TRACK_INIT_PERFORMANCE

it will print the nanosecond spend on the init code.

I cannot get a MacOS X number since I don't know how to show to console on MacOS
X yet. It feel liks similar speed. 

With this patch, we enable more uniocde rendering with Window TTF fonts. 
Comment 19 Frank Tang 2001-08-09 09:32:35 PDT
remove future 
Comment 20 Frank Tang 2001-08-09 09:48:53 PDT
The last mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp  won't link correctly on 
non Carbon build. Use the new one. 
Comment 21 Frank Tang 2001-08-09 09:50:12 PDT
Created attachment 45248 [details]
2nd version of new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/s
Comment 22 Jean-Marc Desperrier 2001-08-10 04:39:04 PDT
Franck,

I think it would be very useful if there was a way that someone, who uses a Mac 
OS version that has the performance problem, could force Mozilla to use ATSUI 
nevertheless, because they know they have installed fonts with the support for 
all the characters they need to display.

I know a number of people who really need a browser they can use to display 
ancient greek and can't switch to Mac OS X right now. 
They can install some of the TTF fonts ptrourke has listed, so that the 
performance problem will not hit them, because all the characters will be 
presents in fonts installed in the system.

So if there was an option : "force system to use ATSUI for unknown characters", 
in Apperance/fonts that would be great. 
Even an option, for that that would not be accessible through the GUI would be 
very useful.

In your patch http://bugzilla.mozilla.org/showattachment.cgi?attach_id=43725, 
that would be a way to force gUseATSUIInGeneral to 1, without calling  
TestATSUIForGeneralUsage();
Comment 23 Jean-Marc Desperrier 2001-08-10 04:42:43 PDT
By the way, up to now, under Mac OS, in appearance/font, I have never seen the 
TTF fonts listed.
Is this a bug or an enhancement needed ? 
Is this different in Mac OS X ?
Comment 24 Frank Tang 2001-08-10 11:30:02 PDT
 Jean-Marc Desperrier: Did you build?
can you try the patch in attachment (id=45248), it is much better . I can see 
those Unicode Yi characters there from the SIL YI font. I believe it will solve 
the Greek problem too.

Yes, the font name issue sound a seperate bug, could you file a new bug against 
me. thanks. It will be better if you can be explicit which font name is missing. 
That will help us debug. Thanks.  I am not sure which font have this issue right 
now. 

Comment 25 Jean-Marc Desperrier 2001-08-10 12:12:52 PDT
I have access to an eMac with Mac OS 9.1 FR, but without Code Warrior.
And I don't have access to it right now, maybe sometime next week.
I'd love to test that patch still, if you could make me a binary available.

I thought the support for FMGetFontTable was only in Mac OS X, and the solution 
applied only to Mac OS X. 
I've found that the doc says it's there from Mac OS 9 and up.
It won't work for Mac OS 8.5, 8.6 users, but that's getting better already.

About the font question : I have never seen any TTF font name appear in the list 
in appearance/font. I thought it was normal as ATSUI was not used.
Comment 26 Frank Tang 2001-08-10 14:10:56 PDT
>I've found that the doc says it's there from Mac OS 9 and up.
Somehow I cannot link in the non Carbon build with those function. Maybe we have 
not upgrade the library to the latest one yet.

Let's target this for MacOS X first. (as what I always said- inventor ignore). 
And once we hit that, then we try to bring it back for MacOS 9 users. How is 
that. If it is too difficult, at least the MacOS X users can get it.

>I have never seen any TTF font name appear in the list 
in appearance/font
Which TTF font name you see from other app ? Tell me some name so I can debug. 
thanks. File another bug for that, please. 
Comment 27 PTRourke 2001-08-10 14:58:46 PDT
> Let's target this for MacOS X first. 
> (as what I always said- inventor ignore). 
> And once we hit that, then we try to bring 
> it back for MacOS 9 users. How is that. If 
> it is too difficult, at least the MacOS X 
> users can get it.

As a user, that makes perfect sense to me; it will make sure that future Mac
users will have it (since all the new Macs have OS X); then you can worry later
about what will soon be in effect a legacy OS.  

Probably should open a new bug for that (Preference for ATSUI-based Unicode
support for OS9) when appropriate? 

Thanks.

Comment 28 Frank Tang 2001-08-15 12:24:45 PDT
Which TTF font name you see from other app ? Tell me some name so I can debug. 
thanks. File another bug for that, please. 
Comment 29 Frank Tang 2001-08-15 17:40:38 PDT
Created attachment 46016 [details]
3rd version of new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp new file
                                                      mozilla/gfx/s
Comment 30 Frank Tang 2001-08-15 17:44:25 PDT
please review the code
the needed stuff is
1. mozilla/gfx/src/mac/nsMacUnicodeFontInfo.h see 
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=45238
2. 3rd version of mozilla/gfx/src/mac/nsMacUnicodeFontInfo.cpp
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=46016
3. diff in mozilla/gfx/src/mac/nsUnicodeRenderingToolkit.cpp
http://bugzilla.mozilla.org/showattachment.cgi?attach_id=45241

pinkerton and sfraser, can you r= and sr= this one ?
on MacOS 9, it is usually a 1.8 second hit on my G3
on MacOS X, it is about a 2.3 second hit on my G3

We won't hit that 1.8 / 2.3 second untill we hit a characters which cannot be 
encoded in any traditional Mac script code. And that hit is one time per 
application launch. it is lazy .
Comment 31 Frank Tang 2001-08-16 10:54:08 PDT
A Big chunk of code in nsMacUnicodeFontInfo.cpp ('cmap' parsing )
are copy and modifie from nsFontMetricsWin.cpp wrote by erik

I hope we can land this by m94. Should be low risk 
Comment 32 Henri Sivonen (:hsivonen) 2001-08-20 11:57:23 PDT
BTW, the Hiragino font family that comes with Mac OS X has pretty nice extended
Greek glyphs. (I hadn't noticed until now. It has glyphs for the Russian subset
of Cyrillic, too.)
Comment 33 Henri Sivonen (:hsivonen) 2001-08-21 07:05:26 PDT
Oops. I looked at isolated glyphs. In the Hiragino family, the width Greek
glyphs is tuned for mixing with Han/Kanji blocks making the Hiragino fonts
unusable for pages where Greek is the dominant script.

Lucida Grande has variable-width glyphs for Greek.
Comment 34 Mike Pinkerton (not reading bugmail) 2001-08-21 16:46:06 PDT
a couple of things:
- InitGolbal(). Should that be InitGlobal()?
- I don't really care for macros. Use inline functions or c++ const decls 
instead. Macros don't give you the benefit of type-checking and they're a pain to 
wade through when there are compiler errors.
- you shouldn't be using PR_Malloc(). Use nsMemory::Alloc()/Free() instead.

fix those and r=pink.
Comment 35 Frank Tang 2001-08-30 17:42:48 PDT
I think it is too late for m0.9.4. change it to m0.9.5
Comment 36 Frank Tang 2001-08-31 13:16:21 PDT
mark p3.
Comment 37 msanz 2001-09-10 22:21:11 PDT
nsbranch- since Frank moved it to 0.9.5
Comment 38 J Luh 2001-09-23 13:56:06 PDT
Mozilla doesn't display the check mark character, hex 2713. The ? character
shows up instead. The TextEdit application displays this character properly.
Comment 39 Jean-Marc Desperrier 2001-10-04 10:10:59 PDT
Nothing new happening on this bug since end of August ?
Now it has missed the 0.9.5 Milestone.
Comment 40 Frank Tang 2001-10-08 10:37:48 PDT
move to 0.9.7
Comment 41 Frank Tang 2001-10-19 14:40:43 PDT
Created attachment 54269 [details]
nsMacUnicodeFontInfo.h
Comment 42 Frank Tang 2001-10-19 14:41:33 PDT
Created attachment 54270 [details]
nsMacUnicodeFontInfo.cpp
Comment 43 Frank Tang 2001-10-19 14:42:32 PDT
Created attachment 54271 [details] [diff] [review]
diff v2
Comment 44 Frank Tang 2001-10-19 14:46:20 PDT
pinkerton- can you nicely review this again ? I now use the compressed char map
bstell developed to save some space. The 8K map still needed on the heap.
Comment 45 Kathleen Brade 2001-10-22 08:59:00 PDT
I don't understand this method in the API:
  FreeGlobal
No code appears to call this (yet?)
Comment 46 Frank Tang 2001-10-22 10:29:40 PDT
>I don't understand this method in the API:
>  FreeGlobal
>No code appears to call this (yet?)
You are right, I forgot to hook it up.
Comment 47 Frank Tang 2001-10-22 10:41:52 PDT
Comment on attachment 54270 [details]
nsMacUnicodeFontInfo.cpp

forget to include code to free global
Comment 48 Frank Tang 2001-10-22 10:42:56 PDT
Created attachment 54499 [details]
v3 of nsMacUnicodeFontInfo.cpp which include shutdown observer
Comment 49 Frank Tang 2001-10-22 10:44:43 PDT
just update the nsMacUnicodeFontInfo.cpp to v3 so it include the code to clean
up the global memory when we shutdown. (code copy & modified from
nsFontMetricsWin.cpp)
Comment 50 Frank Tang 2001-10-22 15:51:33 PDT
pinkerton- can you review the code?
Comment 51 Frank Tang 2001-10-23 09:11:43 PDT
>a couple of things:
>- InitGolbal(). Should that be InitGlobal()?
addressed

>- I don't really care for macros. Use inline functions or c++ const decls 
>instead. Macros don't give you the benefit of type-checking and they're a pain to 
>wade through when there are compiler errors.
Try to sync with Window and GTK code. don't want to change this.

>- you shouldn't be using PR_Malloc(). Use nsMemory::Alloc()/Free() instead.
changed.
Comment 52 Mike Pinkerton (not reading bugmail) 2001-10-23 10:11:55 PDT
a couple of comments:

- there are several places where you increment by a constant. I'm assuming those
are sizes of characters or longs. use sizeof(PRUnichar) or sizeof(long) (or
whatever is correct) instead of the bare constants. If not, then comment why
you're adding the constant.
    PRUint8* p = buf + 2;
    ...
    p += 2;
    PRUint16 encodingID = GET_SHORT(p);
    p += 2;
    offset = GET_LONG(p);
    p += 4;

- there are some tabs mixed in with InitGlobalCCMap()
- use enums/constants for the format and platformID codes rather than just using
the comment to tell what they mean in FillFontInfoFromCMap()

Comment 53 Jean-Marc Desperrier 2001-10-23 11:05:53 PDT
I believe the constants here are values that are garanteed to be 2 or 4 byte 
long by the TTF format definition, not characters size.
But SIZE_SHORT and SIZE_LONG instead of 2 and 4 would probably be more explicit.

This said, what follows below does not look very good. That code could end up 
being reused on another architecture/the processor could change. 
Isn't there already a byte reversal macro somewhere that will do nothing if not 
needed ?

// this is an big endian version, not good for little endian machine
#undef GET_SHORT
#define GET_SHORT(p) (*((PRUint16*)p))
#undef GET_LONG
#define GET_LONG(p)  (*((PRUint32*)p))
Comment 54 Frank Tang 2001-10-23 15:04:57 PDT
>- there are several places where you increment by a constant. I'm assuming those
>are sizes of characters or longs. use sizeof(PRUnichar) or sizeof(long) (or
>whatever is correct) instead of the bare constants. If not, then comment why
>you're adding the constant.
>    PRUint8* p = buf + 2;
>    ...
>    p += 2;
>    PRUint16 encodingID = GET_SHORT(p);
>    p += 2;
>    offset = GET_LONG(p);
>    p += 4;

See nsFontMetricsWin.cpp . The code is copy from there. We try to sync with
window code there. They are offset in TrueType.

>- there are some tabs mixed in with InitGlobalCCMap()
I will fix that. 
>- use enums/constants for the format and platformID codes rather than just using
>the comment to tell what they mean in FillFontInfoFromCMap()
There are no meaning enums or constants for the format ID. The TrueType spec say
the following : 
(see 
http://developer.apple.com/fonts/TTRefMan/RM06/Chap6cmap.html
for details)

The 'cmap' formats

The Macintosh standard character to glyph mapping is supported by format 0.
Format 2 supports a mixed 8/16 bit mapping useful for Japanese, Chinese and
Korean. Format 4 is used for 16 bit mappings. Format 6 is used for dense 16 bit
mappings.

Formats 8, 10, and 12 (properly 8.0, 10.0, and 12.0) are used for mixed
16/32-bit and pure 32-bit mappings. This supports text encoded with surrogates
in Unicode 2.0 and later.
'cmap' format 0

Format 0 is suitable for fonts whose character codes and glyph indices are
restricted to a single byte. It is the standard Apple character to glyph index
mapping table.

Table 8: 'cmap' format 0
.....

'cmap' format 2

the best enum for it will be 'kTTCMAPFormat4' for 4, which are not different
from '4'

we probably can do something for platform ID

>This said, what follows below does not look very good. That code could end up 
>being reused on another architecture/the processor could change. 
>Isn't there already a byte reversal macro somewhere that will do nothing if not 
>needed ?
>
>// this is an big endian version, not good for little endian machine
>#undef GET_SHORT
>#define GET_SHORT(p) (*((PRUint16*)p))
>#undef GET_LONG
>#define GET_LONG(p)  (*((PRUint32*)p))

Again, try to sync with Window code. Any suggestion how to make it better? If
not, I will keep it this way. 


Comment 55 Mike Pinkerton (not reading bugmail) 2001-10-23 15:09:11 PDT
> See nsFontMetricsWin.cpp . The code is copy from there. 

well then perhaps we should fix them both, or add comments. Why are you adding
2? What's the significance of 2?

> There are no meaning enums or constants for the format ID

There might not be apple-defined constants, but that doesn't stop you from
creating your own. makes the code much more readable.
Comment 56 Jean-Marc Desperrier 2001-10-23 15:50:48 PDT
>>// this is an big endian version, not good for little endian machine
>>#undef GET_SHORT
>>#define GET_SHORT(p) (*((PRUint16*)p))
>Again, try to sync with Window code. 

x86's are little endian. 
If there's exactly the same in the Window code, then there's a problem in the 
comment (I assume the code works :-) ).

htonl, htons, ntohl, ntohs could do the trick, but they will not be in the same 
include file, depending on the platform.
If the data is indeed big endian (network order is big endian), you'd need  :
#define GET_SHORT(p) ntohs(*((PRUint16*)p))
#define GET_LONGT(p) ntohl(*((PRUint16*)p))

If we were to convert from host order to little endian, these macros are no use 
and something like this 
http://developer.gnome.org/doc/API/glib/glib-byte-order-macros.html
would be useful.

I suggest you go for kTTCMAPFormat4 for the format constants.
I is not different from the value 4, but it does say _what_ this value is.
Comment 57 Jean-Marc Desperrier 2001-10-24 07:20:06 PDT
I checked how this is handled in other Mozilla module.

There's a IS_BIG_ENDIAN and a IS_LITTLE_ENDIAN constant defined.

Suggested code replacement for both nsMacUnicodeFontInfo.cpp and  
nsFontMetricsWin.cpp so they will be synchronized :

#ifdef IS_BIG_ENDIAN 
# undef GET_SHORT
# define GET_SHORT(p) (*((PRUint16*)p))
# undef GET_LONG
# define GET_LONG(p)  (*((PRUint32*)p))
#else
# ifdef IS_LITTLE_ENDIAN 
#  undef GET_SHORT
#  define GET_SHORT(p) (((p)[0] << 8) | (p)[1])
#  undef GET_LONG
#  define GET_LONG(p) (((p)[0] << 24) | ((p)[1] << 16) | ((p)[2] << 8) | (p)[3])
# endif
#endif
Comment 58 Frank Tang 2001-10-24 10:36:33 PDT
ok, we focus on fixing mac now. I will address the review comment for the mac
code now. and push the change of window code into bug. Will fix it after the mac
after that. file bug 106488 for that. will bring the fix in this bug to window
on 106488.
Comment 59 Mike Pinkerton (not reading bugmail) 2001-10-24 10:40:49 PDT
ok, r=pink
Comment 60 Frank Tang 2001-10-24 18:24:15 PDT
Created attachment 54980 [details]
v4 of nsMacUnicodeFontInfo.cpp
Comment 61 Frank Tang 2001-10-24 18:26:19 PDT
pinkerton- I put a new version (v4) of the cpp file here . It include a lot of
comment and also define the enum fo rit. 
Please r= it . Thanks
Comment 62 Mike Pinkerton (not reading bugmail) 2001-10-25 07:06:59 PDT
Comment on attachment 54980 [details]
v4 of nsMacUnicodeFontInfo.cpp

r=pink. more verbose, but that's what i like.
Comment 63 Simon Fraser 2001-10-25 16:31:11 PDT
Comments on attachment 54980 [details]:

NS_IMETHODIMP nsFontCleanupObserver::Observe(nsISupports *aSubject, const 
PRUnichar *aTopic, const PRUnichar *someData)

You need to update your tree. The nsIObserer API changed to take a char* as the 
second parameter.


# endif
#endif
enum {
  eTTPlatformIDUnicode = 0,

Need a blank line here.

#define HEAD FOUR_CHAR_CODE('head')
#define LOCA FOUR_CHAR_CODE('loca')
#define CMAP FOUR_CHAR_CODE('cmap')

Would be nicer in an enum (like mac headers use).

  int isLong = GetIndexToLocFormat(aFont);

GetIndexToLocFormat returns a PRInt32 (which is a 'long' on Mac, not an 'int'). 
Also, the return value of GetIndexToLocFormat seems to be treated like a boolean. 
Why not make it one?

The code in GetSpaces() seems overly longwinded and fragile. Why not allocate a 
PRUint16 or PRUInt32 array, or use stack arrays, or wrap this detail in a class? 
Are the array so long that you have to malloc? And why is GetIndexToLocFormat() 
needed, when all it does is call FMGetFontTable?

static int spacesInitialized = 0;
static PRUint32 spaces[2048];

Static variables like this should have names reflecting that they are globals, 
like "gSpaces", "gSpacesInitialized".

  PRUint16 segCount = s[3] / 2;
  PRUint16* endCode = &s[7];

Still lots of magic numbers here. Why? Comments?

Generally, this code seems pretty involved and messy. Can it be made cleaner with 
some nice C++ classes that handle the different font types etc?
Comment 64 Frank Tang 2001-10-25 17:45:28 PDT
>NS_IMETHODIMP nsFontCleanupObserver::Observe(nsISupports *aSubject, const 
>PRUnichar *aTopic, const PRUnichar *someData)
>
>You need to update your tree. The nsIObserer API changed to take a char* as the 
>second parameter.

Will do in the last minutes. 

># endif
>#endif
>enum {
>  eTTPlatformIDUnicode = 0,
>
>Need a blank line here.

Where ? after #endif ?

>#define HEAD FOUR_CHAR_CODE('head')
>#define LOCA FOUR_CHAR_CODE('loca')
>#define CMAP FOUR_CHAR_CODE('cmap')
>
>Would be nicer in an enum (like mac headers use).

will do
>  int isLong = GetIndexToLocFormat(aFont);
>
>GetIndexToLocFormat returns a PRInt32 (which is a 'long' on Mac, not an 'int'). 
>Also, the return value of GetIndexToLocFormat seems to be treated like a 
>boolean. 
>Why not make it one?
not really, it could return -1 , 0 and 1
I will make it a PRInt32 instead


>The code in GetSpaces() seems overly longwinded and fragile. 
Code copy from nsFontMetricsWin.cpp. Origionally written by erik.

>Why not allocate a PRUint16 or PRUInt32 array, 
According to the TrueType spec, the format could be 16 bits or 32 bits, 
depend the return value of GetIndexToLocFormat . We can duplicate the 
allocation code and make two version and push into the   if (isLong) and else
block. Is that what you want to see?

> or use stack arrays
The size of the table will depend on the font. We cannot predict the max size of 
it.

> or wrap this detail in a class? 
I am not sure that will help.

>Are the array so long that you have to malloc? 
I think so. for those Truetype font on my Window box, the 'loca' table could 
have 76 bytes or 105K bytes. 

>And why is GetIndexToLocFormat() 
>needed, when all it does is call FMGetFontTable?
It call FMGetFontTable to get the 'head' table to find out the the format for
'loca' table. I copy it from the code erik wrote for nsFontMetricsWin.cpp

>static int spacesInitialized = 0;
>static PRUint32 spaces[2048];
>
>Static variables like this should have names reflecting that they are globals, 
>like "gSpaces", "gSpacesInitialized".
Well change.

>  PRUint16 segCount = s[3] / 2;
>  PRUint16* endCode = &s[7];
>Still lots of magic numbers here. Why? Comments?
You beat me. I just copy erik's code from nsFontMetricsWin.cpp
It is all document in the TrueType specification. see 
 // http://www.microsoft.com/typography/otspec/cmap.htm
I mention at the source code already
see the section "Format 4: Segment mapping to delta values "

I don't understand these code at all. I only know erik use it on window to 
handle format 4 and it work. 

>Generally, this code seems pretty involved and messy. 
as I said, it's copy from nsFontMetricsWin.cpp . blame erik, not me :)

>Can it be made cleaner with some nice C++ classes that handle the 
>different font types etc?
different font types? not sure what you mean. Please be more explicit.
Comment 65 Frank Tang 2001-10-26 18:05:29 PDT
sfraser, can you also review 

nsMacUnicodeFontInfo.h : 10/19/01 14:40
http://bugzilla.mozilla.org/attachment.cgi?id=54269&action=view

diff v2: 10/19/01 14:42
http://bugzilla.mozilla.org/attachment.cgi?id=54271&action=view

I will resubmit a v4 of nsMacUnicodeFontInfo.cpp
Comment 66 Frank Tang 2001-10-26 18:06:02 PDT
Comment on attachment 54980 [details]
v4 of nsMacUnicodeFontInfo.cpp

see v5
Comment 67 Frank Tang 2001-10-26 18:07:56 PDT
Created attachment 55346 [details] [diff] [review]
v5 nsMacUnicodeFontInfo.cpp
Comment 68 Frank Tang 2001-10-26 18:11:57 PDT
sfraser, please sr= the v5 patch. Thanks

>NS_IMETHODIMP nsFontCleanupObserver::Observe(nsISupports *aSubject, const 
>PRUnichar *aTopic, const PRUnichar *someData)
>
>You need to update your tree. The nsIObserer API changed to take a char* as 
>the second parameter.

addressed in v5

># endif
>#endif
>enum {
>  eTTPlatformIDUnicode = 0,
>
>Need a blank line here.

adderssed in v5 

>#define HEAD FOUR_CHAR_CODE('head')
>#define LOCA FOUR_CHAR_CODE('loca')
>#define CMAP FOUR_CHAR_CODE('cmap')
>Would be nicer in an enum (like mac headers use).

addressed in v5

>  int isLong = GetIndexToLocFormat(aFont);
>
>GetIndexToLocFormat returns a PRInt32 (which is a 'long' on Mac, not an
> 'int'). 
>Also, the return value of GetIndexToLocFormat seems to be treated like a >boolean. 
>Why not make it one?

change to PRInt8 in v5, as I said earlier, it is NOT a boolean.

>The code in GetSpaces() seems overly longwinded and fragile. Why not 
>allocate a PRUint16 or PRUInt32 array, or use stack arrays, or wrap this
>detail in a class? 
>Are the array so long that you have to malloc? And why is 
>GetIndexToLocFormat() 
>needed, when all it does is call FMGetFontTable?

see previous comment.


>  PRUint16 segCount = s[3] / 2;
>  PRUint16* endCode = &s[7];
>
>Still lots of magic numbers here. Why? Comments?
>
>Generally, this code seems pretty involved and messy. Can it be made cleaner >with 
>some nice C++ classes that handle the different font types etc?

see previous coment
Comment 69 Frank Tang 2001-10-29 10:08:04 PST
>static int spacesInitialized = 0;
>static PRUint32 spaces[2048];
>
>Static variables like this should have names reflecting that they are globals, 
>like "gSpaces", "gSpacesInitialized".
sorry, forget to change. Will do it next turn if you have other thing want me to
address.

Comment 70 Simon Fraser 2001-10-29 16:43:06 PST
sr=sfraser
Comment 71 Frank Tang 2001-10-30 17:37:06 PST
fixed and check in
Comment 72 Yuying Long 2001-11-15 16:56:01 PST
By looking attached in URL page:
http://www.methymna.com/unicode/texts/sappho31.html

I still see quite a few "?" displayed on Mac OS X (Mac9.1, WinME also) / 11-14
trunk build, while 11-14 linux trunk build and WinXP display fine.

Is that page good for test this feature?  Thanks! 
Comment 73 Jean-Marc Desperrier 2001-11-16 03:29:55 PST
I think that Windows XP has by default the arial unicode font which has 
almost all the unicode 3.0 characters.
The default fonts of Mac OS X apparently miss the ancient greek characters, but 
just installing any font that has them, including arial unicode, would solve 
that.
A friend of mine has already tested with success support for this range of 
characters.
If you can test if adding TTF font with support for this characters would work 
under Mac OS 9.1, that would be interesting.

Frank Tang was targetting Mac OS X only , but according to the Apple 
documentation the function that he uses in the patch should be available in 
Carbon under Mac OS 9.1. I don't know if the Mac OS 9.1 build has this patch 
enabled and if it could be compiled with the correct Carbon include to use this 
functions.

For reference :
http://developer.apple.com/techpubs/macosx/Carbon/text/FontManager/Font_Manager/
Functions/Accessing_Font_Objects.html

Mac OS X header: ApplicationServices/ApplicationServices.h 
Mac OS 9 header: Fonts.h

Availability
Supported in Carbon. Available in Carbon 1.0.2 and later when running Mac OS 9 
or later.
Comment 74 PTRourke 2001-11-16 05:38:24 PST
Ylong, please see above comments with regard to fonts.  The question marks are 
probably the Extended Greek range Unicode characters; the page reads fine, 
except for one glitch (the letter pi is displayed with the wrong glyph, 
different glyph in each font, bug 109461), on my computer iBook with the Cardo 
and Athena fonts installed in ~/Library/Fonts/ .
Comment 75 PTRourke 2001-11-16 05:55:18 PST
I have tested the OS9 builds; as of a week ago, they did not have this support 
enabled.  

Note You need to log in before you can comment on or make changes to this bug.