Closed Bug 176290 Opened 22 years ago Closed 21 years ago

enabling custom font encoding support in Xft-enabled Mozilla (mathml on xft)

Categories

(Core :: Internationalization, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.4final

People

(Reporter: jshin1987, Assigned: blizzard)

References

(Blocks 1 open bug, )

Details

(Keywords: fixed1.4.2, intl)

Attachments

(8 files, 30 obsolete files)

130.66 KB, image/jpeg
Details
9.16 KB, text/html
Details
2.03 KB, text/plain
Details
68.63 KB, patch
Details | Diff | Splinter Review
68.83 KB, patch
blizzard
: review+
blizzard
: approval1.4.1+
Details | Diff | Splinter Review
3.15 KB, patch
Details | Diff | Splinter Review
561 bytes, patch
blizzard
: review+
netscape
: superreview+
Details | Diff | Splinter Review
539 bytes, patch
Details | Diff | Splinter Review
This is copied from bug 126919. 

There are some 'dumb' TTFs with hack-encoding which can be used
to dramatically improve rendering of pre-1933 orthography Korean text. 
Those fonts have all the necessary glyphs, but either their encoding is
non-standard (atlhgouth advertised as Unicode) or  or they use
PUA code points. In both cases, a sequence of characters (Unicode)
has to be converted to a sequence of non-standard code points. 

For X11 BDF fonts, something similar has been done for 'johab?-1'  X11
fonts. With X11 BDF fonts, it's relatively easy to do because all 
that need to be done is to register 'font-encoding' (johab?-1) 
and to add a Unicode converter.  Now, I'm wondering how to do 
the same thing with Xft-enabled Mozilla. This font/encoding
has to be font-specific and Mozilla should use different
'converters' based-on 'family-name' of a font in use. 
(see Gnome bug 95708 at http://bugzilla.gnome.org/show_bug.cgi?id=95708
and my patch there)  

i------------------quote from bug 126919 -----------------
My question is whether it's possible to provide a 'dumb' TTF 
with some intelligence. What do I mean by this? It's kinda like 'font-hack'
for X11 core fonts with (non-)standard/custom encoding. (see bug 126919 comment
#112).

There are some Korean TTFs included in Old Korean tools from Microsoft
for pre-1933 orthography text. They're just
plain TTFs with no intelligence in the sense that they don't have any
opentype tables (gsub, gpos, etc) for the proper rendering of Hangul Jamos
(U+1100). However, they sorta have more or less all the necessary glyphs 
in PUA (and some bad ones have glyphs assigned code space of BMP).
What's missing is a *font-specific* mapping from a sequence of Unicode 
code points to a sequence of glyphs in PUA. I added these mappings
to Yudit(http://www.yudit.org) and Lambda(Unicode-enabled LaTeX) and
have been working on adding them to Pango
1.1(http://bugzilla.gnome.org/show_bug.cgi?id=95708). 

With X11 core fonts,
it's relatively easy to support this custom-encoding and it can be done
pretty quickly (as is done for 'johab*-1' fonts mentioned in comment #112).
Now, I'm wondering whether it's doable with Xft and how hard it would be and
where I have to look if possible. 

The reason I thought this might be of interest to others is:  
until OTF support is in place and working OTFs are available,
the proper support of Indic script is hard to come by. However,
recently a set of free fonts for various Indic scripts were released
under GPL. Unfortunately, they're not OTFs but plain TTFs with
custome-encoding. Still they have necessary glyphs for rendering
of the target script, but without mapping from a seq. of
Unicode char. to a seq. of glyphs, they're useless. The situation seems to be
similar to Korean situation aforementioned. 

Any idea or comment would be appreciated. 
 


------- Additional Comment #316 From Jungshik Shin 2002-10-16 06:41 -------

comment bug 126912 comment #87 by Owen gave a generic answer (ans. to q. 4, for
example) 
and it reminded me that Mathml is another area in which some sort of 'font-specific'
hack is needed (at least for now).
 

> If fonts are encoded using some sort of standard,it should just be
> a layout level issue. If it involves recognizing specific fonts, and
> knowing how they are encoded, it will be a little bit difficult to
> fit that sort of hackery on top o fthe fontconifg framework, 

> though Mozilla 
> could certainly just ask for those fonts specifically.

  Suppose a specific font with custom encoding (i.e. the mapping
from a seq. of Unicode characters to a seq. of glyphs in the font
is not trivial one-to-one, but the font is dumb and is not an
OTF but a plain TTF) is explicitly specified via CSS font-family 
in a html document. fontconfig would just hand in that font to
Mozilla.. then, taking care of this font-specific mapping/encoding
is  the responsibility of what you're refering to as layout... 
I've gotta figure out how it's done with hopefully some help ...
--------------quote-end-----------
adding some people to CC. 
Summary: enabling font-spefic 'hacks' (for hack-encoded fonts) in Xft-enabled Mozilla → enabling font-spefic 'hacks' (for hack-encoded fonts) in Xft-enabled Mozilla
->blizzard@mozilla.org
Assignee: yokoyama → blizzard
Blocks: xft_tracking
Status: UNCONFIRMED → NEW
Ever confirmed: true
Related to this bug is bug 128153, which is in turn related to 
bug 127063 and bug 35236. Probably, bug 127063 should give
some good hints. Rendering MathML requires 'font-specific
hacks' and the same is true here.
Keywords: intl
QA Contact: ruixu → ylong
Adding ftang to Cc because he appears to be interested in rendering Indic
scripts with
truetype fonts in non-Unicode/non-standard (from the point of view of TTF standard)
encodings according to his messag to Unicode list.

BTW, I filed bug 176315 for converters to use for Old Korean text rendering when
this bug is solved. 

FYI, 35 GPLed Indic truetype fonts with non-Unicode/non-standard Cmap are available
at http://www.akruti.com/freedom  
GNU India(?) reportedly plans to convert
them to OTFs eventually, but it'll take a while. In the meantime,  I think users
of Indic scripts
will benefit a lot if this bug can be resolved. 

This must be  obvious to all here. Anyway, for the record...
Something similar to what's done for bug 155703 (MathML in
MacOS) and what's done in nsFontMetricsWin.cpp for MathML and custom-encoded 
fonts has to be done for Xft. 
Just wanna put this somewhere safer than my disk.
Attached patch a patch (obsolete) — Splinter Review
This one is actually working although it needs some (if not a lot of)
polishing (code clean-up, hashing fontface - encoding pairs, etc).
Glyph positioning also needs some more work.

It uses fontEncoding.properties file in the same format
as used by Mozilla under MS Windows and MacOS for MathML 
and other custom-encoded fonts (see bug 176315 and bug 177877 for 
Korean case).

Anyway, with this, I believe Mozilla-Xft will be able to
support rendering Indic scripts with custom-encoded fonts
(for which we need a converter. If there's a TTF with
Sun-Indic encoding, we might be able to test it now....)
as well as Korean. In case of MathML, for sure more work
is necessary, but IMHO, the foundation is laid.   

Chris, could you take a look?
Attached file a sample fontEncoding.properties file (obsolete) —
Mozilla-Xft's (with a couple of problems in attachment 105283 [details] [diff] [review]  fixed)
rendering of http://jshin.net/i18n/korean/hunmin-ng.html
A comparable screenshot is attachment 105012 [details] (for bug 177877) taken with
Mozilla under MS Windows.
Attachment #105188 - Attachment is obsolete: true
Do you have some good test pages for testing this?  Better yet, can you upload
some as attachments to this bug?
Yes, I have several :-)
http://jshin.net/i18n/korean/hunmin-ng.html  
http://jshin.net/i18n/korean/hunmin-og.html
http://jshin.net/i18n/korean/ogtest.html
http://jshin.net/i18n/korean/ngtest.html

You have to install Ogulim and Ngulim fonts (the instruction is given
in bug 176315). They're just TTF's with custom encoding so that
once you download them, you can just install them in an usual way.

To see them in action,
you also have to apply my patches for bug 176315(there's one minor
change I've made to one of new files in that bug since. I'll
update that file after writing this.) as well as my
latest patch to this bug (which I'm gonna upload soon). The screenshot
uploade last night was taken after I made a ccouploe of crucial changes in 
attachment 105283 [details] [diff] [review]. 

Alternatively, you can install Mathematica Truetype and TeX computer
modern *truetype*(as opposed to T1) fonts (http://www.mozilla.org/projects/mathml).
In that case, you have to replace 'math[1-5]' in  fontEncoding.properties file
with 'mathematica[1-5]' . This is required because family names
exported by fontconfig are 'mathematica[1-5]' instead of 'math[1-5]'.
I don't know how this works under Windows because I suspect 
MS Windows would do the same. If you like, instead of
modifying fontEncoding.properties file,  you can alias
'math[1-5]' to 'mathematica[1-5]' in fonts.conf file of fontconfig.
Fonts like 'Math1Bold' still don't work, though with either of 
two methods. (Can I alias 'Math1Bold' to 'Mathematica1:Bold' in
fonts.conf?) 

 Then you can
put up a page with math symbols in Unicode code points
(instead of font-specific code points) to see if the  conversion from Unicode
code points
for math symbols to font-specific code points work. Well, I'll make one and
attach here soon. 
Attached patch a new patch (obsolete) — Splinter Review
I planned to do some clean-up and implement hashing(if necessary), 
but let's put it off for a while to let Chris play with it.
Attachment #105283 - Attachment is obsolete: true
I'll look at this very soon since I'd like to have something in place that does
this kind of thing.

I'm not hot on the idea of subclassing nsFontXft - there might be a cleaner way
to do it.  If I come up with something better I'll try to implement it.
Attached file fontEncoding.properties (obsolete) —
If you leave fonts.conf alone, you have to put this in res/fonts. 
'math' in 'math[1-5]' is replaced by 'mathematica'.
This is a very simple test file I _manually_
made out of cmr10.uf file. Later, I might put up 
a set of html files testing all custom fonts
used in MathML (no promise)

To make this work, a new patch I'm gonna upload
has to be applied. The last patch doesn't handle
'narrow' fonts in nsFontXftCustom::FillSpecBuffer.
This is just to show that it works with 'narrow' fonts
as well as wide ones. I forgot to take care of 'narrow'
fonts in my prev. patch. 

Chris, 
As for subclassing nsFontXft, it'd be great if you
come up with a better solution.
Attachment #105284 - Attachment is obsolete: true
Attachment #105330 - Attachment is obsolete: true
Attached file Mathematica1 font test page (obsolete) —
more interesting than CM Roman.
Some chars. are rendered as unknown glyphs or hollow boxes.
I have to figure out what's going on...
Attachment #105345 - Attachment is obsolete: true
This should be rendered like attachment 105289 [details].
(or screenshots at bug 176315 and bug 177877)
Attached file Computer Modern Roman (cmr10) (obsolete) —
Sorry for spamming...
Having too many fonts (with ext. coverage) makes testing this
rather hard for MathML fonts. In a new test page
for CM Roman, I added a column with font-family
set to 'serif' for comparison. The result is not what I expected.
Perhaps, gdb is better for making sure this is working...
Attachment #105350 - Attachment is obsolete: true
Comment on attachment 105357 [details]
Computer Modern Roman (cmr10)

I'm sorry I  forgot about uf file format (now I refreshed my  memory) and
produced totally useless test pages for MathML fonts. Please, ignore all
MathML font testing cases.
Attachment #105357 - Attachment is obsolete: true
> In that case, you have to replace 'math[1-5]' in  fontEncoding.properties file
> with 'mathematica[1-5]' . This is required because family names
> exported by fontconfig are 'mathematica[1-5]' instead of 'math[1-5]'.
> I don't know how this works under Windows because I suspect 
> MS Windows would do the same.

You need to install the *Mathematica 4.1 fonts*... Otherwise the names and the
.uf files don't match up. Wolfram renamed their 4.2 fonts and changed their
encoding :-(

For testcases, see http://www.mozilla.org/projects/mathml/fonts/encoding/
The font tables are given from the links on the right sidebar. For example, the
character map that is displayed on the fly by Mozilla should match,
glyphIndex-by-glyphIndex, with the screenshot of the font's own character map:
http://www.mozilla.org/projects/mathml/fonts/encoding/math1-encoding.html
http://www.mozilla.org/projects/mathml/fonts/encoding/math1.gif
Thank you for info. on MathML visual encoding pages and Mathematica
fonts. Before your comment arrived in my mailbox, without
knowing about visual encoding pages (I should have looked
around MathML pages), I ended up reinventing something 
similar(much less pretty and less extensive) by hacking nsconv.cpp.
My test pages do things from the opposite direction 
(from Unicode to font-encoding of MathML fonts). Anyway,
attachment 105347 [details] [diff] [review] works fine with the following change
except for characters in US-ASCII whose 'glyph codepoint'
in math fonts are different from US-ASCII/Unicode codepoints.
(see below the change necessary for attachment 105347 [details] [diff] [review])



There's a problem with sign extension in attachment 105347 [details] [diff] [review]. 

+    if (! mIsWide) {  // 'narrow' fonts as used by MathML
+        char* str = buf;
+        char* pstr = str;
+        while (pstr < str + destLength) {
+            (*aSpecBuffer)[*aSpecBufferLen].font = mXftFont;
+            (*aSpecBuffer)[*aSpecBufferLen].ucs4 = *pstr;
+
+            nscoord x = aX + *aXoffset;

The following line 
+            (*aSpecBuffer)[*aSpecBufferLen].ucs4 = *pstr;

has to be

+            (*aSpecBuffer)[*aSpecBufferLen].ucs4 = PRUint8(*pstr);


Back to rendering problem with chars. in US-ASCII....
For instance, in math1 font, U+0022 is at 0x8d instead of
0x22. Because DrawString(char *,...) is not modified
to take care of custom-encoded fonts(only
DrawString(PRUnichar *,...) was modified),
U+0022 is drawn with the glyph at 0x22 instead of
the glyph at 0x8d. DrawString(char *,...) can be modified
similarly to the way DrawString(PRUnichar *...) is modified,
but it's optimized for rendering US-AsCII chars.(< U+0080)
so that we'd better not touch it unless MathML needs this change.
(I don't think there's any font other than math fonts that
has non-standard/custom encoding for characters below U+0080
and I guess we can assume that nobody would use fonts like
Computer Modern to render US-ASCII characters except
for MathML visual encoding test pages).


Therefore, my question to rbs is if MathML layout engine
ever calls nsTextFrame::PaintAsciiText() or (in an unlikely
case) directly nsRenderingContextGTK::DrawString(char *). 
(they're where nsFontMetricsXft::DrawString(char *) comes
from). If not (i.e. nsFontMetrics...:DrawString(char *)
or nsTextFrame::PaintAsciiText() is never used by 
MathML layout engine), I think we
can just leave nsFontMetricsXft::DrawString(char *) as it
is now.

Another related issue has to do with the way 'em width' and various
other metrics are measured. It's assumed that
 every font has US-ASCII characters
(like 'a','W', 'M') at their standard codepoints. This
assumption breaks down for math fonts. Again 'the custom
encoding' might be taken into account, but it would slow
down things and I'm wondering if we have to. 
 
> Therefore, my question to rbs is if MathML layout engine
> ever calls nsTextFrame::PaintAsciiText() or (in an unlikely
> case) directly nsRenderingContextGTK::DrawString(char *). 

This is bug 73539. Don't worry about that for now. (The solution is that GFX
should first check that the current font that is advertised as ASCII has all the
ASCII characters. If not, it should kick off the font-swithing in the usual way.)

> Another related issue has to do with the way 'em width' and various
> other metrics are measured. It's assumed that
>  every font has US-ASCII characters
> (like 'a','W', 'M') at their standard codepoints. This
> assumption breaks down for math fonts.

Don't worry. There is no problem. The CSS metrics are always fetched from an
ASCII font. See RealizeFont() and CacheFontMetrics(). Once |mWesternFont =
FindFont('a')| is called, the CSS metrics are taken from there). So, for
example, suppose you have <span style="font-family:Symbol"> ...a symbol text...
</span>.
GFX will hunt for the first ASCII font that it can find in your system,
forgetting your 'Symbol' font. If the font-family list happens to include a font
that contains the ASCII set, then fine, that's the one used since FindFont('a')
will find it. If not, your symbolic fonts are forgotten and the CSS metrics
('em', 'ex', ...) may be picked from any other font on your system with the
ASCII set. This guarantees that the CSS metrics are always sensible.

(But when working on bug 155703, I recently noted that the Mac is lagging behind
and doesn't work like that. Unfortunately, the Mac doesn't receive as much
attention, scrutiny and iterations as in the other platforms.)
Thank you for your detailed reply. 

> GFX will hunt for the first ASCII font that it can find in your system,
> forgetting your 'Symbol' font. 

  My cursory look didn't lead me to think that GFX 'forgets' about 'symbol' font
when hunting for mWestern. Anyway, now I have one 
fewer to worry about thanks to your assurance.

I've been wrestling with mathematica fonts and Symbol (included in
MS Windows) for about two hours because
with my patch applied, Computer Modern and MathExtra work just as
expected. However, Mathematica fonts have trouble with any 
chars positioned above 0x80 ('glyph indices'). Mozilla's converter
does its job in nsFontXftCustom::FillSpecBuffer well. That is,
U+00A9 is replaced by 0xd3 for Symbol font. However, U+00A9 is
drawn not with glyph at 0xd3(as shown by pfaedit) but with glyph
at 0xee.  I couldn't help wondering what's going on. 
Then, I realized that mathematica
fonts and Symbol font may not have Unicode Cmap. That turned out to
be the case. Mathematica fonts have Mac Roman cmap (platform ID = 1,
encoding id=0). As for Symbol font, it has both Mac Roman cmap
and Unicode cmap(Pid=3, eid=1), but somehow MacRoman Cmap
is given a higher priority (perhaps, Xft/fontconfig uses
the first Cmap found...)  

Anyway, here's what I found. When a glyph for U+00A9 is requested,
it's  replaced by 0xD3 by nsUnicodeToSymbol converter. Then, this
is fed into Xft which treats 0xD3 as a Unicode code point and
map it to the glyph index of Symbol font using MacRoman Cmap. 
U+00D3 in MacRoman is 0xee and U+00A9 is drawn with the glyph
at 0xee in Symbol font. The same thing happens to other characters.
In case of U+00AC, the path is 0xd8 -> 0xAF (a glyph for U+2193 
in Symbol font). 

Xft is doing its job as it's supposed to do and we can't blame it.
There are two solutions to this problem. One is fix Mathematica
and Symbol fonts, but they're not in our hands so that probably
we can't do anything about them. The other is to introduce
a new set of converters for math[1-5] and symbol fonts.
For this option, we have another branch points. We can 
assign separate set of internal encoding names (for instance, x-math-xft-[1-5])
to a new set of converters. Alternatively,
use different converstion tables for identical internal
encoding names (x-mathematica-[1-5], and Adobe-Symbol-Encoding)
depending on Xft option is turned on or not. Because
turning Xft means not using X11core at all, this doesn't present
any problem. It also avoids increasing the size of shared library
for ucvmath module.   Therefore, I'd propose going with
the second option. This should be easily done by just generating 
6 *uf files and inserting #ifdef's in 6 files in ucvmath directory.

I'll file a new bug for this. I'll just do that if rbs agrees with
me on this issue.   

The most sensible solution to me seems to instead use other APIs of Xft that
would accept the indices as is. On Windows, that's why we use the 'A' functions
as you
saw in bug 177877.
I agree with rbs -- if you can't use real Unicode values, and must use
font-specific encoding information, you should be using the glyph APIs to Xft
instead of the Unicode ones.  The Unicode API is a simple wrapper above the
glyph index API, so there's no performance reason for attempting to take the
longer path.

Of course, I'd prefer to (somehow) use Unicode throughout, but that doesn't look
doable right now, given the state of the available math fonts.
I agree, too.  For these fonts in particular we should just use the glyph APIs
since we can't use pure-unicode APIs.
Thank you for your comments.

Perhaps, I'm on the losing side :-), but let me make my case while the jury
is still out.

I'm afraid using XftDrawDrawGlyphFontSpec instead of XftDrawCharFontSpec
only for 'custom-encoded fonts' (actually *only* Mathematica and Symbol
fonts fall to the category) would introduce yet another branch point
in nsFontMetricsXft and make it more complicated (than otherwise).
It has the following comment : ((somehow, I like this quite a lot...)

     // We use XftDrawCharFontSpec to do our drawing here so that even
    // though we might have many different fonts at many different
    // locations due to spacing, we can do it all in one shot.  This
    // is much faster from a protocol and code standpoint.

And down the line, it has :

    // Go forth and blit!
    XftDrawCharFontSpec(draw, &color, specBuffer, specBufferLen);


DrawString() fills SpecBuffer with multiple text segments to be
rendered with different fonts among which there are fonts requiring
custom-encoding and direct gindex access. Using XftDrawDrawGlyphFontSpec
for fonts requiring direct gindex access means breaking up
a run of text into multiple segments and invoking
XftDrawGlyphFontSpec or  XftDrawCharFontSpec separately depending
on fonts required for rendering them.  If it's regarded as the
most sensible, that should be done(DrawString has to be
 rewritten the way it's written in Windows/X11core). However,
IMHO, it's not necessary.

  Just *inverting* (in advance) what Xft/FT2 would do (converting
what's regarded as UCS4 to MacRoman encoding) for fonts
that advertise themselves as having MacRoman Cmap when they do NOT
involves a simple change (although it took me some
eye-blearing hours to come up with the correct mapping table for that.
Mozilla's MacRoman encoder/decoder are not up to date while FT2/Xft
apparently use the newest MacRoman converter and all MathML
fonts mysteriously disappeared from the list of fonts offered
by fontconfig getting me confused a lot.) in
6 files (nsUnicodeToMathematica[1-5].cpp and nsUnicodeToSymbol.cpp),
adding 6 new uf files to the source tree and adding '.wide' suffix
to entries for mathematica fonts and symbol fonts in fontEncoding.properties
file (assuming my patch here is used at least
in terms of identifying 'wide' fonts). Because Mozilla-X11core
doesn't use fontEncoding file at all and new uf files are conditionally
compiled in only for Mozilla-Xft, I think the impact would
be minimal.

The only problem I see here is that when going from
genuine Unicode codepoints to glyph indices, two consecutive
steps are the inversion of each other leading to no-op. in
net. However, there isn't much extra cost involved. Because
the first step (genuine Unicode code points to x-mathematica-*/symbol
charset) has to be done in any case. Currently, Unicode -> x-matehmatica/symbol
converters are single byte converters, but
for Xft *only*, they have to be double byte converters (because
now we have to map Unicode characters to Pseudo Unicode code
points in x-mathematica charset and then convert
them 'back to Unicode codepoints' pretending that they're
MacRoman code points some of which correspond to Unicode characters
above 0xff. Of course, all these steps are done in generating
uf files and at run-time the conversion is still a single
step.). They're still small and compact and I think little
harm is done to make them double byte converters for Mozilla-Xft.
In addition, this may sound like a bit too much hackery, but does it
really? 

      
By directly using glyph indices, we can save sometimes
at the Xft/FT2 layer because Xft/FT2 doesn't have to worry
about char to glyph mapping. However, wouldn't it be offsetted
by additional complexities in DrawString (using
 XftDrawCharFontSpec and XftDrawGlyphFontSpec depending on
the kind of fonts)? 

Whichever way may be eventually taken as the best,  I'm going
to upload 6 uf files along with patches to nsUnicodeToXXX.cpp files
and a revised fontEncoding.properties file for you to take a look
at and play with if you want to. 

Attachment #105347 - Attachment is obsolete: true
You can always use XftDrawGlyphFontSpec and eliminate the XftDrawCharFontSpec
path.  XftDrawCharFontSpec is just a wrapper on top of XftDrawGLyphFontSpec
anyways.  One extra line of code is needed to convert the character into a glyph
index.
@see XftDrawCharFontSpec() at 
http://cvsweb.xfree86.org/cvsweb/~checkout~/xc/lib/Xft/xftdraw.c

<revision>
Re: comment 25
> > GFX will hunt for the first ASCII font that it can find in your system,
> > forgetting your 'Symbol' font. 
>
>  My cursory look didn't lead me to think that GFX 'forgets' about 'symbol' font
> when hunting for mWestern. Anyway, now I have one 
> fewer to worry about thanks to your assurance.

I re-read the code and your are right indeed. In fact I saw 2 or 3 remaining 
core issues with the current code which might need revisting.
1) It only looks inside the local font-family list. On the other GFX ports, if
your CSS has "font-family: Symbol, serif", it will boil down to something like:
  FindFont(c) means:
    FindLocalFont(c); // i.e., lookup the element's own font-family list: Symbol
    if (!found) FindGenericFont(c); // lookup fonts that are of 'serif' type
        if (found) FindGlobalFont(c); // lookup other global fonts in the system
(then if might fallback to a substitute font if everything fails)

The current code seems to be only doing the equivalent of FindLocalFont(c).

2) Another problem with the current code is that it isn't _lazy_. So if your CSS
has "font-family: Verdana, Arial, serif" as folks often do on the web, and you
want to render 'a', then the code will load _both_ Verdana and Arial, even
though the letter 'a' is in the first font and so loading all subsequent fonts
is wasteful (especially with embedding in mind). If done lazily, the code would
be optimized by the fact that fonts would be loaded on demand, if/when
font-switching becomes necessary -- something rare in the common case (Western
documents). From what I could tell from  bug 155703 the Mac has this problem too.

So further iterations might be needed at some point in these areas.
</revision>
Thanks a lot. I was too much into that gidx has to be used *only*
for fonts with broken cmap (mathematica and symbol) to come up with
an obvious idea that we can  eliminate  'char-drawing' path entirely,
I'll make a new patch after booting back to Linux after working a while
on Mozilla-Windows.
Blocks: 128153
> The most sensible solution to me seems to instead use other APIs of Xft 
> that would accept the indices as is. On Windows, that's why we use the 
> 'A' functions

  When 'Xft*DrawChar*' is used, there are two conversions involved.
One is the conversion from UCS(pseudo or genuine) to the encoding
of the Cmap that a font claim to have(let's call it 'f1'). In Math[1-5] font, it's
MacRoman. The second one('f2') is MacRoman code points to glyph indices.


  On Windows, using 'A' function means ignoring what kind of Cmap
a font claims to have and just feeding 'indices' to 'char to glyph
index' converter. That is, even though Math[1-5] fonts claim to
have MacRoman Cmap, 'A' function does NOT convert pseudo-unicode
code points from Mozilla to MacRoman code point. It just 
hands over 'pseudo-Unicode' code points to 'char to glyph index
converter' . In other words, it bypasses 'f1' and just invokes 'f2'.

  In Xft, using 'Xft...DrawGlyph..' means bypassing both 
'f1' and 'f2'. Obviously, that doesn't work. Is there an Xft API
that bypasses only 'f1' but still does 'f2'?  

  Otherwise, I'm afraid we'd have to resort to what I suggested.

  My proposed solution is to invert 'f1' in advance (not at run-time
but when 'uf' files are generated) so that even though
'Xft..DrawChar..' (or XftCharIndex() ) does both 'f1' and 'f2',
the effect of 'f1' is cancelled out and only 'f2' gets effective.
>Is there an Xft API that bypasses only 'f1' but still does 'f2'? 

 More specifically, I'm looking for a version of XftCharIndex
that bypasses 'f1' but still does 'f2'.  
Most TrueType fonts provide multiple charmap tables.  Fontconfig unifies them to
map as much of the font as possible from Unicode. I have many fonts which
provide an  Apple-Roman table as well as a Unicode table; the Unicode table
providing no entries for characters found in the Apple-Roman table.

If you want to use a specific mapping, you can use the FreeType APIs directly
(FT_Select_Charmap and FT_Get_Char_Index).  As Fontconfig also uses these APIs,
you'll have to take care to ensure the desired mapping is still valid across
calls into Xft or Fontconfig.
Thank you.  It now works well. I also had to  use 
XftGlyphExtents instead of XftTextExtents8 for math fonts.
Besides, nsFontXft is now an abstract class inherited by
nsFontXft(Unicode|Custom|CustoWide). 
Do I have to cache 'fontinfo'(fonttype, cmap,etc)? Here, they're not as heavily
used as in nsFontMetricsWin
Attachment #105628 - Attachment is obsolete: true
Attachment #105629 - Attachment is obsolete: true
Attachment #105630 - Attachment is obsolete: true
Attachment #105837 - Attachment is obsolete: true
I'm sorry for spamming last Tuesday by pressing the submit button twice.
Attachment 105969 [details] [diff]
and attachment 105970 [details] [diff] [review] are identical.
Anyway, Chris, keith, and rbs, could you review my latest patch? 

BTW, I came across bug 94319 (while making a fresh patch for bug 177877) and then
bug 179946. I have an idea to fix bug 179946 for Xft, but I guess we can deal
with it
after this bug is taken care of. 
Attached patch a new patch (obsolete) — Splinter Review
I made a new patch against the current CVS. There are also a couple of minor
changes. 

Chris, could we move this forward?  Pls, let me know and work on/change if
there's anything
you don't like in the patch.
Attachment #105969 - Attachment is obsolete: true
Attachment #105970 - Attachment is obsolete: true
Yeah, I haven't fogotten about this.  Sorry, I've been working on gtk2 bugs. 
I'm really interested in seeing something like this go into the tree.
Glad that you didn't forget this bug :-). I'm really eager to see this added to
Mozilla. 
So, when you have a little bit of time, can you review my patch and suggest what 
you want me to do differently ?  Thank you.
Yeah, I might go ahead and check in small parts of it as well.  Like using
glyphs instead of the character bits.

I'm not sure if I want to use seperate subclassed font classes either.  We'll
see.  I need to look at the problems in depth.  Hopefully I can do this soon.  I
have a few more crashes to fix first before I go adding new features.
Yeah... certainly, it's better to understand the problem in depth than to rush
in sth. 
Just as a data point, I borrowed the idea of mapping different 'types' of fonts
to separate subclasses from Windows code because it seemed pretty natural to me. 
Mind iterating on your patch in the meantime? E.g., aggressively recast the
system in terms of glyph indices across the board to avoid the confusion of this
|FILLSPECBUFFER|. Also, |mCharset| seems restricted since certain fonts (e.g.,
Arial) do embed several subfonts themselves. Hence using mCMap for all fonts
seems richer and more reliable. In other words, the more you make it ressemble
the Windows code which has endured the test/refinement of time, the better.
(No need to attach all the patches of your experimentations. Only those that you
like the best :-)
Thanks for your suggestion.
I'd love to, but I'm afraid I can't because
I'm away from my place and won't be back until late January
(although I'll try to come up with a way to make a build-env.
while being away). 
Blocks: 179946
Attached patch a new patch (obsolete) — Splinter Review
I combined two subclasses of nsFontXft for custom-narrow and custom-wide
fonts into one |nsFontXftCustom| (as was done while refinining
my patch for bug 177877) and made FILLSPECBUFFER macro much less
scary than before. I also introduced |nsAutoCharBuffer| for
automatic alloc/dealloc. It's not so helpful as it is, but
will help more when |GetBoundingMetrics| is implemented for MathML.
It'd be even more so if I could use template as I did
in bug 177877, but I guess I'm not allowed to for gtk builds
to support multitude of compilers.
Attachment #108480 - Attachment is obsolete: true
Ooops. I didn't notice that Chris had fixed bug 182877. attachment 116173 [details] [diff] [review] was
against the cvs before
that patch was committed. Moreover, it had a flaw in |FillSpecBuffer| because by
a slip
of fingers, I uploaded the second newest version (instead of the newest version)
among my patches.
Anyway, I'll reconsile my patch with the current cvs.
 
Thanks for the updated patch.  I've finally got some time this week to spend on
it and I'm going to try to do some testing and hacking on it.
Glad that you'll have some time to work on it.

Here's my latest patch against the current CVS.
Reconsiling my patch with the patch for non-BMP support turned out
to have a subtle point (causing off-by-one error)
which doesn't show up unless one tries to view a page with a mix 
of BMP and non-BMP characters in *multiple* planes. For instance, 
pages like http://www.alanwood.net/unicode/old_italic.html
isn't likely to  reveal the problem.
I used my test page(not so complex but still a bit more complex than
a simple char. table)  with a mix of chars.
drawn from plane 1, 2, 3 and above at http://jshin.net/i18n/utftest
for testing.

BTW, there's still a mystery of which the cause I can't figure out
although it might have to do with CCMAP_HAS_CHAR and plane 2 and higher
(there's a related bug for Mozilla-FT2 : bug 195154).
Attachment #116173 - Attachment is obsolete: true
This code makes me think that I want to break out converting the incoming
unichar string into a ucs4 array before figuring out what glyphs to use on those
character codes.  I'm not sure what the cost of doing a two-pass system is, I'll
have to do a little measurement.  Some of that code is just getting waaaay to
complicated and fragile.
Comment on attachment 116195 [details] [diff] [review]
a patch against the current cvs with Non-BMP support

>Index: gfx/src/gtk/Makefile.in
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/src/gtk/Makefile.in,v
>retrieving revision 1.97
>diff -u -r1.97 Makefile.in
>--- gfx/src/gtk/Makefile.in	28 Dec 2002 01:13:36 -0000	1.97
>+++ gfx/src/gtk/Makefile.in	4 Mar 2003 15:45:09 -0000
>@@ -137,6 +137,12 @@
> 
> include $(topsrcdir)/config/rules.mk
> 
>+libs:: fontEncoding.properties
>+	$(INSTALL) $^ $(DIST)/bin/res/fonts
>+
>+install:: fontEncoding.properties
>+	$(INSTALL) $^ $(DESTDIR)$(mozappdir)/res/fonts
>+
> ifdef MOZ_ENABLE_XINERAMA
> GFX_XINERAMA_LIBS += $(MOZ_XINERAMA_LIBS)
> endif
>Index: gfx/src/gtk/nsFontMetricsXft.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp,v
>retrieving revision 1.11
>diff -u -r1.11 nsFontMetricsXft.cpp
>--- gfx/src/gtk/nsFontMetricsXft.cpp	3 Mar 2003 15:26:57 -0000	1.11
>+++ gfx/src/gtk/nsFontMetricsXft.cpp	4 Mar 2003 15:45:13 -0000
>@@ -25,6 +25,7 @@
>  *   Roland Mainz <roland.mainz@informatik.med.uni-giessen.de>
>  *   Brian Stell <bstell@ix.netcom.com>
>  *   Morten Nilsen <morten@nilsen.com>
>+ *   Jungshik Shin <jshin@mailaps.org>
>  *
>  * Alternatively, the contents of this file may be used under the terms of
>  * either the GNU General Public License Version 2 or later (the "GPL"), or
>@@ -54,37 +55,62 @@
> #include "nsDeviceContextGTK.h"
> #include "nsReadableUtils.h"
> #include "nsUnicharUtils.h"
>+#include "nsICharsetConverterManager.h"
>+#include "nsICharsetConverterManager2.h"
>+#include "nsICharRepresentable.h"
>+#include "nsIPersistentProperties2.h"
>+#include "nsCompressedCharMap.h"
>+#include "nsNetUtil.h"
>+#include "nsIURI.h"
>+#include "nsHashtable.h"
> 
> #include <gdk/gdkx.h>
> #include <freetype/tttables.h>
>+#include <freetype/freetype.h>
> 
> #define FORCE_PR_LOG
> #include "prlog.h"
>-
>-// Class nsFontXft is an actual instance of a font.  The
>-// |nsFontMetricsXft| class is made up of a collection of these little
>-// fonts, really.
>+  
>+// Abstract class nsFontXft is the base class for nsFontXftUnicode,
>+// nsFontXftCustom and nsFontXftCustomWide one of which is an instance 
>+// of fonts.  The |nsFontMetricsXft| class is made up of a collection of 
>+// these little fonts, really.
> 
> class nsFontXft {
> public:
>     nsFontXft(FcPattern *aPattern, FcPattern *aFontName);
>-    ~nsFontXft();
>+    virtual ~nsFontXft() = 0;
> 
>     XftFont   *GetXftFont (void);
>-    gint       GetWidth8  (const char *aString, PRUint32 aLength);
>-    gint       GetWidth16 (const PRUnichar *aString, PRUint32 aLength);
>+    virtual gint       GetWidth8  (const char *aString, PRUint32 aLength);
>+    virtual gint       GetWidth16 (const PRUnichar *aString, 
>+                                   PRUint32 aLength, 
>+                                   PRBool aNeedConv = PR_TRUE) = 0;
> 
> #ifdef MOZ_MATHML
>-    void       GetBoundingMetrics8  (const char *aString, PRUint32 aLength,
>-                                     nsBoundingMetrics &aBoundingMetrics);
>-    void       GetBoundingMetrics16 (const PRUnichar *aString,
>-                                     PRUint32 aLength,
>-                                     nsBoundingMetrics &aBoundingMetrics);
>+    virtual void       GetBoundingMetrics8  (const char *aString, PRUint32 aLength,
>+                                             nsBoundingMetrics &aBoundingMetrics);
>+    virtual void       GetBoundingMetrics16 (const PRUnichar *aString,
>+                                             PRUint32 aLength,
>+                                             nsBoundingMetrics &aBoundingMetrics);
> #endif /* MOZ_MATHML */
> 
>     PRInt16    GetMaxAscent(void);
>     PRInt16    GetMaxDescent(void);
> 
>+    virtual PRBool     HasChar(PRUint32 aChar) = 0;
>+
>+    virtual nsresult   FillSpecBuffer(const PRUnichar* aString, 
>+                                      const PRUint32 aLength, 
>+                                      nscoord aX, nscoord aY,
>+                                      nscoord* aXoffset,
>+                                      nsRenderingContextGTK *aContext,
>+                                      XftGlyphFontSpec** specBuffer, 
>+                                      PRUint32* aSpecBufferLen, 
>+                                      PRUint32* aSpecBufferSize, float P2T,
>+                                      const nscoord* aSpacing,
>+                                      PRBool* aFoundGlyph) = 0;
>+
>     // a reference to the font loaded and information about it.  
>     XftFont   *mXftFont;
>     FcPattern *mPattern;
>@@ -92,6 +118,96 @@
>     FcCharSet *mCharset;
> };
> 
>+class nsFontXftInfo;
>+
>+// class for regular 'Unicode' fonts that are actually what they claim to be. 
>+class nsFontXftUnicode : public nsFontXft {
>+public:
>+    nsFontXftUnicode(FcPattern *aPattern, FcPattern *aFontName);
>+    virtual ~nsFontXftUnicode();
>+
>+    virtual gint       GetWidth16(const PRUnichar *aString, 
>+                                  PRUint32 aLength, 
>+                                  PRBool aNeedConv = PR_TRUE);
>+    virtual nsresult  FillSpecBuffer(const PRUnichar* aString, 
>+                                     const PRUint32 aLength, 
>+                                     nscoord aX, nscoord aY,
>+                                     nscoord* aXoffset,
>+                                     nsRenderingContextGTK *aContext,
>+                                     XftGlyphFontSpec** specBuffer, 
>+                                     PRUint32* aSpecBufferLen, 
>+                                     PRUint32* aSpecBufferSize, float P2T,
>+                                     const nscoord* aSpacing,
>+                                     PRBool* aFoundGlyph);
>+    virtual PRBool     HasChar(PRUint32 aChar);
>+};
>+
>+// class for custom encoded fonts that pretend to have Unicode cmap.
>+// There are two kinds of them: 
>+// 1. 'narrow' custom encoded fonts such as  mathematica fonts, 
>+// TeX Computer Roman fonts, Symbol font, etc.
>+// 2. 'wide' custom  encoded fonts such as 
>+// Korean Jamo fonts, non-opentype fonts for Indic scripts.
>+//
>+// The biggest difference between 'narrow' and 'wide' fonts is that 
>+// the result of calling mConverter for 'wide' fonts is a sequence of 
>+// 16bit pseudo-Unicode code points.  For 'narrow' fonts, it's 
>+// a sequence of 8bit code points. 
>+class nsFontXftCustom : public nsFontXft {
>+public:
>+    nsFontXftCustom(FcPattern* aPattern, 
>+                    FcPattern* aFontName, 
>+                    nsFontXftInfo* aFontInfo);
>+
>+    virtual ~nsFontXftCustom();
>+
>+    virtual gint       GetWidth16(const PRUnichar *aString, 
>+                                  PRUint32 aLength, 
>+                                  PRBool aNeedConv = PR_TRUE);
>+
>+    virtual nsresult  FillSpecBuffer(const PRUnichar* aString, 
>+                                     const PRUint32 aLength, 
>+                                     nscoord aX, nscoord aY,
>+                                     nscoord* aXoffset,
>+                                     nsRenderingContextGTK *aContext,
>+                                     XftGlyphFontSpec** specBuffer, 
>+                                     PRUint32* aSpecBufferLen, 
>+                                     PRUint32* aSpecBufferSize, float P2T,
>+                                     const nscoord* aSpacing,
>+                                     PRBool* aFoundGlyph);
>+    virtual PRBool     HasChar(PRUint32 aChar);
>+
>+
>+private: 
>+    nsFontXftInfo* mFontInfo; 
>+
>+    // freetype fontface : used for direct access to glyph indices
>+    // in GetWidth16() and FillSpecBuffer().
>+    FT_Face mFT_Face;
>+    nsresult SetFT_FaceCharmap (void);
>+};
>+
>+// a class to hold some essential information about font. 
>+// it's shared and cached with 'family name' as hash key.
>+class nsFontXftInfo {
>+    public:
>+        nsFontXftInfo() : mCCMap(nsnull), mConverter(0), 
>+                          mFontType(eFontTypeUnicode) 
>+                          { }
>+
>+        ~nsFontXftInfo() {
>+            if (mCCMap)
>+                FreeCCMap(mCCMap);
>+        }
>+
>+    // Char. coverage map(replacing mCharset in nsFontXft)
>+        PRUint16*                   mCCMap;
>+    // converter from Unicode to font-specific encoding
>+        nsCOMPtr<nsIUnicodeEncoder> mConverter;
>+
>+        eFontType                   mFontType;
>+};
>+
> struct MozXftLangGroup {
>     const char    *mozLangGroup;
>     FcChar32       character;
>@@ -148,9 +264,52 @@
> 
> #define IS_NON_BMP(c) ((c) >> 16)
> 
>+// a helper class for automatic char buffer allocation
>+class nsAutoCharBuffer {
>+public:
>+    nsAutoCharBuffer();
>+    ~nsAutoCharBuffer();
>+    char* GetArray(PRInt32 aMinLength = 0);
>+
>+private:
>+    char* mArray;
>+    char  mAutoArray[FONT_SPEC_BUFFER_SIZE];
>+    PRInt32  mCount;
>+};
>+
>+nsAutoCharBuffer::nsAutoCharBuffer()
>+     : mArray(mAutoArray),
>+       mCount(FONT_SPEC_BUFFER_SIZE)
>+{
>+}
>+
>+nsAutoCharBuffer::~nsAutoCharBuffer()
>+{
>+    if (mArray && (mArray != mAutoArray)) {
>+        delete [] mArray;
>+    }
>+}
>+
>+char* nsAutoCharBuffer::GetArray(PRInt32 aMinCount)
>+{
>+    if (aMinCount > mCount) {
>+        char* newArray = new char[aMinCount];
>+        if (!newArray) {
>+          return nsnull;
>+        }
>+        if (mArray != mAutoArray) {
>+          delete [] mArray;
>+        }
>+        mArray = newArray;
>+        mCount = aMinCount;
>+    }
>+    return mArray;
>+}
>+
> static XftGlyphFontSpec gFontSpecBuffer[FONT_SPEC_BUFFER_SIZE];
> 
> PRLogModuleInfo *gXftFontLoad = nsnull;
>+static int gNumInstances = 0;
> 
> // a function pointer to point to either |XftTextExtentsUtf16| if available
> // in Xft library or |MozXftTextExtentsUtf16| defined as a static below if not.
>@@ -171,8 +330,34 @@
> static nsresult
> EnumFontsXft(nsIAtom* aLangGroup, const char* aGeneric,
>              PRUint32* aCount, PRUnichar*** aResult);
>+ 
>+static 
>+NS_DEFINE_CID(kCharsetConverterManagerCID, NS_ICHARSETCONVERTERMANAGER_CID);
>+static PRBool gInitialized = PR_FALSE;
>+static nsIPersistentProperties* gFontEncodingProperties = nsnull;
>+static nsICharsetConverterManager2* gCharsetManager = nsnull;
>+static nsHashtable* gFontXftMaps = nsnull;
>+
>+static nsresult 
>+GetEncoding(const char* aFontName, nsString& aValue);
>+static nsresult
>+GetConverter(nsAutoString& aEncoding, nsIUnicodeEncoder** aConverter);
>+static nsresult FreeGlobals(void);
>+static nsresult InitFontEncodingProperties(void);
>+static nsFontXftInfo* GetFontXftInfo(FcPattern* aPattern);
>+static nsresult
>+ConvertUnicodeToCustom(const PRUnichar* aSrc,  PRInt32 aSrcLength,
>+                       PRInt32& aDestLength, nsIUnicodeEncoder* aConverter,
>+                       PRBool aIsWide, nsAutoCharBuffer& aResult);
>+static nsresult
>+FillSpecBufferCommon(const PRUnichar* aString, PRUint32 aLength, 
>+                     nscoord aX, nscoord aY, nscoord* aXoffset,
>+                     nsRenderingContextGTK *aContext,
>+                     XftGlyphFontSpec* aSpecBuffer, float P2T,
>+                     const nscoord* aSpacing, PRBool* aFoundGlyph, 
>+                     nsFontXft* aFont);
> 
>-nsFontMetricsXft::nsFontMetricsXft()
>+nsFontMetricsXft::nsFontMetricsXft() : mFontIsCustomEncoded(2,50)
> {
>     if (!gXftFontLoad)
>         gXftFontLoad = PR_NewLogModule("XftFontLoad");
>@@ -194,6 +379,7 @@
>         if (!gXftTextExtentsUtf16)  // use the internally defined one.
>             gXftTextExtentsUtf16 = MozXftTextExtentsUtf16; 
>     }
>+    ++gNumInstances;
> }
> 
> nsFontMetricsXft::~nsFontMetricsXft()
>@@ -214,12 +400,13 @@
>     if (mMiniFont)
>         XftFontClose(GDK_DISPLAY(), mMiniFont);
> 
>-#ifdef DEBUG_XFT_MEMORY
>     if (--gNumInstances == 0) {
>+        FreeGlobals();
>+#ifdef DEBUG_XFT_MEMORY
>         XftMemReport();
>         FcMemReport();
>-    }
> #endif
>+    }
> }
> 
> NS_IMPL_ISUPPORTS1(nsFontMetricsXft, nsIFontMetrics)
>@@ -313,9 +500,26 @@
>         mPointSize = 1;
>     }
> 
>+    if (!gInitialized) {
>+        nsServiceManager::GetService(kCharsetConverterManagerCID,
>+        NS_GET_IID(nsICharsetConverterManager2), (nsISupports**) &gCharsetManager);
>+        if (!gCharsetManager) {
>+            FreeGlobals();
>+            return NS_ERROR_FAILURE;
>+        }
>+        gFontXftMaps = new nsHashtable();
>+        if (!gFontXftMaps) {
>+            FreeGlobals();
>+            return NS_ERROR_OUT_OF_MEMORY;
>+        }
>+        gInitialized = PR_TRUE;
>+    }
>+
>     if (NS_FAILED(RealizeFont()))
>         return NS_ERROR_FAILURE;
> 
>+//      NS_ASSERTION(NS_SUCCEEDED(res), "No font at all has been created");
>+
>     return NS_OK;
> }
> 
>@@ -437,7 +641,7 @@
>         // this character.
>         for (PRInt32 j = 0, end = mLoadedFonts.Count(); j < end; ++j) {
>             font = (nsFontXft *)mLoadedFonts.ElementAt(j);
>-            if (FcCharSetHasChar(font->mCharset, c)) {
>+            if (font->HasChar(c)) {
>                 currFont = font;
>                 goto FoundFont; // for speed -- avoid "if" statement
>             }
>@@ -657,6 +861,34 @@
>     return NS_OK;
> }
> 
>+  
>+#define FILLSPECBUFFER                                                    \
>+    do {                                                                  \
>+        XftGlyphFontSpec* oldSpecBuffer = specBuffer;                     \
>+        PRUint32 oldSpecBufferLen = specBufferLen;                        \
>+        res = prevFont->FillSpecBuffer(&inString[start], i - start,       \
>+                    aX, aY, &xOffset,                                     \
>+                    aContext, &specBuffer, &specBufferLen,                \
>+                    &specBufferSize, P2T, aSpacing, &foundGlyph);         \
>+                                                                          \
>+        if (NS_FAILED(res)) {                                             \
>+            if (specBuffer && specBuffer != oldSpecBuffer) {              \
>+                delete [] specBuffer;                                     \
>+            }                                                             \
>+            /* Put the old value back for delete [] */                    \
>+            specBuffer = oldSpecBuffer;                                   \
>+            specBufferLen = oldSpecBufferLen;                             \
>+            goto Error;                                                   \
>+        }                                                                 \
>+                                                                          \
>+        if (specBuffer != oldSpecBuffer) {                                \
>+            if (useLocalSpecBuffer)                                       \
>+                delete [] oldSpecBuffer;                                  \
>+                useLocalSpecBuffer = PR_TRUE;                             \
>+        }                                                                 \
>+    } while (0)                                                               
>+
>+
> nsresult
> nsFontMetricsXft::DrawString(const PRUnichar* aString, PRUint32 aLength,
>                              nscoord aX, nscoord aY,
>@@ -668,10 +900,16 @@
>     // set up our colors and clip regions
>     XftDraw *draw;
>     XftColor color;
>+    nsresult res = NS_OK;
>     PrepareToDraw(aContext, aSurface, &draw, color);
> 
>-    // We use XftDrawCharFontSpec to do our drawing here so that even
>-    // though we might have many different fonts at many different
>+    // We used to use XftDrawCharFontSpec, but some custom encoded
>+    // math fonts  have broken Cmaps that need to be worked around 
>+    // by directly accessing glyph indices.(see bug 176290).
>+    // Therefore, we use XftDrawGlyphFontSpec to do our drawing 
>+    // here. The former is a simple wrapper for the latter and we don't lose
>+    // anything by using the latter while keeping advantages which are: 
>+    // Even though we might have many different fonts at many different
>     // locations due to spacing, we can do it all in one shot.  This
>     // is much faster from a protocol and code standpoint.
> 
>@@ -685,12 +923,25 @@
>     // we keep track of the size of the buffer seperately.
>     PRUint32 specBufferLen = 0;
> 
>+    // The spec buffer may have to be expanded if custom-encoded font 
>+    // requires a buffer larger than aLength so that we have to keep 
>+    // track of the maximum  buffer size  to increase it if necessary.
>+    PRUint32 specBufferSize = FONT_SPEC_BUFFER_SIZE;
>+
>+    // When mMiniFont is not loaded, we have to put UCS2_SPACE in aString, 
>+    // but it's const so that we need to make a local copy of it in 
>+    // such a case.
>+    PRBool useLocalInputBuffer = PR_FALSE;
>+    PRUnichar* inString = (PRUnichar *) aString; // cast away const
>+
>     if (aLength > FONT_SPEC_BUFFER_SIZE) {
>         useLocalSpecBuffer = PR_TRUE;
>         specBuffer = new XftGlyphFontSpec[aLength];
> 
>         if (!specBuffer)
>             return NS_ERROR_FAILURE;
>+
>+        specBufferSize = aLength;
>     }
> 
>     float P2T;
>@@ -701,23 +952,22 @@
> 
>     // Set up the mini font, if it hasn't already been done before.
>     // If it fails, we'll just substitute the space character below.
>+    if (!mMiniFont)
>+        SetupMiniFont();
> 
>     // When we walk the list of fonts this is the offset from the
>     // original x coordinate that we need to render.
>     nscoord xOffset = 0;
>-
>-    for (PRUint32 i = 0; i < aLength; ++i) {
>-        PRUint32 c = aString[i];
>+    nsFontXft* prevFont = nsnull;
>+    PRUint32 start = 0;
>+    PRUint32 i;
>+    PRUint8  isLastNonBMP = 0;
>+  
>+    for (i = 0; i < aLength; i += 1 + isLastNonBMP) {
>+        PRUint32 c = inString[i];
>         nsFontXft *currFont = nsnull;
>         nsFontXft *font = nsnull;
>-
>-        // position in X is the location offset in the string plus
>-        // whatever offset is required for the spacing argument
>-        nscoord x = aX + xOffset;
>-        nscoord y = aY;
>-
>-        // convert this into device coordinates
>-        aContext->GetTranMatrix()->TransformCoord(&x, &y);
>+        isLastNonBMP = 0;
> 
>         // Convert any surrogate paris into a single ucs4 character
>         if (IS_LOW_SURROGATE(c))  // an unpaired low surrogate
>@@ -726,7 +976,7 @@
>         if (IS_HIGH_SURROGATE(c)) {  
>             if (i + 1 < aLength && IS_LOW_SURROGATE(aString[i + 1])) {
>                 c = SURROGATE_TO_UCS4(c,aString[i + 1]);
>-                ++i;
>+                isLastNonBMP = 1;
>             }
>             else {  // an unpaired high surrogate
>                 goto NotFound;
>@@ -736,7 +986,7 @@
>         // this character.
>         for (PRInt32 j = 0, end = mLoadedFonts.Count(); j < end; ++j) {
>             font = (nsFontXft *)mLoadedFonts.ElementAt(j);
>-            if (FcCharSetHasChar(font->mCharset, c)) {
>+            if (font->HasChar(c)) {
>                 currFont = font;
>                 goto FoundFont; // for speed -- avoid "if" statement
>             }
>@@ -753,76 +1003,91 @@
>         if (!mMiniFont) {
>             c = UCS2_SPACE;
>             currFont = FindFont(c);
>-            goto FoundFont;
>-        }
>-
>-        // Draw the unknown glyph, advance our x offset and jump to
>-        // the top of the loop.
>-
>-        // Note that we offset the height of this drawing so that it's
>-        // drawn on the average baseline of the font, not the actual
>-        // baseline since a lot of characters actually go below the
>-        // baseline and if the unknown glyph box is on the baseline it
>-        // looks higher than all the other fonts in a string.
>-        DrawUnknownGlyph(c, x, y + mMiniFontYOffset, &color, draw);
>-
>-        if (aSpacing) {
>-            xOffset += *aSpacing;
>-            aSpacing += (IS_NON_BMP(c) ? 2 : 1);
>+            // Put UCS2_SPACE at inString[i] after making a fresh copy. 
>+            // If malloc fails, just use the original.
>+            if (! useLocalInputBuffer) {
>+                PRUnichar* oldString = inString; // not necessary, but just for readability
>+                if ((inString = new PRUnichar[aLength])) {
>+                    memcpy(inString, oldString, aLength * sizeof(PRUnichar));
>+                    useLocalInputBuffer = PR_TRUE;
>+                    inString[i] = c;
>+                }
>+            }
>+            else 
>+                inString[i] = c;
>         }
>         else {
>-            xOffset +=
>-                NSToCoordRound((mMiniFontWidth * (IS_NON_BMP(c) ? 3 : 2) +
>-                                mMiniFontPadding * (IS_NON_BMP(c) ? 6 : 5)) * P2T);
>-        }
>+            // If we got here it means that we couldn't find a font for
>+            // this character so we need to draw the unknown glyph box,
>+            // after drawing any left over text.
>+            if (prevFont) {
>+                FILLSPECBUFFER;
>+                prevFont = nsnull;
>+            } 
>+
>+            // position in X is the location offset in the string plus
>+            // whatever offset is required for the spacing argument
>+            nscoord x = aX + xOffset;
>+            nscoord y = aY;
>+
>+            // convert this into device coordinates
>+            aContext->GetTranMatrix()->TransformCoord(&x, &y);
>+
>+            // Note that we offset the height of this drawing so that it's
>+            // drawn on the average baseline of the font, not the actual
>+            // baseline since a lot of characters actually go below the
>+            // baseline and if the unknown glyph box is on the baseline it
>+            // looks higher than all the other fonts in a string.
>+            DrawUnknownGlyph(c, x, y + mMiniFontYOffset, &color, draw);
>+
>+            if (aSpacing) {
>+                xOffset += *aSpacing;
>+                aSpacing += (IS_NON_BMP(c) ? 2 : 1);
>+            }
>+            else {
>+                xOffset +=
>+                    NSToCoordRound((mMiniFontWidth * (IS_NON_BMP(c) ? 3 : 2) +
>+                                    mMiniFontPadding * (IS_NON_BMP(c) ? 6 : 5)) * P2T);
>+            }
> 
>-        // jump to the top of the loop
>-        continue;
>+            // jump to the top of the loop
>+            continue;
>+        }
> 
>     FoundFont:
>-        // use current found font to render this character
>-        specBuffer[specBufferLen].x = x;
>-        specBuffer[specBufferLen].y = y;
>-        specBuffer[specBufferLen].font = currFont->GetXftFont();
>-        specBuffer[specBufferLen].glyph = XftCharIndex(GDK_DISPLAY(),
>-                                                       currFont->GetXftFont(),
>-                                                       c);
>-
>-        // check to see if this glyph is non-empty
>-        if (!foundGlyph) {
>-            XGlyphInfo info;
>-            XftGlyphExtents(GDK_DISPLAY(), specBuffer[specBufferLen].font,
>-                            &specBuffer[specBufferLen].glyph,
>-                            1, &info);
>-
>-            if (info.width && info.height)
>-                foundGlyph = PR_TRUE;
>+        // use prevFont to render a substring up to this point.
>+        if (prevFont) { 
>+            if (currFont != prevFont) {
>+                FILLSPECBUFFER;
>+                if (NS_FAILED(res))
>+                    goto Error; 
>+                prevFont = currFont;
>+                start = i;
>+            } 
>+        }   
>+        else {
>+            prevFont = currFont;
>+            start = i;
>         }
>+    }
> 
>-        ++specBufferLen;
>-
>-        if (aSpacing) {
>-            xOffset += *aSpacing;
>-            aSpacing += (IS_NON_BMP(c) ? 2 : 1);
>-        }
>-        else if (IS_NON_BMP(c)) {
>-            xOffset +=
>-                NSToCoordRound(currFont->GetWidth16(&aString[i - 1], 2) * P2T);
>-        }
>-        else 
>-            xOffset +=
>-                NSToCoordRound(currFont->GetWidth16(&aString[i], 1) * P2T);
>+    if (prevFont) {
>+        FILLSPECBUFFER;
>     }
> 
>     // Go forth and blit!
>     if (foundGlyph)
>         XftDrawGlyphFontSpec(draw, &color, specBuffer, specBufferLen);
> 
>+Error: 
>     // only free memory if we didn't use the static buffer
>     if (useLocalSpecBuffer)
>         delete [] specBuffer;
>+    // only free memory if inString is a local copy of aString.
>+    if (useLocalInputBuffer)
>+        delete [] inString;
> 
>-    return NS_OK;
>+    return res;
> }
> 
> #ifdef MOZ_MATHML
>@@ -1198,12 +1463,27 @@
>             printf("\t%s\n", name);
>         }
> 
>-        nsFontXft *font = new nsFontXft(mPattern, set->fonts[i]);
>+        nsFontXft *font;
>+        nsFontXftInfo *info;
>+        nsCOMPtr<nsIUnicodeEncoder> converter = 0;
>+
>+        info = GetFontXftInfo(set->fonts[i]);
>+        if (info) {
>+            if (info->mFontType == eFontTypeUnicode)
>+                font = new nsFontXftUnicode(mPattern, set->fonts[i]);
>+            else
>+                font = new nsFontXftCustom(mPattern, set->fonts[i], info);
>+        }
>+        else  // if null is returned, treat it as Unicode font. 
>+            font = new nsFontXftUnicode(mPattern, set->fonts[i]);
>+
>         if (!font)
>             goto loser;
> 
>         // append this font to our list of loaded fonts
>         mLoadedFonts.AppendElement((void *)font);
>+        mFontIsCustomEncoded.AppendValue(
>+                (info && info->mFontType != eFontTypeUnicode) ? 1 : 0);
>     }
> 
>     // we're done with the set now
>@@ -1225,6 +1505,7 @@
>         nsFontXft *font = (nsFontXft *)mLoadedFonts.ElementAt(i);
>         mLoadedFonts.RemoveElementAt(i);
>         delete font;
>+        mFontIsCustomEncoded.RemoveValueAt(i);
>     }
> }
> 
>@@ -1268,7 +1549,7 @@
>         // this character.
>         for (PRInt32 j = 0, end = mLoadedFonts.Count(); j < end; ++j) {
>             font = (nsFontXft *)mLoadedFonts.ElementAt(j);
>-            if (FcCharSetHasChar(font->mCharset, c)) {
>+            if (font->HasChar(c)) {
>                 currFont = font;
>                 goto FoundFont; // for speed -- avoid "if" statement
>             }
>@@ -1756,8 +2037,90 @@
>     return glyphInfo.xOff;
> }
> 
>+#ifdef MOZ_MATHML
>+void
>+nsFontXft::GetBoundingMetrics8 (const char *aString, PRUint32 aLength,
>+                                nsBoundingMetrics &aBoundingMetrics)
>+{
>+    NS_NOTREACHED("GetBoundingMetrics8");
>+}
>+
>+void
>+nsFontXft::GetBoundingMetrics16 (const PRUnichar *aString,
>+                                 PRUint32 aLength,
>+                                 nsBoundingMetrics &aBoundingMetrics)
>+{
>+    NS_NOTREACHED("GetBoundingMetrics16");
>+}
>+#endif /* MOZ_MATHML */
>+
>+PRInt16
>+nsFontXft::GetMaxAscent(void)
>+{
>+    if (!mXftFont)
>+        GetXftFont();
>+
>+    return mXftFont->ascent;
>+}
>+
>+
>+PRInt16
>+nsFontXft::GetMaxDescent(void)
>+{
>+    if (!mXftFont)
>+        GetXftFont();
>+
>+    return mXftFont->descent;
>+}
>+
>+// class nsFontXftUnicode impl
>+  
>+nsFontXftUnicode::nsFontXftUnicode(FcPattern* aPattern, FcPattern* aFontName) :
>+                                 nsFontXft(aPattern, aFontName)
>+{
>+}
>+
>+nsFontXftUnicode::~nsFontXftUnicode()
>+{
>+}
>+
>+nsresult
>+nsFontXftUnicode::FillSpecBuffer(const PRUnichar* aString, 
>+                                 const PRUint32 aLength, 
>+                                 nscoord aX, nscoord aY,
>+                                 nscoord* aXoffset,
>+                                 nsRenderingContextGTK *aContext,
>+                                 XftGlyphFontSpec** aSpecBuffer, 
>+                                 PRUint32* aSpecBufferLen, 
>+                                 PRUint32* aSpecBufferSize, float P2T,
>+                                 const nscoord* aSpacing,
>+                                 PRBool* aFoundGlyph)
>+{
>+    if (*aSpecBufferLen + aLength  > *aSpecBufferSize) {
>+        XftGlyphFontSpec* oldSpecBuffer = *aSpecBuffer;
>+        *aSpecBufferSize += 2 * aLength;
>+        *aSpecBuffer = new XftGlyphFontSpec[*aSpecBufferSize];
>+        if (! *aSpecBuffer) {
>+            *aSpecBufferSize -= 2 * aLength;
>+            return NS_ERROR_OUT_OF_MEMORY;
>+        }
>+        memcpy(*aSpecBuffer, oldSpecBuffer, 
>+               *aSpecBufferLen * sizeof(XftGlyphFontSpec));
>+    }
>+
>+    if (!mXftFont)
>+        GetXftFont();
>+
>+    FillSpecBufferCommon(aString, aLength, aX, aY, aXoffset, aContext,
>+                         *aSpecBuffer + *aSpecBufferLen, P2T, aSpacing,
>+                         aFoundGlyph, this);
>+    *aSpecBufferLen += aLength;
>+    return NS_OK;
>+}
>+  
>+// aNeedConv is a dummy argument for nsFontXftUnicode
> gint
>-nsFontXft::GetWidth16(const PRUnichar *aString, PRUint32 aLength)
>+nsFontXftUnicode::GetWidth16(const PRUnichar *aString, PRUint32 aLength, PRBool aNeedConv)
> {
>     XGlyphInfo glyphInfo;
>     if (!mXftFont)
>@@ -1775,39 +2138,208 @@
>     return glyphInfo.xOff;
> }
> 
>-#ifdef MOZ_MATHML
>-void
>-nsFontXft::GetBoundingMetrics8 (const char *aString, PRUint32 aLength,
>-                                nsBoundingMetrics &aBoundingMetrics)
>+PRBool
>+nsFontXftUnicode::HasChar(PRUint32 aChar)
> {
>-    NS_NOTREACHED("GetBoundingMetrics8");
>+  return FcCharSetHasChar(mCharset, (FcChar32) aChar);
> }
> 
>-void
>-nsFontXft::GetBoundingMetrics16 (const PRUnichar *aString,
>-                                 PRUint32 aLength,
>-                                 nsBoundingMetrics &aBoundingMetrics)
>+// class nsFontXftCustom impl
>+
>+nsFontXftCustom::nsFontXftCustom(FcPattern* aPattern, FcPattern* aFontName, 
>+                                 nsFontXftInfo* aFontInfo) :
>+                                 nsFontXft(aPattern, aFontName), 
>+                                 mFontInfo(aFontInfo),mFT_Face(nsnull)
> {
>-    NS_NOTREACHED("GetBoundingMetrics16");
> }
>-#endif /* MOZ_MATHML */
> 
>-PRInt16
>-nsFontXft::GetMaxAscent(void)
>+nsFontXftCustom::~nsFontXftCustom()
>+{
>+    if (mXftFont && mFT_Face)
>+       XftUnlockFace(mXftFont);
>+}
>+
>+
>+gint
>+nsFontXftCustom::GetWidth16(const PRUnichar *aString, PRUint32 aLength, PRBool aNeedConv)
> {
>     if (!mXftFont)
>         GetXftFont();
> 
>-    return mXftFont->ascent;
>+    // No need to invoke converter. aString has to be taken as they are 
>+    // without any conversion. Invoked with PR_TRUE by FillSpecBuffer().  
>+    // No known custom font encoding uses UTF-16 so that we can just
>+    // use Xft API for UCS-2 here.
>+    if (! aNeedConv) {
>+        XGlyphInfo glyphInfo;
>+        XftTextExtents16 (GDK_DISPLAY(), mXftFont, (FcChar16 *)aString,
>+                              aLength, &glyphInfo);
>+        return glyphInfo.xOff;
>+    }
>+
>+    nsAutoCharBuffer buffer;
>+    PRInt32 destLength = aLength;
>+    PRBool isWide = mFontInfo->mFontType == eFontTypeCustomWide; 
>+    if (NS_FAILED(ConvertUnicodeToCustom(aString, aLength, destLength, 
>+                                         mFontInfo->mConverter, isWide,
>+                                         buffer)))
>+        return 0;
>+
>+    char* str = buffer.GetArray();
>+    XGlyphInfo glyphInfo;
>+
>+    if (NS_FAILED(SetFT_FaceCharmap())) 
>+        return 0;
>+
>+    if (isWide) { 
>+        XftTextExtents16(GDK_DISPLAY(), mXftFont, (FcChar16 *)str,
>+                         destLength >> 1, &glyphInfo);
>+    }
>+    else {
>+        // If FT_SelectCharMap succeeds for MacRoman or Unicode,
>+        // use glyph indices directly for 'narrow' custom encoded fonts.
>+        FT_UInt *glyphs, glyphs_local[FONT_SPEC_BUFFER_SIZE];
>+        if (destLength <= FONT_SPEC_BUFFER_SIZE)
>+            glyphs = glyphs_local;
>+        else {
>+            glyphs = new FT_UInt[destLength];
>+            if (! glyphs)
>+                return 0;
>+        }
>+        char* pstr = str;
>+        FT_UInt* pglyph = glyphs;
>+        while (pstr < str + destLength) 
>+            *pglyph++ = FT_Get_Char_Index (mFT_Face, FT_ULong(PRUint8(*pstr++)));
>+
>+        XftGlyphExtents(GDK_DISPLAY(), mXftFont, glyphs, destLength, &glyphInfo);
>+        if (glyphs != glyphs_local) 
>+            delete[] glyphs;
>+    }
>+
>+    return glyphInfo.xOff;
> }
> 
>-PRInt16
>-nsFontXft::GetMaxDescent(void)
>+PRBool
>+nsFontXftCustom::HasChar(PRUint32 aChar)
>+{
>+  return (mFontInfo->mCCMap && CCMAP_HAS_CHAR_EXT(mFontInfo->mCCMap, aChar)); 
>+}
>+
>+nsresult
>+nsFontXftCustom::SetFT_FaceCharmap(void)
> {
>+    if (! mXftFont)
>+        GetXftFont();
>+    if (! mFT_Face) {
>+        mFT_Face = XftLockFace(mXftFont);
>+        if (FT_Select_Charmap (mFT_Face, ft_encoding_unicode)
>+            &&  FT_Select_Charmap (mFT_Face, ft_encoding_apple_roman))
>+            return NS_ERROR_UNEXPECTED;
>+    }
>+    return NS_OK;
>+}
>+
>+
>+
>+nsresult
>+nsFontXftCustom::FillSpecBuffer(const PRUnichar* aString, 
>+                                const PRUint32 aLength, 
>+                                nscoord aX, nscoord aY,
>+                                nscoord* aXoffset,
>+                                nsRenderingContextGTK *aContext,
>+                                XftGlyphFontSpec** aSpecBuffer, 
>+                                PRUint32* aSpecBufferLen, 
>+                                PRUint32* aSpecBufferSize, float P2T,
>+                                const nscoord* aSpacing,
>+                                PRBool* aFoundGlyph)
>+{
>+    nsresult rv = NS_OK;
>+    nsAutoCharBuffer buffer;
>+    PRInt32 destLength = aLength;
>+    PRBool isWide = (mFontInfo->mFontType == eFontTypeCustomWide); 
>+
>+    rv = ConvertUnicodeToCustom(aString, aLength, destLength, 
>+                                mFontInfo->mConverter, isWide, buffer);
>+    if (NS_FAILED(rv))
>+        return rv;
>+
>     if (!mXftFont)
>         GetXftFont();
>+      
>+    if (! isWide) {
>+        // For narrow fonts, it's assumed that what we have in str 
>+        // after the conversion  is in the encoding of a specific 
>+        // FT_Charmap (Apple Roman) instead of Unicode so that we 
>+        // should not call XftCharIndex() which regards input string 
>+        // as in Unicode. Instead, we have to directly invoke 
>+        // FT_Get_Char_Index() with FT_Face corresponding to XftFont 
>+        // after setting FT_Charmap to the cmap of our choice(Apple Roman).
>+        // This is true of all 'narrow' custom encoded fonts 
>+        // (TeX CM fonts, Symbol fonts and Math[1-5] fonts) at the moment.
>+        rv = SetFT_FaceCharmap(); 
>+        if (NS_FAILED(rv)) {
>+            return NS_ERROR_UNEXPECTED;
>+        }
>+    }
>+    else {
>+        // The string after the conversion can be longer than the original.
>+        // destLength is in byte while we count PRUnichar's(2bytes) and
>+        // the result of the conversion comes in PRUnichar for 'wide' fonts.
>+        if (*aSpecBufferLen + (destLength >> 1) > *aSpecBufferSize) {
>+            XftGlyphFontSpec* oldSpecBuffer = *aSpecBuffer;
>+            PRUint32 sizeIncrement = 2 * PR_MAX(destLength - aLength, 
>+                     *aSpecBufferLen + (destLength >> 1) - *aSpecBufferSize);
>+            *aSpecBufferSize += sizeIncrement;
>+            *aSpecBuffer = new XftGlyphFontSpec[*aSpecBufferSize];
>+            if (! *aSpecBuffer) {
>+                *aSpecBufferSize -= sizeIncrement;
>+                return  NS_ERROR_OUT_OF_MEMORY;
>+            }
>+            memcpy(*aSpecBuffer, oldSpecBuffer, 
>+                   *aSpecBufferLen * sizeof(XftGlyphFontSpec));
>+        }
>+    }
> 
>-    return mXftFont->descent;
>+    char* str = buffer.GetArray();
>+    if (! isWide) {
>+        for (char* pstr=str; pstr < str + destLength; ++pstr) {
>+            (*aSpecBuffer)[*aSpecBufferLen].font = mXftFont;
>+            (*aSpecBuffer)[*aSpecBufferLen].glyph = 
>+            FT_Get_Char_Index (mFT_Face, FT_ULong(PRUint8(*pstr)));
>+    
>+            /* check to see if this glyph is non-empty */ 
>+            if (!*aFoundGlyph) {
>+                XGlyphInfo info;
>+                XftGlyphExtents(GDK_DISPLAY(),
>+                                (*aSpecBuffer)[*aSpecBufferLen].font,
>+                                &((*aSpecBuffer)[*aSpecBufferLen].glyph),
>+                                1, &info);                       
>+                if (info.width && info.height)                  
>+                    *aFoundGlyph = PR_TRUE;                      
>+            }
>+    
>+            nscoord x = aX + *aXoffset;
>+            nscoord y = aY;
>+    
>+            // Convert to device coordinate
>+            aContext->GetTranMatrix()->TransformCoord(&x, &y);
>+    
>+            (*aSpecBuffer)[*aSpecBufferLen].x = x;
>+            (*aSpecBuffer)[*aSpecBufferLen].y = y;
>+    
>+            *aXoffset += NSToCoordRound(GetWidth8(pstr, 1) * P2T);
>+            (*aSpecBufferLen)++;
>+        }
>+    }
>+    else {
>+        FillSpecBufferCommon((PRUnichar*) str, destLength >> 1, aX, aY, 
>+                             aXoffset, aContext, 
>+                             *aSpecBuffer + *aSpecBufferLen,
>+                             P2T, nsnull, aFoundGlyph, this);
>+        *aSpecBufferLen += (destLength >> 1);
>+    }
>+
>+    return rv;
> }
> 
> // Static functions
>@@ -2208,3 +2740,274 @@
>     g_free(rects);
> }
> #endif /* MOZ_WIDGET_GTK2 */
>+
>+
>+// Helper to determine if a font has a private encoding that we know something about
>+/* static */
>+nsresult
>+GetEncoding(const char* aFontName, nsString& aValue)
>+{
>+  // below is a list of common used name for startup
>+    if ((!strcmp(aFontName, "Helvetica" )) ||
>+         (!strcmp(aFontName, "Times" )) ||
>+         (!strcmp(aFontName, "Times New Roman" )) ||
>+         (!strcmp(aFontName, "Courier New" )) ||
>+         (!strcmp(aFontName, "Courier" )) ||
>+         (!strcmp(aFontName, "Arial" )) ||
>+         (!strcmp(aFontName, "MS P Gothic" )) ||
>+         (!strcmp(aFontName, "Verdana" )))
>+        return NS_ERROR_NOT_AVAILABLE; // error mean do not get a special encoding
>+
>+    nsCAutoString name;
>+    name.Assign(NS_LITERAL_CSTRING("encoding.") + nsDependentCString(aFontName) + 
>+        NS_LITERAL_CSTRING(".ttf"));
>+
>+    name.StripWhitespace();
>+    ToLowerCase(name);
>+
>+    // if we have not init the property yet, init it right now.
>+    if (! gFontEncodingProperties)
>+        InitFontEncodingProperties();
>+
>+    if (gFontEncodingProperties)
>+        return gFontEncodingProperties->GetStringProperty(name, aValue);
>+
>+    return NS_ERROR_NOT_AVAILABLE;
>+}
>+
>+/* static */
>+static nsresult
>+GetConverter(nsAutoString& aEncoding, nsIUnicodeEncoder** aConverter)
>+{
>+    nsresult rv;
>+    nsCOMPtr<nsIAtom> charset;
>+
>+    if (! gCharsetManager) {
>+        nsServiceManager::GetService(kCharsetConverterManagerCID,
>+        NS_GET_IID(nsICharsetConverterManager2), (nsISupports**) &gCharsetManager);
>+        if (! gCharsetManager) {
>+            FreeGlobals();
>+            return NS_ERROR_FAILURE;
>+        }
>+    }
>+
>+    nsAutoString encoding = aEncoding;
>+    if (encoding.Length() > 5 &&
>+        Substring(encoding, encoding.Length() - 5, 5) ==  NS_LITERAL_STRING(".wide")) {
>+        encoding.Truncate(encoding.Length() - 5);
>+    }
>+
>+    rv = gCharsetManager->GetCharsetAtom(encoding.get(), getter_AddRefs(charset));
>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    rv = gCharsetManager->GetUnicodeEncoder(charset, aConverter);
>+    NS_ENSURE_SUCCESS(rv, rv);
>+
>+    return (*aConverter)->SetOutputErrorBehavior((*aConverter)->kOnError_Replace, nsnull, '?');
>+}
>+
>+
>+/* static */
>+nsresult
>+FreeGlobals(void)
>+{
>+  gInitialized = 0;
>+
>+  NS_IF_RELEASE(gFontEncodingProperties);
>+  NS_IF_RELEASE(gCharsetManager);
>+  if (gFontXftMaps)
>+      delete gFontXftMaps;
>+  return NS_OK;
>+}
>+
>+/* static */
>+nsresult
>+InitFontEncodingProperties(void)
>+{
>+    nsresult rv;
>+    // load the special encoding resolver
>+    nsCOMPtr<nsIURI> uri;
>+    rv = NS_NewURI(getter_AddRefs(uri), "resource:/res/fonts/fontEncoding.properties");
>+    if (NS_SUCCEEDED(rv)) {
>+        nsCOMPtr<nsIInputStream> in;
>+        rv = NS_OpenURI(getter_AddRefs(in), uri);
>+        if (NS_SUCCEEDED(rv)) {
>+            rv = nsComponentManager::
>+                 CreateInstance(NS_PERSISTENTPROPERTIES_CONTRACTID, nsnull,
>+                                NS_GET_IID(nsIPersistentProperties),
>+                                (void**)&gFontEncodingProperties);
>+            if (NS_SUCCEEDED(rv)) {
>+                rv = gFontEncodingProperties->Load(in);
>+            }
>+        }
>+    }
>+    return rv;
>+}
>+
>+/* static */
>+nsFontXftInfo*
>+GetFontXftInfo(FcPattern* aPattern)
>+{
>+    const char* family;
>+
>+    // If there's no family, just treat it as if it's a normal Unicode font
>+    if ( FcPatternGetString (aPattern, FC_FAMILY, 0, (FcChar8 **) &family) 
>+         != FcResultMatch) {
>+        return nsnull;
>+    }
>+
>+    nsFontXftInfo* info;
>+
>+    NS_ASSERTION(gFontXftMaps, "gFontXMaps should not be null by now.");
>+
>+    // 'family' returned by FcPatternGetString is just a reference. 
>+    nsCStringKey key = nsCStringKey(family, -1, nsCStringKey::NEVER_OWN);
>+    info = NS_STATIC_CAST(nsFontXftInfo *, gFontXftMaps -> Get(&key));
>+
>+    // cached entry found. 
>+    if (info) {
>+        return info;
>+    }
>+
>+    
>+    info = new nsFontXftInfo; 
>+    if (! info) 
>+        return nsnull; 
>+
>+    // See if a font has a custom/hacked encoding by matching
>+    // its family name against the list in fontEncoding.properties 
>+    // with GetEncoding(). Then get the converter and see if has
>+    // a valid coverage map. Finally, set the fonttype according
>+    // to the encoding name. '.wide' suffix indicates it's a wide font.
>+
>+    PRUint16* ccmap = nsnull;
>+    nsCOMPtr<nsIUnicodeEncoder> converter;
>+    nsAutoString encoding;
>+
>+    info->mFontType = eFontTypeUnicode;
>+    
>+    if (NS_SUCCEEDED(GetEncoding(family, encoding)) &&
>+        NS_SUCCEEDED(GetConverter(encoding, getter_AddRefs(converter)))) {
>+        nsCOMPtr<nsICharRepresentable> mapper(do_QueryInterface(converter));
>+        if (mapper) {
>+            ccmap = MapperToCCMap(mapper);
>+            if (encoding.Length() > 5 && 
>+                Substring(encoding, encoding.Length() - 5, 5) == 
>+                NS_LITERAL_STRING(".wide")) 
>+                info->mFontType = eFontTypeCustomWide;
>+            else 
>+                info->mFontType = eFontTypeCustom;
>+            info->mCCMap =  ccmap;  // will be freed by ~nsFontXftInfo().
>+            info->mConverter = converter;
>+        }
>+    }
>+
>+    gFontXftMaps -> Put(&key, info); 
>+
>+    return info;
>+}
>+
>+/* static */
>+nsresult
>+ConvertUnicodeToCustom(const PRUnichar* aSrc,  PRInt32 aSrcLength,
>+                       PRInt32& aDestLength, nsIUnicodeEncoder* aConverter,
>+                       PRBool aIsWide, nsAutoCharBuffer& aResult)
>+{
>+    nsresult rv;
>+
>+    nsCOMPtr<nsIUnicodeEncoder> converter = aConverter;
>+    if (! converter )
>+        return NS_ERROR_UNEXPECTED;
>+    if (aIsWide &&
>+        NS_FAILED(aConverter->GetMaxLength(aSrc, aSrcLength, &aDestLength))) 
>+        return NS_ERROR_UNEXPECTED;
>+
>+    char* str = aResult.GetArray(aDestLength); 
>+    if (!str) 
>+        return NS_ERROR_OUT_OF_MEMORY;
>+
>+    // Convert aString from Unicode to font-specific encoding with mConverter.
>+    rv = converter->Convert(aSrc, &aSrcLength, str, &aDestLength);
>+    if (NS_FAILED(rv)) 
>+        return rv;
>+
>+#ifdef IS_LITTLE_ENDIAN
>+    // Convert BE UCS2 to LE UCS2 for 'wide' custome fonts
>+    if (aIsWide) {
>+        char* pstr = str;
>+        while (pstr < str + aDestLength) {
>+            PRUint8 tmp = pstr[0];
>+            pstr[0] = pstr[1];
>+            pstr[1] = tmp;
>+            pstr += 2; // swap every two bytes
>+        }
>+    }
>+#endif
>+    return NS_OK;
>+}
>+
>+/* static */
>+nsresult
>+FillSpecBufferCommon(const PRUnichar* aString, PRUint32 aLength, 
>+                     nscoord aX, nscoord aY, nscoord* aXoffset,
>+                     nsRenderingContextGTK *aContext,
>+                     XftGlyphFontSpec* aSpecBuffer, 
>+                     float P2T, const nscoord* aSpacing,
>+                     PRBool* aFoundGlyph, nsFontXft* aFont)
>+{
>+    XftFont* xftFont = aFont->mXftFont;
>+
>+    const PRUnichar* str = aString;
>+    const PRUnichar* end = str + aLength;
>+    while(str < end) {
>+        nscoord x = aX + *aXoffset;               
>+        nscoord y = aY;                        
>+        /* Convert to device coordinate. */   
>+        aContext->GetTranMatrix()->TransformCoord(&x, &y);
>+                                                                 
>+        /* position in X is the location offset in the string 
>+           plus whatever offset is required for the spacing   
>+           argument                                           
>+         */                                                  
>+
>+        aSpecBuffer->x = x;                    
>+        aSpecBuffer->y = y;                   
>+        aSpecBuffer->font = xftFont;
>+        PRUint32 c = *str;
>+
>+        // Convert any surrogate paris into a single ucs4 character
>+        // NO unpaired surrogate point should reach here !!
>+        NS_ASSERTION(! IS_LOW_SURROGATE(c), "an unpaired low surrogate."); 
>+        if (IS_HIGH_SURROGATE(c)) {  
>+            NS_ASSERTION(str + 1 < end && IS_LOW_SURROGATE(str[1]),
>+                         "an unpaired high surrogate.");
>+            c = SURROGATE_TO_UCS4(c,str[1]);
>+        }
>+        aSpecBuffer->glyph = XftCharIndex(GDK_DISPLAY(), xftFont, c);
>+                                                               
>+        /* check to see if this glyph is non-empty */ 
>+        if (!*aFoundGlyph) {                           
>+            XGlyphInfo info;                        
>+            XftGlyphExtents(GDK_DISPLAY(), xftFont, 
>+                            &(aSpecBuffer->glyph), 1, &info);
>+            if (info.width && info.height)                  
>+              *aFoundGlyph = PR_TRUE;                      
>+        }                                                 
>+
>+        if (aSpacing) {
>+            *aXoffset += *aSpacing;
>+            aSpacing += (IS_NON_BMP(c) ? 2 : 1);
>+        }
>+        else if (IS_NON_BMP(c)) {
>+            *aXoffset +=
>+                NSToCoordRound(aFont->GetWidth16(str, 2) * P2T);
>+        }
>+        else 
>+            *aXoffset +=
>+                NSToCoordRound(aFont->GetWidth16(str, 1) * P2T);
>+
>+        ++aSpecBuffer;
>+        str += (IS_NON_BMP(c) ? 2 : 1);
>+    }                                                          
>+    return NS_OK;
>+}
>Index: gfx/src/gtk/nsFontMetricsXft.h
>===================================================================
>RCS file: /cvsroot/mozilla/gfx/src/gtk/nsFontMetricsXft.h,v
>retrieving revision 1.4
>diff -u -r1.4 nsFontMetricsXft.h
>--- gfx/src/gtk/nsFontMetricsXft.h	3 Mar 2003 15:26:57 -0000	1.4
>+++ gfx/src/gtk/nsFontMetricsXft.h	4 Mar 2003 15:45:13 -0000
>@@ -42,6 +42,7 @@
> #include "nsIAtom.h"
> #include "nsString.h"
> #include "nsVoidArray.h"
>+#include "nsValueArray.h"
> #include "nsIFontMetricsGTK.h"
> 
> #include <X11/Xlib.h>
>@@ -50,6 +51,12 @@
> 
> class nsFontXft;
> 
>+enum eFontType {
>+     eFontTypeUnicode,
>+     eFontTypeCustom,
>+     eFontTypeCustomWide
>+};
>+
> class nsFontMetricsXft : public nsIFontMetricsGTK
> {
> public:
>@@ -242,6 +249,8 @@
>     nsCAutoString        mDefaultFont;
> 
>     nsVoidArray          mLoadedFonts;
>+
>+    nsValueArray         mFontIsCustomEncoded;
> 
>     // Xft-related items
>     nsFontXft           *mWesternFont;
I'm terribly sorry for making the long list of comments even longer with
my misuse of 'Edit as comment' (I thought it would add only a diff...)

> BTW, there's still a mystery of which the cause I can't figure out
> although it might have to do with CCMAP_HAS_CHAR and plane 2 and higher

 Using CCMAP_HAS_CHAR_EXT in place of CCMAP_HAS_CHAR
in |nsFontXftCustom::HasChar| solved the problem.

As for converting to UCS4, it'd simplify things somehow and it'd not
complicate custom-encoded font support much (|nsConvertUnicodeToCustom|
has to be changed a bit).
Just to post an update here...

I've got code in my local tree that converts all the incoming UCS2 into UCS4 and
have moved all of the text measurement to use glyphs instead of characters. 
There doesn't seem to be a noticable impact on performance in either direction
yet from my measurements.

I've also managed to eliminate the use of |XftTextExtentsUtf16| which means that
I can delete that helper function.

Next to move everything, including text measurement, to using glyphs instead of
characters.

 nsFontMetricsXft.cpp |  270 +++++++++++++++++++++++----------------------------
nsFontMetricsXft.h   |    2
 2 files changed, 125 insertions(+), 147 deletions(-)

So far it's a net loss of code, which I'm pretty happy about.  I'll work more on
this this afternoon.
Thanks for the update.

>converts all the incoming UCS2 into UCS4

You meant UTF-16 instead of UCS2, didn't you? Did you use
mozilla uconv or roll out your own? 

> to eliminate the use of |XftTextExtentsUtf16|

Yup, with everything converted to UCS4, we can use 
|Xft....32|

> it's a net loss of code, which I'm pretty happy about.

  That's good news especially considering it's likely that
it comes with no noticable performance loss. How about
the net change in the memory footprint? Changing PRUnichar
to PRUint32 in some places may increase the footprint,
which may be  kinda cancelled out  by the net loss of code 
you wrote about. 
Yeah, sorry - I meant UTF-16.  I'm converting incoming characters, including
surrogates, from the incoming string into a string of ucs4 characters.

As for memory footprint, I'm not sure.  We'll have to see.  I do have to add a
couple more static buffers to have an area where you can do these translations
and avoid doing huge allocations for every call to any text measurement or
drawing functions.  I think this includes adding another 6k of static buffers,
but that number can be tuned a bit.  Also, if we have decent allocators on the
system it might be worth it just to allocate for every pass, but it's hard to
say without real numbers.

Anyway, I'm seeing if this makes sense because if it works and the tradeoff is
very small it's worth it to get rid of the huge code complexity and repeated
patterns we're dealing with right now.  Once I get a pure-glyph system up and
running, I'll do some more measurements and post an interem patch here so you
guys can have a look.
Summary: enabling font-spefic 'hacks' (for hack-encoded fonts) in Xft-enabled Mozilla → enabling font-spefic 'hacks' (for hack-encoded fonts) in Xft-enabled Mozilla (mathml on xft)
Attached patch patch (obsolete) — Splinter Review
Checkpointing changes.	Just saving this here before I go in and hammer the
ucs4 -> glyph lookup code.
@@ -807,11 +784,12 @@
         }
         else if (IS_NON_BMP(c)) {
             xOffset +=
-                NSToCoordRound(currFont->GetWidth16(&aString[i - 1], 2) * P2T);
+                NSToCoordRound(currFont->GetWidth32(&chars[i - 1], 2) * P2T);
         }
-        else 
+        else {
             xOffset +=
-                NSToCoordRound(currFont->GetWidth16(&aString[i], 1) * P2T);
+                NSToCoordRound(currFont->GetWidth32(&chars[i], 1) * P2T);
+        }

We don't need the distinction between BMP and non-BMP any more here,
do we? Instead of the above, the following line would do it:

  xoffset += NSToCoordRound(currFont->GetWidth32(&chars[i], 1) * P2T);

+            else { // Unpaired high surrogate
+                outBuffer[outLen] = c;
+            }
         }
-        else {
-            glyphs[nglyphs++] = XftCharIndex (aDisplay, aFont, wstr[i]);
+        else if (IS_LOW_SURROGATE(aString[i])) { // Unpaired low surrogate?
+            outBuffer[outLen] = c;
  

I guess it's a bit better to replace unpaired surroggate code
points with U+FFFD (replacement character). BTW, have you
tried using uconv, instead? Speedwise, it'd hurt, but
in terms of memory footprint, it'd help a bit. Besides,
I think the loss in performance may be made rather small by init'ing
the conveter once when |nsFontMetricsXft| is init'd. 

  
Attached patch patch #2 (obsolete) — Splinter Review
Checkpointing changes #2 - This patch uses an enumerator for walking the list
of found fonts.  This saves on complexity a lot and is more or less the same in
terms of lines of code and memory footprint.

It's not using glyphs exclusively yet.	It's still using XftFont + ucs4
character instead.
Attachment #116765 - Attachment is obsolete: true
> We don't need the distinction between BMP and non-BMP any more here,
> do we? Instead of the above, the following line would do it:

No, I already fixed that in the patch I just uploaded.

> I guess it's a bit better to replace unpaired surroggate code
> points with U+FFFD (replacement character). BTW, have you

Yeah, I wondered about that.  Leaving the unpaired surrogate did seem odd to me.

> tried using uconv, instead? Speedwise, it'd hurt, but
> in terms of memory footprint, it'd help a bit. Besides,
> I think the loss in performance may be made rather small by init'ing
> the conveter once when |nsFontMetricsXft| is init'd. 
> 

I thought about it, but I'm going for architecture changes here instead.  In any
case, does uconv have a UCS4 target?
> I thought about it, but I'm going for architecture changes here instead.  

  I meant using uconv converters along with architecture change.
That is, wrapping them up in |ConvertUnicharToUCS4|  Perhaps, it's not
worth bothering to use them because a simple conversion would be
sufficient here.  

> In any case, does uconv have a UCS4 target?

  If you meant converters between UTF-16(mozilla's native) and UCS4/UTF-32,
the answer is yes. 
> static XftGlyphFontSpec gFontSpecBuffer[FONT_SPEC_BUFFER_SIZE];
> +static FcChar32         gFontUCS4Buffer[FONT_SPEC_BUFFER_SIZE];

 BTW, it's not thread-safe to use two static buffers, is it? or we don't care?
How about defining a couple of helper classes for buffer alloc/dealloc? 
 
This code is only accessed from the UI thread, so it doesn't need to be thread safe.
So I haven't seen any problems with this so far.  I'm tempted to check it in.
Yeah, it looks nice with iterator. Before checking it in, don't
forget to put U+FFFD in place of unpaired surrogate codepoints.
After checking in your patch(which is not really about this bug
per se), are you gonna work on custom-encoded font support? 
Adapting my patch to the new architecture seems pretty easy. 

btw, it appears that using iterator in |nsFontMetricsGTK|
would be nice, too. 
Attached patch final patch (obsolete) — Splinter Review
This is the patch that I checked in for the sake of posterity.
Attachment #116792 - Attachment is obsolete: true
Featurewise, this is equivalent to attachement 116195 but it's now
casted in iterators and UCS4 added by Chris in attachment 117081 [details] [diff] [review].
This is still a work in progress. (I think I opened up a potential way
to support complex scripts with opentype fonts as opposed to using hacked 
truetype fonts. well, it may or may not work. too much is unknown. anyway I 
gues OT support is not for 1.4) I'm gonna also try to implement a few methods
for mathml.
Attachment #116195 - Attachment is obsolete: true
I implemented GetBoundingMetrics() (I grabbed Chris' old patch from
last spring and fitted it into the current patch). I may have gotten signs 
wrong. MathML sorta works(??), but something is definitely wrong with the 
vertical dimension.
Attachment #120369 - Attachment is obsolete: true
I got GetBoundingMetrics to work correctly as shown in bug 128153. 
Featurewise, I regard this patch as complete. There are some
code clean-ups and streamlining to do. One of them
is whether to leave nsFontXftCustom alone or split it into
nsFontXftCustomNarrow and nsFontXftCustomWide. It seems
like splitting it into two would make it a bit more simplification 
because there are some characteristics shared by CustomWide fonts
and genuine Unicode fonts but not by CustomNarrow fonts.

Anyway, could you take a look, Chris, Roger and anyone interested
in this? 
Thank you
Attachment #120405 - Attachment is obsolete: true
Target Milestone: --- → mozilla1.4beta
Comment on attachment 120561 [details] [diff] [review]
a new patch with working MathML support

Some comments:
===========================
Many #if 0 that obfuscate the code. If something is gone, you might 
as well leave it out. There is the CVS history in case some people wish
to look back at it.

===========================
My preference is for just two classes: Unicode and Custom, 
with |mIsWide| as a member variable of Custom. Doing so
will yield a code that is much easier to read.

===========================
|FillSpecBuffer|. A better name is probably |FillDrawStringSpec|
since it is only meant to be used by |DrawString|.

===========================
Now let's look at the details of this |FillSpecBuffer| :

+nsresult
+nsFontXftCustom::FillSpecBuffer(const FcChar32* aString, PRUint32 aLen,
+				 void* aData)
+{
+    DrawStringData *data = (DrawStringData *)aData;
+
+    nsresult rv = NS_OK;
+    nsAutoBuffer buffer;
+    PRUint32 destLen = aLen;
+    PRBool isWide = (mFontInfo->mFontType == eFontTypeCustomWide); 
+
+    rv = ConvertUCS4ToCustom(aString, aLen, destLen, mFontInfo->mConverter, 
+			      isWide, buffer);
+    NS_ENSURE_SUCCESS(rv, rv);
+
+    if (!mXftFont)
+	 GetXftFont();
+      
+    if (!isWide) {
+	 // For some narrow fonts(Mathematica, Symbol, and MTExtra),  
+	 // what we get back  after the conversion  is in the encoding 
+	 // of a specific  FT_Charmap (Apple Roman) instead of Unicode 
+	 // so that we	can't call XftCharIndex() which regards input 
+	 // as a Unicode codepoint. Instead, we have to directly invoke 
+	 // FT_Get_Char_Index() with FT_Face corresponding to XftFont 
+	 // after setting FT_Charmap to the cmap of our choice(Apple Roman).
+	 rv = SetFT_FaceCharmap(); 

I don't understand why you are special-casing. Note that the glyphs exist
in _duplicate_ sets in the Mathematica fonts. There are glyphs at U+F000
and the _same_ glyphs at U+0100. That's why there is nothing special
about them in GfxWin. We just use the narrow 'A' functions.

+	 NS_ENSURE_SUCCESS(rv, rv);
+    }
+    // The string after the conversion can be longer than the original.
+    if (destLen > aLen && 
+	 data->specBufferLen + destLen > data->specBufferSize) {
+	 data->specBuffer = ReallocSpecBuffer(data->specBufferSize, 
+			     data->specBufferSize + (destLen - aLen) * 2,
+			     data->specBuffer);
+	 if (!data->specBuffer)
+	     return NS_ERROR_OUT_OF_MEMORY;
+	 data->specBufferSize += (destLen - aLen) * 2;
+    }
+
+    FcChar32 *str = NS_STATIC_CAST(FcChar32 *, buffer.GetArray());
+
===========================
+    // XXX : this is identical to the loop in nsFontXft::FillSpecBuffer
+    // except for FT_Get_Char_Index in place of XftCharIndex. 
+    // Consider factoring out using |CharToGlyphIndex|.
+    // |CharsToGlyphIndices| might be better !

It looks to me that all you need here is a nsFontXft::CharToGlyphIndex()
which is overriden by the derived class (like you did with nsFontXft::HasChar),
and re-write a single fill spec stuff around that helper. 

+    if (!isWide) {
+	 for (FcChar32 *pstr=str; pstr < str + destLen; ++pstr) {
+	     data->specBuffer[data->specBufferLen].font = mXftFont;
+	     FT_UInt glyph = FT_Get_Char_Index (mFT_Face, *pstr);
+	     data->specBuffer[data->specBufferLen].glyph = glyph;
+    
+	     /* check to see if this glyph is non-empty */ 
+	     if (!data->foundGlyph) {
+		 XGlyphInfo info;
+		 XftGlyphExtents(GDK_DISPLAY(), mXftFont, &glyph, 1, &info);    
+		 if (info.width && info.height) 		 
+		     data->foundGlyph = PR_TRUE;		      
+	     }
+    
+	     nscoord x = data->x + data->xOffset;
+	     nscoord y = data->y; 
+    
+	     // Convert to device coordinate
+	     data->context->GetTranMatrix()->TransformCoord(&x, &y);
+    
+	     data->specBuffer[data->specBufferLen].x = x;
+	     data->specBuffer[data->specBufferLen].y = y;
+    
+	     XGlyphInfo info;
+	     XftGlyphExtents(GDK_DISPLAY(), mXftFont, &glyph, 1, &info);
+	     data->xOffset += NSToCoordRound(info.xOff * data->p2t);
+	     ++data->specBufferLen;
+	 }
+    }
+    else {
+	 rv = nsFontXft::FillSpecBuffer(str, destLen, aData);
+    }
+
+    return rv;
+}

===========================
+    nsAutoBuffer utf16Buffer;
+    PRUnichar* utf16Src  = NS_STATIC_CAST(PRUnichar*,
utf16Buffer.GetArray(aSrcLen * 4));

Now that you have the auto array, I have been wondering why not 
you didn't capitatize on it with the other spec allocs, etc. E.g.,
specBuffer.GetArray(aSrcLen * sizeof(...)));
Doable?

===========================
+    // See if a font has a custom/hacked encoding by matching

I don't see why you use the 'hacked' terminology. By definition,
not all 'glyphs' will ever be referenced in Unicode (since it is a 
'character' system). For special things, e.g., ink signature, or 
other various symbols, font designers have no alternative other than 
to use a private encoding. Simply use 'custom/private encoding'.

===========================
+    if (NS_SUCCEEDED(GetEncoding(family, encoding)) &&
+	 NS_SUCCEEDED(GetConverter(encoding, getter_AddRefs(converter)))) {
+	 nsCOMPtr<nsICharRepresentable> mapper(do_QueryInterface(converter));
+	 if (mapper) {
+	     ccmap = MapperToCCMap(mapper);
+	     if (encoding.Length() > 5 && 
+		 Substring(encoding, encoding.Length() - 5, 5) == 
+		 NS_LITERAL_STRING(".wide")) 

Several parsings which can be avoided if you had GetEncoding(, , &isWide)

===========================
+	     info->mCCMap =  ccmap;  // will be freed by ~nsFontXftInfo().

What about the maps that are shared?
Attachment #120561 - Attachment is obsolete: true
Thanks a lot for review and comment.

> I don't understand why you are special-casing. Note that the glyphs exist
> in _duplicate_ sets in the Mathematica fonts. There are glyphs at U+F000
> and the _same_ glyphs at U+0100. That's why there is nothing special
> about them in GfxWin. We just use the narrow 'A' functions.

  Windows narrow 'A' function seems to work roughly at the level of 
FT_* API in this particular case while XftCharIndex() works at a higher 
level. 

  You can trust me here :-). I spent a lot of hours figuring out
this with Keith's help last November. (see comment #36, comment #37, 
comment #38, comment #39). XftCharIndex() expects input to be in Unicode 
and converts input to the encoding of the cmap "selected". Let me take an 
example of Math1. Math1 Mozilla converter maps U+2122 (trade mark sign) to 0xD4.
Math1 font has two cmaps, Unicode cmap(PID=3, EID=1) and Apple Roman cmap
(PID=1, EID=0) as you know well. The glyph ID of the glyph for TM sign in the
font is
0xB7. Now the Unicode cmap maps character code U+F0D4 to the gid 0xB7
while Apple Roman cmap maps character code 0xD4 to 0xB7. Because the Mozilla
math1 converter is a single byte converter and gives us 0xD4 for U+2122,
we have to use Apple Roman cmap. Why can't we use XftCharIndex() even then?
That's because XftCharIndex unifies all the cmaps present in
a given font and does its own conversion to the encoding of the
"current" cmap  regarding input as Unicode codepoint. That is, 0xD4 is regarded 
as U+00D4(instead of 0xD4 in Apple Roman) and converts it to 0xEF 
(U+00D4 in Apple Roman is 0xEF.) if Apple Roman is 'the current' cmap.
This 0xEF is fed to FT_Get_Char_Index 
instead of 0xD4 and we get back GID 0xd2 instead of GID 0xb7.
To bypass the conversion by XftCharIndex, we have to directly call 
FT_Get_Char_Index.

You may ask why not then "lock" on Unicode cmap(, which we can't with Xft
API because it only exposes 'the unified' cmap instead of individual
cmaps as in FT API.) and call XftCharIndex after
adding 0xF000 to the value returned by Mozilla math1 converter. That would work
for math[1-5] fonts. However, that doesn't work for TeX Computer Modern fonts.
Using "Unicode cmap"(even though it's a completely "bogus") is fine for them, 
but then we  should not add 0xF000 for TeX fonts. Symbol font is another story
(it does not have Unicode cmap but has Apple Roman and MS Symbol cmap of which
we need to use Apple Roman cmap) and I forgot about Mathtype font. Perhaps, 
it's possible to use XftCharIndex by treating all 4 categories separately, but
that's 
as much special-casing as is currently done, isn't it? 

> It looks to me that all you need here is a nsFontXft::CharToGlyphIndex()
> which is overriden by the derived class (like you did with nsFontXft::HasChar),
> and re-write a single fill spec stuff around that helper. 

  Yes, that was behind my comment about it. What about my comment
on CharsToGlyphIndices()? That's because I was worried that everytime 
nsFontXftCustom::CharToGlyphIndex() is called (which is very often), 
we need to check mIsWide(or its equivalent) to decide which to use,XftCharIndex 
or FT_Get_Char_Index, which was one of a few reasons I was pondering
over splitting ns..FontCustom into two.
I thought using CharsToGlyphIndices() would reduce the number 
of that checking/branching. Anyway, I have something similar implemented 
and I'll upload my new patch after addressing your other concerns(renaming
FillSpecBuffer,
changing comment, GetEncoding(....&aIsWide)...)

> My preference is for just two classes: Unicode and Custom, with |mIsWide| as a 
> member variable of Custom.

 Yeah, I settled on using just two classes. BTW, what's equivalent to
|mIsWide| is stored in |mFontInfo->mFontType|. Do you think it's all right
(or desirable) to replicate that in |mIsWide| in ctor ? 

> I have been wondering why not you didn't capitatize on it with the other spec
allocs

I wasn't sure whether that would be a win speedwise compared with inlined
AllocSpecBuffer,
and Free*Buffer. 

> font designers have no alternative other than 
> to use a private encoding. Simply use 'custom/private encoding'.

  I'll change the comment as you suggested. However, I still think
they're "hacked" fonts for the following reason. They can use Unicode  
PUA codepoints by NOT violating any standard. What's done in Math* fonts 
with Apple Roman cmap for characters not representable in Apple Roman 
character set (it's fine for them to use U+F000 codeblock in Unicode cmap 
because it's in PUA) and in TeX TTF with Unicode cmap are certainly a 
hackery. That's probably a legacy of by-gone era when Win95/98
were widely used without 'W' API 'emulation' so that most programs had
to use Win32 'A' API instead of 'W' API.
Even then, I still don't understand why they put   a "bogus" Unicode cmap
with only characters below U+0100 (in addition to Apple Roman cmap) in 
TeX CM TTFs. Using 'Apple Roman' cmap for single byte 'custom/private
encoding' would have been sufficient as is done in Mathematica font. 

+	     info->mCCMap =  ccmap;  // will be freed by ~nsFontXftInfo().

> What about the maps that are shared?

Currently, sharing is incomplete in that two fonts with different family names
but the same CCMap don't share the map. In any case, |FreeCCMap| checks 'nullness'
before freeing it and nsFontXft instances stored in |gFontXftMaps| hash live as
long as 
|nsFontMetricsXft| of which dtor deletes |gFontXftMaps|. During the destruction
of gFontXftMaps(nsHashTable), I believe nsFontXft's stored get deleted as well.
Hmmm. they  
can't and don't. I was leaking....(although it might not matter because
presumably Mozilla
would die right after nsFontMetrics die..) I'll look into this..

As for gFontXftMaps, I've just found nsTHashtable (hash table with template). I'll
use one of its derivatives.
What happens when you do (for all 'narrow' fonts)
  FT_Select_Charmap (mFT_Face, ft_encoding_apple_roman)
  FT_Get_Char_Index()

Doesn't that give the intended GID in all cases?

[Note: the |if (mIsWide)| is immaterial. It is noise even compared to just the
context-switch needed to call the embedding function. Beware of premature
optimization...]
> What happens when you do (for all 'narrow' fonts)
>  FT_Select_Charmap (mFT_Face, ft_encoding_apple_roman)
>  FT_Get_Char_Index()

> Doesn't that give the intended GID in all cases?

  For truetype TeX computer modern fonts, we have to use Unicode cmap
(ft_encoding_unicode) because that's what Mozilla MathML converters for
TeX fonts return. Are you worried about this? I'm afraid we can't
avoid it. 

+        FT_Encoding encoding;
+        if (nsCRT::strncasecmp(mFT_Face->family_name, "cm", 2))
+            encoding = ft_encoding_apple_roman; // Mathematica, Symbol fonts 
+        else 
+            encoding = ft_encoding_unicode; // TeX Computer Modern fonts
 
BTW, thank you for the note about premature optimization. Now I'm considering
reversing changes(related to char->gid conversion)  I made. 
Echoing what rbs said, and I was about to, don't over-optimize.  If it's a perf
problem we'll deal with it later but code simplicity here is pretty important. 
I set that code up so that you could do all of your mathml and hacked font
conversions in one place - where characters are mapped to glyphs.

That being said iirc, I was still using a unicode char to indicate the glyph so
that interface might need to be changed to a font/glyph pair.  I'm not afraid of
doing so, if you want.
Attached patch a new patch (obsolete) — Splinter Review
Changes from the last patch:
									       
  
  - uses virtual CharToGlyphIndex to remove code duplication in
    |FillDrawStringSpec|
									       
  
  - uses in-place replacement wherever possible. For instance,
    conversion to UTF-16 from UCS-4 in ConvertUCS4ToCustom  and
    converstion to glyph indices in nsFontXftCustom::GetTextExtents32()
    replace the current buffer in place.
									       
  
  - switched from nsHashtable to a low level PLHash
    (copied from nsFontMetricsWin.cpp)
    to avoid the leak at the end of nsFontMetricXft's life. nsClassHashtable
    would be very nice if we want to avoid adding all these PLHash handling
    stuffs, but I haven't managed to compile nsClassHashtable.h(yes, I couldn't

    compile the header file ! It's brand new and nobody has deployed it
    in Mozilla code, yet.)  with g++ 3.2.
									       
  
  - switched to nsAutoBuffer for all char-like buffer handling so that
    static gUCS4Buffer[] and FreeUCS4Buffer are gone. gFontSpecBuffer[]
    and (Alloc|Realloc|Free)SpecBuffer	are kept because of the following:
									       
  
    1. the size of XftGlyphFontSpec is three times(or four times on 64bit
     platform) as large as the largest char-like type(UCS4) so that "sharing"
     (of course, it's not sharing...)
     the same static buffer (3000 bytes) would effectively reduce its capacity
     down to  250 elements. (in practice, 250 might or might not be
     large enough...)
    2. The current approach of using the global static buffer and
       Alloc/Free (Realloc added by me) seems simple enough except that
       it has to be freed explicitly.
									       
  
    If I can use templates as in nsFontMetricsWin.cpp, the conclusion could be
    different, but I can't. Alternatives include increasing the default
    buffer size of nsAutoBuffer to 10kB or so and defining a separate
    nsAutoSpecBuffer. The former has a drawback of the momentary memory
    footprint increase of 7kB times # of autobuffers opened simultaneously.
									       
  
  - TextDimensionCallback, nsFontXft::GetWidth32, and
    nsFontXft::GetBoundingMetrics32 invoke nsFontXft::GetTextExtents32
    that in turn calls Xft(Text|Glyph)Extents after the conversion to
    font custom encoding if necessary.
									       
  
  - there are other miscellaneous changes and clean-up
Attached patch alternative (obsolete) — Splinter Review
This is alternative to attachment 120800 [details] [diff] [review]. I made this one before attachment
120800 [details] [diff] [review], but decided to go with attachment 120800 [details] [diff] [review]. In light of Chris' 
comment, I thought this might be of interest to him and rbs.
 
I didn't put UCS4 to GID's in a single function, but I could do
that rather easily by making ConvertUCS4ToCustom a member function
of nsFontXftCustom and putting (currently) separate GetGlyphsWithFT
and GetGlyphWithXft into the new (renamed : 
e.g. ConvertUCS4ToGlyphIndices) member function.

Anyway, with this change, I also had to preserve  |non-BMPness|
which is lost when chars converted to glyph indices. There are two
ways I came up with. One is to encode non-BMPness in one of unused
high bit in glyph indices (even an opentype font can have at most
64k glyphs although  they can be used for non-BMP characters)
and check it later when filling up specbuffer to decide whether
or not to skip aSpacing. The other is to 'align' aSpacing with UCS-4 
string and copy the result to a new buffer. The former is more complex
than the latter in that with the latter approach, we don't have
to check IS_NON_BMP down the road. On the other hand,
the latter makes the calling sequence for ConvertUnicharToUCS4 a 
bit more complex and requires a new buffer(most of time  no heap
allocation should be required.)  This 'preserving' of non-BMPness 
done in ConvertUnicharToUCS4 in this patch might as well be 
done in ConvertUCS4ToGlyphIndices although I prefer the current patch.

BTW, this brings up an issue I've been	ignoring so far. 
Unlike Pango, there's no easy way to move along 'attributes' of characters
(in this case 'aSpacing') when characters in a given string
undergo metamorphosis(conversion to custom font code,
conversion to glyph indices, etc) and the length is not preserved,
which is common in complex script processing. 

Anyway, if you like this patch (with changes I mentioned here)
better than attachment 120800 [details] [diff] [review], I'll make a new patch with
changes. Otherwise, could you review attachment 120800 [details] [diff] [review]?
> undergo metamorphosis(conversion to custom font code,
> conversion to glyph indices, etc) and the length is not preserved,

  I should have added  'or glyphs/characters are reordered'. Anyway, this can be
another bug and we don't have to deal with it here. 
Comment on attachment 120800 [details] [diff] [review]
a new patch

=================
+nsresult
+nsFontMetricsXft::TextDimensionsCallback(const FcChar32 *aString, PRUint32
aLen,
+					  nsFontXft *aFont, void *aData)

+    data->dimensions->width += glyphInfo.xOff;
+    if (data->dimensions->ascent < glyphInfo.y)
+	 data->dimensions->ascent = glyphInfo.y;
+    if (data->dimensions->descent < glyphInfo.height - glyphInfo.y)
+	 data->dimensions->descent = glyphInfo.height - glyphInfo.y;

-    if (data->dimensions->ascent < tmpMaxAscent)
-	 data->dimensions->ascent = tmpMaxAscent;
-    if (data->dimensions->descent < tmpMaxDescent)
-	 data->dimensions->descent = tmpMaxDescent;
+    return NS_OK;
[...]
+nsFontXft::GetBoundingMetrics32(const FcChar32*    aString,
+				 PRUint32	    aLength,
+				 nsBoundingMetrics& aBoundingMetrics)
+{
+    aBoundingMetrics.Clear ();
+
+    if (aString && aLength) {
+	 XGlyphInfo glyphInfo;
+	 GetTextExtents32 (aString, aLength, glyphInfo);
+	 aBoundingMetrics.leftBearing = - glyphInfo.x;
+	 aBoundingMetrics.rightBearing = glyphInfo.width - glyphInfo.x;
+	 aBoundingMetrics.ascent = glyphInfo.y;
+	 aBoundingMetrics.descent = glyphInfo.height - glyphInfo.y;
+	 aBoundingMetrics.width = glyphInfo.xOff;
+    }
+    return NS_OK;
 }

You are returning the same ascent/descent in both 
GetTextDimensions and GetBoundingMetrics. They are not the same thing.
The previous code of GetTextDimensions was correct. (Things might
escape your testing if you don't have a mixture of characters from
different languages on the same text run).

==================
To avoid that hard special-casing (which I am having some misgivings
about), you might use another property:

encoding.cmex10.FT_Select_Charmap = ft_encoding_unicode
encoding.math1.FT_Select_Charmap = ft_encoding_apple_roman

then stash FT_Encoding mFT_Encoding in the fontInfo struct.

At least this way, you don't have to hardwire the special-casing
and you can move the comments to that effect in the property file.

=====================
Apart from these, the patch looks good.
Sorry for spamming. This one should replace attachment 120800 [details] [diff] [review]. By mistake,
I uploaded the second newest patch rather than the most recent patch.
attachment 120800 [details] [diff] [review] refered to in my comment for attachment 120803 [details] [diff] [review]
should be understood as refering to this patch, instead. 

BTW, I added a patch to Makefile.in as well to install fontEncoding.properties
file in gfx/src/windows directory.
Attachment #120800 - Attachment is obsolete: true
What/Where is the different chunk of code between the two? It takes a toll to
review that much code in succession.
Thank you for your prompt review and sorry for the confusion. I realized that
the difference between
attachment 120800 [details] [diff] [review] and attachment 120805 [details] [diff] [review] is very minor (I should have just explained
it instead of uploading it). Several changes in comment and one fix of typo
(PRUint -> PRUint32).
The only other actual code change occurred about 130 lines from the end. I just
removed the 
following unnecessary lines in |GetFontXftInfo| and moved |nsFontXftInfo* info|
just 
before it's used. 

-+    info = new nsFontXftInfo;
-+    if (!info)
-+        return nsnull;

I should've asked about TextDimension before changing it.
I wondered about it, but later forgot that.
                                                                                  
> encoding.cmex10.FT_Select_Charmap = ft_encoding_unicode
> encoding.math1.FT_Select_Charmap = ft_encoding_apple_roman
                                                                                  
> then stash FT_Encoding mFT_Encoding in the fontInfo struct.
                                                                                  
This looks like a nice idea. Do you think it's worth making it generic
using FT_ENC_TAG (see
http://freetype.sourceforge.net/freetype2/docs/reference/ft2-base_interface.html#FT_Encoding)?
I can read in the property value(such as 'symb(ol)', 'unic(ode)') and convert it to
C string and take the only first 4 letters to get FT_Encoding, which I can later
hand over to FT_Select_Charmap.
Doing so might make it easier (not that it's very hard as it is now)  to fix
bug 179946.
~

P.S. I just noticed that nsAutoBuffer has unused Realloc method and I'll remove it.
> Do you think it's worth making it generic using FT_ENC_TAG

Seems that it will add extra levels of indirection that don't buy us much, but
instead make things unnecessarily cryptic. I wouldn't bother with such extra
level of indirections that resolve to the same thing. Whereas with
|encoding.cmex10.FT_Select_Charmap = ft_encoding_unicode|, it is immediately
clear to the reader (now or a few months later...) that the intention is to
instruct FreeType to select the specified cmap when processing that font.
Why do we need to preserve the BMP status of a particular character if it's
already been converted to a glyph?
In comment #81, I think, I explained why. In short, we need to preserve it
because of
the following code:

        if (data->spacing) {
            data->xOffset += *data->spacing;
            data->spacing += IS_NON_BMP(*pstr) ? 2 : 1;
        }
        
Attached patch a new patch (obsolete) — Splinter Review
This is the latest. Not much has changed from attachment 120805 [details] [diff] [review]
other than fixing TextDimensionCallback
and putting FTCharmap list in fontEncoding.properties file.
Makefile.in and fontEncoding.properites file are included in the patch.
Attachment #120805 - Attachment is obsolete: true
Although some of the following have been mentioned before, it might be
handy to list  test cases here:

1.
MathML font encoding test:
   http://www.mozilla.org/projects/mathml/fonts/encoding/
                                                                                
2. Bound. Metrics test case
http://bugzilla.mozilla.org/attachment.cgi?id=72918
                                                                                
3. Math ML torture page
http://www.mozilla.org/projects/mathml/demo/texvsmml.xml
                                                                                
4. Old Korean (needs patch for bug 176315)
http://jshin.net/i18n/korean/hunmin-og.html
                                                                                
5. Plane 1 and 2. 4digit and 6digit unknown glyph boxes
http://jshin.net/i18n/utftest/bom_utf32be.html
                                                                                
6. Plane 1 and RTL plane 1
http://www.i18nguy.com/unicode/plane1-utf-16.html

5 and 6 are not directly related with custom font-encoding
support, but need to be checked because my patch changes
some code related with non-BMP char. support. 

Anyway, Chris, it'd be nice if you can go through my latest patch(or
alternative which is less refined) and
comment on it. I feel pretty satisfied with it, but there may be something
I missed.  It's been pretty long and let's hope that we can be done with
this soon :-)

BTW, in the latest patch, I forgot to replace x-mathematica-5 with
apple-roman for encoding.math5*.ftcmap entries.

Thank you in advance :-)
Comment on attachment 120830 [details] [diff] [review]
a new patch

sr=rbs, I would have preferred
|encoding.cmex10.FT_Select_Charmap = ft_encoding_unicode|
rather than your nomenclature which looks a bit too "semi-official" for my
taste, but...

Back to blizzard if he is okay to r= that the code agrees with his plans.
Attachment #120830 - Flags: superreview+
BTW, looks like nsFontMetricsXft::FamilyExists() is testing for ASCII names and
so excludes your Korean fonts.

+# For the Korean locale, Korean family names have to be
+# listed in UTF-8.
+encoding.새굴림.ttf = x-hykoreanjamo-1.wide
+encoding.굴림옛한글자모.ttf = x-hykoreanjamo-0.wide
rbs, thank you for sr.

Now that fontEncoding.properties file for
Xft became different from that for GFXWin
I'll remove unnecessary 
Korean lines.
 
As for semi-official lines, I decided to use
them because FT_Select...* and ft-encoding-* 
seem a bit too 'mouthful' :-).Of course,
I can go, either way.

blizzard, now I'm looking forward to your 
review. Thank you! 
This looks like it almost solves the problem of Thai shaping for Xft-enabled
Mozilla.  Thai shaping was provided by the Sun CTL extension for the Xlib font
code, but with Xft there's no shaping. Proper Thai shaping requires some
context-dependent glyph substitution.  By convention in Windows Thai TTF fonts,
the additional glyphs required are mapped into part of the PUA (F700 - F71A)
(the nominal glyphs are correctly mapped in accordance with Unicode); I think
Mac Thai fonts use a different PUA range; most Thai fonts use the Windows
convention. This patch would allow proper Thai shaping to occur once appropriate
converters were written.

The problem I see is with the requirement that a user would have to add every
Thai font family for which they want proper shaping to fontEncoding.properties.
 This seems to me to negate one of the main advantages of fontconfig/Xft: a user
can simply cp a .ttf file to their fonts directory and it "just works".  The way
both Windows (Uniscribe) and Pango (on top of Xft) work is that the Thai shaping
engines know about the PUA convention for Thai fonts; they see if the unicode
cmap of the font contains entries for PUA: if it does it, they automatically
perform these substitutions.  Pango knows both the Windows and Mac convention
and automatically does the right thing.  Some fonts might not contain the PUA
chars, but no Thai font is going to use the PUA inconsistently with at least the
the Windows convention, because it would fail to work with almost all important
applications.  As far as I know, there aren't any Thai TTF fonts that use
OpenType GSUB, and there aren't any Thai aware formatters that use GSUB: the PUA
convention *the* to get proper Thai shaping.

I think Mozilla should be able to do choose a converter based on the language
group and the characters (typically PUA) available in the
fontfontEncoding.properties should be able to declare the conventions used, i.e.
it should be able to say a font for such and such a language group that contains
a particular range of PUA characters should use a particular converter. 
Alternatively this convention should be hard-coded.  At any rate, users should
NOT have to fiddle with obscure config files to get their language to display
properly with their fonts.
Yes, I had South Asian and SouthEast Asian scripts (as well as MathML and
Korean) very much in mind when I filed this bug. Thai appears to be the easiest
to handle of South and Southeast Asian scripts partly because TIS-620
and Unicode Thai range don't follow phonetic order but follow visual order
(thus, there's no need to reorder letters and simple-minded overstriking
works pretty well as proven in xterm's support of Thai with X11 bitmap fonts)
Anyway, as a test case, I may write  converters for Thai and Tamil.
Is there any good link to PUA font encoding for Thai ? I think I know
the one I mentioned in bug 127755, but having another wouldn't hurt.

BTW, have you seen http://www.microsoft.com/typography/otfntdev/thaiot/default.htm?
If you have Windows XP and a Thai OT font, Mozilla-Win *might* work
as well as MS IE6. 

Of course, the availability of opentype fonts for Thai wouldn't help
until Mozill-Xft can make use of OTFs. It seems that the best way to do it
is to take advantage of Pango somehow. See my comments in bug 186463 and
the link there to Mozilla+Graphite project). However, it'll take time and in 
the meantime, we have to rely on 'custom-fonts' as this bug tries to. 

> fontfontEncoding.properties should be able to declare the conventions used, i.e.
> it should be able to say a font for such and such a language group that contains
> a particular range of PUA characters should use a particular converter.

   This might work (but it could break for Pan-Unicode fonts like Code2000
with almost full PUA range). Anyway, let's get the patch
for this bug landed first :-) When we can finally get Mozilla 'married' to Pango, 
we will be able to delegate most chores to Pango....
I've attached a document I wrote a while back; it's written to assume no
knowledge of Thai.  The algorithm assumes valid Thai combining sequences. 
Invalid sequences should be detected and displayed using the WTT 2 rules as
described on Thep's page.

My description hasn't been reviewed by anybody, so it may have bugs.

Thanks for the pointer to the OT Thai rules.
Doing further extensions is probably another bug. But it seems to me that the
task of supporting arbitrary fonts is best left to the font-engine (Xft), with a
dynamic/extensible fontconfig, and other accessories. 

Linux/Unix has been lagging behind. As now acknowledged on Windows (or
elsewhere with extensions), it is with interoperability & cooperation between
the systems that the biggest win comes, and that's what we are trying, but not
so much as Mozilla taking over and becoming a font server.

From a Mozilla/browser perspective, all that is really important at the end is
to have some fallback glyphs so as to cover Unicode -- as a palliative to the
limitations of fonts & font-engines (often on Linux/Unix). This way, a web
reader can at least go on the web out there. This at least motivates to try
further extra if the trade-off is worth it.

But if Mozilla (or other clients) could just pass the Unicode Thai string, and
Xft does the magic-mapping for the rendering/measuring, that would be a win to
all clients in the long run. Xft people: hear, hear.

And between hard-coding or putting in a file, be it obscure, the latter doesn't
seem that bad on further thougths since it is at least configurable.
> Doing further extensions is probably another bug. But it seems to me that the
> task of supporting arbitrary fonts is best left to the font-engine (Xft), with a
> dynamic/extensible fontconfig, and other accessories. 
> 

We should probably be delegating to pango for this kind of thing.  It's written,
well tested and it works.  It also doesn't have to handle the actual rendering,
if you don't want.  It's device independent (think printing.)
> Doing further extensions is probably  another bug.
                                                                                   
  I agree. Writing converters for custom font encodings of SE/S. Asian
scripts doesn't belong to this bug just like bug 176315 is a separate
bug. Neither does automatic font discovery or something similar even if
we decide to do it.
                                                                                   
> as a palliative to the  limitations of fonts &  font-engines (often
> on Linux/Unix).
                                                                                   
  Actually, this is not a fair descrpition of the rendering support in
Linux/Unix in 2003. I would have agreed with you if you had told me that
in 2001. freetype2/fontconfig/Xft/Pango changed everything in a single
stroke. They're  the best thing that has ever come to Unix/X11 in the last
10 years.  What's missing is the link between Mozilla and Pango. It might
be possible to implement most methods of nsIFontMetrics as rather simple
wrappers over Pango APIs because Pango can do a lot more than Uniscribe
and OTLS combined under Win32.  As for Mozilla-Win, it's still unclear to
me as to what magic ExtTextOutW can do. Does it invoke Uniscribe and OTLS
on its own without callers of ExtTextOutW doing anything special? It's
not likely because Mozilla doesn't seem to take advantage of opentype
fonts while on the same system MS IE6 can do. The rendering by MS IE6 and
that by Mozilla-Win are different for complex scripts like Indic scripts

and Thai.  If we (who don't know those scripts) don't take a closer look,
it's easy to miss the difference although to native speakers, it's all
but impossible to miss that huge difference just at a glance.
                                                                                   
                                                                                   
> But if Mozilla (or other clients) could just pass the Unicode Thai
> string, and  Xft does the magic-mapping for the rendering/measuring, that
> would be a win to  all clients in the long run. Xft people: hear, hear.
                                                                                   
  That's not what Xft is about :-) That's Pango's job and it does it
wonderfully well.  Basically, if you throw a random Unicode string at it,
it'll 'analyze' the string to split it into as many pieces as necessary
(per lang/font/shaping engine coverage), search for fonts for each piece
and measure/draw them. Well, you have to call 'analyzer' separately before
invoking shaper. A potential competitor to freetype2/fontconfig/Xft/Pango
is Sun's Standard Type Service Framework (STSF).
See bug 203052 for Thai converter
that can be used along with the patch
for this.
Owen has said that pango is kind of slow, which is why I didn't go directly from
the old core code to pango.  Pango relies on fontconfig's matching these days,
so we do the same matching right now.  However, for complex scripts, pango is
the way to go, that's for sure.
>  Actually, this is not a fair descrpition of the rendering support in
>Linux/Unix in 2003.

I recognize that. And that's why I said that it was lagging behind (until those
fresh developments and the next challenge is to catch up with OpenType -- to be
done by fonts themselves as well as applications). But we are getting off track.
Better concentrate on the checkin of this little gem.
Blocks: 203052
Blocks: 204039
I would like this feature to be available in Mozilla 1.4. Then I can recommend
people create tamil webpages in unicode format. Users using Linux can use one of
the many tscii encoded fonts to view the webpages until mozilla has full unicode
support for tamil.

Manmathan
Blocks: 204286
Blocks: 204439
blizzard, can you review and approve the patch (attachment 120830 [details] [diff] [review]) for 1.4f
branch as well as for the trunk? A lot of people (MathML, Thai, Devanagari,
Tamil and Korean) want this feature in 1.4f and they have been waiting pretty
long. Thank you
Summary: enabling font-spefic 'hacks' (for hack-encoded fonts) in Xft-enabled Mozilla (mathml on xft) → enabling custom font encoding support in Xft-enabled Mozilla (mathml on xft)
Target Milestone: mozilla1.4beta → mozilla1.4final
If its any help I've been using this patch now for about a month as per comment
62 in bug 128153 ("need to get MathML fonts working with Xft"). I havent noticed
any other problems as a result ...
THis ia a patch against the trunk but virtually nothing has changed in the
actual content. 

With the patch for bug 206811 checked in, I need a little update. The function
signature of GetEncoding() was a bit simplified because we don't use nsIAtom
any more for charset. manager. I also added Hindi, Thai and Tamil font entries
to fontEncoding.properties file (bug 204039, bug 203052, bug 204286) and
removed the unnecessary entry for Korean font (with Korean name). It now uses
'FT_Select....' as suggested by rbs. BTW, I added a few PR_LOG lines for custom
encoded  fonts.

Summing up, attachment 120830 [details] [diff] [review] is for 1.4 branch and this is for the trunk.
Comment on attachment 125513 [details] [diff] [review]
a new patch (basically the same) against the trunk

assuming rbs's review still stands, I'm now asking blizzard for review. 

BTW, it's not bug 206811 but bug 206379 that made it necessary to update the
patch.
Attachment #125513 - Flags: review?(blizzard)
Comment on attachment 125513 [details] [diff] [review]
a new patch (basically the same) against the trunk

sr=rbs

blizzard, please help with the drive, all interested folks are waiting on you
to lead now, thanks.

jshin, I had second thoughts about the naming after actually seeing how it
looks like in the patch.  The long /.FT_Select.../ hurts the eyes. Your earlier
/.ftcmap = unicode | apple-roman/
is more appealing, please revert when you check in...
Attachment #125513 - Flags: superreview+
...ring... ...ring... ...ring... 

Chris, please pickup, know you're there...
Yeah, sorry.  All kinds-of-busy.
OK, I've looked through this patch.  Here are my first set of comments from a
high level:

We're not keeping a clean seperation between measuring glyphs and picking fonts
and converting character points into the proper glyphs.  I'd like to try to
maintain that it we can.  I realize that the current code doesn't do a good job
of this at the moment since I pick the font in EnumerateGlyphs.

It seems that the root of the problem is that the way that what converter we
choose depends on the font that is picked to render the character.  Because of
this, I suggest a two-pass system where a full UTF-16 -> glyph conversion is
done, then the operation in question is run on the resulting glyphs.  Hmm, is
this not possible because of the calls to FT_Select_Charmap()?  I think I need
to read and think about this a bit more.
Thanks forlooking at it.

> I suggest a two-pass system where a full UTF-16 -> glyph conversion is
> done, then the operation in question is run on the resulting glyphs.

  Is this along the same line as your comment #79? Please, see my responses to
that (in comment #81, comment #82, comment #89 and attachment 120803 [details] [diff] [review]). 
I did some thinking about how to solve the problems I've brought up and I don't
think they are serious enough to warrant not taking this patch.  I'd like to
revisit some of the issues, but not right now.  This has been sitting long
enough on my watch.

I've got a copy of this patch in my tree that has a bunch of style cleanups. 
I'm just going to check it in instead of doing another round trip.
attachment 120830 [details] [diff] [review] has an old fontEncoding.properties file. This one
has to be used, instead, for _both_ 1.4branch (attachment 120830 [details] [diff] [review]) 
and trunk (attachment 125513 [details] [diff] [review])

In addition, the following change has to be made to attchment 125513:

>+	      nsAutoString ftCharMap; 
>+	      nsresult rv = gFontEncodingProperties->GetStringProperty(
>+			    Substring(name, 0, name.Length() - 4) + 
>+			    NS_LITERAL_CSTRING(".FT_Select_Charmap"),
ftCharMap);

			    NS_LITERAL_CSTRING(".ftcmap"), ftCharMap);

>+	    
>+	      if (NS_FAILED(rv)) 
>+		  aFTEncoding = ft_encoding_none;
>+	      else if (ftCharMap.EqualsIgnoreCase("ft_encoding_apple_roman"))

	      else if (ftCharMap.EqualsIgnoreCase("mac_roman"))

>+		  aFTEncoding = ft_encoding_apple_roman;
>+	      else if (ftCharMap.EqualsIgnoreCase("ft_encoding_unicode"))

	      else if (ftCharMap.EqualsIgnoreCase("unicode"))

>+		  aFTEncoding = ft_encoding_unicode;
>+	  }
Attached patch final patchSplinter Review
Final patch saved while I move my home computer.
Attachment #125513 - Attachment is obsolete: true
Attachment #126002 - Attachment is obsolete: true
Thanks for the style clean-up. I ported it back to 1.4branch, built and tested
it. Both trunk and 1.4branch  work fine. I'll check in attachment 126190 [details] [diff] [review] if you
don't mind. And, I'll seek 'a1.4' for this assuming r/sr for the trunk patch
hold for it as well because the only difference between two patches  is due to
bug 206379 (removing nsIAtom from the charset. conveter manager)
Attachment #105334 - Attachment is obsolete: true
Attachment #117081 - Attachment is obsolete: true
Attachment #120803 - Attachment is obsolete: true
Attachment #120830 - Attachment is obsolete: true
I landed  attachment 126190 [details] [diff] [review] in the trunk. blizzard and rbs, can I get
review stamps for the final patch to 1.4 branch so that I can ask for a1.4
Thanks !
My gtk2/libxft2 build of Thunderbird now fails, probably because of this change.
The compile error is below.

../../../mozilla-config.h -Wp,-MD,.deps/nsFontMetricsXft.pp
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:249: error: syntax
   error before `*' token
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp: In member
   function `nsresult nsFontMetricsXft::BoundingMetricsCallback(const
   FcChar32*, unsigned int, nsFontXft*, void*)':
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1684: error: `
   nsBoundingMetrics' undeclared (first use this function)
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1684: error: (Each
   undeclared identifier is reported only once for each function it appears
   in.)
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1684: error: parse
   error before `;' token
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1688: error: `bm
   ' undeclared (first use this function)
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1696: error: `
   GetBoundingMetrics32' undeclared (first use this function)
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1701: error: '
   struct _BoundingMetricsData' has no member named 'bm'
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:1705: error: '
   struct _BoundingMetricsData' has no member named 'bm'
../../../dist/include/string/nsBufferHandle.h: At top level:
/home/andre/devel/cvs/mozilla/gfx/src/gtk/nsFontMetricsXft.cpp:2797: warning: `
   nsresult StaticBoundingMetricsCallback(const FcChar32*, unsigned int,
   nsFontXft*, void*)' defined but not used
Thanks for spotting the problem. Because in both my local build and xft
tinderbox, MathML is enabled, I didn't notice that I forgot to enclose some
code blocks with "#ifdef MOZ_MATHML". Here's the patch. Can you try it on your
tree and let me know the result? Adding '#undef MOZ_MATHML' manually to ns*Xft
files to test the patch didn't work (I guess I have to do that all over the
place in gfx/src/gtk)
Tried the patch, and now I get a compile failure in a different location. The
gtk2/xft2 tinderbox has the same build failure:

http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey-Ports/1056591415.11559.gz


 ../../../mozilla-config.h -Wp,-MD,.deps/nsAccessibleText.pp
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.cpp
In file included from
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.h:43,
                 from
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.cpp:41:
/home/andre/devel/cvs/mozilla/accessible/src/base/nsAccessibleEventData.h:49:
warning: `
   class nsAccessibleEventData' has virtual functions but non-virtual
   destructor
In file included from
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleWrap.h:46,
                 from
/home/andre/devel/cvs/mozilla/accessible/src/base/nsBaseWidgetAccessible.h:43,
                 from
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.h:44,
                 from
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.cpp:41:
/home/andre/devel/cvs/mozilla/accessible/src/base/nsAccessNode.h:102: warning: `
   virtual nsresult nsAccessNode::GetChildAt(int, nsIAccessNode**)' was hidden
/home/andre/devel/cvs/mozilla/accessible/src/base/nsAccessible.h:71: warning:
   by `virtual nsresult nsAccessible::GetChildAt(int, nsIAccessible**)'
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.cpp: In
   member function `virtual nsresult nsAccessibleText::GetCharacterExtents(int,
   PRInt32*, PRInt32*, PRInt32*, PRInt32*, int)':
/home/andre/devel/cvs/mozilla/accessible/src/atk/nsAccessibleText.cpp:738: error: no
   matching function for call to `nsIFrame::GetWindow(
   nsCOMPtr<nsIPresContext>&, nsIWidget**)'
../../../dist/include/layout/nsIFrame.h:1010: error: candidates are: virtual
   nsIWidget* nsIFrame::GetWindow() const
Thanks for testing.
 
As for the new falure, it has nothing to do with this, does it? Why don't you
check with roc+ (or possibly danm) who had landed right before the failed build
began?
Comment on attachment 126257 [details] [diff] [review]
blizzard's cleaned-up patch backported to 1.4 branch

sr=rbs applies from now on, don't worry asking/waiting for me. Looks like 1.4f
is around the corner and if you wait any longer before trying to get approval,
you might miss the cut off, and will have to patch later, in an out-of-sync
manner.

[BTW, you might do well to check in the fix for the build bustage which is
often a high priority as it saves time to people by avoiding broken builds
half-way in the massive compilation, especially for those who let it build
overnight.]
Attachment #126257 - Flags: superreview+
Comment on attachment 126257 [details] [diff] [review]
blizzard's cleaned-up patch backported to 1.4 branch

Asking for a1.4 (since backporting blizzard's final style-cleanup patch to
1.4branch is trivial, I'm assuming blizzard's r but asking him for it anyway.) 
This patch is for Xft build (not a default build) but a lot of people have been
waiting for this to see in 1.4final because it enables Xft-build to render
MathML and complex scripts (Devanagari for Hindi and several other Indian
languages, Tamil, Thai and Korean Hangul). This patch (with and without
blizzard's style-cleanup) has been field-tested over two months by me and a few
others and no problem has been found. The risk to those who will never see web
pages in complex scripts or MathML is pretty low. 

BTW, I've already incorporated attachment 126514 [details] [diff] [review] (in my 1.4tree) to get it
compiled when MathML(which is on by default) is disabled.
Attachment #126257 - Flags: review?(blizzard)
Attachment #126257 - Flags: approval1.4?
I've just clobber-built 1.5a with a fresh cvs pull. Done some browsing,
including the MathML start and torture test pages. Excellent anti-aliased
Mathml, no other problems so far.  I has also previously applied this patch to
my tree as part (bug 128153 comment 62). 

So overall, works for me (many thanks).
Thank you for sr, rbs. I checked in attachment 126514 [details] [diff] [review] (build-bustage fix) after
making a clobber build with MathML disabled). 

Thank you for testing and testimonial :-), Geoff
Attachment #125513 - Flags: review?(blizzard)
Hi Jungshik Shin!

Thank you very much for the wonderful 1.4 branch patch.

I have build a patched RH-8.0 rpm and found one error. It seems that when you
only use '$(INSTALL) $^' in Makefile.in it will copy the link and not the file
itself. I suppose you should have used '$(SYSINSTALL) $(IFLAGS1) $^' instead as
done with 
mathfont properties files?
Too bad this missed the 1.4 train.  If there aren't any problems found on the
trunk over the next week or two, we can probably make 1.4.1.  Does that sound
reasonable?
Comment on attachment 126257 [details] [diff] [review]
blizzard's cleaned-up patch backported to 1.4 branch

moving approval request forward.
Attachment #126257 - Flags: approval1.4? → approval1.4.x?
Chris, it'd be great to see 1.4.1 in a week or two with this patch in.  Can you
help with a1.4.x?

Manmathan, it seems you're right. I've never built a mozilla rpm (although I
made dist-build once). SYSINSTALL was introduced about a year ago for GRE
(http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=Makefile.in&branch=&root=/cvsroot&subdir=mozilla/layout/mathml/base/src&command=DIFF_FRAMESET&rev1=1.34&rev2=1.35)
'make dist' and making rpm appear differnt... 

Roger, we need SYSINSTALL on Win32 (gfx/src/windows/Makefile.in), don't we?
Maybe not ...
This should be applied to both trunk and 1.4 branch(in case of branch, it's to
be applied on top of attachment 126257 [details] [diff] [review])
Comment on attachment 126983 [details] [diff] [review]
Makefile.in patch for rpm(?)/GRE

r=blizzard on that
Attachment #126983 - Flags: review+
Comment on attachment 126983 [details] [diff] [review]
Makefile.in patch for rpm(?)/GRE

Thanks for r, blizzard.
As he didn't add sr, I'm asking seawood for sr (build issue)
Attachment #126983 - Flags: superreview?(cls)
Comment on attachment 126983 [details] [diff] [review]
Makefile.in patch for rpm(?)/GRE

Thanks for r, blizzard.
As he didn't add sr, I'm asking seawood for sr (build issue)
Attachment #126983 - Flags: superreview?(cls)
Attachment #126983 - Flags: superreview+
Attachement 126983 was checked in.

In attachment 127013 [details] [diff] [review] for bug 179834, I found fontEncoding.properties files for
Win32 and Mac listed in basebrower-win.pkg and basebrowser-mac-cfm, judging
from which  I guess I should add it to basebrowser-unix, as well although it
may have to be commented out for now as optional (because it's only used for
Xft-build but  fixing 208213 may change that)
For printing output  with a closer match to Xft's screen rendering, [1] see bug
190031 and 211763. Nothing has been done yet, but I thought some people here
might be interested in them.

[1] Even now Xprint module can be used to print out mathml and complex script
text because it shares the code with Xlib/Gtk for which the custom encoding font
support has been available for a while. However, as the font selection mechanism
of Xlib/Gtk (based on XLFD), among other things, is different from that of Xft
module (with fontconfig), the printing output is not a very faithful replica of
the screen rendering by Xft.
Did the code that would allow you to embed the glyphs for the used fonts ever
get written?  I thought it had, but I'm not sure.
Comment on attachment 126257 [details] [diff] [review]
blizzard's cleaned-up patch backported to 1.4 branch

good to go
Attachment #126257 - Flags: review?(blizzard)
Attachment #126257 - Flags: review+
Attachment #126257 - Flags: approval1.4.x?
Attachment #126257 - Flags: approval1.4.x+
Thanks, Chris. Fix checked into the 1.4 branch. Marking as fixed. 
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Keywords: fixed1.4.1
Blocks: 224532
Comment on attachment 126983 [details] [diff] [review]
Makefile.in patch for rpm(?)/GRE

asking for 1.4.2 approval. 

I should have asked for a1.4.1, but forgot to ask in July. This is the cause of
the problem mentioned in bug 
120198 comment #68 (MathML not getting rendered even though all the fonts are
installed)
Attachment #126983 - Flags: approval1.4.2?
Comment on attachment 126983 [details] [diff] [review]
Makefile.in patch for rpm(?)/GRE

a=mkaply

if this is the only change going on the branch
Attachment #126983 - Flags: approval1.4.2? → approval1.4.2+
Makefile.in change checked into 1.4.2 branch. Thanks for a.
Keywords: fixed1.4.1fixed1.4.2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: