Closed Bug 203052 Opened 21 years ago Closed 21 years ago

write a simple wrapper over tis620-2 converter to use with Xft build for Thai shaping


(Core :: Layout: Text and Fonts, defect)

Not set





(Reporter: jshin1987, Assigned: jshin1987)


(Blocks 1 open bug)


(Keywords: intl)


(3 files, 3 obsolete files)

Writing a simple converter that is a thin wrapper of 
the existing tis620-2 converter (that is turned on when CTL is
compiled in) would enable Mozilla-Xft to render Thai the same
way as Mozilla-X11core font can do. Because tis620-2 converter
does all the necessary jobs of shaping, all this converter
has to do is to just translate tis620-2 codepoints to Unicode
Thai range and PUA codepoints used in Thai TTFs. 

See bug 176290.
Jamo converter dealt with in bug 176315 is an example of converters that can
take advantage
of my patch for bug 176290.
Attached patch a patch (not yet tested) (obsolete) — Splinter Review
With this patch and patch for bug 176290, Thai web pages
should get rendered with shaping. CTL has to be compiled
in. For each Thai TTF, a line in the following format
has to be added in (dist/res/fonts) 

# Thai TTFs (TIS620-2)
encoding.norasi.ttf = x-thaittf-0.wide

I haven't tested the patch yet because I forgot
to enable CTL in my build and had to rebuild
with CTL enabled.
Component: Internationalization → Complex Text Layout
Keywords: intl
After applying the patch, it doesn't compile.
Here're what I've tried so far:

- line 86: 2nd argument of GetMaxLength() should be *aSrcLength
- line 90: "char* med" is undeclared
- line 70: GetConverter() is declared static but never defined (I just commented
it out)

- should #include "nsUnicodeToThaiTTF.h"

(But I don't know how to. So, I'm stuck at this.)
Attached patch a nwe patch that get compiled (obsolete) — Splinter Review
Ooops I'm sorry, thep. I should have uploaded this patch earlier to save you
some time.
This one should get compiled. I didn't upload it because I had a trouble with
linking when
CTL is enabled.  I posted an article to mozilla i18n/build newsgroup about this
The article is news://$
Attachment #121438 - Attachment is obsolete: true
Comment on attachment 121593 [details] [diff] [review]
a nwe patch that get compiled 

>+GetMaxLength(const PRUnichar * aSrc, PRInt32 aSrcLength, PRInt32 * aDestLength)

  I forgot to quailify GetMaxLength with class name. 
The above should be: 


Now I am able to compile and link without a problem
(on one of my machines, I still had the linking
problem while on the other, it's getting linked
fine...), but none of encoders in CTL module is not available to use... I'll
keep trying.
This is the first working screenshot. I'm gonna upload a workign patch
Attached patch a working patch (obsolete) — Splinter Review
thep, can you verify that attachment 121617 [details] showed the correct rendering?
It'd be also nice if you can try this patch along with my patch for bug 176290.

If you verify that it works, I'll ask for review so that we can put this
in before 1.4.	BTW, how does Thai rendering works under Windows and Mac? 
bug 177877 is similar to bug 176290. If you need this converter 
under Windows as well, you can simply add lines to
file (gfx/src/windows) of Windows build. Let me know if you need that, too.

Another btw, can you give me a list of 5 or 6 representative Thai truetype
(with Windows encoding) to add to file?
I'll add them to the file and land it when I get review from blizzard for
bug 176290.
Attachment #121593 - Attachment is obsolete: true
Yes, the screenshot is a perfect Thai rendering. Thank you.
I will try your patches soon.
Thank you for verifying. And, please vote for bug 176290. We need your support :-)

I have the following for file. Let me know if you want
to add
more fonts. Style variants(Italic, Bold, Bold Italic) don't need to be listed

# Thai TTFs
# glyph arrangement : TIS620-2 Windows encoding,
# code points used : Unicode Thai block + about 10 PUA code points in U+F700,
# coverage : US-ASCII, Thai (U+0E00-U+0E07F), punctuation marks
#            U+2018, U+2019, U+201c, U+201d, U+2022, U+2013, U+2014 )
encoding.norasi.ttf = x-thaittf-0.wide
encoding.garuda.ttf = x-thaittf-0.wide
encoding.dbthaitext.ttf = x-thaittf-0.wide

BTW, do you want me to make this converter (x-thaittf-0)
cover general punctuation marks in Unicode as listed above?
I found Windows Thai fonts have them and it'd be more
consistent to use them from Thai fonts than from other
fonts so that I'm inclined to add them to FillInfo in
nsUnicodeToTIS620.cpp which is inherited by nsUnicodeToThaiTTF.
Or, I can just add it to nsUnicodeToThaiTTF if X11 core
fonts for TIS620-2 doesn't have them. If I go with the latter,
the codesize will be slightly larger. So, I like to 
modify the parent class.

Target Milestone: --- → mozilla1.4beta
Yes, tis620-2.enc in XFree86 distribution does cover those punctuations,
because it was designed with Windows fonts in mind.
> BTW, how does Thai rendering works under Windows [and Mac]? 

Seems fine here. There has been a tranlastion of the MathML Start Page in Thai...
rbs, I'm afraid that Mozilla-win does not render Thai correctly. 
In case of Thai script, I think it's easier to be fooled by apparently
correct(but not correct) rendering than in other Brahmi-derived scripts in
South/Southeast Asia because simple overstriking with _nominal_ glyphs *almost*
works. The easiest way to see that is to compare how
Mozilla renders this test case with how MS IE renders it (or
the screenshot of Mozilla-Xft with this patch and the patch
for bug 176290 applied rendering the test casein attachment 121617 [details]).   See
attachment 121415 [details] and you can tell easily that substitutions described by
lower(), lower_left(), left(), remove_below()  are not taken care of by
Mozilla-Win. Neither
are the decomposition of U+0E33 into U+0E4D and U+0E32 and the
subsequent reordering ( operation #5 in James' write-up)

Although I alleged that ExtTextOutW in Win32 might do some magic,
it turns out that I was wrong. [1] ExtTextOutW is not so smart and
opentype fonts are not so intelligent as Apple's AAT fonts. A lot
of works (invoking Uniscribe and optionally OTLS under Windows and
calling Pango APIs under Linux) have to be done by 'clients' (like Mozilla, MS
IE, MS Word) to take advantage of  opentype
fonts. Mozilla-Win doesn't do any of this as far as I can tell.
(otherwise, why would ftang have begun Graphite-Mozilla project 
at Well, Graphite of SIL supports more 
complex scripts than Uniscribe currently does so that there would
have been an incentive for the project, anyway.) Please, correct me if I'm

In conclusion, file for Mozilla-Win
also needs additional entries for Thai. BTW, enable-ctl
seems to be effective only on Unix. If it's confirmed that
we need this also under Windows and other platforms,
we have to change that, too.

[1] ExtTextOutW in Windows XP may be smarter than ExtTextOutW in
Win2k. There's a report that a test version of Korean opentype
font works with both MS IE 6 and Mozilla under Win XP while
it only works with MS IE 6 under Win2k.
I don't read Thai and so can't distinguish what is "apparently correct (but not
correct)"... What a way to put it :-). Cc:ing Arthit Suriyawongkul
<,> who did the Thai translation of the MathML Start
> I don't read Thai and so can't distinguish what is "apparently correct 
> (but not correct)"... What a way to put it :-)

  You didn't think I could read Thai, did you? I can't, either.
(although I like some Thai dishes a lot :-)) However, Thai
codechart(, Thep's
excellent write-up at
(see the comparison between tis620-0 and tis620-1,2 in 'Shaping')
and James' write-up(attachment 121415 [details]) are enough for me to tell one from the
other. (actually, I don't think it'll take long to learn to _read_ Thai script.)
By 'apparently correct', I meant that combining characters are applied to base
characters but their positions
(and shapes) are not quite right for _some_ sequences.
Blocks: thai
Depends on: 176290
Does this hack work if the font family name has spaces in?  One of the most
popular Thai fonts is "Angsana New" (from Windows).

Thai text *does* get shaped properly on Windows XP with the standard Mozilla 1.3
binaries.  Indeed, complex scripts in general appear to display properly.
However, selection and caret operations don't work properly, but that needs a
separate bug.

It's not hard to tell whether Thai text is getting shaped properly.  If you look
at a page of Thai text you will see several characters per line typically have a
diacritic mark above them aligned to the right edge of the glyph, which is
either a small vertical bar or looks a little like a script number 2. In
improperly shaped Thai text, these diacritic marks are always at the same
height.  In properly shaped Thai text, some of them will be lowered just that
they are just a little above the glyph they apply to.  Improperly shaped Thai
text is unaesthetic but not uncommon and perfectly readable: I am watching a
movie on Star movies right now with Thai subtitles and the text is not properly
shaped.  Proper shaping is not as important in Thai as in Indic scripts. 
> Does this hack work if the font family name has spaces in?  One of the most
> popular Thai fonts is "Angsana New" (from Windows).

Yes, it does. See
32 You can add 

encoding.angsananew.ttf = x-thaittf-0 

> Improperly shaped Thai text is unaesthetic but not uncommon and perfectly 
> readable: 

  I think it depends on fonts. If they're designed specifically for
simple overstriking to work (glyphs are designed NOT to bump over
one another when overstruck together by making them take disjoint
spaces), the above would be the case. However, I found glyphs in fonts
designed to be used with shaping (as opposed to simple overstriking)
tend to overlap each other when put together without any glyph
substitutions (shaping). (this is also the case of Korean Hangul.)
So, another way to tell shaped Thai rendering from unshaped one is to see 
whether there's any overlap of glyphs. Well, it's hard to notice this unless 
you know what they should look like, but it looks too 'crowded' if there's 
a 'collision'. 

> Thai text *does* get shaped properly on Windows XP with the standard Mozilla 
> binaries.  Indeed, complex scripts in general appear to display properly.

This is the second report I got about Mozilla's ability to render complex
script properly under Win XP. (see the footnote [1] in  comment #11) 
According to MSDN artcile dating back to 1998, calling Win32 Text APIs 
(ExtTextOutW and others) is one of four ways to render/measure/layout complex 
scripts and unicode strings
So, Win32 Text APIs do some magic at least under WinXP. Then, why not
under Win2k? Something wrong with my Win2k set-up? Win2k is basically the 
same OS as WinXP. How about Win9x/ME? The MSDN article mentioned that 
it should work there, too, for a Unicode application (which Mozilla is now) 
> However, selection and caret operations don't work properly, but that needs a
> separate bug.

  According to the article, we have to use Uniscribe for this. 

Anyway, I'm getting off-topic here. We have to move this issue (Mozilla-Win
and complex script rendering) to i18n newsgroup/mailing list. 
I posted an article about complex script rendering in Mozilla-Win.
I found out why calling Win32 Text APIs didn't work for me under
Win2k while it worked for James and my friend under WinXP.
See news://$
for details.  
Attached patch patch for reviewSplinter Review
I think this patch is simple enough to go in (even after 1.4b release)
Attachment #121618 - Attachment is obsolete: true
Comment on attachment 122738 [details] [diff] [review]
patch for review

One thing to resolve is how to credit
authors of thai-xft.c (in Pango) for
the conversion table (TIS 620-2 to Unicode +
PUA). Perhaps, I'll ask
staff about it if necessary.
Attachment #122738 - Flags: superreview?(rbs)
Attachment #122738 - Flags: review?(prabhat.hegde)
BTW, this patch can be used by non-Thai version of Windows 9x/ME users as well 
once CTL is enabled by default. Under Thai version of Win 9x/ME and any langauge
version of Win2k/XP, Mozilla can rely on the native Thai support offered by the
OS. I guess there aren't many non-Thai Win9x/ME users who need to view Thai
pages, but nonetheless we may add this to the I18N release notes if this patch
and CTL-enabling are in for 1.4.
Attachment #122738 - Flags: review?(prabhat.hegde) → review+
Comment on attachment 122738 [details] [diff] [review]
patch for review

I understand this patch only works with CTL-enabled builds since it is bundled

>One thing to resolve is how >to credit
>authors of thai-xft.c (in Pango)

An aknowledgment/reference (as you did) seems fine to me.
Attachment #122738 - Flags: superreview?(rbs) → superreview+
Comment on attachment 122738 [details] [diff] [review]
patch for review

Thank you for r/sr, prabhat and rbs.
asking for a. This is not a part of the default build so that it has no effect
on the default build. When CTL is enabled, it increases the code size a bit,
but otherwise it has little interaction with other parts.
Attachment #122738 - Flags: approval1.4?
Comment on attachment 122738 [details] [diff] [review]
patch for review

a=asa (on behalf of drivers) for checkin to 1.4
Attachment #122738 - Flags: approval1.4? → approval1.4+
fix checked in. Thanks all. Now time to get in the patch for bug 176290 :-)
Closed: 21 years ago
Resolution: --- → FIXED
Component: Layout: CTL → Layout: Text
QA Contact: amyy → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.