Closed Bug 203052 Opened 22 years ago Closed 22 years ago

write a simple wrapper over tis620-2 converter to use with Xft build for Thai shaping

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.4beta

People

(Reporter: jshin1987, Assigned: jshin1987)

References

Details

(Keywords: intl)

Attachments

(3 files, 3 obsolete files)

Writing a simple converter that is a thin wrapper of the existing tis620-2 converter (that is turned on when CTL is compiled in) would enable Mozilla-Xft to render Thai the same way as Mozilla-X11core font can do. Because tis620-2 converter does all the necessary jobs of shaping, all this converter has to do is to just translate tis620-2 codepoints to Unicode Thai range and PUA codepoints used in Thai TTFs. See bug 176290. Jamo converter dealt with in bug 176315 is an example of converters that can take advantage of my patch for bug 176290.
Attached patch a patch (not yet tested) (obsolete) — Splinter Review
With this patch and patch for bug 176290, Thai web pages should get rendered with shaping. CTL has to be compiled in. For each Thai TTF, a line in the following format has to be added in fontEncoding.properties (dist/res/fonts) # Thai TTFs (TIS620-2) encoding.norasi.ttf = x-thaittf-0.wide I haven't tested the patch yet because I forgot to enable CTL in my build and had to rebuild with CTL enabled.
Component: Internationalization → Complex Text Layout
Keywords: intl
After applying the patch, it doesn't compile. Here're what I've tried so far: nsUnicodeToThaiTTF.cpp: - line 86: 2nd argument of GetMaxLength() should be *aSrcLength - line 90: "char* med" is undeclared - line 70: GetConverter() is declared static but never defined (I just commented it out) nsCtlLEModule.cpp: - should #include "nsUnicodeToThaiTTF.h" intl/ctl/public/nsCtlCIID.h: - should #define NS_UNICODETOTHAITTF_CID (But I don't know how to. So, I'm stuck at this.)
Attached patch a nwe patch that get compiled (obsolete) — Splinter Review
Ooops I'm sorry, thep. I should have uploaded this patch earlier to save you some time. This one should get compiled. I didn't upload it because I had a trouble with linking when CTL is enabled. I posted an article to mozilla i18n/build newsgroup about this issue. The article is news://news.mozilla.org:119/b88vki$cbs7@ripley.netscape.com
Attachment #121438 - Attachment is obsolete: true
Comment on attachment 121593 [details] [diff] [review] a nwe patch that get compiled >+ >+NS_IMETHODIMP >+GetMaxLength(const PRUnichar * aSrc, PRInt32 aSrcLength, PRInt32 * aDestLength) I forgot to quailify GetMaxLength with class name. The above should be: +NS_IMETHODIMP +nsUnicodeToThaiTTF::GetMaxLength(........) Now I am able to compile and link without a problem (on one of my machines, I still had the linking problem while on the other, it's getting linked fine...), but none of encoders in CTL module is not available to use... I'll keep trying.
This is the first working screenshot. I'm gonna upload a workign patch shortly.
Attached patch a working patch (obsolete) — Splinter Review
thep, can you verify that attachment 121617 [details] showed the correct rendering? It'd be also nice if you can try this patch along with my patch for bug 176290. If you verify that it works, I'll ask for review so that we can put this in before 1.4. BTW, how does Thai rendering works under Windows and Mac? bug 177877 is similar to bug 176290. If you need this converter under Windows as well, you can simply add lines to fontEncoding.properties file (gfx/src/windows) of Windows build. Let me know if you need that, too. Another btw, can you give me a list of 5 or 6 representative Thai truetype fonts (with Windows encoding) to add to fontEncoding.properties file? I'll add them to the file and land it when I get review from blizzard for bug 176290.
Attachment #121593 - Attachment is obsolete: true
Yes, the screenshot is a perfect Thai rendering. Thank you. I will try your patches soon.
Thank you for verifying. And, please vote for bug 176290. We need your support :-) I have the following for fontEncoding.properties file. Let me know if you want to add more fonts. Style variants(Italic, Bold, Bold Italic) don't need to be listed separately. # Thai TTFs # glyph arrangement : TIS620-2 Windows encoding, # code points used : Unicode Thai block + about 10 PUA code points in U+F700, # coverage : US-ASCII, Thai (U+0E00-U+0E07F), punctuation marks # U+2018, U+2019, U+201c, U+201d, U+2022, U+2013, U+2014 ) encoding.norasi.ttf = x-thaittf-0.wide encoding.garuda.ttf = x-thaittf-0.wide encoding.dbthaitext.ttf = x-thaittf-0.wide BTW, do you want me to make this converter (x-thaittf-0) cover general punctuation marks in Unicode as listed above? I found Windows Thai fonts have them and it'd be more consistent to use them from Thai fonts than from other fonts so that I'm inclined to add them to FillInfo in nsUnicodeToTIS620.cpp which is inherited by nsUnicodeToThaiTTF. Or, I can just add it to nsUnicodeToThaiTTF if X11 core fonts for TIS620-2 doesn't have them. If I go with the latter, the codesize will be slightly larger. So, I like to modify the parent class.
Target Milestone: --- → mozilla1.4beta
Yes, tis620-2.enc in XFree86 distribution does cover those punctuations, because it was designed with Windows fonts in mind.
> BTW, how does Thai rendering works under Windows [and Mac]? Seems fine here. There has been a tranlastion of the MathML Start Page in Thai... http://www.mozilla.org/projects/mathml/start-thai.xml
rbs, I'm afraid that Mozilla-win does not render Thai correctly. In case of Thai script, I think it's easier to be fooled by apparently correct(but not correct) rendering than in other Brahmi-derived scripts in South/Southeast Asia because simple overstriking with _nominal_ glyphs *almost* works. The easiest way to see that is to compare how Mozilla renders this test case with how MS IE renders it (or the screenshot of Mozilla-Xft with this patch and the patch for bug 176290 applied rendering the test casein attachment 121617 [details]). See attachment 121415 [details] and you can tell easily that substitutions described by lower(), lower_left(), left(), remove_below() are not taken care of by Mozilla-Win. Neither are the decomposition of U+0E33 into U+0E4D and U+0E32 and the subsequent reordering ( operation #5 in James' write-up) Although I alleged that ExtTextOutW in Win32 might do some magic, it turns out that I was wrong. [1] ExtTextOutW is not so smart and opentype fonts are not so intelligent as Apple's AAT fonts. A lot of works (invoking Uniscribe and optionally OTLS under Windows and calling Pango APIs under Linux) have to be done by 'clients' (like Mozilla, MS IE, MS Word) to take advantage of opentype fonts. Mozilla-Win doesn't do any of this as far as I can tell. (otherwise, why would ftang have begun Graphite-Mozilla project at http://sila.mozdev.org? Well, Graphite of SIL supports more complex scripts than Uniscribe currently does so that there would have been an incentive for the project, anyway.) Please, correct me if I'm wrong. In conclusion, fontEncoding.properties file for Mozilla-Win also needs additional entries for Thai. BTW, enable-ctl seems to be effective only on Unix. If it's confirmed that we need this also under Windows and other platforms, we have to change that, too. [1] ExtTextOutW in Windows XP may be smarter than ExtTextOutW in Win2k. There's a report that a test version of Korean opentype font works with both MS IE 6 and Mozilla under Win XP while it only works with MS IE 6 under Win2k.
I don't read Thai and so can't distinguish what is "apparently correct (but not correct)"... What a way to put it :-). Cc:ing Arthit Suriyawongkul <art@siit.net, art@geegen.com> who did the Thai translation of the MathML Start Page.
> I don't read Thai and so can't distinguish what is "apparently correct > (but not correct)"... What a way to put it :-) You didn't think I could read Thai, did you? I can't, either. (although I like some Thai dishes a lot :-)) However, Thai codechart(http://www.unicode.org/charts/PDF/U0E00.pdf), Thep's excellent write-up at http://linux.thai.net/~thep/thaisupp/ (see the comparison between tis620-0 and tis620-1,2 in 'Shaping') and James' write-up(attachment 121415 [details]) are enough for me to tell one from the other. (actually, I don't think it'll take long to learn to _read_ Thai script.) By 'apparently correct', I meant that combining characters are applied to base characters but their positions (and shapes) are not quite right for _some_ sequences.
Blocks: thai
Depends on: 176290
Does this hack work if the font family name has spaces in? One of the most popular Thai fonts is "Angsana New" (from Windows). Thai text *does* get shaped properly on Windows XP with the standard Mozilla 1.3 binaries. Indeed, complex scripts in general appear to display properly. However, selection and caret operations don't work properly, but that needs a separate bug. It's not hard to tell whether Thai text is getting shaped properly. If you look at a page of Thai text you will see several characters per line typically have a diacritic mark above them aligned to the right edge of the glyph, which is either a small vertical bar or looks a little like a script number 2. In improperly shaped Thai text, these diacritic marks are always at the same height. In properly shaped Thai text, some of them will be lowered just that they are just a little above the glyph they apply to. Improperly shaped Thai text is unaesthetic but not uncommon and perfectly readable: I am watching a movie on Star movies right now with Thai subtitles and the text is not properly shaped. Proper shaping is not as important in Thai as in Indic scripts.
> Does this hack work if the font family name has spaces in? One of the most > popular Thai fonts is "Angsana New" (from Windows). Yes, it does. See http://lxr.mozilla.org/seamonkey/source/gfx/src/windows/fontEncoding.properties# 32 You can add encoding.angsananew.ttf = x-thaittf-0 > Improperly shaped Thai text is unaesthetic but not uncommon and perfectly > readable: I think it depends on fonts. If they're designed specifically for simple overstriking to work (glyphs are designed NOT to bump over one another when overstruck together by making them take disjoint spaces), the above would be the case. However, I found glyphs in fonts designed to be used with shaping (as opposed to simple overstriking) tend to overlap each other when put together without any glyph substitutions (shaping). (this is also the case of Korean Hangul.) So, another way to tell shaped Thai rendering from unshaped one is to see whether there's any overlap of glyphs. Well, it's hard to notice this unless you know what they should look like, but it looks too 'crowded' if there's a 'collision'. > Thai text *does* get shaped properly on Windows XP with the standard Mozilla 1.3 > binaries. Indeed, complex scripts in general appear to display properly. This is the second report I got about Mozilla's ability to render complex script properly under Win XP. (see the footnote [1] in comment #11) According to MSDN artcile dating back to 1998, calling Win32 Text APIs (ExtTextOutW and others) is one of four ways to render/measure/layout complex scripts and unicode strings (http://www.microsoft.com/msj/defaultframe.asp? page=/msj/1198/multilang/multilang.htm&nav=/msj/1198/newnav.htm). So, Win32 Text APIs do some magic at least under WinXP. Then, why not under Win2k? Something wrong with my Win2k set-up? Win2k is basically the same OS as WinXP. How about Win9x/ME? The MSDN article mentioned that it should work there, too, for a Unicode application (which Mozilla is now) > However, selection and caret operations don't work properly, but that needs a > separate bug. According to the article, we have to use Uniscribe for this. Anyway, I'm getting off-topic here. We have to move this issue (Mozilla-Win and complex script rendering) to i18n newsgroup/mailing list.
Status: NEW → ASSIGNED
I posted an article about complex script rendering in Mozilla-Win. I found out why calling Win32 Text APIs didn't work for me under Win2k while it worked for James and my friend under WinXP. See news://news.mozilla.org:119/b8fsm8$dli3@ripley.netscape.com for details.
Attached patch patch for reviewSplinter Review
I think this patch is simple enough to go in (even after 1.4b release)
Attachment #121618 - Attachment is obsolete: true
Comment on attachment 122738 [details] [diff] [review] patch for review One thing to resolve is how to credit authors of thai-xft.c (in Pango) for the conversion table (TIS 620-2 to Unicode + PUA). Perhaps, I'll ask mozilla.org staff about it if necessary.
Attachment #122738 - Flags: superreview?(rbs)
Attachment #122738 - Flags: review?(prabhat.hegde)
BTW, this patch can be used by non-Thai version of Windows 9x/ME users as well once CTL is enabled by default. Under Thai version of Win 9x/ME and any langauge version of Win2k/XP, Mozilla can rely on the native Thai support offered by the OS. I guess there aren't many non-Thai Win9x/ME users who need to view Thai pages, but nonetheless we may add this to the I18N release notes if this patch and CTL-enabling are in for 1.4.
Attachment #122738 - Flags: review?(prabhat.hegde) → review+
Comment on attachment 122738 [details] [diff] [review] patch for review sr=rbs I understand this patch only works with CTL-enabled builds since it is bundled there. >One thing to resolve is how >to credit >authors of thai-xft.c (in Pango) An aknowledgment/reference (as you did) seems fine to me.
Attachment #122738 - Flags: superreview?(rbs) → superreview+
Comment on attachment 122738 [details] [diff] [review] patch for review Thank you for r/sr, prabhat and rbs. asking for a. This is not a part of the default build so that it has no effect on the default build. When CTL is enabled, it increases the code size a bit, but otherwise it has little interaction with other parts.
Attachment #122738 - Flags: approval1.4?
Comment on attachment 122738 [details] [diff] [review] patch for review a=asa (on behalf of drivers) for checkin to 1.4
Attachment #122738 - Flags: approval1.4? → approval1.4+
fix checked in. Thanks all. Now time to get in the patch for bug 176290 :-)
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Component: Layout: CTL → Layout: Text
QA Contact: amyy → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: