Closed Bug 205476 Opened 17 years ago Closed 13 years ago

Devanagari(Hindi,etc), Telugu, Bengali, Tamil and other Indic scripts support

Categories

(Core Graveyard :: GFX: Mac, defect)

PowerPC
macOS
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: denty, Assigned: jaas)

References

()

Details

(Keywords: intl)

Attachments

(10 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030501 Camino/0.7+
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.4b) Gecko/20030501 Camino/0.7+

In Devanagari (Hindi) fonts, the short E character, choti E matra, should be
rendered before the character it follows.

This thread started in http://bugzilla.mozilla.org/show_bug.cgi?id=202667 but it
appears it may be an integration issue with Mac OS X.

The attachments to the above bug demonstrate the mis-rendering.

- Mozilla's rendering (and Camino on OS X demonstrates the same):
http://bugzilla.mozilla.org/attachment.cgi?id=121086&action=view
- Correct rendering http://bugzilla.mozilla.org/attachment.cgi?id=121087&action=view

OS X can natively render Devanagari correctly. Indeed, the OS X title bar often
displays the title of a BBC story correctly where it is incorrectly displayed on
the page.

Explanation:

In Hindi, mozilla is written mozilA (phonetically) but it must be displayed
moizlA (in glyph terms) becase the short i (choti E matra) precedes its
preceding character. This is different from the long i (bari E matra) and most
other glyphs that follow their preceding character as you would expect.

In the thread above, Jungshik Shin (Cc'd) had some questions:

Just out of curiosity, does Camino take
advantage of AAT fonts and ATSUI? If not, could you tell
me what approach you're taking to enable Devanagari rendering
in Camino?

I don't know enough to answer his queries but I am happy to help in
testing/resolving this bug.

Reproducible: Always

Steps to Reproduce:
1. Got to BBC website, http://www.bbc.co.uk/hindi/
Actual Results:  
Incorrect rendering of choti E matra. Glyph is renderd following its affecting
glyph when it should preceed it.

Expected Results:  
The glyph should have preceeded it. Interestingly, a cut/paste of the bad text
into OS X's textpad results in the correct rendering of the text.
Can you take a screenshot of attachment 122383 [details] (attached to bug 204286)
and attach it here? I suspect it's not just the reordering problem
with 'choti e (i) matra'. 

The test page explicitly specifies the font.
To render the page with Devanagari fonts on your system,
you have to disable the font-overiding by web pages
in Edit|Preference|Appearance|Fonts|Hindi.

If my memory serves me right, ftang mentioned a couple of years ago that
Mozilla didn't use ATSUI, but that was probably about MacOS 9.
What does Camino use to 'draw strings'? 

Keywords: intl
Mozilla and Camino share the same text rendering code now (which uses QuickDraw,
with ATSUI fallbacks). It's probably the case that rendering with ATSUI would
fix this.
Status: UNCONFIRMED → NEW
Component: OS Integration → GFX: Mac
Ever confirmed: true
Product: Camino → Browser
Version: unspecified → Trunk
Shows three applications (textedit, camino and mozilla) rendering the word
Mozilla in Hindi.
The HTML generated by Mozilla's composer.
Attached file Mozilla in Hindi (RTF)
The RTF generated by Mac OS X's textedit application.
In the attached screengrab, from left to right:

1. Mac OS X's textedit application: this is the control and the rendering is
correct.
2. Camino: the o and a dependent vowels are rendered correctly following their
associated constanant. The i is also rendered following its dependent constanant
where is should preceed it. (Actually, the dot is also rendered incorrectly -
this is a pain but it is a matter of taste as to whether to distinguish between
z and j and so doesn't occur so often.)
3. Mozilla's composer window: same behavior as Camino.

Procedure:

1. The text was written in textedit using OS X's native multi-lingual input
method and saved as RTF.
2. The text was copied from textedit and pasted into Mozilla's composer and
saved as HTML.
3. The HTML file was opened in Camino.
Re. comment #1, checking in my preferences for Camino I find no Hindi option.
> It's probably the case that rendering with ATSUI would
> fix this.

 As sfraser wrote, I guess Devanagari shaping will be taken care of
by ATSUI.  These include Virama positioning, reordering of vowel sign 
'I' and other vowels, consonant conjuncts (your example didn't include these), 
handling of halant and half-forms and so forth. The fact that TextEdit
can handle all of them is a good sign that Camino can also do it by invoking 
ATSUI APIs at the right place. Just in case, ATSUI is just  my guess (based on 
sfraser's comment and some pieces of info. I happened to have) because I don't 
have access to Mac OS development environment.  

  BTW, I'm changing the summary line to be more generic because it's not just 
about vowel sign 'I' but about Devanagari script in general. 

  As for Hindi font-pref. menu in Camino, I should have known that Camino is 
different from Mozilla in some UI. If you want to test attachment 122383 [details], you 
can edit the file to remove the font specification. Perhaps, I'll upload a new 
test file with the font-spec. removed. 



Summary: Devanagari (Hindi) fonts rendered incorrectly (choti e (i) matra) → Devanagari (Hindi) script support
Loaded attachment 122383 [details] into Camino and, as you suggest, it highlights a number
of rendering shortcomings. (Tried removing the font-spec to no noticable effect.)

Textedit renders the copy/pasted content properly.
It's natural that font-spec doesn't affect the result (I realize that my prev.
comment about it was misleading.) because Camino currently does not take
advantage of  ATSUI (responsible for the rendering of complex scripts like
Devanagari, Tamil, Korean among other things) for Devanagari. 
Generated by Camino build 2003050105.
I thought I'd see if I could dig into this one and here are the results.

I tried to prove Jungshik's theory that ATSU could handle the rendering of the
complex script alone. And it appears that it can.

I wasn't able to get the BBC site to render correctly without modification. (I
haven't posted the results of modifying one of their pages for copyright
reasons.) However, it appears that removing the style-sheet reference is enough
to get the site to work. I don't know why this should be the case...? Anyhow,
compare attachments #123592 and #123593.

Anyhow, perhaps more importantly, the rendering of mozilla in Hindi now looks as
it should - see attachment #123594 [details].

The 'patch' I uploaded is not intended to show how it should be done (lol). I
don't really understand what's going on enough to write an ideal implementation.
Also, the continued misrendering of the unmodified BBC site suggests there is
something else going on too.
Following up to my previous comment regarding the BBC's Hindi site not rendering
properly without removing the stylesheet reference. Having removed the fonts
Mangal and Raghu, it appears to render the text itself slightly better (although
it fails to compute the font spacing correctly).
Wow, it's great that it works with that simple patch.
 
> The 'patch' I uploaded is not intended to show how it should 
> be done (lol)

  Why don't you ask sfraser for help with this? BTW, your patch is a bit hard to
read because it uses mixed style diff. Can you make your patch with '-u' option? 


> Also, the continued misrendering of the unmodified BBC site 
> suggests there is something else going on too.

  I guess ATSUI doesn't understand gsub/gpos tables in Mangal and Raghu fonts.
However, ASTUI does know how to take advantage of AAT fonts that are included in
MacOS X. Does TextEdit work correctly with either Mangal or Raghu? 

In the stylesheet used at BBC Hindi site, Mangal and Raghu are specified
explicitly. You can configure Mozilla (I don't know whether Camino has a similar
feature) NOT to honor the server-side font specification (in CSS). Go to
Pref|Appereance|Font|Hindi and make sure 'Allow document to use other fonts' is
NOT checked.  Also specify Devanagari fonts (that come with MacOS X. Are there a
couple of them, right?) for Hindi.
With that option NOT checked, my guess is that BBC Hindi site would render well
with your patch applied even if Mangal and Raghu are present.

Adds ENABLE_ATSUI_PRIMARY option (similar in form to DISABLE_*_FALLBACK) that
selects ATSUI rendering ahead of Mozilla-native support.

Seems to do the trick for Devanagari and doesn't appear to unduly affect other
languages. Doesn't seem to help for Arabic/Urdu, though, which still looks
somewhat odd to my eyes. Another story, I guess.

Interested in comments on how to move forward on this one.

d.
Attachment #123596 - Attachment is obsolete: true
> Does TextEdit work correctly with either Mangal or Raghu?

I hadn't tried this until you suggested it but, you're right: TextEdit does not
work with Mangal. Also, the OS doesn't seem to be recognising Raghu at all. I
brought it over from Linux some time ago so it's possible it's incompatible.
> TextEdit does not work with Mangal. 

This is an indication that ATSUI doesn't take advantage of GSUB/GPOS tables.
It's not definitive, though. ATSUI appears to need mort(?) tables present in AAT
fonts.

> Also, the OS doesn't seem to be recognising Raghu at all.

Perhaps, there's something MacOS X doesn't like about it. Is there a font
validator (similar to the one available for Windows from Microsoft VOLT
community) for Mac OS X? If there's no such tool and  you have a Windows box
around, you may download 'font validator?' (you have to subscribe to VOLT
community : http://www.microsoft.com/typography) to test Raghu.
Regarding ATSUI and the need for mort tables, yes, this article would tend to
support your suggestion. http://developer.apple.com/fonts/WhitePapers/IUC15CG.pdf

The Mangal font is one written by Microsoft.

Regarding the rogue Raghu font. I wasn't able to find a font validator for OS X
in a casual search of the web. Font Manager Pro (trial version) didn't seem to
recognise it and so I'm removing it. I don't think it's pertinent to this bug thogh.

So, is ATSUI the right way to go for the Mozilla/Camino renderer? I confess to
being a bit stuck as to what to do now. I don't think I know enough about fonts
to register the Mangal mis-rendering by ATSUI as a bug with Apple.
Simon,

I wonder if you could do a code review of the patch I submitted.

Thanks,
denty.
Comment on attachment 123815 [details] [diff] [review]
Try ATSUI rendering before other methods

I don't have a Mac (yet). So, I just had a casual look at the first third of
the patch and here's my comment. 

>--- nsATSUIUtils.h.~1.10~	Fri Aug 16 01:54:03 2002
>+++ nsATSUIUtils.h	Sun May 18 14:26:11 2003

>+  static
>+  nsresult nsATSUIToolkit::ConvertToUnicode(const char *in, PRUint32 inLen, PRUnichar *&out);

 Do you realy need this function?  What it does is just converting MacRoman
char[]
to PRUnichar[], right?	And, you're calling it from functions that I believe to
work only on
US-ASCII (e.g. GetWidth(const char* aString,....) is not meant to be for
anything other than US-ASCII, afaik). 


>--- nsRenderingContextMac.cpp.~1.152~	Sat Apr  5 00:37:31 2003

> nsRenderingContextMac::GetWidth(const char* aString, PRUint32 aLength, nscoord& aWidth)
.....
>+    PRUnichar *uString;
>+    nsresult res = nsATSUIToolkit::ConvertToUnicode (aString, aLength, uString);
.....

If I'm right above (I could well be wrong not having programmed for Mac),
there's no need for this.
If all this patch does is make Mozilla use ATSUI primarily, then there are other
patches which are both cleaner, and more efficient in bug 121540.

Any changes to text rendering require extensive page load performance tests to
ensure that they don't degrade performance.
Hi guys,

Thanks for the comments.

You're right in that nsATSUIToolkit::ConvertToUnicode is certainly non-optimal
and not an ideal thing to be doing (twice). Nontheless, something similar is
necessary as ATSUI doesn't have any native char */US-ASCII rendering code - only
UTF-16 and a whole bunch of converters. #121540 talks about a similar problem.

Taking your comments on board and looking through #121540, I am of the opinion
that whether or not to do a wholly-ATSUI rendering implementation for OS X is
more a policy/design decision than bug. (Of course, the matter of mis-rendering
Devanagari is still a bug.)

With that in mind, - and to lay my cards on the table, I am keen to pursue an
ATSUI approach - is this the best forum to discuss this on?

Cheers,
d.
Can |GetWidth|, |DrawString| and so forth with 'char *' just use non-ATSUI APIs
while ATSUI APIs are used for Unicode string? They belong to separate code
paths, don't they? 

Even if that's not the case, you don't have to use TextConverter API. You can
just use CopyASCIItoUTF16, ToNewUnicode or similar because all you need is just
an byte-expansion of ASCII char*.  

Anyway, we have to resolve bug 121540 first. Then, there'll be very little to do
here.
Depends on: atsui
*** Bug 249618 has been marked as a duplicate of this bug. ***
Summary: Devanagari (Hindi) script support → Devanagari(Hindi,etc), Telugu, Bengali, Tamil and other Indic scripts support
Bug #249618 is about firefox and mozilla not displaying any indic text atall.
See the screenshot there of the hindi wikipedia in firefox and safari. safari
displays it ok and firefox displays ????? (question marks)all over the place.
I have verified the same behavior with you test site bbc-hindi also. 
(oh btw safari has the problem about the short i not being diplayed before the
character). 

Inshort as far as I can see indic character support on Mozilla/OSX seems
non-existant.
Please increase the severity of this bug to "major feature is broken" the major
feature being indic support. I think international text support is arguably a
mojor feature.
I want to say again that Bug #249618 (the bug I reported is not about a vowel
being at a wrong place(thats what most of the discussion here seems to be about)
but about mozilla not displaying *any indic text atall*

Thanx
Spundun
I am sorry to have caused you guys trouble. On my machine the issue that I had.
got fixed. The reason was different... I had my system frameworks hosed up. 
Here is what I typed on irc channel about it:
probably you guys dont care but here is the story: me running out of room on the
main partition moved /Develooper directory to seperate partition /Volumes/Shared
and softlinked it... when I installed Java SDK it replaced that softlink with
new /Developer directory and put the java stuff in... so the other got left
dangling in the other partition

Safari managed to work still with the framework messed up.
Thanx.
p.s.: now I do see the problem about i vowel being at the wrong place. Hope it
gets fixed soon... Good luck.
(In reply to comment #18)
> Created an attachment (id=123815)
> Try ATSUI rendering before other methods
You are the man denty!!..

That patch works great(hold on! not perfect! :( )... It disaplays both gujarati and devnagari text and 
doesnt have all the spelling errors in rendering(i.e. sequencing of glyphs is great). So this patch is a 
huge stepforward for me :). 

However it skrewes up the smoothness of all the text, including english. It also skrewesup the spacing 
between english characters. I will attach the screenshots of the gu.wikipedia.org mainpage for safari, 
firefox1.0PR and firefox with your patch... so you can compare.

Thanx for the great work... let me know if you want me to test anything else.

Spundun
oh and If you want to know what build config I used... I have those posted in
bug #263212
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; rv:1.7.3) Gecko/20040913
Firefox/0.10.1
Hi Spundun,

Thanks for the encouragement. I have to be honest - I kind of gave up on this
one some time ago. I came to the conclusion that I would have to do quite a lot
of work in other areas before I could make my own patch (a) work properly and
(b) be performant. That was my own conclusion - perhaps if any others are
reading this who have more knowledge about Geko, you could confirm this???

If the amout of work is smallish, I could pick it up again.

I was quite happy about the rendering (although the spacing issues were a bit
annoying). It was more a proof of concept than anything else.

d.
Why attachment (id=123815) doesn't work correctly?

ATUSI requires qpplications to create ATSUTextLayout.
ATSUTextLayout has infomation about spacing and alignment
and justification and ligature and etc...

When creating such infomations, framework needs "real" string.
Apple Computer says, When application create ATSUTextLayout,
> For example, with a buffer of less than a full paragraph,
> ATSUI can neither reliably obtain the context for bidirectional
> processing nor reliably generate accent attachments and ligature
> formations for Roman text.
http://developer.apple.com/documentation/Carbon/Reference/ATSUI_Reference/atsu_reference_Reference/function_group_5.html#//apple_ref/doc/uid/TP30000309/F00224

When nsATSUIUtils.cpp makes ATSUTextLayout, it passes local array
-this maybe null-terminated string- to ATSUI runtime.
http://lxr.mozilla.org/mozilla/source/gfx/src/mac/nsATSUIUtils.cpp#263
After that, nsATSUIUtils.cpp add infomation about properties of font.
http://lxr.mozilla.org/mozilla/source/gfx/src/mac/nsATSUIUtils.cpp#338
And nsATSUIUtils.cpp cashes this ATSUTextLayout object.
http://lxr.mozilla.org/mozilla/source/gfx/src/mac/nsATSUIUtils.cpp#356
When nsATSUIUtils.cpp needs a ATSUTextLayout object, that reuse a object
from cache. When seek a object, keys are fontNum, size, which is bold,
which is italic, color. String is not concerned.
http://lxr.mozilla.org/mozilla/source/gfx/src/mac/nsATSUIUtils.cpp#260
If this works correctly, this is magic.

Core programmers know everything around this.
Bug 157967 's fitst report says,
> Right now, when rendering with ATSUI, we're driving
> a paragraph-layout-API with
> word-sized chunks of text, which is very inefficient.

To use ATSUI, we maybe needs new architecture.
Also see bug 167469 the two bugs are similar bugs but different components.
*** Bug 167469 has been marked as a duplicate of this bug. ***
Assignee: sfraser_bugs → joshmoz
QA Contact: chrispetersen → mac
*** Bug 314355 has been marked as a duplicate of this bug. ***
I am able to see Devanagari just fine in Safari, but in firefox 1.5rc2 I see nothing but question marks wherever there should be Devanagari script. I have tried setting the font preferences, but this doesn't help. I've tried manually changing the encoding, but that doesn't help. It seems to be a very serious bug.
Flags: blocking1.8.1?
*** Bug 327566 has been marked as a duplicate of this bug. ***
*** Bug 334066 has been marked as a duplicate of this bug. ***
*** Bug 339443 has been marked as a duplicate of this bug. ***
Minusing for now since we probably wouldn't block 1.8.1 for this.  Having said that simon or Josh any chance you could get a patch together?
Flags: blocking1.8.1? → blocking1.8.1-
(In reply to comment #44)
> Minusing for now since we probably wouldn't block 1.8.1 for this.  Having said
> that simon or Josh any chance you could get a patch together?
> 

This but depends (whatever that means) on bug #121540 . Yamashita Makoto has created a patch for that bug already and if you apply that patch then mozilla rendering for indic scripts works great! I have tested with devnagari and gujarati. 

Just apply his patch and close these two bugs pleeease! :)
Do we need another patch for this?  What is wrong with the current one?  This seems important to get in as soon as possible ass India is a big and important user base!
See comment #31 comment #35 and comment #36, this patch is old so maybe there is some new developement and this patch may now just work. Who knows. I haven't tested this patch with firefox for over a year. I would be more than happy to test it for hindi and gujarati if someone created a buiold for me.

On the other hand bug #121540 has a patch begging to be reviewed, the patch is not trivial though but has been thoroughly cleaned up afaik. So either way as soon as one of the patches go it, I will be very happy.
fixed on trunk by cairo
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Duplicate of this bug: 338659
Duplicate of this bug: 333070
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.