ZWJ, ZWNJ with devanagari characters does not display correct glyphs

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
17 years ago
11 years ago

People

(Reporter: alkuma, Assigned: prabhat.hegde)

Tracking

({intl})

Trunk
x86
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(4 attachments)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312

No way of knowing if this form will be submitted in utf-8 encoding. 
Please see description of bug #2 and bug #3 on
http://groups.google.com/groups?dq=&hl=hi&lr=&ie=UTF-8&oe=UTF-8&selm=BAY1-DAV15mrEkkRv6o0006c891%40hotmail.com

Bug #2: Mozilla (and Composer) doesn't understand or display the
ZeroWidthJoiner and ZeroWidthNonJoiner combinations correctly:

1. क् + त = क्त (a glyph of kta)
2. क् + ZWJ + त = क्‍त (half-ka & ta)
3. क् + ZWNJ + त = क्‌त (ka-halant & ta)

Bug #3: Mozilla (and Composer) doesn't understand or display the Marathi
crescent-R sequence correctly:

र् + ZWJ + य = र्‍य (crescent-R & ya)

In Mozilla, the cresent-R is only displayed as an
r-halant-r-halant-consonant combo. For example: र्र्ह. (i.e. र् + र् + ह)

Reproducible: Always

Steps to Reproduce:
1. Make sure you are able to see devanagari using
http://geocities.com/alkuma/seehindi.html
2. View
http://groups.google.com/groups?dq=&hl=hi&lr=&ie=UTF-8&oe=UTF-8&selm=BAY1-DAV15mrEkkRv6o0006c891%40hotmail.com
using IE6 on Win2k or XP with Indic installed - description of bugs 2 and 3
3. View the same url with Mozilla 1.3 - they are not displayed properly.

Actual Results:  
On linux rh7.1 with indix using Moz 1.3, 
the correct glyphs for ka + halant + ta with zwj before ta are not displayed
correctly.

On XP with indic, using Moz 1.1, 
ka + halant + ta with (a) zwj before ta and (b) zwnj before ta are not displayed
correctly.
ra + halant + zwj + ya is not displayed correctly.



Expected Results:  
View the same page on IE6.
See
  
http://lxr.mozilla.org/seamonkey/source/layout/html/base/src/nsTextTransformer.cpp#1080

Before the rendering routine (implementations of 
nsIFontMetrics : nsFontMetricsXft, nsFontMetricsWin,
nsFontMetricsOS2, nsFontMetricsMac, and so forth) see ZWNJ/ZWJ,
it seems like it's stripped away in nsTextTransform. The comment
in nsTextTransform.cpp says that we need to strip away 
ZWNJ/ZWJ for non-Arabic because they may appear in
pre-shaped text. I'm not sure exactly what that means.
I'll take a look at bug 192088.


Mozilla-Win under Windows XP/2k should be able to handle ZWJ/ZWNJ
correctly for Indic scripts (see bug 166520), but they're stripped 
away so that they're just lost before reaching nsFontMetricsWin.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: intl
Depends on: 192088
With the patch for bug 192088 backed out, bug #2 is fixed under
Windows 2k/XP. Please, note that this issue(a part of complex script
rendering) is highly dependent on platform/OS and toolkits so that
working correctly under Win2k/XP does NOT mean that it also works
for Mozilla-Xft, Mozilla-MacClassic, Mozilla-Win under Win9x/ME,
Fizilla etc. For example, Mozilla-Xft does not handle ZWJ/ZWNJ
correctly, let alone rendering Indic scripts.

What is strange is that bug #3 is not fixed(in the sense that MS IE
and Mozilla render an example given in bug #3 differently). At the moment,
Mozilla-Win doesn't do anything special for complex script rendering (with the
possiblie exception of BIDI handling). That is, 
it's delegating most of jobs to Win32 standard text APIs. On the
other hand, MS IE apparently does some of jobs on its own 
(more exactly, it uses a different set of APIs - Uniscribe APIs
instead of Win32 standard text APIs.)  That may be the cause
of difference in bug #3. If that's the case,  I'm afraid there's
not much Mozilla can do because Win32 stanard text APIs are supposed
to handle this. Needless to say, Mozilla-Win can switch over to
using Uniscribe APIs as MS IE does, but I guess that's beyond the
scope of this bug.

> No way of knowing if this form will be submitted in utf-8 encoding.
If you set the encoding(Character Coding in MOzilla) to UTF-8 before
submitting the form, you can enter UTF-8 text  without a problem
to Bugzilla.
The ZWJ problem exists in Mozilla-1.7.2 for Malayalam Characters also. I am 
using RH9.0, with Pango-1.5.2 & mozilla-1.7.2. Gedit renders the 
consonant+virama+ZWJ combination well. But mozilla displays them as splitted 
with a rectangle for ZWJ. Why the problem is not  handled yet. 
 
Mozilla-1.7.2 doesn't rely on Pango so that having installed Pango 1.5.2 doesn't
help you in any way. If you want to take advantage of Pango, you have to do one
of the follwoing two:

1. get the trunk source from the CVS and compile with '--enable-pango'

2. get either the trunk source or 1.7 branch source and apply my patch for bug
215219 and compile with '--enable-xft' 

With the new textframe code, I have the same result between IE 6 and Fx.
Also it's now possible to include the text to display inside the bug :

1. क् + त = क्त (a glyph of kta)
2. क् + ZWJ + त = क्‍त (half-ka & ta)
3. क् + ZWNJ + त = क्‌त (ka-halant & ta) 

र् + ZWJ + य = र्‍य (crescent-R & ya) 
र्र्ह. (i.e. र् + र् + ह) 
A ZWNJ presence messes up the word rendering completely in Malayalam. Screenshots of how a text with ZWNJ renders in Firefox 2 and 3 are attached.

You can test this URL ( http://fci.wikia.com/wiki/SMC ) in both versions of the browser.
Posted image zwnj display in ff3
messed up display in firefox 3 when a ZWNJ occurs (quite common in Malayalam words)
Posted image zwnj display in ff2
correct display of zwnj in firefox 2 with pango enabled (Fedora Core 8)
I'm not so sure what Praveen is seeing is a dup of 405393. Praven is seeing  a Windows bug, and 405393 so far seems to be Mac/Linux only.

In the attachment here, the display of the characters that *are* displayed seems much better under fx3 than fx2. I personally don't see a single character missing on the SMC page with fx3.

The screen capture Praveen added on bug 405393 shows that his computer does not display "ല്‍" properly in the string "ത്തില്‍ നിന്നും" (also "ര്‍ത്ഥ" and "ണ്‌" are broken inside "ഭക്ഷണപദാര്‍ത്ഥമാണ്‌ പൊറോട്ട").
It works for me under Windows XP.

Two possibilities : 
- Praveen has no font installed to display "ല്‍", "ര്‍ത്ഥ" and "ണ്‌", or 
- Fx is broken WRT the font he is using.

Praveen, what result do you get on those word with IE ? 
If it's broken too, then it's probably a problem on your computer. 
If it does work, then try to copy/paste "ഭക്ഷണപദാര്‍ത്ഥമാണ്‌ പൊറോട്ട" inside wordpad, and display it in large size using every malay font you have installed to find out by visual comparison which font IE as well as Fx are trying to use to display malay.

If Wordpad can not display the string properly using any of your installed font, then it's a problem on your side, you need better fonts.


Sorry forget about the above comment, I got things seriously confused.
Do we still have this issue on Mac? The Linux issue is resolved and I think it is not an issue on Windows. So we could close this if it is not an issue on Mac anymore. Can some one test this on Mac?
On Mac I think this was fixed by bug 396137 at least for AAT fonts. See also bug 408897.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
(In reply to comment #12)
>  I think it
> is not an issue on Windows. 

It is still an issue on Windows. Tested on firefox 2.0.0.11.

Attaching screenshots using the same font(Mangal) on Windows xp.

This attachment and the previous one for 2.0.0.11 displays the difference in rendering between firefox 2.0.0.11 and IE6 on windows xp with the IE6 rending being correct when the same font is used in both browsers.
This problem is fixed in the Fx 3.0 beta and nightly releases.
It will not be fixed for any release of Fx 2.0
Component: Layout: CTL → Layout: Text
QA Contact: arthit → layout.fonts-and-text
You need to log in before you can comment on or make changes to this bug.