Open Bug 324615 Opened 19 years ago Updated 2 years ago

ZWNJ is stripped from LTR languages causing incorrect rendering in languages like Myanmar

Categories

(Core :: Layout: Text and Fonts, defect)

x86
All
defect

Tracking

()

People

(Reporter: devel, Unassigned)

References

Details

(Keywords: intl, rtl)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; my-MM; rv:1.8) Gecko/20060113 Firefox/1.5
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; my-MM; rv:1.8) Gecko/20060113 Firefox/1.5

ZWNJ (and ZWJ) is stripped from Myanmar text causing the text to be rendered as if it was a completly different word. The problem arises with U+1039 U+200C + consonant. If the U+200C is present the U+1039 will be visible and the following consonant rendered normally. If the U+200C is stripped the consonant should be rendered differently, usually underneath the preceding consonant and the U+1039 character should not be visible.
Normal web pages do not strip the ZWNJ/ZWJ and so the text is rendered correctly.

Reproducible: Always

Steps to Reproduce:
1.Use a version of Mozilla capable of rendering Myanmar Unicode e.g. Pango with Pangographite module on Linux or http://sila.mozdev.org/grFirefox.html
2.Display Myanmar text in a sidebar, dialog control etc.

Actual Results:  
Consonants are rendered underneath the preceding consonant and the visible virama is hidden.

Expected Results:  
The visible virama marked by a ZWNJ should be displayed and the following consonant should not be rendered underneath the preceding consonant.

Patch to follow.
Patch changes nsBidiUtils to only strips control characters if RTL or Arabic shaping is being used. This should mean that ZWNJ and ZWJ will not be stripped in LTR languages such as Myanmar that require it for correct rendering.
maybe related:

Bug 202352
ZWJ, ZWNJ with devanagari characters does not display correct glyphs

Bug 274152
ECMA-262 Edition 3 specifies ignoring ZWNJ and ZWJ along with other Unicode format-control characters
Keywords: intl
Comment on attachment 209563 [details] [diff] [review]
controlChar-krs-20060125.diff

Maybe the call to StripBidiControlCharacters should go two lines higher,  inside the if (doShape) {} block
(In reply to comment #3)
> (From update of attachment 209563 [details] [diff] [review] [edit])
> Maybe the call to StripBidiControlCharacters should go two lines higher, 
> inside the if (doShape) {} block
> 
Since the call to mBidiEngine->WriteReverse does not set the NSBIDI_REMOVE_BIDI_CONTROLS that would make sense, but I'm not very familiar with RTL text.
(In reply to comment #2)
> maybe related:
> 
> Bug 202352
> ZWJ, ZWNJ with devanagari characters does not display correct glyphs
202352 may be related, but the description implies that it applies to normal web pages as well. This bug only applies to tree frames, drop down lists and similar controls. I think these go through a different rendering path to nsTextFrame.
> 
> Bug 274152
> ECMA-262 Edition 3 specifies ignoring ZWNJ and ZWJ along with other Unicode
> format-control characters
Bug 274152 does affect Myanmar, but it is still present with the above patch. (I get around it using \u200c in javascript code). I think this is a separate bug. 

==> Bidi
Assignee: general → mozilla
Status: UNCONFIRMED → NEW
Component: General → Layout: BiDi Hebrew & Arabic
Ever confirmed: true
Product: Mozilla Application Suite → Core
QA Contact: general → zach
Version: unspecified → Trunk
I want to make a more general solution for this in bug 280936
Depends on: 280936
Presumably this was fixed by the fix for bug 280936. Can anyone verify?
Mass-assigning the new rtl keyword to RTL-related (see bug 349193).
Keywords: rtl
Component: Layout: BiDi Hebrew & Arabic → Layout: Text
QA Contact: zach → layout.fonts-and-text
Assignee: mozilla → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: