Open Bug 490534 Opened 15 years ago Updated 2 years ago

ZWJ and NNBSP rendered incorrectly in scripts like Mongolian

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Windows Vista
defect

Tracking

()

UNCONFIRMED

People

(Reporter: nad.pot, Unassigned)

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.10) Gecko/2009042316 Firefox/3.0.10

ZWJ (U+200D Zero Width Joiner) and NNBSP (U+202F Narrow No-Break SPace) in Mongolian texts fail to change the form of the letters following them as they should. The problem with NNBSP is very annoying as it is ubiquitous in Mongolian. Firefox seems to use the default font to display NNBSP even when it is surrounded by Mongolian letters that the default font does not support, and to use the default font to display ZWJ even when it is at the beginning of a Mongolian word. This prevents them from combining with the following character.

The problem disappears when the author or the reader explicitly chooses a Mongolian font. However, this is not a viable solution for the author if he doesn't know what Mongolian fonts the reader has installed and not a good solution for the reader of multilingual documents.

IE8 has the same problem with ZWJ but it handles NNBSP correctly.

Reproducible: Always

Steps to Reproduce:
1. Use a system supporting Mongolian script (Windows Vista for example) and make sure  that at least one font supporting Mongolian (Mongolian Baiti for example) is installed. Do not select this font as the default font of Firefox.
2. Use Firefox to display the following text or enter it in a text box: ‍ᠣ ᡳ (ZWJ+o+NNBSP+Manchu i)
Actual Results:  
Both ᠣ (o) and ᡳ (Manchu i) are displayed in their isolate forms.

Expected Results:  
Both ᠣ (o) and ᡳ (Manchu i) should be displayed in their final forms.

Maybe related to bugs 324615 and 408437.
Component: General → Layout: Text
Product: Firefox → Core
QA Contact: general → layout.fonts-and-text
Attached file ZWJ o NNBSP i
The second line is identical to the first, except that the font is explicitly set to Mongolian Baiti.
This bug is also related to 665352
As described, I believe the bug is invalid. 
The proper solution is to tell the browser what the language of the unicode encoded page is, and the proper way to do that is to use a lang="mn" attribute and *certainly* *not* to use the font-family attribute to set an explicit font name (as done in the attachment).

See http://www.w3schools.com/tags/ref_language_codes.asp

They are many language that can not be displayed properly if Fx doesn't know the text is in that language, so I'm not sure the Mongolian case means special hooks should be added to display ZWJ/NNBSP with the same font as the surrounding text. 
OTOH Mongolian is likely not the only language that would be helped by this when not properly tagged.

This being said I'm not certain Firefox recognize the mn value, and knows what font to use then. A quick check shows me that Fx 15 is missing the proper font.name-list*** properties to be told that "baiti" is the name of the correct font to use.

A list of the usual fonts that should be used for Mongolian under various OS, and an order of preference, would be needed in order to know what should be put in those preferences for mn.
@Jean-Marc Desperrier :

Mongolian use several scripts, mn-Mong is for mongol bichig that is our subject (~800-900 years old, mainly used in China), mn-Cyrl for cyrillic (~60 years old, mainly in Mongolia and Russia), mn-Latn for latin translitteration. But some think mn-Cyrl is the default (associated with Mongolia country, that is not the majority of Mongolians) and this is assumed on some Wikipedia tags for example. Those ISO 15924 4char extension, should also be given then.

Due to the history of this special *vertical* writing on computers, based on vowels with special ligatures, lot of tricks are still needed. Firefox <30 had a well known a bug in regards to this writing (see bug 665352). There are several methods (CMS for example override latin fonts, before Unicode, now there is special ligatures cases depending on implementations & keyboard strokes. Harfbuzz (that is used today by Firefox) author (Behdad Esfahbod) explained this in a nice slide called "Unicode, OpenType, and HarfBuzz: Closing the Circle" (in 2014, so 2 years after your post). There are still some ligature problems with most (but not all) fonts today.

The page on Mongolian case (but that's probably better to read the whole slide):
https://docs.google.com/presentation/d/1x97pfbB1gbD53Yhz6-_yBUozQMVJ_5yMqqR_D-R7b7I/present#slide=id.g4e63d4d9db789ef90
Currently, to prevent this issue, you should specify which font(s) should be used for Mongolian texts within HTML, as following pages:
https://en.wikipedia.org/wiki/Qing_dynasty (look at the section “Notes”)
https://en.wikipedia.org/wiki/Resident_Identity_Card

The best solution for this issue, I think, is to implementing the option as “font.name.serif.x-mong” and defaulting to Mongolian Baiti in about:config page.
On Firefox 49.0 NNBSP works proper when I open the attachment, even if the font has not spicified, BUT ZWJ still depends on specific font family.

Additionaly I found NNBSP also works on this page:
https://wikisource.org/wiki/%E1%A0%A0%E1%A0%AF%E1%A1%B3%E1%A0%A8_%E1%A1%B3_%E1%A1%A8%E1%A0%A3%E1%A1%B4%E1%A0%B0%E1%A0%A3_%E1%A1%A9%E1%A1%9D_%E1%A1%A5%E1%A0%A0%E1%A0%AF%E1%A1%A5%E1%A1%A1%E1%A0%A8_%E1%A0%AA%E1%A1%9D_%E1%A0%B5%E1%A0%A0%E1%A1%B3%E1%A0%AF%E1%A0%A0%E1%A1%A5%E1%A0%A0_%E1%A1%A7%E1%A1%B3
On this page a suffix after NNBSP rendered incorrectly the page title, but there is not specify the Mongolian/Manchu font.
Attached file Test results.zip
With attachment 374937 [details] I see there is different performance between IE, Firefox, Edge and Chrome while texts not specified the font face.

Firefox 54.01:
1st line: U+1823 (o) displayed in its initial form, but U+1873 (Manchu i) is displayed in its final form
2nd line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its final form

Chrome 60.0.3122.78:
1st line: U+1823 (o) displayed in its final form, but U+1873 (Manchu i) is displayed in its initial form
2nd line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its final form

Edge 40.15063.0.0:
1st line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its initial form
2nd line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its final form

IE 11.483.15063.0:
1st line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its initial form
2nd line: both U+1823 (o) and U+1873 (Manchu i) are displayed in its final form

Actually, I can make conclusion that while Mongolian/Manchu texts displaying with fallback fonts, Firefox can handle NNBSP but not ZWJ, Chrome can handle ZWJ but not NNBSP, Edge and IE cannot handle both NNBSP and ZWJ.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: