khmer script rendering issues (misplacement and erroneous duplicate of glyphs) with some fonts since firefox switch to harfbuzz for all text shaping on linux

VERIFIED FIXED in Firefox 19

Status

()

Core
Layout: Text
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: Mathieu Pellerin, Assigned: jfkthame)

Tracking

Trunk
mozilla20
x86
Linux
Points:
---

Firefox Tracking Flags

(firefox19 verified)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

5 years ago
Created attachment 669890 [details]
harfbuzz-ng (wrong; left) and old pango (proper; right)

Just spotted a pretty bad rendering bug while testing out firefox linux's use of harfbuzz-ng as a replacement of pango (in change committed Oct 9; see bug 797398).

Harfbuzz-ng misplaces and add unwanted duplicate glyphs when rendering Khmer script using Kh-Battambang.ttf, while pango renders the same Khmer script correctly.

I've created a test case here: http://licadho-cambodia.org/mapnik/testcase-refine.html -- you can download the font on the page, as well as retrieve Khmer script linked to this issue (and a <img> of proper rendering).

A bug has been filed at freedesktop: https://bugs.freedesktop.org/show_bug.cgi?id=55827

Due to the negative impact on readability of Khmer script, firefox should synch with harfbuzz when the issue is fixed.
Confirmed with Mozilla/5.0 (X11; Linux i686; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605
Status: UNCONFIRMED → NEW
Ever confirmed: true
CCing our Khmer localizer so as to inform him of the regression for his builds and in case we need a second tester in a different timezone :)
(Assignee)

Comment 3

5 years ago
(In reply to Mathieu Pellerin from comment #0)
> Created attachment 669890 [details]
> harfbuzz-ng (wrong; left) and old pango (proper; right)
> 
> Just spotted a pretty bad rendering bug while testing out firefox linux's
> use of harfbuzz-ng as a replacement of pango (in change committed Oct 9; see
> bug 797398).
> 
> Harfbuzz-ng misplaces and add unwanted duplicate glyphs when rendering Khmer
> script using Kh-Battambang.ttf, while pango renders the same Khmer script
> correctly.

We've identified the cause of this: it looks like that font attempts to support Khmer under "generic" shapers as well as actual Khmer-specific engines, by using a ton of contextual substitutions to "fake" the reordering of vowel components. This is all placed in the 'liga' feature, so that generic engines will apply the lookups. Unfortunately, this means that when harfbuzz handles the Khmer vowel split/reordering itself, *and* then applies the 'liga' feature (as it's a general-purpose feature that should be usable for any script), you get these unwanted "duplications".

It seems that Uniscribe does *not* apply 'liga' to Indic/SEAsian scripts, so I guess we'll change harfbuzz to match. That should resolve this issue.
(Reporter)

Comment 4

5 years ago
Jonathan, great news, thanks for working on this. Also happy to hear this might end up having a positive impact on other Indic scripts.

I've raised a small esthetic issue with Khmer script alongside this bug on freedesktop's harfbuzz bug tracker. I didn't think it was worth duplicating this one over here. But while you're discussing Khmer script with Behdad, might be worth mentioning it out. It's this one: https://bugs.freedesktop.org/show_bug.cgi?id=55824, with this screenshot showing the issue over here: https://bugs.freedesktop.org/attachment.cgi?id=68388.

It's a minuscule visual regression when moving away from pango that does _not_ impact on readability.

Comment 5

5 years ago
Hi Mathieu,

We looked into that too.  That's a limitation of our fallback positioning logic.  It's not really possible to improve it.  We can either do fallback positioning, which has some limitations, or not do fallback positioning, which has more problems.  We decided that if people want perfect positioning they should add a GPOS table, so we are not going to improve the case you raised.
(Reporter)

Comment 6

5 years ago
Behdad, thanks for looking into it. It's something Khmer readers can all live with, and as you mention it can be properly dealt with at the font level. Also, for the record, Uniscribe renders the vowels the same way Harfbuzz does.

Just out of personal curiosity, what is/are the differences (when compared to Harfbuzz) that allowed Pango to position the ុ and ូ vowels properly?

Comment 7

5 years ago
Hi Mathieu,

Pango did not do fallback mark positioning.  Which means, it would just render marks the way they font was designed for.  And multiple marks could well overlap.  We have fixed that in HarfBuzz.

Now, there's some evidence building that this may be wrong for Khmer, so Jonathan and I will discuss it again and reconsider today.  We may overturn this decision.
(Reporter)

Comment 8

5 years ago
Behdad, appreciate the explanation, thanks. Once you guys settle on the behavior, I'll give it an extensive try and report issues, if any.
(Assignee)

Comment 9

5 years ago
(In reply to Mathieu Pellerin from comment #6)
> Just out of personal curiosity, what is/are the differences (when compared
> to Harfbuzz) that allowed Pango to position the ុ and ូ vowels properly?

In the case of the KhmerOSSys font, at least, these vowels are moved downwards in the context of the pre-base Ra by the 'pres' feature. This seems a bit surprising - it's not the feature I would have expected to handle them - but more importantly, Uniscribe (and DirectWrite) applies it in the same way, and renders the vowel "too low" in examples like ក្រូ and ក្រុ. This can be seen by loading the http://licadho-cambodia.org/mapnik/testcase-refine.html page in any of Firefox, Chrome or IE on Windows 7.

I don't have an old version of pango on hand for comparison, but my assumption is that it must not have been applying this feature. (Maybe the WinXP version of uniscribe didn't, either?) So older systems may have appeared to work, but only because they failed to apply all the font's features.

So our position in this case is that harfbuzz is working correctly and producing the same result as Microsoft's implementation; if the resulting vowel positions are poor, this is a design problem in the font. We are reluctant to "fix" this particular example by disabling the feature involved, as that will cause our rendering to differ from Uniscribe's in the case where a font correctly uses the full set of features.
(Assignee)

Comment 10

5 years ago
The harfbuzz update in bug 801410 fixed the rendering of the example here, but unfortunately appears to have broken some other fonts such as the "hanuman" font used on http://khmer.rfa.org/.

Proposed fix is to revert the Khmer part of upstream commit 981748cb2e9b48b77177b19ec1f972cab7afda89, so that fonts where the 'khmr' script includes a 'liga' feature will still go through the indic shaper. This will slightly "degrade" the rendering of vowels in examples such as KhmerOS "ក្រូ", but that's a minor issue (and according to comment 6, it matches Uniscribe behavior). This seems like a better compromise than the current situation where pre-base and below-base forms in hanuman, for example, are completely broken because they depend on using the indic shaper.

(If only font developers would stick to standards.....!)
(Assignee)

Comment 11

5 years ago
Created attachment 683093 [details] [diff] [review]
partially revert hb commit 981748cb2e9b48b77177b19ec1f972cab7afda89, so we shape khmer fonts with 'liga' using the indic shaper (but ignore that feature)

Behdad, do you agree this is our best option at this point?
Attachment #683093 - Flags: review?(mozilla)

Comment 12

5 years ago
I think we were seeing Kh-Battang and family fail to do prebase reordering because they don't have a pref features.  We started adding the hardcoded-Ra support back for 'pref' stuff, but gave up since that was ugly.  Do you think we should pursue that, or break that family?
(Reporter)

Comment 13

5 years ago
Jonathan & Behdad, I've updated the above testcase (http://licadho-cambodia.org/mapnik/testcase-refine.html) to feature Khmer script rendered using the hanuman font, in addition to the already present Khmer OS System and Kh-Battambang.

Also, this might be useful:
- Khmer site using Kh-battambang: http://www.cen.com.kh/ [* broken->working since latest harfbuzz synch]
- Khmer site using Hanuman: http://khmer.rfa.org/ [! working->broken since latest harfbuzz synch]
- Khmer site using Khmer OS System: http://khmer.voanews.com/ [ working->working; you must have font installed in your system]

For the record, both pango and uniscribe engines are able to display the 3 fonts in the above testcase without problems. Sorry if I sound like a never-ending complainer, I'm not :) just trying to be as useful as I can.
(Reporter)

Comment 14

5 years ago
Created attachment 683421 [details]
hanuman font rendering issues under latest harfbuzz

Attaching a screenshot of Khmer script improperly rendered using hanuman font + latest harfbuzz with indications as to what is missing and/or misplaced.
(Assignee)

Comment 15

5 years ago
See also http://people.mozilla.org/~jkew/kh/test.html, and discussion on the harfbuzz mailing list. Once we arrive at a solution within hb, we'll take a further update in gecko.
(Reporter)

Comment 16

5 years ago
Jonathan, few points:
- this ongoing issue is both a regression on firefox linux desktop (as both the kh-battambang and hanuman were properly rendered using the pango shaping engine) and a webkit-parity bug (as both fonts are rendered properly using android 4.1 stock browser).
- using a firefox linux desktop v18 (which still uses pango as its shaping engine), all fonts (except the KhUni* sets) are rendered properly on your test page.

Whatever approach pango took on this issue, it appears to be the right one. Might be worth having a look at its source to see how things were done there. 

BTW, the google web fonts service offers 21 Khmer fonts, inlcuding hanuman. Other fonts have been seen in the wild, might be worth adding to your test page. http://www.google.com/webfonts
(Assignee)

Comment 17

5 years ago
Created attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

As reported on the mailing list, this gives us the desired results for all the various flavors of Khmer fonts we've encountered. Behdad, if you agree with this and (intend to) take it upstream as well, I'd like to get it into our tree so that it goes out in nightly builds for wider testing ASAP.
Attachment #683905 - Flags: review?(mozilla)
(Assignee)

Updated

5 years ago
Attachment #683093 - Attachment is obsolete: true
Attachment #683093 - Flags: review?(mozilla)

Comment 19

5 years ago
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

Upstream already.
Attachment #683905 - Flags: review?(mozilla) → review+
(Assignee)

Comment 20

5 years ago
Thanks, Behdad. Pushed to mozilla-inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/31a649d3f731
Assignee: nobody → jfkthame
Target Milestone: --- → mozilla20
https://hg.mozilla.org/mozilla-central/rev/31a649d3f731
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Reporter)

Comment 22

5 years ago
Yay, fantastic. Jonathan, I think this should also be committed to Aurora, as the "problematic" synch with harfbuzz occured in v19.

This concludes everything that needed to be fixed for Khmer script to be fully supported by harfbuzz / firefox mobile / firefox OS. Fantastic week.
(Assignee)

Comment 23

5 years ago
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Harfbuzz update in bug 801410 fixed the rendering of some previously-broken Khmer fonts, but regressed others.

User impact if declined: Incorrect rendering (garbled text) with certain Khmer fonts that are used as webfonts on some major Khmer-language sites.

Testing completed (on m-c, etc.): Verified to work correctly with 140+ available Khmer fonts from a variety of sources.

Risk to taking this patch (and alternatives if risky): Minimal risk: only affects text in Khmer script; introduces no new codepaths, just alters the condition used to choose between two implementations. (Alternative of backing out 801410 would be bad because although it would fix the Khmer fonts that regressed, it would "un-fix" many others, as well as regressing unrelated issues for various other scripts.)

String or UUID changes made by this patch: None.
Attachment #683905 - Flags: approval-mozilla-aurora?
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

khmer-only fix for a FF19 regression. Approving for Aurora.
Attachment #683905 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Mozilla/5.0 (X11; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0

Verified as fixed in Firefox 19.0 RC (buildID: 20130215130331) using the info provided at comment 13. Compared the way that FF 19.0a1 (buildID: 20121010030605 stated on comment 1 that on that build is still reproducible) and FF 19.0 RC display the test case and links provided.
Status: RESOLVED → VERIFIED
status-firefox19: fixed → verified

Comment 27

5 years ago
Hi all,

I can still see some problems displaying some khmer fonts. 
Mac OS 10.8.2 / Firefox 19.0.2
http://s15.postimage.org/od3twdlaz/firefox_19_khmer_display.png

Chrome 25 displays all fonts correctly
http://s24.postimage.org/5ttwgtwcl/chrome_25_khmer_display.png
(Assignee)

Comment 28

5 years ago
The samples colored red on the http://people.mozilla.org/~jkew/kh/test.html page are known to display incorrectly, because the fonts involved do not use the OpenType features correctly. There are so many differing, non-standardized implementations of Khmer fonts that it is essentially impossible to create a rendering engine that handles them all "correctly".

Note that your image from Chrome 25 also shows a large number of fonts that are displaying *incorrectly* there (I count over 30 failures, at a glance). See for example Kh-Battambang, Kh-Metal-Chrieng, Kh-Muol-Pali and Kh-Seimreap from KhUnicode210, and many others from the KhmerUnicodeFonts set and even the Google Webfonts set.

Comment 29

5 years ago
Thanks for your quick reply! 
Well, I used to have no problem with earlier versions of Firefox. 
It does not display khmer fonts correctly in major websites like Google, Youtube, or Facebook (I don't attach Facebook screenshots for privacy reasons).

http://s11.postimage.org/x377iaheb/google_khmer.png
http://postimage.org/image/ks9yyum7r/

Yes, some fonts are flagged in red in the Chrome screenshot, there may be errors, but as for the khmer text sample, it is still readable normally.

http://s21.postimage.org/wwzugz55z/chrome_google_khmer.png
http://postimage.org/image/4vlt39zvn/

From a user point of view, I cannot consider this issue is fixed, sorry about this! 
Firefox thumbs webpages titles do not display correctly either.

Comment 30

5 years ago
You are right for Kh-Battambang, Kh-Metal-Chrieng, Kh-Muol-Pali and Kh-Siemreap etc.

Anyway, is there a way to have Firefox displaying khmer text correctly on major websites like Google, Youtube, or Facebook?
(Assignee)

Comment 31

5 years ago
OK, it looks like we did break one aspect of this in Firefox 19 (due to bug 797402), so that Khmer fonts that use AAT will no longer be handled properly. I've filed bug 851495 to fix that.

AFAICT, of the AAT Khmer fonts on OS X 10.7, only one of them (Khmer Sangam MN) actually renders correctly using Core Text; the other (Khmer MN) still seems to have some problems. So even after bug 851495 is fixed, the result may depend which font is chosen. (I don't know if OS X 10.8 will show the same problem.)

In the meantime, you could possibly work around the problem by setting gfx.font_rendering.harfbuzz.scripts back to 71 in about:config (instead of the new default of -1). I believe that should restore the pre-FF19 behavior for the Mac OS X fonts - though it will regress the rendering of some OpenType fonts that are used as webfonts on various sites.

Comment 32

5 years ago
Thanks, it is better: Khmer text now displays properly on Youtube, certain websites and in the thumbs webpage titles. However, still does not work with Facebook and Google.
(Assignee)

Comment 33

5 years ago
I suspect that means you're getting the "Khmer MN" font on those pages, which Core Text doesn't seem to handle properly. You could try disabling Khmer MN in Font Book.app, so that the pages will fall back to Khmer Sangam MN, as that appears to work better AFAICT.

Comment 34

5 years ago
Yes indeed, it now displays properly on Google and Facebook, thank you!
You need to log in before you can comment on or make changes to this bug.