Closed Bug 799869 Opened 9 years ago Closed 9 years ago

khmer script rendering issues (misplacement and erroneous duplicate of glyphs) with some fonts since firefox switch to harfbuzz for all text shaping on linux


(Core :: Layout: Text and Fonts, defect)

Not set



Tracking Status
firefox19 --- verified


(Reporter:, Assigned: jfkthame)



(3 files, 1 obsolete file)

Just spotted a pretty bad rendering bug while testing out firefox linux's use of harfbuzz-ng as a replacement of pango (in change committed Oct 9; see bug 797398).

Harfbuzz-ng misplaces and add unwanted duplicate glyphs when rendering Khmer script using Kh-Battambang.ttf, while pango renders the same Khmer script correctly.

I've created a test case here: -- you can download the font on the page, as well as retrieve Khmer script linked to this issue (and a <img> of proper rendering).

A bug has been filed at freedesktop:

Due to the negative impact on readability of Khmer script, firefox should synch with harfbuzz when the issue is fixed.
Confirmed with Mozilla/5.0 (X11; Linux i686; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605
Ever confirmed: true
CCing our Khmer localizer so as to inform him of the regression for his builds and in case we need a second tester in a different timezone :)
(In reply to Mathieu Pellerin from comment #0)
> Created attachment 669890 [details]
> harfbuzz-ng (wrong; left) and old pango (proper; right)
> Just spotted a pretty bad rendering bug while testing out firefox linux's
> use of harfbuzz-ng as a replacement of pango (in change committed Oct 9; see
> bug 797398).
> Harfbuzz-ng misplaces and add unwanted duplicate glyphs when rendering Khmer
> script using Kh-Battambang.ttf, while pango renders the same Khmer script
> correctly.

We've identified the cause of this: it looks like that font attempts to support Khmer under "generic" shapers as well as actual Khmer-specific engines, by using a ton of contextual substitutions to "fake" the reordering of vowel components. This is all placed in the 'liga' feature, so that generic engines will apply the lookups. Unfortunately, this means that when harfbuzz handles the Khmer vowel split/reordering itself, *and* then applies the 'liga' feature (as it's a general-purpose feature that should be usable for any script), you get these unwanted "duplications".

It seems that Uniscribe does *not* apply 'liga' to Indic/SEAsian scripts, so I guess we'll change harfbuzz to match. That should resolve this issue.
Jonathan, great news, thanks for working on this. Also happy to hear this might end up having a positive impact on other Indic scripts.

I've raised a small esthetic issue with Khmer script alongside this bug on freedesktop's harfbuzz bug tracker. I didn't think it was worth duplicating this one over here. But while you're discussing Khmer script with Behdad, might be worth mentioning it out. It's this one:, with this screenshot showing the issue over here:

It's a minuscule visual regression when moving away from pango that does _not_ impact on readability.
Hi Mathieu,

We looked into that too.  That's a limitation of our fallback positioning logic.  It's not really possible to improve it.  We can either do fallback positioning, which has some limitations, or not do fallback positioning, which has more problems.  We decided that if people want perfect positioning they should add a GPOS table, so we are not going to improve the case you raised.
Behdad, thanks for looking into it. It's something Khmer readers can all live with, and as you mention it can be properly dealt with at the font level. Also, for the record, Uniscribe renders the vowels the same way Harfbuzz does.

Just out of personal curiosity, what is/are the differences (when compared to Harfbuzz) that allowed Pango to position the ុ and ូ vowels properly?
Hi Mathieu,

Pango did not do fallback mark positioning.  Which means, it would just render marks the way they font was designed for.  And multiple marks could well overlap.  We have fixed that in HarfBuzz.

Now, there's some evidence building that this may be wrong for Khmer, so Jonathan and I will discuss it again and reconsider today.  We may overturn this decision.
Behdad, appreciate the explanation, thanks. Once you guys settle on the behavior, I'll give it an extensive try and report issues, if any.
(In reply to Mathieu Pellerin from comment #6)
> Just out of personal curiosity, what is/are the differences (when compared
> to Harfbuzz) that allowed Pango to position the ុ and ូ vowels properly?

In the case of the KhmerOSSys font, at least, these vowels are moved downwards in the context of the pre-base Ra by the 'pres' feature. This seems a bit surprising - it's not the feature I would have expected to handle them - but more importantly, Uniscribe (and DirectWrite) applies it in the same way, and renders the vowel "too low" in examples like ក្រូ and ក្រុ. This can be seen by loading the page in any of Firefox, Chrome or IE on Windows 7.

I don't have an old version of pango on hand for comparison, but my assumption is that it must not have been applying this feature. (Maybe the WinXP version of uniscribe didn't, either?) So older systems may have appeared to work, but only because they failed to apply all the font's features.

So our position in this case is that harfbuzz is working correctly and producing the same result as Microsoft's implementation; if the resulting vowel positions are poor, this is a design problem in the font. We are reluctant to "fix" this particular example by disabling the feature involved, as that will cause our rendering to differ from Uniscribe's in the case where a font correctly uses the full set of features.
The harfbuzz update in bug 801410 fixed the rendering of the example here, but unfortunately appears to have broken some other fonts such as the "hanuman" font used on

Proposed fix is to revert the Khmer part of upstream commit 981748cb2e9b48b77177b19ec1f972cab7afda89, so that fonts where the 'khmr' script includes a 'liga' feature will still go through the indic shaper. This will slightly "degrade" the rendering of vowels in examples such as KhmerOS "ក្រូ", but that's a minor issue (and according to comment 6, it matches Uniscribe behavior). This seems like a better compromise than the current situation where pre-base and below-base forms in hanuman, for example, are completely broken because they depend on using the indic shaper.

(If only font developers would stick to standards.....!)
I think we were seeing Kh-Battang and family fail to do prebase reordering because they don't have a pref features.  We started adding the hardcoded-Ra support back for 'pref' stuff, but gave up since that was ugly.  Do you think we should pursue that, or break that family?
Jonathan & Behdad, I've updated the above testcase ( to feature Khmer script rendered using the hanuman font, in addition to the already present Khmer OS System and Kh-Battambang.

Also, this might be useful:
- Khmer site using Kh-battambang: [* broken->working since latest harfbuzz synch]
- Khmer site using Hanuman: [! working->broken since latest harfbuzz synch]
- Khmer site using Khmer OS System: [ working->working; you must have font installed in your system]

For the record, both pango and uniscribe engines are able to display the 3 fonts in the above testcase without problems. Sorry if I sound like a never-ending complainer, I'm not :) just trying to be as useful as I can.
Attaching a screenshot of Khmer script improperly rendered using hanuman font + latest harfbuzz with indications as to what is missing and/or misplaced.
See also, and discussion on the harfbuzz mailing list. Once we arrive at a solution within hb, we'll take a further update in gecko.
Jonathan, few points:
- this ongoing issue is both a regression on firefox linux desktop (as both the kh-battambang and hanuman were properly rendered using the pango shaping engine) and a webkit-parity bug (as both fonts are rendered properly using android 4.1 stock browser).
- using a firefox linux desktop v18 (which still uses pango as its shaping engine), all fonts (except the KhUni* sets) are rendered properly on your test page.

Whatever approach pango took on this issue, it appears to be the right one. Might be worth having a look at its source to see how things were done there. 

BTW, the google web fonts service offers 21 Khmer fonts, inlcuding hanuman. Other fonts have been seen in the wild, might be worth adding to your test page.
As reported on the mailing list, this gives us the desired results for all the various flavors of Khmer fonts we've encountered. Behdad, if you agree with this and (intend to) take it upstream as well, I'd like to get it into our tree so that it goes out in nightly builds for wider testing ASAP.
Attachment #683905 - Flags: review?(mozilla)
Attachment #683093 - Attachment is obsolete: true
Attachment #683093 - Flags: review?(mozilla)
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

Upstream already.
Attachment #683905 - Flags: review?(mozilla) → review+
Thanks, Behdad. Pushed to mozilla-inbound:
Assignee: nobody → jfkthame
Target Milestone: --- → mozilla20
Closed: 9 years ago
Resolution: --- → FIXED
Yay, fantastic. Jonathan, I think this should also be committed to Aurora, as the "problematic" synch with harfbuzz occured in v19.

This concludes everything that needed to be fixed for Khmer script to be fully supported by harfbuzz / firefox mobile / firefox OS. Fantastic week.
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Harfbuzz update in bug 801410 fixed the rendering of some previously-broken Khmer fonts, but regressed others.

User impact if declined: Incorrect rendering (garbled text) with certain Khmer fonts that are used as webfonts on some major Khmer-language sites.

Testing completed (on m-c, etc.): Verified to work correctly with 140+ available Khmer fonts from a variety of sources.

Risk to taking this patch (and alternatives if risky): Minimal risk: only affects text in Khmer script; introduces no new codepaths, just alters the condition used to choose between two implementations. (Alternative of backing out 801410 would be bad because although it would fix the Khmer fonts that regressed, it would "un-fix" many others, as well as regressing unrelated issues for various other scripts.)

String or UUID changes made by this patch: None.
Attachment #683905 - Flags: approval-mozilla-aurora?
Comment on attachment 683905 [details] [diff] [review]
[harfbuzz] improve heuristic for choosing between shapers for khmer fonts

khmer-only fix for a FF19 regression. Approving for Aurora.
Attachment #683905 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Mozilla/5.0 (X11; Linux i686; rv:19.0) Gecko/20100101 Firefox/19.0

Verified as fixed in Firefox 19.0 RC (buildID: 20130215130331) using the info provided at comment 13. Compared the way that FF 19.0a1 (buildID: 20121010030605 stated on comment 1 that on that build is still reproducible) and FF 19.0 RC display the test case and links provided.
Hi all,

I can still see some problems displaying some khmer fonts. 
Mac OS 10.8.2 / Firefox 19.0.2

Chrome 25 displays all fonts correctly
The samples colored red on the page are known to display incorrectly, because the fonts involved do not use the OpenType features correctly. There are so many differing, non-standardized implementations of Khmer fonts that it is essentially impossible to create a rendering engine that handles them all "correctly".

Note that your image from Chrome 25 also shows a large number of fonts that are displaying *incorrectly* there (I count over 30 failures, at a glance). See for example Kh-Battambang, Kh-Metal-Chrieng, Kh-Muol-Pali and Kh-Seimreap from KhUnicode210, and many others from the KhmerUnicodeFonts set and even the Google Webfonts set.
Thanks for your quick reply! 
Well, I used to have no problem with earlier versions of Firefox. 
It does not display khmer fonts correctly in major websites like Google, Youtube, or Facebook (I don't attach Facebook screenshots for privacy reasons).

Yes, some fonts are flagged in red in the Chrome screenshot, there may be errors, but as for the khmer text sample, it is still readable normally.

From a user point of view, I cannot consider this issue is fixed, sorry about this! 
Firefox thumbs webpages titles do not display correctly either.
You are right for Kh-Battambang, Kh-Metal-Chrieng, Kh-Muol-Pali and Kh-Siemreap etc.

Anyway, is there a way to have Firefox displaying khmer text correctly on major websites like Google, Youtube, or Facebook?
OK, it looks like we did break one aspect of this in Firefox 19 (due to bug 797402), so that Khmer fonts that use AAT will no longer be handled properly. I've filed bug 851495 to fix that.

AFAICT, of the AAT Khmer fonts on OS X 10.7, only one of them (Khmer Sangam MN) actually renders correctly using Core Text; the other (Khmer MN) still seems to have some problems. So even after bug 851495 is fixed, the result may depend which font is chosen. (I don't know if OS X 10.8 will show the same problem.)

In the meantime, you could possibly work around the problem by setting gfx.font_rendering.harfbuzz.scripts back to 71 in about:config (instead of the new default of -1). I believe that should restore the pre-FF19 behavior for the Mac OS X fonts - though it will regress the rendering of some OpenType fonts that are used as webfonts on various sites.
Thanks, it is better: Khmer text now displays properly on Youtube, certain websites and in the thumbs webpage titles. However, still does not work with Facebook and Google.
I suspect that means you're getting the "Khmer MN" font on those pages, which Core Text doesn't seem to handle properly. You could try disabling Khmer MN in Font, so that the pages will fall back to Khmer Sangam MN, as that appears to work better AFAICT.
Yes indeed, it now displays properly on Google and Facebook, thank you!
You need to log in before you can comment on or make changes to this bug.