Balinese taling mark displayed on wrong side of base character

RESOLVED FIXED in Firefox 41

Status

()

Core
Graphics: Text
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: Norbert Lindenberg, Assigned: jfkthame)

Tracking

Trunk
mozilla41
x86_64
Mac OS X
Points:
---

Firefox Tracking Flags

(firefox40 affected, firefox41 fixed)

Details

(URL)

Attachments

(4 attachments)

(Reporter)

Description

3 years ago
Created attachment 8601903 [details]
Firefox rendering of test page

The precomposed Balinese characters U+1B40, taling tedung, and U+1B41, taling repa tedung, are rendered incorrectly when using the Noto Sans Balinese font: They should be treated as split vowels, with the taling or taling repa component rendered to the left of the base character and the tedung component to the right. Instead, they are entirely rendered to the right of the base character.

To see the problem, go to
http://lindenbergsoftware.com/google/noto/bali-taling.html
The table on this page shows the different Balinese taling marks combined with different base characters. In each cell of the table, the first line uses Noto Sans Balinese, the second line the Aksara Bali font (a font that goes to extraordinary lengths to render Balinese correctly in the absence of any support from the rendering engine).

I reported this in September 2014 to Google as a bug in the Noto Sans Balinese font (https://code.google.com/p/noto/issues/detail?id=163), but now believe that it's a bug in Firefox:

① In February 2015, the Universal Shaping Engine specification became available, and I assume it will govern shaping of Balinese text with OpenType fonts from now on. The specification requires that the shaping engine decomposes split vowels as defined in UnicodeData.txt, and according to Andrew Glass all characters with the Indic Syllabic Category Vowel_Dependent are candidates for this decomposition. The relevant entries in the Unicode Character Database are:

# IndicSyllabicCategory-7.0.0.txt
1B3D..1B41    ; Vowel_Dependent # Mc   [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG

# UnicodeData-7.0.0.txt
1B40;BALINESE VOWEL SIGN TALING TEDUNG;Mc;0;L;1B3E 1B35;;;;N;;;;;
1B41;BALINESE VOWEL SIGN TALING REPA TEDUNG;Mc;0;L;1B3F 1B35;;;;N;;;;;

This decomposition is the first step of shaping, so that the subsequent character reordering step sees U+1B3E U+1B35 instead of U+1B40, and U+1B3F U+1B35 instead of U+1B41. Firefox renders U+1B3E U+1B35 correctly, as seen in the test page.

② The current version of Chrome (42.0.2311.135) renders the test page correctly.
(Reporter)

Comment 1

3 years ago
Created attachment 8601904 [details]
Chrome rendering of test page

Comment 2

3 years ago
We do the decomposition.  Either Jonathan or me have to debug this.
(Assignee)

Comment 3

3 years ago
It renders as expected with standalone harfbuzz; this seems to imply we're doing something strange on the Gecko side. I'll look into it.
Assignee: nobody → jfkthame
(Assignee)

Comment 4

3 years ago
Oh, I know what's happening here. In Gecko, we have an obsolete version of the Unicode normalization data, and updating it is non-trivial due to other issues -- see bug 728180. And therefore the harfbuzz decomposition callback we use, based on nsUnicodeNormalizer, doesn't know about the Balinese characters.

Switching the harfbuzz callback to use ICU would fix this, but IIUC we are not yet building with ICU on all platforms, so that's not an option -- at least not universally. Though it sounds like this may change soon.

Building harfbuzz with its own UCDN module instead of callbacks based on Gecko's Unicode data would also fix this, but that means we'd be carrying around yet another copy of a bunch of Unicode data. Yuck.
(Assignee)

Comment 5

3 years ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9dc07d8e51ba
(Assignee)

Comment 6

3 years ago
Created attachment 8602050 [details] [diff] [review]
Use ICU normalization support during shaping if available, to support decomposable characters in more recently-encoded scripts

Here's a patch to make gfxHarfBuzzShaper use the ICU normalization support on platforms where we have it available. This fixes the Balinese rendering for desktop Firefox. (Other scripts such as Grantha will benefit similarly.)
Attachment #8602050 - Flags: review?(jdaggett)
(Assignee)

Comment 7

3 years ago
Created attachment 8602051 [details] [diff] [review]
Basic reftest to check that Balinese two-part vowel characters are shaped properly

And a simple reftest to go with it. I'm expecting this to fail on Android, where we don't yet build with ICU, so will need to add a fails-if annotation before this lands. Once that's fixed, we'll be able to remove the annotation again, and clean up the #ifdef'd code in gfxHarfBuzzShaper.
Attachment #8602051 - Flags: review?(jdaggett)
(Assignee)

Comment 8

3 years ago
Comment on attachment 8602050 [details] [diff] [review]
Use ICU normalization support during shaping if available, to support decomposable characters in more recently-encoded scripts

Also flagging :gps for r? on the moz.build part of this.
Attachment #8602050 - Flags: review?(gps)

Comment 9

3 years ago
Comment on attachment 8602051 [details] [diff] [review]
Basic reftest to check that Balinese two-part vowel characters are shaped properly

Review of attachment 8602051 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with proper fails-if annotation for Android
Attachment #8602051 - Flags: review?(jdaggett) → review+

Updated

3 years ago
Attachment #8602050 - Flags: review?(jdaggett) → review+

Updated

3 years ago
Attachment #8602050 - Flags: review?(gps) → review+
(Assignee)

Comment 10

3 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea5522b52e14
https://hg.mozilla.org/integration/mozilla-inbound/rev/57def6a002ea
https://hg.mozilla.org/mozilla-central/rev/ea5522b52e14
https://hg.mozilla.org/mozilla-central/rev/57def6a002ea
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox41: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Blocks: 728180
You need to log in before you can comment on or make changes to this bug.