Closed Bug 1161900 Opened 9 years ago Closed 9 years ago

Balinese taling mark displayed on wrong side of base character

Categories

(Core :: Graphics: Text, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox40 --- affected
firefox41 --- fixed

People

(Reporter: mozillabugs, Assigned: jfkthame)

References

()

Details

Attachments

(4 files)

The precomposed Balinese characters U+1B40, taling tedung, and U+1B41, taling repa tedung, are rendered incorrectly when using the Noto Sans Balinese font: They should be treated as split vowels, with the taling or taling repa component rendered to the left of the base character and the tedung component to the right. Instead, they are entirely rendered to the right of the base character.

To see the problem, go to
http://lindenbergsoftware.com/google/noto/bali-taling.html
The table on this page shows the different Balinese taling marks combined with different base characters. In each cell of the table, the first line uses Noto Sans Balinese, the second line the Aksara Bali font (a font that goes to extraordinary lengths to render Balinese correctly in the absence of any support from the rendering engine).

I reported this in September 2014 to Google as a bug in the Noto Sans Balinese font (https://code.google.com/p/noto/issues/detail?id=163), but now believe that it's a bug in Firefox:

① In February 2015, the Universal Shaping Engine specification became available, and I assume it will govern shaping of Balinese text with OpenType fonts from now on. The specification requires that the shaping engine decomposes split vowels as defined in UnicodeData.txt, and according to Andrew Glass all characters with the Indic Syllabic Category Vowel_Dependent are candidates for this decomposition. The relevant entries in the Unicode Character Database are:

# IndicSyllabicCategory-7.0.0.txt
1B3D..1B41    ; Vowel_Dependent # Mc   [5] BALINESE VOWEL SIGN LA LENGA TEDUNG..BALINESE VOWEL SIGN TALING REPA TEDUNG

# UnicodeData-7.0.0.txt
1B40;BALINESE VOWEL SIGN TALING TEDUNG;Mc;0;L;1B3E 1B35;;;;N;;;;;
1B41;BALINESE VOWEL SIGN TALING REPA TEDUNG;Mc;0;L;1B3F 1B35;;;;N;;;;;

This decomposition is the first step of shaping, so that the subsequent character reordering step sees U+1B3E U+1B35 instead of U+1B40, and U+1B3F U+1B35 instead of U+1B41. Firefox renders U+1B3E U+1B35 correctly, as seen in the test page.

② The current version of Chrome (42.0.2311.135) renders the test page correctly.
We do the decomposition.  Either Jonathan or me have to debug this.
It renders as expected with standalone harfbuzz; this seems to imply we're doing something strange on the Gecko side. I'll look into it.
Assignee: nobody → jfkthame
Oh, I know what's happening here. In Gecko, we have an obsolete version of the Unicode normalization data, and updating it is non-trivial due to other issues -- see bug 728180. And therefore the harfbuzz decomposition callback we use, based on nsUnicodeNormalizer, doesn't know about the Balinese characters.

Switching the harfbuzz callback to use ICU would fix this, but IIUC we are not yet building with ICU on all platforms, so that's not an option -- at least not universally. Though it sounds like this may change soon.

Building harfbuzz with its own UCDN module instead of callbacks based on Gecko's Unicode data would also fix this, but that means we'd be carrying around yet another copy of a bunch of Unicode data. Yuck.
Here's a patch to make gfxHarfBuzzShaper use the ICU normalization support on platforms where we have it available. This fixes the Balinese rendering for desktop Firefox. (Other scripts such as Grantha will benefit similarly.)
Attachment #8602050 - Flags: review?(jdaggett)
And a simple reftest to go with it. I'm expecting this to fail on Android, where we don't yet build with ICU, so will need to add a fails-if annotation before this lands. Once that's fixed, we'll be able to remove the annotation again, and clean up the #ifdef'd code in gfxHarfBuzzShaper.
Attachment #8602051 - Flags: review?(jdaggett)
Comment on attachment 8602050 [details] [diff] [review]
Use ICU normalization support during shaping if available, to support decomposable characters in more recently-encoded scripts

Also flagging :gps for r? on the moz.build part of this.
Attachment #8602050 - Flags: review?(gps)
Comment on attachment 8602051 [details] [diff] [review]
Basic reftest to check that Balinese two-part vowel characters are shaped properly

Review of attachment 8602051 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with proper fails-if annotation for Android
Attachment #8602051 - Flags: review?(jdaggett) → review+
Attachment #8602050 - Flags: review?(jdaggett) → review+
Attachment #8602050 - Flags: review?(gps) → review+
https://hg.mozilla.org/mozilla-central/rev/ea5522b52e14
https://hg.mozilla.org/mozilla-central/rev/57def6a002ea
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
Blocks: 728180
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: