Apply NFC normalization in preference to falling back to a different font for combining marks [was: U+0303 COMBINING TILDE character has a broken rendering with "Source Sans Pro" font]

NEW
Unassigned

Status

()

P3
normal
2 years ago
8 months ago

People

(Reporter: jdpc557, Unassigned)

Tracking

49 Branch
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:49.0) Gecko/20100101 Firefox/49.0
Build ID: 20160922113459

Steps to reproduce:

My name is "João Costa", which is the same name present on my "git config user.name"

On my git config, my name is encoded as "4a 6f 61 cc 83 6f 20 43 6f 73 74 61 0a"
(Note that "ã" is being encoded in two characters: "LATIN SMALL LETTER A" and "COMBINING TILDE" instead of "LATIN SMALL LETTER A WITH TILDE")

If I open travis or gitbucket, my name will be rendered incorrectly. Other browsers render the text just fine.

Tested on Firefox 49.0.1 (MacOS X 10.11.6)


Actual results:

The name is rendered incorrectly, see: https://i.imgur.com/0yHav3g.png


Expected results:

The name should be rendered in the same way as with "LATIN SMALL LETTER A WITH TILDE", see: https://i.imgur.com/Dc7udpI.png
Can you provide an example HTML page or the URL?
The rendering depends on the font, and maybe the actual encoding on the page (may be converted on server side).
Flags: needinfo?(jdpc557)
Component: Untriaged → Layout: Text
Product: Firefox → Core
(Reporter)

Comment 2

2 years ago
(In reply to Tooru Fujisawa [:arai] from comment #1)
> Can you provide an example HTML page or the URL?
> The rendering depends on the font, and maybe the actual encoding on the page
> (may be converted on server side).

The URL is https://travis-ci.org/ShiftForward/apso/builds/167409170

The font used is "Source Sans Pro"

I'm not sure if pasting the HTML will help, because I don't know if the UTF-8 encoding will change.
Thanks.
confirmed the issue on Firefox Nightly 52.0a1 (2016-10-12) (64-bit) on OSX 10.11.6.
and the text is not changed, it's U+0061 U+0303 sequence.
(sorry I meant normalization forms, not encoding)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(jdpc557)
Summary: U+0303 COMBINING TILDE character has a broken rendering → U+0303 COMBINING TILDE character has a broken rendering with "Source Sans Pro" font
The problem occurs because the site is using a webfont for Source Sans Pro that does not support the combining tilde character. Checking the CSS provided (from https://fonts.googleapis.com/css?family=Source+Sans+Pro:300,400,600), we see a bunch of resources for different Unicode ranges, but none of them include the combining diacritics in the U+03xx range (the tilde is U+0303).

Therefore, while the base characters "Joao" are rendered with Source Sans Pro, as styled, the combining tilde falls back to a different font (what font you get may depend on your system/configuration). And positioning of diacritics will generally not work well across font boundaries.

It displays OK in Chrome, I expect, because they apply NFC normalization to the text prior to rendering, which replaces the sequence <a, combining tilde> with the single character U+00E3 'ã', which IS present in the Source Sans Pro font.

The simplest solution is to use the precomposed LATIN SMALL LETTER A WITH TILDE instead of the decomposed representation, as this is much more widely supported; relatively few fonts have good support for Unicode combining marks. (And in general, normalization form NFC is the recommended form for text on the web. See http://www.w3.org/International/questions/qa-html-css-normalization.)

Comment 5

2 years ago
Chrome shapes with HarfBuzz first then does font fallback based on the shaping result (at least in the “complex path” not sure if they switched Latin to it yet), so I think the composition is done by HarfBuzz here.

I think it might be worthwhile to do the same at some point.
Priority: -- → P3
Where the "current" font (i.e. the font chosen by the font-matching algorithm for the base character) doesn't support the combining mark(s) that follow(s), it would be better to try NFC normalization before falling back to a different font. That would fix the example here, and we get a steady trickle of reports of such cases in various languages and fonts.
Summary: U+0303 COMBINING TILDE character has a broken rendering with "Source Sans Pro" font → Apply NFC normalization in preference to falling back to a different font for combining marks [was: U+0303 COMBINING TILDE character has a broken rendering with "Source Sans Pro" font]
Duplicate of this bug: 1343737
Duplicate of this bug: 1395339
Duplicate of this bug: 1408732
You need to log in before you can comment on or make changes to this bug.