Closed Bug 1503157 Opened Last year Closed Last year
Subtags in lang attributes should take precedence when selecting fonts
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:63.0) Gecko/20100101 Firefox/63.0 Steps to reproduce: Go to https://en.wiktionary.org/wiki/%E0%A4%AE%E0%A4%BE%E0%A4%B2%E0%A4%BF%E0%A4%95%E0%A4%BE%E0%A4%A8%E0%A4%BE, scroll down and expand the quotations box. Actual results: The italicized romanization of the Hindi text looks bad (see attached screenshot) because it is a mix of two fonts. This is due to Firefox seeing it tagged as Hindi text, and the font being applied—Devanagari Sangam MN—not containing diacritics. Expected results: However, the bigger issue is that a Hindi font is being applied to text explicitly tagged as lang="hi-Latn". The fact that the script of the text is being designated as "Latn" should make it use a default Latin font instead of prioritizing the "hi" language portion, since languages can be written in different scripts and are commonly romanized. System information: Firefox 63.0 on MacOS 10.11.5 Note: Chrome and Safari don't try to do anything clever here and instead just inherit "font-family: sans-serif" from the body styling.
I can't see a difference between how Chrome and Firefox display these fonts and I think I'm missing some important bit of knowledge, so I will set this issue's component as (Core) Layout: Text and Fonts. Please "need info" me if I can help.
Component: Untriaged → Layout: Text and Fonts
Product: Firefox → Core
This is the sort of issue that the restructuring proposed in bug 556237 would be expected to address. It might be possible to do a more localized fix in the meantime, though, perhaps just handling the -Latn subtag (which is the most likely script subtag to be used in this way, though others are certainly possible).
Status: UNCONFIRMED → NEW
Ever confirmed: true
See Also: → 556237
The problem here is that we use font prefs based on the mapping of lang to langGroup provided by nsLanguageAtomService, based on the list in langGroups.properties, and (with a couple of ad hoc exceptions) this ignores script subtags. Now that we have BCP47-based Locale support, we can do better by discarding that big list of language codes and working primarily off of the script subtag, using AddLikelySubtags to add it when not explicitly present. Mapping the script code to a langGroup then needs only a much smaller list, which I've moved directly into the source file as it doesn't seem worth maintaining in separate .properties form at this point. This makes the example reported here render correctly, using the Latin font; as a bonus, it also fixes an existing WPT testcase that is currently marked as failing.
Actually we might as well simplify what's left of the kLangGroups array a bit further. (Apologies for the patch churn.)
Assignee: nobody → jfkthame
Status: NEW → ASSIGNED
Actually, we can improve this further by mapping the script tags directly to gecko atoms, rather than mapping to langGroup strings that we then atomize as a separate step.
Attachment #9026939 - Flags: review?(m_kato)
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/6a0012913373 Make selection of font preferences respect a script subtag in the lang attribute, if present. r=m_kato
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/90e20d0a24f5 followup - don't attempt to create a Locale from an empty lang string. r=me to fix crashtest on a CLOSED TREE
Jonathan, can you explain what and how exactly I should verify your change, please? Thanks.
Using a fragment of text from the page mentioned in comment 0, and default font preferences on macOS, compare the rendering of these three examples: (1) data:text/html;charset=utf-8,<h1 style="font-family:sans-serif" lang="en"><i>hariyāṇā mẽ sthānīya nikāy vibhāg kī dukānõ (2) data:text/html;charset=utf-8,<h1 style="font-family:sans-serif" lang="hi"><i>hariyāṇā mẽ sthānīya nikāy vibhāg kī dukānõ (3) data:text/html;charset=utf-8,<h1 style="font-family:sans-serif" lang="hi-Latn"><i>hariyāṇā mẽ sthānīya nikāy vibhāg kī dukānõ Expected results: (1) looks fine, using the Helvetica font throughout (2) looks ugly, with one font used for the plain ASCII letters and a different one for the accented letters (a mix of Devanagari Sangam MN and Times) (3) is the interesting testcase: before this patch, it was ugly like (2), but after the fix it should look like (1). Similar results on Win10, too (the exact fonts used are different, but case 3 renders poorly before the fix).
I can confirm this fix in Windows 10, Ubuntu 16 and Mac OS 10.14 on Nighlty v65.0a1 from 2018-12-3. The texts (1) and (3) look alike, while text (2) looks somewhat ugly. Thanks.
You need to log in before you can comment on or make changes to this bug.