Consider accepting `lang` attribute with separator `_` instead of `-` when determining which font prefs to use
Categories
(Core :: Layout: Text and Fonts, defect, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox99 | --- | fixed |
People
(Reporter: jfkthame, Assigned: jfkthame)
References
Details
Attachments
(2 files)
See bug 1643536 comment 10: Project Gutenberg pages such as https://www.gutenberg.org/ebooks/search/ have things like <html lang="en_GB"> or <html lang="en_US">.
These lang attributes fail to parse when we're trying to determine what langGroup to use to resolve generic font preferences, because they're not well-formed BCP 47 tags, as per https://datatracker.ietf.org/doc/html/rfc5646#section-2. The only valid subtag separator is HYPHEN, not UNDERSCORE.
Given that there are systems (in particular ICU) that do use underscore-separated tags, it's not very surprising this error may show up. And testing indicates that both Blink and Webkit do take account of such malformed tags for the purpose of font selection.
So although the "correct" solution is for sites (like Gutenberg) to fix their pages to have well-formed lang attributes, I think we should also consider adding a check for this type of error and "fixing" it internally (as Postel's law would suggest), so that users see fonts for the intended language.
| Assignee | ||
Comment 1•4 years ago
|
||
Testcase showing the effect of lang tags on font selection. Each test line is actually a pair of overlaid lines, one of which (red) has a correct CJK lang tag, and one (green) has a tag with underscore instead of hyphen. If these result in different fonts being used, some bits of the red glyphs will show through.
| Assignee | ||
Comment 2•4 years ago
|
||
Updated•4 years ago
|
Comment 4•4 years ago
|
||
| bugherder | ||
Description
•