CSS text-transform:uppercase and JavaScript toUpperCase() not up to date on german ß/ẞ (U+1E9E)
Categories
(Core :: Internationalization, defect, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox114 | --- | fixed |
People
(Reporter: info, Assigned: jfkthame)
References
Details
Attachments
(1 file)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:87.0) Gecko/20100101 Firefox/87.0
Steps to reproduce:
It's not that long, but for over 10 years we got the ISO-Standard for the capital variant for ß: ẞ (U+1E9E). (The thing itself is way older.)
While it's not mandatory to use, it was finally made official by the german language-council 4 years ago.
It makes way more sense to use it and therefor becomes more popular and used nowadays (since it started existsting in digital fonts - before print-designers had to handcraft those letters themselves!..).
There are cases where ss and ß have a different meaning. Also Names should generally be spelled correctly.
So one should consider ẞ the new norm, and SS the still allowed form.
Actual results:
CSS text-transform:uppercase and JavaScript toUpperCase() do use the old substitute of using SS when capitalizing.
The more you get used to ẞ, the more distracting those SS are.
Every good font nowadays does support U+1E9E.
Firefox itself is handling this issue differently in different places:
https://developer.mozilla.org/en-US/docs/Web/CSS/text-transform
https://developer.mozilla.org/en-US/docs/Web/CSS/font-variant-caps
This might be due to the css-specs? Probably, it is like that just because the latter is a more recent addition.
Expected results:
I believe the standard-behavior should be using U+1E9E.
It might fallback if the glyph is not present.
There could be a option for forcing ss using font-variant:historical-forms or something like that to suit conservatives.
Well, excuse the pun, but hopefully i can do my part with this to eliminate the german SS.
Updated•4 years ago
|
Comment 1•4 years ago
|
||
Hi,
I am assigning a component to this issue in order to involve the development team and get an opinion on this. If is not the correct component please feel free to change it to an appropriate one.
Thanks!
| Assignee | ||
Comment 2•4 years ago
|
||
What do other browsers do? (As far as I'm aware, all browsers currently implement ß -> SS for the uppercase transform or the toUpperCase() function.) What font-variant does will depend on what's implemented by the specific font being used; this is up to the font designer rather than the browser.
I'd be hesitant to make a change here without a broader discussion in the standards community (Unicode, CLDR, DIN, CSS WG, ....) so that there's clear agreement as to what is expected. Can you point to any clear guidance about these operations, e.g. from DIN? Would this depend on language/locale? (E.g. are expectations the same in Austria and Switzerland as in Germany?)
| Assignee | ||
Comment 3•4 years ago
|
||
Moving this to Internationalization, as it's relevant to both Layout and JS, but i18n is where the actual mapping should live.
(In reply to Jonathan Kew (:jfkthame) from comment #2)
Hi,
implementation depending on U+1E9E being implemented in the font as in font-variant-caps would be nice.
But that's rather a matter of a complete or incomplete font, a fallback, not something on the font-desinger as a design decision, right? (In that case the designer would build a SS or SZ for U+1E9E, some fonts actually do that.)
As i wrote i guess the standard here on uppercase is just too old, whereas font-variant-caps is doing it right, just because it's newer.
I found the official document: http://www.rechtschreibrat.com/DOX/rfdr_Bericht_2011-2016.pdf (Page 8, chapter I A 2).
As i wrote: ẞ is optional, but for example has been mandatory in official documents / passports for much longer. Often using the lowercase glyph, due to the lack of the uppercase one.
Afaik ß only exists in german. So language/locale doesn't matter. Switzerland abandoned ß completely.
That council includes members of all german speaking countries.
I also found a Unicode-casing-guideline. That guideline is stating "The default case mapping operations follow standard German orthography, which uses the string “SS” as the regular uppercase mapping for U+00DF ß latin small letter sharp s. In contrast, the alternate, single character uppercase form, U+1E9E latin capital letter sharp s, is intended for typographical representations of signage and uppercase titles, and in other environments where users require the sharp s to be preserved in uppercase. Overall, such usage is uncommon. Thus, when using the default Unicode casing operations, capital sharp s will lowercase to small sharp s, but not vice versa: small sharp s uppercases to “SS”, as shown in Figure 5-16. A tailored casing operation is needed in circumstances requiring small sharp s to uppercase to capital sharp s."
source: http://www.unicode.org/versions/Unicode6.2.0/ch05.pdf (Page 31, chapter 5.18)
At first "follow standard German orthography" is not really up to date.
But - more important - there's a circular reasoning: "Usage is uncommon" because typewriters and digital fonts didn't have the glyph. We didn't use it because we couldn't!
The statement "typographical representations of signage and uppercase titles, and in other environments where users require the sharp s to be preserved in uppercase" is obviously a missinterpretation of reality. Those "environments" are just more professional, so in here fallback to SS or lowercase ß are simply not acceptable. Professionals had to craft the glyph themselves for advertisements and so before, official papers just used the lowercase letter - what looked crap, but was correct. But a normal person on a typewriter or ms-word chose SS or ß depending on their preference between correctness (allcaps) or correctness (spelling).
Another reason to "uncommon" and "environment": Usually you don't write all-caps. And ß can never be at the beginning of a word. So traditionally there's no use for a capital ß, so it didn't exist in the first place, so it wasn't on the typewriters, so it didn't make it into ASCII, and so on... But as soon as you do all-caps you realize: there's something missing. That's why the SS-rule was introduced. It's just a backward-compatibility-bug, so to say. So keeping that in mind, keeping on with that fallback-behavior is actually just wrong (although officially accepted in inofficial use-cases) as we've got the unicode-glyph now.
So while i would insist on implementing it right away correctly on the browser of my choice, since the facts and rules are clear, that discussion should probably ported to those standards-forums, to get i right everywhere. But i wouldn't know how.
Oh and other browsers: IE11 was doing it right (and its so beautiful on my current project, the customer loved it, haha), but since edge is chrome, now no other browser is doing it... (Thats for uppercase, i didn't check font-variant-caps.)
Updated•4 years ago
|
| Assignee | ||
Comment 6•2 years ago
|
||
The Unicode case mapping data still just has the ß -> SS mapping (see https://www.unicode.org/Public/UCD/latest/ucd/SpecialCasing.txt), but mapping to ẞ (U+1E9E) instead would be a trivial override.
However, I'm still concerned that the results will often be poor, because many fonts in common use do not support that codepoint and so fallback will happen (in the case of text-transform), or a missing-glyph box will appear (in the case of font-variant, which happens too late in the rendering pipeline for fallback to come into play).
Therefore, I'm going to suggest that we implement the "new" mapping behind a pref, so that people interested in this can begin to experiment with it, but until there's broad consensus about updating this behavior across the industry, I don't think we should unilaterally make the change by default.
| Assignee | ||
Comment 7•2 years ago
|
||
Updated•2 years ago
|
Comment 8•2 years ago
|
||
(In reply to nikö from comment #4)
I also found a Unicode-casing-guideline. That guideline is stating "The default case mapping operations follow standard German orthography, which uses the string “SS” as the regular uppercase mapping for U+00DF ß latin small letter sharp s. In contrast, the alternate, single character uppercase form, U+1E9E latin capital letter sharp s, is intended for typographical representations of signage and uppercase titles, and in other environments where users require the sharp s to be preserved in uppercase. [...]
At first "follow standard German orthography" is not really up to date.
The Unicode text is up-to-date with the latest orthography rules, cf. https://www.rechtschreibrat.com/DOX/rfdr_Regeln_2016_redigiert_2018.pdf, 2.3 §25, E3:
Bei Schreibung mit Großbuchstaben schreibt man SS. Daneben ist auch die Verwendung des Großbuchstabens ẞ möglich. Beispiel: Straße – STRASSE – STRAẞE.
This matches exactly what's written in the Unicode text: Standard upper case for ß is SS, but it's also possible to instead use ẞ.
| Assignee | ||
Comment 9•2 years ago
|
||
Yes - and in view of this (along with limited font support), I certainly don't think we should deploy this by default. But given that we've had a number of people asking for this behavior over the years, I'm OK with offering it as a pref for those who want to try it.
Updated•2 years ago
|
Comment 10•2 years ago
|
||
| Assignee | ||
Comment 11•2 years ago
|
||
Just for the record, note that the patch here will not affect the Javascript toUpperCase() function, even when the new pref is enabled; it applies only to rendering.
Comment 12•2 years ago
|
||
Backed out for causing reftest failures on 1425243-2.html
| Assignee | ||
Comment 13•2 years ago
|
||
(In reply to Norisz Fay [:noriszfay] from comment #12)
Backed out for causing reftest failures on 1425243-2.html
I don't think this was actually the cause of that failure; the test there doesn't have any connection to the code being touched here. And according to https://bugzilla.mozilla.org/show_bug.cgi?id=1798725#c5 it sounds like the backout didn't fix things anyhow. So I'm going to try re-landing this patch.
Comment 14•2 years ago
|
||
Comment 15•2 years ago
|
||
| bugherder | ||
Description
•