Open Bug 1706879 Opened 4 years ago Updated 5 months ago

Hyphens: auto doesn't work with capitalized words

Categories

(Core :: Layout: Text and Fonts, defect)

Firefox 88
defect

Tracking

()

Tracking Status
firefox88 --- affected
firefox89 --- affected
firefox90 --- affected

People

(Reporter: u684083, Unassigned)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:88.0) Gecko/20100101 Firefox/88.0

Steps to reproduce:

Apply "hyphens: auto" CSS property on single word in smaller than word length container. Use long word that starts with capital letter.

Actual results:

Word is not hyphenated.

Expected results:

Word should be hyphenated, like in other browsers. Interesting, that lowercase word "management" is hyphenated correctly.
Code example: https://jsfiddle.net/61g0pnbo/1/

Managed to reproduce this issue on Windows 10 x64, macOS 10.15, Ubuntu 20.04 and on Windows 7 x64.

Severity: -- → S4
Status: UNCONFIRMED → NEW
Component: Untriaged → Layout: Text and Fonts
Ever confirmed: true
OS: Unspecified → All
Product: Firefox → Core
Hardware: Unspecified → All

This is by design, to avoid hyphenating personal and corporate names, etc. (The behavior is different in German, where all nouns are capitalized.) See bug 1550532.

You can enable hyphenation of capitalized words by adding a boolean preference intl.hyphenate-capitalized.en-US (or for other locale as appropriate) to about:config, with its value set to true.

(In reply to Jonathan Kew (:jfkthame) from comment #2)

This is by design, to avoid hyphenating personal and corporate names, etc. (The behavior is different in German, where all nouns are capitalized.) See bug 1550532.

You can enable hyphenation of capitalized words by adding a boolean preference intl.hyphenate-capitalized.en-US (or for other locale as appropriate) to about:config, with its value set to true.

So corporations would really rather have their name disappear off the page than have it hyphenated in ALL CASES?

So if I have a title starting with e.g. "AmericanAirlines is hiring." with hyphens set to auto because it's supposed to wrap on mobile;
they would really prefer it to say:

AmericanAi
is hiring
(because the rest of the word doesnt fit and flies off the page, perhaps completely hidden by overflow: hidden;)

or perhaps they prefer

AmericanAi
rlines is hir
ing
using word break instead because this "by design" fix broke hyphenation completely?

I'm pretty sure they would prefer this:

American-
Airlines is
hiring

if the only alternative is clipping...

Why is this the default behaviour as it's working totally different from other browsers?

And is there a better workaround than setting lang="de" -attribute on the containing element or making everything lower case?

I would imagine providing non-language specific -moz-hyphens: force or something similar would be a good thing to have if there's conflicting defaults across browsers.

There definitely should be a way to disable this functionality completely. The default behavior breaks very often with Finnish language (and asumably with many other languages as well).

It is very difficult to maintain regular headings with other languages OTHER than german, for example, "Kodin elektroniikka" with hyphens: auto would result in:

"Kodin elek-
troniikka"

or in other situations even compound words such as

"Konsolipelaamine
n"

(it doesn't even include a hyphen!)

In my opinion it is completely bonkers that this essential functionality now behaves completely different compared to other browsers as @kai.kulju suggested)

Flags: needinfo?(jfkthame)

Out of curiosity, I just tried Chrome, and I'm seeing similar behavior to Firefox: a capitalized word does not get hyphenated.

Testcase:

data:text/html;charset=utf-8,<p lang="en" style="width:80px;hyphens:auto;border:1px solid gray">AmericanAirlines is hiring.<br><br>americanairlines is hiring.

In both Chrome and Firefox, this gives me

AmericanAirlines
is hiring.

ameri-
canairlines
is hiring.

where "Airlines" on the first line extends beyond the box. (And would be clipped if we used overflow: clip or similar.)

So I think the claim that Firefox's behavior is "completely different to other browsers" is perhaps overstated.

For Finnish specifically, one thing you can try is creating a new preference in about:config named intl.hyphenate-capitalized.fi with type Boolean with its value set to true.

Flags: needinfo?(jfkthame)

I would argue the logic is quite drastically different for every first word of a sentence as seen from my attachment. Chrome is on the left and Firefox on the right. Both locale attributes have been set to "fi".

If I would change the locale to German, the result would be as expected. This would, however mess with the visitors screen reader or other accessibility tools.

From one point of context I can appreciate the suggestion to just change behavior on my own computer. Unfortunately as a developer, I need to take into consideration other users as well who might not be as tech savvy as me. From their point of view, the current default behavior and/or localized version of the functionality is not correct.

example site
https://developer.mozilla.org/en-US/docs/Web/CSS/hyphens

This appears to be lang-specific in Chrome: if you change the lang attribute to en, you'll see that it also leaves the word "Extraordinarily" unhyphenated, and overflowing the box.

So the difference in behavior seems to be that Chrome applies (its equivalent of) intl.hyphenate-capitalized for fi in addition to de, whereas Firefox currently only has that setting (by default) for de.

We can certainly consider setting that pref by default for Finnish, if long words are common enough there that it seems beneficial.

[edit: indeed, bug 1787442 is on file about that. Let's go ahead and do it there...]

See Also: → 1787442

(In reply to Jonathan Kew [:jfkthame] from comment #6)

Out of curiosity, I just tried Chrome, and I'm seeing similar behavior to Firefox: a capitalized word does not get hyphenated.

Testcase:

data:text/html;charset=utf-8,<p lang="en" style="width:80px;hyphens:auto;border:1px solid gray">AmericanAirlines is hiring.<br><br>americanairlines is hiring.

In both Chrome and Firefox, this gives me

AmericanAirlines
is hiring.

ameri-
canairlines
is hiring.

where "Airlines" on the first line extends beyond the box. (And would be clipped if we used overflow: clip or similar.)

So I think the claim that Firefox's behavior is "completely different to other browsers" is perhaps overstated.

For Finnish specifically, one thing you can try is creating a new preference in about:config named intl.hyphenate-capitalized.fi with type Boolean with its value set to true.

Please note that hyphenation works differently in Chromium for Finnish in Mac and Windows/Linux environments. ( https://bugs.chromium.org/p/chromium/issues/detail?id=652964 ) While Chromium has hyphenation support on all platforms, not all languages are supported on all platforms. Finnish is not on the list of supported hyphenations outside Mac: https://source.chromium.org/chromium/chromium/src/+/main:third_party/hyphenation-patterns/hyb/ . The screenshots above are from a MacOS AFAIK.

(In reply to Jonathan Kew [:jfkthame] from comment #8)

This appears to be lang-specific in Chrome: if you change the lang attribute to en, you'll see that it also leaves the word "Extraordinarily" unhyphenated, and overflowing the box.

So the difference in behavior seems to be that Chrome applies (its equivalent of) intl.hyphenate-capitalized for fi in addition to de, whereas Firefox currently only has that setting (by default) for de.

We can certainly consider setting that pref by default for Finnish, if long words are common enough there that it seems beneficial.

[edit: indeed, bug 1787442 is on file about that. Let's go ahead and do it there...]

Yes, Finnish is known for the use of long words, so much so that it's a source for some memes/jokes.

Because I have access, here are some stats for unique words in Finnish from www.hel.fi, the City of Helsinki (our capital) :

  • Words: 143306
  • Median: 11
  • Average: 11.91
  • Min: 2
  • Max: 50

Sure, in such a big list there are typos and other problems, but I think we get the gist. For example "koulunkäynninohjaajaharjoittelijoiden" is a real word of 37 characters that appears on this school page: https://www.hel.fi/fi/kasvatus-ja-koulutus/kankarepuiston-peruskoulu#harjoittelijakoodinaattori and it does not fit a single line on 320px mobile screen with 16px side padding and 18px font size and means roughly "trainee school assistants".

I can provide more examples of long words in Finnish, but I'd imagine there are way more languages in the world that have similar problems. Not hyphenating Capitalised words to me seems short sighted if no override property is provided.

Duplicate of this bug: 1877571

I just tested this again in a couple of Blink based browsers and Firefox with a handful of language codes and it seems like at least now Blink has implemented this misfeature more "correctly". I.e. hyphenation of Title Case is completely broken as "by design" in English, but apparently it properly hyphenates words with any case in all other languages as long as the language has hyphenation support.

I can't tell if they have done the sensible thing and made hyphenation work by default and only if the language is "en" they enable this stupid don't hyphenate titles and first words of sentences rule; or if they simply have added an exception similar to "intl.hyphenate-capitalized.*" for every single language they have hyphenation for except English (which is a truly silly way to do it, since this absurd preference for having proper nouns overflow rather than hyphenated is as far as I know an exclusively English language preference); but hey at least hyphenation works properly for everyone except English in Blink browsers now!

But Firefox still stubbornly makes hyphenation completely useless in all languages... Why is this so hard to fix? It's been 5 years since you broke it!
This is just embarrasing!


The reason I say this stupid "don't break proper nouns (and incidentally also every other word with any uppercase letter for any reason)"-rule makes auto hyphenation completely useless (even in English) is because it mainly severely affects titles because they tend to use very large font sizes. And when the text is very large font you really need hyphenation; breaking a title into one word on each line just because some words are rather long doesn't look very nice; and having single words that are longer than the screen width overflowing causing horisontal scrolling or worse completely hiding the words with overflow:hidden, is terrible UX.

And since title fields tends to implemented as plain text (i.e. non-html) fields in most CMS'es inserting soft hyphens is rather difficult because you can't just tell the editors to use type the html entity ­ you'd have to teach them how to type an invisible character using complex and platform specific keyboard shortcuts, or using a (platform specific) character map app, or copy pasting an invisible character from a document or doing that; and then somehow avoid deleting this invisible character when they later edit it. So asking a non technical editor to insert soft hyphens is basically a non-solution to the problem.

And in titles English has this habit of capitalising every words, and even if you don't the first word of all sentences are normally capitalised in all European languages I know of.
And even if this misfeature was fixed with a magical "proper noun detector" somehow, so that it actually applied only to proper nouns and not every random string with capital letters; it would still be quite broken in English because proper nouns are one of the types of words that also in English tends to be written as all one word for stylistic reasons, which means they can be quite long even though they're compound words that are very possible to hyphenate properly.

Hyphenation in body text is just a "nice to have" feature, it mostly just looks slightly nicer, and this is where a "don't hyphenate proper nouns" style rule would be fine; but if you have this style preference for small font body text, then just don't apply hyphens auto to body text! You'll only have a faintly more ragged edge in English with it's weird habit of writing compound words as separate short words anyway.
Forcing this English style preference into the hyphenation engine is rather silly because you then sometimes break the layout a little in titles; but forcing this purely English style preference onto every language in the world (except German, as if that's the only other language in the world that have long words, lol) where this breaks the layout of large text titles almost all the time is just extremely arrogant. Just reverse the logic already and make this broken rule opt-in and apply it to only English so the rest of the world can actually be able to use the hyphenation properly!

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: