Open Bug 656750 Opened 13 years ago Updated 2 years ago

[meta] Enhance hyphenation

Categories

(Core :: Layout: Text and Fonts, enhancement)

enhancement

Tracking

()

People

(Reporter: mnater, Unassigned)

References

(Depends on 6 open bugs, Blocks 1 open bug)

Details

(Keywords: meta)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_7; de-de) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:6.0a1) Gecko/20110510 Firefox/6.0a1

Hi

Justified text and consequently hyphenation has been a common request from customers (specially in for German-speaking websites).
That is why I hacked Hyphenator.js (http://code.google.com/p/hyphenator/) which I always considered as a crutch until lack of hyphenation in browsers is fixed. I'm glad to see that the WebKit guys and you guys are working on this!
(https://bugzilla.mozilla.org/show_bug.cgi?id=253317)

I just quickly scanned the comments above and like to share some thoughts:

a) IMHO spell-checking and hyphention have to be separate. As a user I'd install spell-checkers only for the language I'll use actively. Spell-checking doesn't affect my perception of a website.
In contrast hyphenation affects the layout of pages and thus affects the perception of a website. As a user I'm surfing on sites in other languages than those I'm actively using as well. As a webdeveloper I'd like to rely on texts being hyphenated on the client – regardless of the language I'm using.
Conclusion: In order that webdesigner can rely on hyphenation make sure that the patterns are available without any interaction by the user!
Idea: Expose the availability of hyphenation patterns to Javascript and implement 'hyphenate-resource'. It's up to the user to make site loading faster by installing patterns locally.

b) Feedbacks on Hyphenator.js revealed that missplaced break-points are very important to customers. A faulty inner-word line-break is a spelling error and thus not acceptable. Automatic hyphenation can not be perfect!
Conclusion: Give webdevelopers good opportunities to control and fix hyphenation.
Ideas:
- there should be a way to display all hyphenation points in word (cf \showhyphens in TeX).
- don't hyphenate words that are hyphenated manually (i.e. are containing soft hyphens) -> word-level corrections
- expose an interface for exceptions -> page-level corrections

c) A major problem in narrow columns are path-like strings (links, mail-adresses). There should be an oppurtunity to break those strings, too. In Hyphenator.js I'm using the zero width space as hyphen character for paths-like strings and a simple regexp to find breakpoints conforming to TeXs url-package (http://mirror.switch.ch/ftp/mirror/tex/macros/latex/contrib/url/url.pdf)
Idea: Implement 'hyphenate-character' and 'hyphenate-resource' and optionaly include patterns for path-like strings

Reproducible: Always
> There should be an oppurtunity to break those strings, too.

We already do, I thought.  And that's even without word-wrap:break-word.
(In reply to comment #0)

> a) IMHO spell-checking and hyphention have to be separate.

They are. They're controlled separately, and depend on separate resources.

> - there should be a way to display all hyphenation points in word (cf
> \showhyphens in TeX).

Would it be sufficient to do something like

data:text/html,<div style="width:0;-moz-hyphens:auto" lang="en-us">supercalifragilisticexpialidocious

> - don't hyphenate words that are hyphenated manually (i.e. are containing
> soft hyphens) -> word-level corrections

Yes, this is something we should handle, but don't currently.

> - expose an interface for exceptions -> page-level corrections

An interesting idea; probably worth discussing ideas for how to specify this in CSS on the www-style list.
(In reply to comment #1)
> > There should be an oppurtunity to break those strings, too.
> 
> We already do, I thought.  And that's even without word-wrap:break-word.

OK.
I didn't know that.
Just checked: it works for URLs and Paths but not for email-adresses.
(In reply to comment #2)
> They are. They're controlled separately, and depend on separate resources.

Thanks for clarification.

> Would it be sufficient to do something like
> 
> data:text/html,<div style="width:0;-moz-hyphens:auto"
> lang="en-us">supercalifragilisticexpialidocious

Oh! Yes. Thanks!

> > - don't hyphenate words that are hyphenated manually (i.e. are containing
> > soft hyphens) -> word-level corrections
> 
> Yes, this is something we should handle, but don't currently.

The WD says:
"Conditional hyphenation characters inside a word, if present, take priority over automatic resources when determining hyphenation points within the word."
(http://www.w3.org/TR/2011/WD-css3-text-20110412/#hyphens)

> > - expose an interface for exceptions -> page-level corrections
> 
> An interesting idea; probably worth discussing ideas for how to specify this
> in CSS on the www-style list.

See http://lists.w3.org/Archives/Public/www-style/2011Mar/0721.html
An other possibility would be to edit the patterns:
.l8e9g8e8n8d. -> le-gend (instead of leg-end which is irritating)

But this is complicated and requires hyphenate-resource…
Opened discussion on www-style:
http://lists.w3.org/Archives/Public/www-style/2011May/0337.html
(In reply to comment #4)
> (In reply to comment #2)
> > (In reply to comment #0)
> > > - there should be a way to display all hyphenation points in word (cf
> > > \showhyphens in TeX).
> > 
> > Would it be sufficient to do something like
> > 
> > data:text/html,<div style="width:0;-moz-hyphens:auto"
> > lang="en-us">supercalifragilisticexpialidocious
> 
> Oh! Yes. Thanks!

Isn't this what 'hyphens: all' is intended to be? (The 'all' value was removed in bug 655198.)

Also, I think it's safe to confirm this RFE. And I've added a few dependencies to ensure this doesn't get lost.
Status: UNCONFIRMED → NEW
Depends on: 253317, 655198, bcp47
Ever confirmed: true
Blocks: bcp47
Depends on: 666662
No longer depends on: bcp47
Depends on: 672320
(In reply to comment #6)
> > > > - there should be a way to display all hyphenation points in word (cf
> > > > \showhyphens in TeX).
> 
> Isn't this what 'hyphens: all' is intended to be? (The 'all' value was
> removed in bug 655198.)

Yes, "hyphens:all" would do that, but we have not implemented it (and probably won't). It doesn't have a clear use case, given that for debugging purposes there are techniques available that don't require us to add special code to Gecko. So the cost/benefit balance doesn't look favorable.

Note that http://www.w3.org/TR/css3-text/ mentions that "hyphens:all" is considered at-risk for removal.
Blocks: 953408
Depends on: 956213
Depends on: 656879
Depends on: 767279
Depends on: 727658
Keywords: meta
Blocks: 958173
Blocks: 958187
Depends on: 987668
Depends on: 987674
Blocks: 1240277
Depends on: 1401575
No longer blocks: 958173, 958187, 953408, 1240277
Depends on: 958173, 958187, 953408, 1240277
Summary: Enhance hyphenation → [meta] Enhance hyphenation
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.