Closed Bug 1240277 Opened 8 years ago Closed 2 years ago

Add hyphenation patterns for Indian languages

Categories

(Core :: Internationalization, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
97 Branch
Tracking Status
firefox97 --- fixed

People

(Reporter: santhosh.thottingal, Assigned: jfkthame)

References

(Blocks 1 open bug)

Details

(Keywords: dev-doc-needed)

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36

Steps to reproduce:

In Bug 672320,  while adding hyphenation patterns for many locales, there was an unresolved issue about Indian languages. The license was LGPL and not compatible to add.

Licensing should not be an issue now since I relicensed the packages to permissive license MIT.
Patterns are available at https://github.com/santhoshtr/hyphenation

Please add them to firefox.
Component: Untriaged → Internationalization
Product: Firefox → Core
Severity: normal → enhancement
Depends on: 656750
Keywords: dev-doc-needed

Is it possible to move this forward? It seems that Santhosh has done much of the work.

See also https://w3c.github.io/iip/gap-analysis/taml-gap#issue79_hyphenation

Tamil is a language that really needs hyphenation support, because it has long words, and there are others that are similar, such as Malayalam (see https://r12a.github.io/scripts/malayalam/#linebreak).

Jonathan, I think the question is for you :)

Flags: needinfo?(jfkthame)

Yes, I think we could try adding these patterns. I'll put up a patch.

It's unclear to me what degree of review & testing these have actually had among the relevant communities; they're extremely simple rule-based patterns -- apparently derived (without attribution) from a simple proof-of-concept that I originally posted to the XeTeX mailing list back in 2004 -- that may not fully handle cases that go beyond the simple "orthographic cluster" structure of these scripts, but they should be sufficient as a starting point.

Flags: needinfo?(jfkthame)

Using hyphenation patterns from https://github.com/santhoshtr/hyphenation.

The tests here are implemented as Mozilla reftests rather than added to WPT because I don't think
we can reasonably have such tests in WPT. The specific set of languages for which the UA supports
auto-hyphenation is not a normative requirement, and nor is the particular dictionary or algorithm
that will be used for any specific language. As such, the exact results are not defined by the
spec. (They may also change over time, if the hyphenation rules we use are updated, in which case
the tests will have to change accordingly.)

Assignee: nobody → jfkthame
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2a669757a3cd
Add hyphenation patterns for Indic languages. r=platform-i18n-reviewers,dminor
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch
Blocks: 656750
No longer depends on: 656750
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: