Open Bug 1427032 Opened 6 years ago Updated 1 year ago

::first-letter { text-transform: capitalize } breaks Arabic rendering

Categories

(Core :: Layout: Text and Fonts, defect, P3)

58 Branch
defect

Tracking

()

People

(Reporter: amir.aharoni, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20171224140229

Steps to reproduce:

I used the CSS "::first-letter: { text-transform: capitalize }" on Arabic text. See the attached file.


Actual results:

The first letter on the Arabic word appeared disconnected from the rest of the word.


Expected results:

The letter is definitely not supposed to be disconnected.

The CSS rule in question can have no effect on text in writing systems that don't have letter case. The only writing systems that have letter case are Latin, Cyrillic, Greek, Coptic, Armenian, Adlam, Warang Citi, Cherokee, and Osage (and perhaps also Georgian, but there is a controversy about how to define it). In all other writing systems, among them Arabic, Hebrew, Devanagari, Chinese, and more, letter case and capitalization are irrelevant. In some of them this code doesn't do anything anyway, but in those that involve complex font shaping, such as Arabic, this actually causes harm.

This is broken in both Firefox and Chrome, so maybe this is something that should be clarified in the CSS standard. I submitted an issue to CSS as well: https://github.com/w3c/csswg-drafts/issues/2135
(... Forgot to mention a few other writing systems that have letter case: Old Hungarian, Glagolitic, and Deseret. They are very rare, but still valid.)
Summary: ::first-letter: { text-transform: capitalize } breaks Arabic rendering → ::first-letter { text-transform: capitalize } breaks Arabic rendering
Component: Untriaged → Layout: Text
Product: Firefox → Core
A change in the text-transform property always causes textrun separation (in current Gecko code), as text-transform is implemented by creating a special "transformed text run" that implements the case-change, almost as if it were a font feature. The creation of transformed text runs happens prior to script run analysis, so at that point, we don't know that the text-transform change will actually be a no-op for the text involved. (And more generally, the element involved might contain Latin characters as well as Arabic, in which case we'd definitely have to create transformed textruns.)

More generally, text shaping works across element (or pseudo-element) boundaries only when the style properties on the two sides of the boundary are the same, for all properties that are known to potentially affect font shaping. And because text-transform is a font-rendering effect (not a transformation of the actual DOM content), it causes such an interruption in shaping. It's comparable to if you'd specified a font-variant-* property on the ::first-letter, for example.

Ideally, I guess the basic Arabic-script features (init/medi/fina/isol) would work across boundaries, by propagating appropriate context, but we don't currently do that. And it would still be less than entirely satisfactory, as something like text-transform:capitalize would still break things like ligatures that want to actually span the boundary. To fix that, I think we'd need to rework the implementation of transformed textruns more radically.
Priority: -- → P3
Severity: normal → S3

The text-shaping refactoring that I'm contemplating in bug 1748636 might help here, by handling script-run analysis earlier. I'll mark this as depending on that bug, to remind us to look into it further.

Status: UNCONFIRMED → NEW
Depends on: 1748636
Ever confirmed: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: