Open Bug 1817862 Opened 2 years ago Updated 2 years ago

Zero-width joiner does not take effect across direction-run boundaries [was: is ignored]

Categories

(Core :: Layout: Text and Fonts, defect)

Firefox 109
defect

Tracking

()

People

(Reporter: aprilop, Unassigned)

References

Details

Attachments

(2 files)

Attached file zwj.html

User Agent: Mozilla/5.0 (Android 11; Mobile; rv:109.0) Gecko/109.0 Firefox/109.0

Steps to reproduce:

Write
<h1> ﻌ </h1>
<h1> ‍ع‍ </h1>
into HTML file.

Actual results:

The two lines look differently.

Expected results:

The two lines should look the same.

Component: Untriaged → Layout
Product: Firefox → Core

Using ZWJ to force joining forms of Arabic letters may not work at direction-run boundaries, or at the start/end of the text if the surrounding context is left-to-right; if the bidi processing includes the ZWJ in the adjacent LTR run rather than the RTL run, then the shaping of the Arabic text won't "see" it.

To improve the chance of this working as intended, you can include a Right-to-Left Mark character (U+200F) before the initial ZWJ and after the final one, so that they unambiguously remain within the same RTL run of text. This will enable the joiners to apply to the Arabic letter as expected.

(I do think this is a valid issue, and ideally we would find a way to ensure that a joiner at a direction-run boundary can affect shaping appropriately on either side, despite the runs being shaped separately.)

Severity: -- → S3
Status: UNCONFIRMED → NEW
Component: Layout → Layout: Text and Fonts
Ever confirmed: true
Summary: Zero-width joiner is ignored → Zero-width joiner does not take effect across direction-run boundaries [was: is ignored]
Attached file zwj2.html

Adding RLM or dir=rtl will not completely solve the problem in Firefox,
see new attachment zwj2.html.

Yes, I noticed this. The reason is that the presence of the directional control or joining control character is preventing the preceding space (at the beginning of the element's text) from being collapsed; when evaluating whether a space is "trimmable", we check if there are any following characters that would cluster with it, and if so, the space is retained. I think the main reason for this is to support the case where the space is used as the base character for a combining mark; in that case we need to keep the space in order for the mark to have somewhere to render.

It may be that we can exclude the directional controls and joiners from this, so that the leading space here does get stripped as expected. I'll need to run a test to see if that has any unexpected side-effects.

Read also thread "Zero-Width Joiner U+200D" at
https://corp.unicode.org/pipermail/unicode/2023-February/thread.html

See Also: → 1819025
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: