Zero-width joiner does not take effect across direction-run boundaries [was: is ignored]
Categories
(Core :: Layout: Text and Fonts, defect)
Tracking
()
People
(Reporter: aprilop, Unassigned)
References
Details
Attachments
(2 files)
User Agent: Mozilla/5.0 (Android 11; Mobile; rv:109.0) Gecko/109.0 Firefox/109.0
Steps to reproduce:
Write
<h1> ﻌ </h1>
<h1> ع </h1>
into HTML file.
Actual results:
The two lines look differently.
Expected results:
The two lines should look the same.
Updated•2 years ago
|
Comment 1•2 years ago
|
||
Using ZWJ to force joining forms of Arabic letters may not work at direction-run boundaries, or at the start/end of the text if the surrounding context is left-to-right; if the bidi processing includes the ZWJ in the adjacent LTR run rather than the RTL run, then the shaping of the Arabic text won't "see" it.
To improve the chance of this working as intended, you can include a Right-to-Left Mark character (U+200F) before the initial ZWJ and after the final one, so that they unambiguously remain within the same RTL run of text. This will enable the joiners to apply to the Arabic letter as expected.
(I do think this is a valid issue, and ideally we would find a way to ensure that a joiner at a direction-run boundary can affect shaping appropriately on either side, despite the runs being shaped separately.)
| Reporter | ||
Comment 2•2 years ago
|
||
Adding RLM or dir=rtl will not completely solve the problem in Firefox,
see new attachment zwj2.html.
Comment 3•2 years ago
|
||
Yes, I noticed this. The reason is that the presence of the directional control or joining control character is preventing the preceding space (at the beginning of the element's text) from being collapsed; when evaluating whether a space is "trimmable", we check if there are any following characters that would cluster with it, and if so, the space is retained. I think the main reason for this is to support the case where the space is used as the base character for a combining mark; in that case we need to keep the space in order for the mark to have somewhere to render.
It may be that we can exclude the directional controls and joiners from this, so that the leading space here does get stripped as expected. I'll need to run a test to see if that has any unexpected side-effects.
| Reporter | ||
Comment 4•2 years ago
|
||
Read also thread "Zero-Width Joiner U+200D" at
https://corp.unicode.org/pipermail/unicode/2023-February/thread.html
Description
•