Closed Bug 1584718 Opened 2 months ago Closed 2 months ago

Codepoint U+FE0F (Variation Selector-16) affecting the width of adjacent space characters

Categories

(Core :: Layout: Text and Fonts, defect, P3)

69 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox71 --- fixed

People

(Reporter: polarathene-signup, Assigned: jfkthame)

References

()

Details

Attachments

(4 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36

Steps to reproduce:

  1. Install the Noto Color Emoji on a default Linux install(Manjaro KDE in my case).
  2. Visit getemoji.com
  3. Change the font-family for a given section to "Noto Color Emoji" (The "People and Fantasy" section works fine).
  4. Notice that large white-space gaps appear the width of an emoji character(eg for 👳‍♀️ == 👳 U+1F473 MAN WITH TURBAN + ♀ U+2640 FEMALE SIGN)

Actual results:

Some emoji rendered beside them white-space gaps as wide as the emoji.

With the default "Mozilla Twemoji" emoji font applied in the browsers "about:config" this issue appears more subtle.

I've noticed another rendering discrepancy.. If you set "about:config" emoji font to be "Noto Color Emoji", you'll get the similar subtle white-space like "Mozilla Twemoji" was presenting, so that might be unrelated to the Twemoji font bundled with Firefox. The issue is still presented if you set a font-family for Noto Color Emoji on the CSS, despite being also set as emoji font in "about:config".

Expected results:

The affected emoji should not be rendering the right-side(or both in some cases) spaces with the width of the emoji.

Or, as Chrome appears to do, all spaces should be rendered at this width.

Summary: Text layout error when the rendered font-family glyphs differ default font-family → Codepoint FE0H (Variation Selector-16) affecting the width of adjacent space characters

TL;DR: The U+FE0F codepoint appears to render "space" characters at either side of the emoji at the width of emoji. A ZWJ(U+200D) appears to prevent this.

Actual issue might be related to the ZWJ behaviour, in that Chrome is rendering these emoji width space characters for everything(when using Noto Color Emoji as the emoji font).

This image shows what the double gap on each end looks like:

And this one with only single gap to the right of emoji:

More screenshots from another emoji site as well as Chrome browser available at this Github issue.


The problematic emoji are multiple codepoint glyphs that on the system have one or more glyphs rendering with multiple fonts(eg Noto Color Emoji or Mozilla Twemoji, with DejaVu Sans).

A common one is with the gender signs. The emoji with different genders that are unaffected are combining with the man/woman emoji rather than the gender sign glyph/emoji, compare these two emoji:

👳‍♀️Woman Wearing Turban(Female Sign): https://emojipedia.org/woman-wearing-turban/
👩‍🍳 Woman Cook(Woman emoji): https://emojipedia.org/female-cook/

The culprit appears to be the U+FE0F, Variation Selector-16:
https://emojipedia.org/variation-selector-16/

That is often combined with a glyph that renders with a different font (higher priority in fontconfig), which present the big white-space gaps, a few of these are:

https://emojipedia.org/female-pilot/
https://emojipedia.org/female-judge/
https://emojipedia.org/woman-shrugging/
https://emojipedia.org/woman-shrugging-type-1-2/
https://emojipedia.org/man-shrugging/

While those that do not, lack the FE0F codepoint:

https://emojipedia.org/female-mechanic/
https://emojipedia.org/female-scientist/
https://emojipedia.org/male-singer/
https://emojipedia.org/thumbs-up-sign/


Disabling the font-family style CSS that applied "Segoe UI Emoji"(which would trigger Firefox to use the internal emoji font in about:config), I can see the DejaVu glyphs for some hand emoji(✌️ ☝️ ✍️), Firefox renders the rest as the emoji font defined in "about:config".

Unicode 1.1 (1993) / Emoji 1.0 (2015) | ✌️ ☝️ ✍️ (2 codepoints, rendered as DejaVu Sans in Firefox, Noto Sans Symbols2 in Chrome)
Unicode 6.0 (2010) / Emoji 1.0 (2015) | 👇 ✋ (1 codepoints, rendered as Noto Sans Symbols2 in Chrome)
Unicode 7.0 (2014) / Emoji 1.0 (2015) | 🖐 👁 🗣 (2 codepoints, rendered as Noto Sans Symbols2 in Chrome), 🖕 (1 codepoint, rendered as Noto Color Emoji in Chrome)
Unicode 9.0 (2016) / Emoji 3.0 (2016) | 🤚🤞(1 codepoint, rendered as Noto Color Emoji in Chrome)

Having looked into the codepoints, the emoji/glyphs with 2 codepoints all append a U+FE0F. It does appear that another pattern reveals, unlike the other emoji which append a gap/white-space to the emoji, ✌️ ☝️ ✍️ 🖐 👁 🗣 (all with 2 codepoints), actually seem to additionally prepend a gap as well. Perhaps from the lack of a ZWJ(U+200D)? https://emojipedia.org/zero-width-joiner/

These two emoji also do not render a blank gap, but include a FE0F(heart) glyph as the 2nd part of the definition, it is surrounded by a ZWJ codepoint on each side.

https://emojipedia.org/kiss-woman-woman/
https://emojipedia.org/kiss-man-man/

Perhaps if an emoji has the FE0F codepoint, a ZWJ is required to remove the added gap on either end?(actually appears to be assigning the width a "space"/(%20) character to that of the emoji width, confirmed - this is the case).

Attachment #9097103 - Attachment description: Screenshot, showing the white-space with single gaps per emoji → Shows gaps on each side of the emoji.
Attachment #9097104 - Attachment description: Shows single gaps(right of emoji) → Shows single gaps(left or right side depending which side the FE0H or ZWJ is)

Compared to a similar emoji site, eosrei.

It's due to how the emoji were formatted in the HTML element. getemoji.com used a new line for each, where eosrei has no new lines. eosrei actually has emoji seperated with spaces, so **the firefox bug is inserting spaces for some reason on getemoji(although if you manually add the spaces into the html, the white-space: normal(default) property in CSS is collapsing those(it keeps a single space if there's two characters between them on the same line(in the html text, not rendered element).

Hi @Brennan Kinney, tested the issue in Ubuntu 18.04 using Noto Color Emoji and the fonts changed => cannot reproduce. In my end no gaps, white-spaces will be displayed. Do you have any other pref setted up ? There is any other action you've made managing to reproduce the issue?
Meanwhile, I will set a component, if isn't the proper one please fell free to change it.
Regards,
Liviu

Component: Untriaged → Layout: Text and Fonts
Flags: needinfo?(polarathene-signup)
Product: Firefox → Core

What's happening here is related to the Segment Break Transformation Rules described in the CSS Text spec, which require line-breaks in the source to be either transformed to <space> or removed, depending on the context.

(In particular, note that those rules explicitly mention that zero-width space (U+200B) before or after the break causes it to be removed.)

I think the U+FE0F effect is arising out of these rules. Note that most emoji codepoints have EAW=wide, which means that in a sequence of emoji <newline> emoji the second bullet point of the Segment Break Transformation Rules will apply, and the break will be removed (rather than converted to a space). But if the variation selector is inserted, giving emoji VS16 <newline> emoji, the variation selector has EAW=ambiguous, and now the second bullet point does not apply.

But wait... the spec also says that "For this purpose, Emoji (Unicode property Emoji) with an East Asian Width property of Wide or Neutral are treated as having an East Asian Width property of Ambiguous."

So that means that the second bullet point in the rules should not be applying to the emoji <newline> emoji case. We'll need to check the code, but I suspect that's a bug in Firefox's implementation.

Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: Codepoint FE0H (Variation Selector-16) affecting the width of adjacent space characters → Codepoint U+FE0F (Variation Selector-16) affecting the width of adjacent space characters

(In reply to Liviu Seplecan from comment #4)

Hi @Brennan Kinney, tested the issue in Ubuntu 18.04 using Noto Color Emoji and the fonts changed => cannot reproduce. In my end no gaps, white-spaces will be displayed. Do you have any other pref setted up ? There is any other action you've made managing to reproduce the issue?

Hi @Liviu Seplecan, the fontconfig defaults might differ between Ubuntu and Manjaro? I do not presently have a custom fontconfig.

I have gone into "about:config" and set the emoji font there from "Mozilla Twemoji" to "Noto Color Emoji", although this doesn't make a difference. It will sort of appear as you described with no gaps, but if you look at the glyphs with the spacing differences, it is still there, just more subtle.

I could setup Ubuntu 18.04 on my system to compare if that would help? Applying the font-family Noto Color Emoji to the CSS(replacing Segoe UI Emoji) does have an impact on the rendering/spacing. What appears to happen is instead of falling back to the emoji font defined in "about:config", all text content will use that font, thus the noted gaps(inserted spaces?) adopt the width of space character from Noto Color Emoji.

If assigning "Twemoji Mozilla" to the CSS, it is not as visible. Dev tools indicate that this is because "Twemoji Mozilla" is not substituting the space character like "Noto Color Emoji" does(as CSS font-family, not ass emoji font fallback in "about:config"). For my system, the default serif font Bitstream Vera Serif is being reported for the space characters.

Would a screenshot comparing the initial load to the "font-family: Noto Color Emoji" CSS change be helpful?

Flags: needinfo?(polarathene-signup)

(In reply to Jonathan Kew (:jfkthame) from comment #5)

I think the U+FE0F effect is arising out of these rules. Note that most emoji codepoints have EAW=wide, which means that in a sequence of emoji <newline> emoji the second bullet point of the Segment Break Transformation Rules will apply, and the break will be removed (rather than converted to a space). But if the variation selector is inserted, giving emoji VS16 <newline> emoji, the variation selector has EAW=ambiguous, and now the second bullet point does not apply.

So that means that the second bullet point in the rules should not be applying to the emoji <newline> emoji case. We'll need to check the code, but I suspect that's a bug in Firefox's implementation.

Chrome appears to separate each emoji/glyph with a space character. So perhaps the cited emoji that have VS16 codepoints that are causing the notable white-space(U+0020) gaps is actually correct, and all the remaining glyphs should also be padding with space characters?

I'm not sure if that's what you were hinting at with your interpretation of the Segment Break Transformation rules, I think you were?

Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a89e8dcdb123
Make segment break transformation rules treat emoji characters with EAW=Wide as EAW=Ambiguous, per CSS Text spec. r=m_kato
https://hg.mozilla.org/integration/autoland/rev/613fe1a4095e
Add examples with emoji to the segment break transformation reftest. r=m_kato
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
Assignee: nobody → jfkthame

(In reply to Brennan Kinney from comment #9)

(In reply to Jonathan Kew (:jfkthame) from comment #5)

I think the U+FE0F effect is arising out of these rules. Note that most emoji codepoints have EAW=wide, which means that in a sequence of emoji <newline> emoji the second bullet point of the Segment Break Transformation Rules will apply, and the break will be removed (rather than converted to a space). But if the variation selector is inserted, giving emoji VS16 <newline> emoji, the variation selector has EAW=ambiguous, and now the second bullet point does not apply.

So that means that the second bullet point in the rules should not be applying to the emoji <newline> emoji case. We'll need to check the code, but I suspect that's a bug in Firefox's implementation.

Chrome appears to separate each emoji/glyph with a space character. So perhaps the cited emoji that have VS16 codepoints that are causing the notable white-space(U+0020) gaps is actually correct, and all the remaining glyphs should also be padding with space characters?

I'm not sure if that's what you were hinting at with your interpretation of the Segment Break Transformation rules, I think you were?

Yes, there should be a space between each of the emoji; Firefox was incorrectly eliminating those spaces in many cases. This should be fixed beginning from tomorrow's Nightly version.

(There'll likely still be some emoji that don't display correctly, depending on font support, but that's a separate matter.)

You need to log in before you can comment on or make changes to this bug.