Open Bug 1900708 Opened 9 months ago Updated 9 months ago

Flag emoji such as πŸ³οΈβ€πŸŒˆ πŸ³οΈβ€βš§οΈ πŸ΄β€β˜ οΈ containing zero-width joiners are broken up in subject lines

Categories

(Thunderbird :: Message Reader UI, defect)

Thunderbird 115
defect

Tracking

(Not tracked)

People

(Reporter: duxovni, Unassigned)

References

(Regression)

Details

(Keywords: regression)

Attachments

(3 files)

Steps to reproduce:

Send myself an email with πŸ³οΈβ€πŸŒˆ (rainbow flag emoji), πŸ³οΈβ€βš§οΈ (trans flag emoji), and/or πŸ΄β€β˜ οΈ (pirate flag emoji) in the subject line.

Actual results:

The email subject is displayed as 🏳 🌈 🏳 ⚧ 🏴 ☠ in the message list, and in the subject line when opening the message.

Expected results:

The flag emoji should be displayed normally in their combined forms as πŸ³οΈβ€πŸŒˆ πŸ³οΈβ€βš§οΈ πŸ΄β€β˜ οΈ.

On which exact operating system and version?

Flags: needinfo?(duxovni)

Also, please attach a sample message as .eml

Component: Untriaged → Message Reader UI

NixOS, 115.11.0

Flags: needinfo?(duxovni)

(NixOS unstable 24.11, Thunderbird 115.11.0)

I see the problem here too.

The bug also shows up in the composer.
STR:

  1. start a new e-mail.
  2. copy and past e.g. the rainbow flag πŸ³οΈβ€πŸŒˆ into the subject and the body.
  3. save to draft.

In the body of the draft message, the flag is correctly coded with the following 4 code points as:
<u+1F3F3><u+FE0F><u+200D><u+1F308>

www.unicode.org/emoji/charts/full-emoji-list.html#1f3f3_fe0f_200d_1f308
(Please be patient. The page needs some time to load.)

However, the subject only contains 2 code points:
<u+1F3F3> <u+1F308>

Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression
Regressed by: 1506587

Example with rainbow flag and several ZERO WIDTH JOINERs as well as several VARIATION SELECTORs in the From header

Attachment #9405777 - Attachment mime type: message/rfc822 → text/plain

Neither the ZERO WIDTH JOINER nor the VARIATION SELECTOR 16 are spaces in the true sense of the word. They therefore do not create empty space and are therefore not relevant for Bug 1506587 in my opinion.
I have therefore removed them from cleanToken().
In the picture you can see the correct rainbow flag without an empty space appearing after the β€œZWF:” or the β€œVS16:”.

Or is this more of a wontfix?

Thanks for finding that!

Looking at those bug reports, it sounds like the goal of removing those characters was to prevent a malicious sender breaking up a series of consecutive spaces like <space><zwj><space><variation selector 16><space> to avoid detection. But if that's the goal, then the regex should only remove those characters when they're preceded by actual space characters. Also, if this is about spoofing senders, then this extra-strict sanitisation should only be applied to the From header, not other headers like Subject.

It seems like the correct sanitisation behavior would be specifically to replace "a space character followed by a run of only spaces and zero-width codepoints" with a single space.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: