Closed Bug 1604611 Opened 4 years ago Closed 4 years ago

Errant question marks in place of multiple spaces

Categories

(MailNews Core :: Composition, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1435903

People

(Reporter: raysatiro, Unassigned)

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:73.0) Gecko/20100101 Firefox/73.0

Steps to reproduce:

I sent an e-mail in Thunderbird 68.3.0 that had this text:

  {
    curl_version_info_data *info = curl_version_info(CURLVERSION_NOW);
    printf("HTTP2 is %s\n", ((info->features & CURL_VERSION_HTTP2) ? "supported" : "NOT supported"));
    printf("%s\n", curl_version());
  }

Actual results:

Thunderbird instead sent the e-mail with this text:

 ?? {
 ?????? curl_version_info_data *info = curl_version_info(CURLVERSION_NOW);
 ?????? printf("HTTP2 is %s\n", ((info->features & CURL_VERSION_HTTP2) ? 
"supported" : "NOT supported"));
 ?????? printf("%s\n", curl_version());
 ?? }

Expected results:

The space characters should not have been converted to question marks. This has happened once or twice to me at random. I couldn't reproduce it sending the exact same text a second time. I hesitate to say this is bug 1435903 since it's arbitrary and I'm not sending foreign characters, just ascii.

Config Editor > mail.strictly_mime: False.
Composition > Configure text format behavior > Send messages as plain text if possible: On.
Composition > Configure text format behavior > When sending as HTML if recipient cannot receive HTML: Send as both "plain text and HTML" is selected.

I am using Yahoo SMTP servers.

Looks like I had logging enabled so I've attached an excerpt of the log that includes the SMTP transfer and the IMAP save to outbox. My mozilla log level is set to IMAP:5,SMTP:5,POP3:5,timestamp,sync. The SMTP body is not present in the log (why?) but the IMAP body is. The log shows spaces were appended or replaced by UTF-8 encoded non-breaking spaces.

Example:

  {
    curl_version_info_data *info = curl_version_info(CURLVERSION_NOW);

There are 2 spaces before the brace, or in other words 2020 but it was saved as 20C2A020. There are 4 spaces before curl, or in other words 20202020 but it was saved as 20C2A0C2A0C2A020

Also, the e-mail was sent to a mailing list and is archived at https://curl.haxx.se/mail/lib-2019-12/0051.html

Component: Untriaged → Composition
Product: Thunderbird → MailNews Core

Thanks. Why does Thunderbird encode multiple spaces as non-breaking spaces for plaintext? Shouldn't that be for HTML only?

That's a jolly good question. It shouldn't and it doesn't as far as I can tell from a very simple test sending something to my local outbox. Are you sure the message didn't get send as HTML? Or plaintext+HTML as per the settings described at the end of comment #0?

A very simple check you can do is to view the message source and then switch between Unicode and Western encodings. Unicode C2A0 will show some character and a NBSP when displayed as Western (windows-1252), so you can check whether they are there without a trace or hex editor.

I think you need to research this a bit yourself, maybe using a different outgoing server. All we know is that Yahoo messes up big time as per the article I quoted. And turning valid UTF-8 into ?? isn't so cool either.

In the repro that was attached yes the e-mail was a reply sent to a mailing list and sent as plaintext+HTML. To eliminate possible contamination I tried just now sending an e-mail to myself with the contents " test. foobar." typed out so only plaintext would be used. In the debug log I can see the IMAP copy of the sent message that Thunderbird saves to the outbox has the double-space as "20C2A020", was only sent plaintext and the plaintext headers show the same as the e-mail in the original repro:

Content-Transfer-Encoding: 8bit
...
Content-Type: text/plain; charset=utf-8; format=flowed

Since Thunderbird does not record the SMTP details such as body (shouldn't it do that? my debug level is smtp:5) I cannot say for sure what is sent for SMTP. However the contents received in my inbox is also "20C2A020" and when I open it it looks normal.

I then did a similar test but appended some HTML so it would be sent as plaintext+HTML. The same thing happened and it looks normal.

Based on what I've experienced we can assume the bad conversion resulting in is arbitrary. Without being able to record what raw SMTP data is sent I cannot be certain the ???? is a Yahoo bug but I think it's pretty likely.

Also it appears at least from my results that Thunderbird may be changing multiple spaces to nbsp in strictly plaintext UTF-8 messages. Again without being able to see the SMTP raw data I can't say for sure.

hm bugzilla kind of messed that up, it's "<SPACE><SPACE>test. foobar." not "<SPACE>test. foobar." which is how it's shown above. Do you know if that is a bug in bugzilla that the HTML was able to collapse the space, or is that expected?

I see the dilemma not being able to inspect the content that's going over the SMTP wire (unless you use WireShark or some such). Why don't you send the message to the local outbox and inspect it there: "File > Send Later" or Ctrl+Shift+Enter. I'm 99.99995 sure that whatever is in the outbox will be shipped out 1:1.

As described in my Wiki article, Yahoo's behaviour can change on a daily basis, so if you get ?? one day, you might not get them the next. Typically we've seen ?? for messages that were delivered further by Yahoo's SMTP server, so typically recipients see the ??, not the Yahoo users.

Another thing: You said that you sent two spaces but received three? 20 C2A0 20? Something is really fishy here.

Thanks, I saved the e-mail to the outbox as send later and that shows that the spaces are converted by Thunderbird to nbsp. Regarding the number of nbsp I always see space count - 1 so for example

<SPACE><SPACE>foo ===> <SPACE><NBSP><SPACE>foo
<SPACE><SPACE><SPACE>foo ===> <SPACE><NBSP><NBSP><SPACE>foo
<SPACE><SPACE><SPACE><SPACE>foo ===> <SPACE><NBSP><NBSP><NBSP><SPACE>foo
and so on.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: