Open Bug 534885 Opened 15 years ago Updated 2 years ago

UTF-8 encoding of return address inserts bare LF in the stream, breaking RFC822

Categories

(Thunderbird :: Message Compose Window, defect)

x86_64
Linux
defect

Tracking

(Not tracked)

UNCONFIRMED

People

(Reporter: hggdh2, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7pre) Gecko/20091215 Ubuntu/9.10 (karmic) Shiretoko/3.5.7pre
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.7pre) Gecko/20091214 Shredder/3.0.1pre

Original Ubuntu bug: https://bugs.launchpad.net/ubuntu/+source/thunderbird/+bug/233990

Tested on TB 3.0.1pre1 and TB 2.0.0.23.

When a MDN (return receipt) is sent, the UTF-8 encoding of the 'To:' address is inserted with a bare LF at the end.

This breaks RFC822. Although not all SMTP servers are picky, at least QMail refuses to send the message.

This does not happen if the display name of the email address can be represented in ASCII.

For example, the following is an excerpt of a sniffer trace ran on a SMTP session, showing the error:

00b0 6d 3e 0d 0a 53 75 62 6a 65 63 74 3a 20 52 65 74 m>..Subject: Ret
00c0 75 72 6e 20 52 65 63 65 69 70 74 20 28 64 69 73 urn Receipt (dis
00d0 70 6c 61 79 65 64 29 20 2d 20 74 65 73 74 20 4d played) - test M
00e0 44 4e 20 35 20 54 42 33 0d 0a 54 6f 3a 20 3d 3f DN 5 TB3..To: =?
00f0 55 54 46 2d 38 3f 42 3f 55 32 46 75 64 4d 4f 70 UTF-8?B?U2FudMOp
0100 49 47 52 6c 4c 55 46 32 61 57 78 73 5a 58 6f 3d IGRlLUF2aWxsZXo=
0110 3f 3d *0a* 20 3c 68 67 67 64 68 32 40 75 62 75 6e ?=. <hggdh2@ubun
0120 74 75 2e 63 6f 6d 3e 0d 0a 52 65 66 65 72 65 6e tu.com>..Referen

Note offset 0x111 and 0x112 -- a bare LF is there, in between the UTF-8 encoding and the email address.
This is the UTF-8 for 'Santé...". The encoding is still inserting a bare LF in the stream.

Reproducible: Always

Steps to Reproduce:
1. Set up your display name with accented characters, say "Liberté Equalité Fraternité", or "Ócio", or similar.
2. send an email to somebody else, setting Return Receipt
3. look at/sniff/whatever the actual SMTP stream (unfortunately it seems TB does not save the sent MDN -- or, at least, I cannot find it in my folders).
4. If the recipient of the MDN request is running under QMail, perfect, you will be able to see the QMail error.
Actual Results:  
UTF-8 encoding of MDN inserts a bare LF in the stream

Expected Results:  
either a CRLF is inserted, or no bare LF.
Actually this affects other encodings as well. The Ubuntu original bug is encoding to ISO-8859*
Summary: UTF-8 encoding of return address inserst bare LF in the stream, breaking RFC822 → UTF-8 encoding of return address inserts bare LF in the stream, breaking RFC822
On MS Win, inserted bytes was CRLF(0x0D0A) instead of LF.
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.5) Gecko/20091129 Shredder/3.0.1pre

Original From: in mail with MDN=on. ({CRLF}==0x0D0A)
> From: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?= <muttley@ops.dti.ne.jp>{CRLF}
Tb3 generated following To: in return receipt.
> To: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?={CRLF}
>  <muttley@ops.dti.ne.jp>{CRLF}
In this case, RFC2822 violation doesn't happen fortunately.

Inserted byte(s) looks "new line" used by OS.
From the original bug reporter :

I attached to the aforementioned bug 233990 launchpad URL a number of self-explanatory eml files which are captures of the MDN message Thunderbird was trying to send when Qmail complained with bare linefeed.  The URL itself contains grep output showing the bare linefeed.

Notice that the bare linefeed bug does not only affect the To: line but also the Subject: line.
It is a rare everyday-use event, although
- I cannot know how many of the e-mails I sent could not be acknowledged because of this
- I use a special configuration with which sending from my Thunderbird hotmail account to my Thunderbird ulg account can repeatedly be done with RR.  Several of the eml files use that feature to test and demonstrate the problem (esp on the subject line).

André.
WADA,

The original description says that there *may* be a bare linefeed in Return Receipt without saying where (seen in From and in Subject) nor why. It provides eml files for someone to find out why by analyzing them under light of the code.  The ultimate reason will be known after that analysis.

If you want to reproduce the problem without analyzing first and if you don't succeed, you may want to find inspiration in these eml files from what really happened for sure.

I'm not sure what you're showing us (a From: and a To:).
Sending? Received?  Return Receipt?  Normal Mail?
Please provide full files showing everything you did, like mine do.

It wouldn't be surprising that the problem did not exist on Windows, because EOL is CRLF on Windows and LF on Linux.  This is a Linux Thunderbird bug.
Attached file mail folder file
Attached mail folder file:
  mail-1 : Sent mail.
           Generated by Sm2 and Subject: is manually crafted.
           - Sm2 puts each rfc2047 encoded word in single header line.
           - Two header lines are merged into single herader line manually.
  mail-2 : Return receipt by Tb 3 on MS Win.

(In reply to comment #4)
> I'm not sure what you're showing us (a From: and a To:).
> Sending? Received?  Return Receipt?  Normal Mail?
> Please provide full files showing everything you did, like mine do.

You can't understand next?
> Original From: in mail with MDN=on. ({CRLF}==0x0D0A)
> (a) > From: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?= <muttley@ops.dti.ne.jp>{CRLF}
> Tb3 generated following To: in return receipt.
> (b) > To: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?={CRLF}
>     >  <muttley@ops.dti.ne.jp>{CRLF}

1. Some one(Tb, Seamonkey) sends a mail with From: of (a),
   with requesting return receipt.
2. Tb 3 receives the mail, and sent return receipt according to MDN request.
3. To: in the "return receipt sent by Tb3" was (b).
   {CRLF} was inserted at same position as {LF} in comment #0. 

> It wouldn't be surprising that the problem did not exist on Windows, because
> EOL is CRLF on Windows and LF on Linux.  This is a Linux Thunderbird bug.

Problem does exist in Tb 3 on MS Win too. {CRLF} was surely inserted by Tb 3.
On MS Win, inserted bytes was {CRLF}. So RFC2822 violation won't occur fortunately.
"unneeded folding of To:" was observed on MS Win too.
As From: (by Tb, Seamonly) is following,
> From: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?= <muttley@ops.dti.ne.jp>
next is sufficient for To: of "return receipt"(shorter than the From:).
> To: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?= <muttley@ops.dti.ne.jp>
Similar "folding after each RFC2047 encoded word" is also observed for Subject: header.
See attached mail folder data. 

I guess that problem is "new line of OS is inserted upon header folding by Tb on any OS" instead of "Tb on Linux only inserts bare LF wrongly".
And, I guess the header folding is for avoiding RFC violation(e.g. too long header generated by original mail sender).
I think attached mail data indicates;
  Tb folds mail header after each rfc2047 encoded word.
Oh, RFC2822 violation existed in return receipt produced by Tb3 on MS Win too.
  - spase only line in Subject: (space+{CRLF} line, fortunately I added a space) 
  - {CRLF} only lines in Subject: => Mail header terminates at there
    original Subject  : word-1 word-2{CRLF}
    after try to fold : word-1{CRLF} ({CRLF} is inserted)
                        word-2{CRLF} ({CRLF} is inserted)
                        {CRLF}
I may come too late, but, I have added a more doable "Alternative, steps to reproduce" to launchpad description.
I have also added 4 msdnmsgN(f+s).eml samples, where f=1=From failure 
and s=2:Subject failure

Please note msdnmsgN2.eml follows the proposed workaround but fails.
Do not overlook the Subject: line case

> You can't understand next?
I was wondering if the From belonged to the original mail or the RR.
Beware that the To: line of the RR comes from 
Disposition-Notification-To: Wada Mitsuhiro <muttley@ops.dti.ne.jp>
not From: ...

I wonder why, being on Qmail and with an accent in my From:,
I haven't met the problem more often.

Thanks Wada, I had told Ubuntu that you would be fast and you are.
André Pirard, sorry for my wrong comment. 
I based on From:==Disposition-Notification-To: in my test.
I should have correctly presented next data instead of From: data.
> Disposition-Notification-To: =?ISO-2022-JP?B?GyRCT0JFRDh3OTAbKEI=?={CRLF}
>  <muttley@ops.dti.ne.jp>
Note: I used next in crated Subject: test(attached mail data).
      Sorry for my confusing/mis-leading comments/data.
> Disposition-Notification-To: Wada Mitsuhiro <muttley@ops.dti.ne.jp>
問題ありません。 カワ
ありがとうございました
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: