Closed
Bug 614930
Opened 15 years ago
Closed 11 years ago
Sending email to addresses where the local part is non ASCII does not work
Categories
(MailNews Core :: Internationalization, defect)
MailNews Core
Internationalization
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: matt, Unassigned)
References
Details
Attachments
(1 file)
|
16.90 KB,
text/plain
|
Details |
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US) AppleWebKit/534.13 (KHTML, like Gecko) Ubuntu/10.04 Chromium/9.0.595.0 Chrome/9.0.595.0 Safari/534.13
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.15) Gecko/20101027 Lightning/1.0b1 Thunderbird/3.0.10
Local parts of the email address (i.e. before the @) that aren't ASCII should be converted to quoted printable.
If I manually set the email address to be quoted printable it works fine, but if you try sending the email normally it isn't quoted when sent to the smtp server.
I.e., sending an email to <帰南真昇@mattkeenan.net> doesn't work (the local part of the email address isn't converted to quoted printable), but sending an email to <=?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net> works fine.
Reproducible: Always
Steps to Reproduce:
1. Compose an email to an address with a non ASCII local part.
2. Sends a non quoted string to the smtp server in the To: header
3.
Actual Results:
BAD HEADER SECTION, Non-encoded 8-bit data (char E5 hex): To: \345\270\260\345\215\227\347\234\237\346\230\207@mat[...]
Expected Results:
To: =?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net
Comment 1•15 years ago
|
||
Tb 3.1.5 generated next header (帰南真昇 in UTF-8 always) for 帰南真昇@mattkeenan.net at To: field data of mail composition window, for any character encoding of mail.
> To: 帰南真昇@mattkeenan.net
i.e. Tb looks to use utf-8 always for non-ascii local part. So, if your server's local rule on local part == utf-8, no problem won't occur.
It was checked by "Send Later" and view source at Outbox of Local Folders. So, application of rule of SMTP and Tb's behaviour in SMTP sending is not checked yet.
> To: \345\270\260\345\215\227\347\234\237\346\230\207@mat...
Who generated the local part first?
If it's generated first by Tb from From: header of original mail of Reply, what byte code of what charset is placed in local part of original mail?
> Expected Results:
> To: =?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net
RFC2822 defines; (http://tools.ietf.org/html/rfc2822#section-3.4.1)
> addr-spec = local-part "@" domain
> local-part = dot-atom / quoted-string / obs-local-part
> quoted-string = [CFWS]
> DQUOTE *([FWS] qcontent) [FWS] DQUOTE
> [CFWS]
> qcontent = qtext / quoted-pair
> qtext = NO-WS-CTL / ; Non white space controls
> %d33 / ; The rest of the US-ASCII
> %d35-91 / ; characters not including "\"
> %d93-126 ; or the quote character
> %d32=0x22=" %d35=0x5C=\ %d92=0x5D=]
So, byte code other than 7bits ascii characters is needed to be specified as "<byte code of non-ascii chars>"@domain. As for this rule, "always use utf-8" may be called "RFC violation of RFC2822" if 8bit part exists in used UTF-8 data.
Interpretation of "<non-ascii chars>" is up to server. IIRC, there is no standard tracked RFC for interpretation/presentation of non-ascii local part. Even domain part, IDN is still experimental in limited environment. I guess "application of RFC2047 on local part of mail address" is local rule of your server(and perhaps local rule of some other servers).
If you need "To: =?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net", you need to do:
In address book, define mail address entry of;
mail addr : =?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net
display name : 帰南真昇
You get next To: header with Tb 3.1 as you want, by typing 帰南真昇 in To: field.
> Example of mail composed in ISO-2022-JP.
> To: =?ISO-2022-JP?B?GyRCNSJGbj8/PjobKEI=?=
<=?utf-8?B?5biw5Y2X55yf5piH?=@mattkeenan.net>
Comment 2•15 years ago
|
||
utf8smtp extension allows sending localpart in UTF-8 encoding rfc5336 is experimental and almost have none support from SMTP servers yet.
Comment 3•15 years ago
|
||
http://tools.ietf.org/html/rfc5336
Updates: 2821, 2822, 4952 Category: Experimental
IDN ccTLDs are already in use, the first ones were announced this time last year and were delegated approx 4 months ago. Egypt, UAE and Saud from memory. If we can confirm that Tb supports RFC 5336 then that handles half of the equation (server supports UTF8SMTP). RFC5336 suggest that some systems in the past have used <8bit><\h><7bit>... to avoid the issue of 8 bit chars in the headers, but I think this is incredibly broken and shouldn't be supported by Tb. Any other suggestions for fall back? Could QP be a valid suggestion (it would be passed unaltered by intermediate systems and be finally un-quoted when it arrived at its final destination)?
Updated•15 years ago
|
Assignee: nobody → smontagu
Blocks: 127399
Component: Message Compose Window → Internationalization
Product: Thunderbird → MailNews Core
QA Contact: message-compose → i18n
Updated•15 years ago
|
Assignee: smontagu → nobody
Comment 5•15 years ago
|
||
(In reply to comment #4)
> IDN ccTLDs are already in use, the first ones were announced this time last
> year and were delegated approx 4 months ago. Egypt, UAE and Saud from memory.
Matt, could you please present us RFC number which defines IDN for domain part of mail address in RFC 5356(will updates RFC 2821, 2822, 4952) and in current RFC 2821, 2822, 4952?
> but I think this is incredibly broken and shouldn't be supported by Tb.
I think so, if your Tb 3.0.x or Tb 3.1.x blindly sends 8bit part of UTF-8 as ??? of ???@x.y.z or "???"@x.y.z to SMTP server even though SMTP server doesn't advertise "utf8smtp extension is supported" to Tb.
Matt, can you attach mail data in Outbox of Local Folders generated by Tb by "Send Later", and SMTP log for mail sending when "end Unsent Messages" to this bug(never paste, please)?
As this bug is for local-part of mail address, replace sensitive data in log file and remove irrelevant data in log file before open your log ile to public, please.
Comment 7•15 years ago
|
||
(In reply to comment #5)
> (In reply to comment #4)
> > IDN ccTLDs are already in use, the first ones were announced this time last
> > year and were delegated approx 4 months ago. Egypt, UAE and Saud from memory.
>
> Matt, could you please present us RFC number which defines IDN for domain part
> of mail address in RFC 5356(will updates RFC 2821, 2822, 4952) and in current
> RFC 2821, 2822, 4952?
I think there no such RFC, at least in current state domain part is could be punycoded because it doesn't brake anything. Most IDN mail work need is localpart. This section maybe helps http://tools.ietf.org/html/rfc5335#section-4.4
Comment 8•15 years ago
|
||
(In reply to comment #6)
> packet dump of SMTP session in Hex / ASCII
(i) UTF8SMTP doesn't exist in EHLO response.
(ii) No local part in RCPT To:.
> 0x0030: 15c7 faad 5243 5054 2054 4f3a 3c40 6d61 ....RCPT.TO:<@ma
> 0x0040: 7474 6b65 656e 616e 2e6e 6574 3e0d 0a ttkeenan.net>..
(iii) local part in UTF-8 is sent as is.
> 0x0120: 6e3a 2031 2e30 0d0a 546f 3a20 e5b8 b0e5 n:.1.0..To:.....
> 0x0130: 8d97 e79c 9fe6 9887 406d 6174 746b 6565 ........@mattkee
> 0x0140: 6e61 6e2e 6e65 740d 0a53 7562 6a65 6374 nan.net..Subject
"RFC violation of (iii) when (i)" is similar phenomenon to "sent in 8bit even when no 8BITMIME in EHLO" case.
(a) if mail.strictly_mime=false, sent in 8bit even when no 8BITMIME.
(b) if mail.strictly_mime=true, sent in quoted-printable when 8bit data exists.
i.e. Tb doesn't support RFC for 8BITMIME.
Almost all recent SMTP supports 8BITMIME, and many of recent SMTP server who doesn't return 8BITMIME does process 8bit data very well. Server doesn't drop bit at 0x80 any time even if no 8BITMIME in EHLO response, and can process Content-Type:charset=UTF-8/Content-Transfer-Encoding:8bits properly. Problem due to (a) is already exceptional and very very rare.
I think (ii) is bigger issue. Why no local part in RCPT TO:? Tb removes local part because of 8bit data? To whom will the mail be sent if the SMTP server doesn't reject 8bit local part in To: header?
> 0x0030: 15c7 fa94 3232 3020 6261 7a62 6f78 2045 ....220.bazbox.E
> 0x0040: 534d 5450 2050 6f73 7466 6978 2028 5562 SMTP.Postfix.(Ub
> 0x0050: 756e 7475 290d 0a untu)..
As you could obtain 帰南真昇@mattkeenan.net, SMTP server of ISP who owns mattkeenan.net should support non-ascii local part correctly. Correctly in this context is "based on RFC 5356 too, instead of based on his local rule only".
Does the SMTP sever support RFC5356 correctly?
Or did your ISP start to use non-ascii local part with expecting RFC2047 encoding or quoted-printable&utf-8 of local part by any mailer or any MTA?
FYI.
Yahoo! SMTP response for To: 帰南真昇@rocketmail.com and To: 帰南真昇@gmmil.com.
> An error occurred while sending mail. The mail server responded:
> From address not verified - see http ://help.yahoo.com/.../sendfrom-07.html.
> Please verify that your email address is correct in your Mail preferences and try again.
Why "From address"? MAIL FROM: & From: header is my @rocketmail.com of Yahoo!.
In some errors, @rocketmail.com perhaps is still considered non Yahoo! address.
Gmail SMTP response for To: 帰南真昇@gmail.com.
> An error occurred while sending mail. The mail server responded:
> 5.1.3 Syntax error in mailbox address "?@gmail.com" (non-printable character).
> Please check the message recipient and try again.
Comment 9•15 years ago
|
||
8BITMIME is seen in next source only(looks fake smtp in test suite), but there is no source which has string of UTF8SMTP in it.
> http://mxr.mozilla.org/comm-central/source/mailnews/test/fakeserver/smtpd.js#42
If someone already worked for RFC5336, I think string of UTF8SMTP is seen.
It may be;
(1) Message header enhancement by RFC5336(UTF8SMTP, utf-8 in local part of mail address) and/or by RFC5335(general enhancement of string in messae header from ascii to utf-8) was supported in mail display.
(2) It was wrongly done on "localpart of mail address" in message header generation, even though RFC5336 is not supported by Tb.
(3) As SMTP code of Tb bases on RFC2821(SMTP), SMTP code removed wrong 8bit localpart from RCPT TO:.
If it's right, real problem is (2), because there is no way to pass SMTP command/message header with 8bit in localpart of mail address when SMTP server doesn't suport UTF8SMTP. Tb shouldn't generate message header of 8bit in localpart, until RFC5336 and/or RFC5335 will become standard track and will be widely supported.
Neeedless to say, improvement for users who need to use "already used non-ascii localpart" is desiarable, not only for "localpart based on RFC5336" but also "local rule on localpart". Please note that "RFC2047 encoding or quoted-printable of localpart" is "local rule at server", even if ISP's servers fully support RFC5336 upon start using non-ascii localpart and accept "8bit local part in MAIL FROM:/RCPT TO:" & "8bit localpart in message headers".
Comment 10•15 years ago
|
||
Please consider next case. (RFC5336=is supported,non-RFC5336=is not supported)
Mailer(RFC5336)->SMTP(RFC5336)->SMTP(Relay,non-RFC5336)->Destination(RFC5336)
What can SMTP servers do, even though he doesn't know RFC5336 and/or local rule at Destination?
To support this situation, mailer side enhancements are mandatory, and, simply supporting UTF8SMTP is insufficient and support of server's local rule is also needed.
Further, main reason of no support of 8BITMIME was:
8BITMIME is known only after SMTP server connection.
Re-construction of mail data after SMTP connection is nearly impossible.
It's applicable to UTF8SMTP too.
Example of enhancements at mailer side.
localpart.rfc5336.support:
true = always use utf-8 for localpart
(RFC5336 suppor by SMTP code is mandatory)
false = action depends on other options(try to encode)
localpart.local_rule.charset
default=utf-8
localpart.local_rule.encoding:
0 = Reject non-ascii in localpart
1 = rfc2047 encoding of localpart
2 = base64 encoding of localpart
3 = quoted-printable encoding of localpart
"RFC5336 is supported or not" depends on all of SMTP server, Relay servers, and Destination(POP3/IMAP). And, local rule depends on server(@x.y.z part).
To which entity these options should be asociated?
Entry in address book is sufficient?
If address book, will enhancement of generic address like *@x.y.z be needed for this kind of options?
If address book, new "encoded mail address" field and automatic filling of the "encoded mail address" is convenient?
Comment 11•15 years ago
|
||
Wada, I think this case is covered in http://tools.ietf.org/html/rfc5504 - Downgrading Mechanism for Email Address Internationalization. But according it "... There is no associated up-conversion mechanism, although internationalized email clients might use original internationalized addresses or other data when displaying or replying to downgraded messages." Which brings us to http://tools.ietf.org/html/rfc5825 - Displaying Downgraded Messages for Email Address Internationalization
Comment 12•13 years ago
|
||
can you please retest this in recent TB nightlies (20)?
Maybe this was fixed by bug 127399.
Flags: needinfo?(matt)
Comment 13•11 years ago
|
||
(In reply to :aceman from comment #12)
> can you please retest this in recent TB nightlies (20)?
> Maybe this was fixed by bug 127399.
Matt's email address bounces
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Flags: needinfo?(matt)
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•