Closed
Bug 229399
Opened 21 years ago
Closed 19 years ago
RFC2047 subject and realname headers [=?charset?...] miscopied if charset differs from compose body charset
Categories
(MailNews Core :: Composition, defect)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: thomas.lussnig, Assigned: jshin1987)
References
Details
(Keywords: intl)
Attachments
(1 file)
1.12 KB,
patch
|
Bienvenu
:
review+
mscott
:
superreview+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6a) Gecko/20031029 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6a) Gecko/20031029 When try to reply to this header mozilla use wrong email adresses. None of them is the correct one all are the Descriptive name decoded. maybe even used for letting users send mail to not intended persons. Special in case of signed mail the answer can go to wrong people if the realname is nicly choosen. To: lussnig@smcc.net Cc: yoshfuji@linux-ipv6.org Subject: Re: IPv6 Patch From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= <yoshfuji@linux-ipv6.org> Reproducible: Always Steps to Reproduce: 1. Header with To: lussnig@smcc.net Cc: yoshfuji@linux-ipv6.org From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= <yoshfuji@linux-ipv6.org> 2. Recive the mail and try to reply 3. You see the wrong names Actual Results: To: YOSHIFUJI To: Hideaki To: / To: <control characterts from unicode> Expected Results: yoshfuji@linux-ipv6.org copy email adress return the right result !
Assignee | ||
Comment 1•21 years ago
|
||
Confirming for the moment (my tree got 'screwed up' so that I couldn't check it with the trunk, but I was able to reproduce it with 1.5. I have to check it again) From: address is correctly displaed in the mail list pane and the message display area.
Comment 2•21 years ago
|
||
> From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= > <yoshfuji@linux-ipv6.org> RFC 2822 defines : ( See http://www.faqs.org/rfcs/rfc2822.html ) > from = "From:" mailbox-list CRLF > mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list > mailbox = name-addr / addr-spec > name-addr = [display-name] angle-addr > addr-spec = local-part "@" domain [display-name] should be quoted by "(double-quote) if it contains control character such as space. > display-name = phrase > phrase = 1*word / obs-phrase > obs-phrase = word *(word / "." / CFWS) > word = atom / quoted-string > atom = [CFWS] 1*atext [CFWS] Therefore From: should be : > From: "YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" > <yoshfuji@linux-ipv6.org> What mailer created the mail?
Reporter | ||
Comment 3•21 years ago
|
||
Mailer is X-Mailer: Mew version 4.0.62 on Emacs 21.3.50 / Mule 5.0 (SAKAKI) 1. Even if the mail is not correct escaped why use mozilla the first 3 Tokens and not the last one wich is correct? 2. Why than copy mail adress work with the right mouse key. 3. if mozilla take is as an name list, why it skip the last token wich contain the correct recipient 4. Is there no check agains valid emails wich should not contain the utf8 subset above 127
Assignee | ||
Comment 4•21 years ago
|
||
I suspected that, but haven't bothered to check RFC (2)822. (well, I should have known better than that). Changing the severity to 'enhancement' because it's not a bug per se. Various Emacs mail programs are often broken when it comes to MIME and I18N. As for point #4, with IDN(international domain name), it's now possible to have non-ascii characters in the email address. RFC 2822 is not likely to have been updated yet. And, when it's updated, perhaps punycode would be used there for 'machines'. So, your point still stands. Converting between punycode and UTF-8 (or other forms of Unicode) would be mail clients' job. Anyway, it seems like there may be something we can do against this kind of standard violation.
Severity: major → enhancement
OS: Linux → All
Assignee | ||
Comment 5•21 years ago
|
||
> Therefore From: should be : >> From: "YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" >> <yoshfuji@linux-ipv6.org> Ooops. Something must have gotten into my head. RFC 2047-encoded word is an atom (cannot be within a quoted-string). Therefore, the above is not correct. atom = [CFWS] 1*atext [CFWS] word = atom / quoted-string phrase = 1*word / obs-phrase display-name = phrase None of characters in the header here at issue is forbidden in 'atom' (they are all valid 'atext' including 'slash'). atext = ALPHA / DIGIT / ; Any character except controls, "!" / "#" / ; SP, and specials. "$" / "%" / ; Used for atoms "&" / "'" / "*" / "+" / "-" / "/" / "=" / "?" / "^" / "_" / "`" / "{" / "|" / "}" / "~" In conclusion, the address is valid. From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= <yoshfuji@linux-ipv6.org>
Severity: enhancement → normal
Assignee | ||
Comment 6•21 years ago
|
||
*** Bug 104064 has been marked as a duplicate of this bug. ***
Comment 7•21 years ago
|
||
>> Therefore From: should be : >>> From: "YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?=" >>> <yoshfuji@linux-ipv6.org> > RFC 2047-encoded word is an atom (cannot be within a quoted-string). > Therefore, the above is not correct. Oh yeah, you are right. I forgot RFC 2047. > In conclusion, the address is valid. > From: YOSHIFUJI Hideaki / =?iso-2022-jp?B?GyRCNUhGIzFRTEAbKEI=?= > <yoshfuji@linux-ipv6.org> No, you are not correct too. In BNF notation, "/" is OR. See http://www.faqs.org/rfcs/rfc2234.html > 3.2 Alternatives Rule1 / Rule2 > Elements separated by forward slash ("/") are alternatives. > Therefore, > foo / bar > will accept <foo> or <bar>. Since atext can not include space, display-name can not include space if it is not quoted by "(double-quote). Therefore, your conclusion is also incorrect. To satisfy both RFC(2)822 and RFC 2047, whole display-name characters should be encoded at once. ie. valid format in your case is as follows. > From: =?iso-2022-jp?B?(Encoded-English&Spaces&Japanese)=?= > <uid@domain.name> Mozilla 2003122809-trunk/Win-Me generated recipient address of this format for display-name includes Japanese characters.
Assignee | ||
Comment 8•21 years ago
|
||
No, you got it wrong. For sure, atext doesn't include space, but look how atom is defined. atom is defined as a sequence of atext _enclosed_ by CFWS. If you're right, tens of millions of emails produced per RFC 822 (as shown below) would be invalid by RFC 2822. Authors of RFC 2822 do care about the backward compatibility. From: Jungshik Shin <jshin@example.com>
Assignee | ||
Comment 9•21 years ago
|
||
> whole display-name characters should be encoded at once.
Well, you have to be careful NOT to exceed the encoded word length limit (78?)
in RFC 2047. You have to split somewhere if it gets too long.
Comment 10•20 years ago
|
||
The following From: headers exhibit the same (or similar) problems: (bug 231732) From: Example Name =?iso-2022-jp?B?GyRCPzkyPEJZOSgbKEI=?= <test@example.com> reply init'd as: To: Example Name {junk} Nearly identical to this bug -- ISO-2022-JP, and the reply email address is trashed. I'm duping this one over. (bug 252240) From: "=?big5?B?IkhzdSwgSmVubnkgW659pk6sTF0i?=" <yyy@xxx.tw> reply init'd as; To: Jenny [{junk}]"" <yyy@xxx.tw> Similar to this bug, but the email address is preserved; note that this address is incorrectly quoted, and exhibits the problems from bug 156588 and bug 254519. (bug 258155) From: =?Windows-1251?B?wOHw4Ozq6O3gINLg8vz/7eA=?= <mail@from.host.com> reply init'd as: To: {bunch of junk} <mail@from.host.com> Again, similar, but the email address is preserved. Jungshik Shin, do you think the latter two examples are the same problem?
Summary: wrong reply adress on =?utf8?x?xxxx?= realnames → wrong reply address to some RFC2047 realnames [=?charset?...]
Comment 11•20 years ago
|
||
*** Bug 231732 has been marked as a duplicate of this bug. ***
Reporter | ||
Comment 12•20 years ago
|
||
In Version 1.8a3 it look that is been fixed. From: =?GB2312?B?sbG+qdeovNK3rdLrzfhCZWlqaW5n?= Chinese Translation <bjhyw35@eyou.com> Subject: =?GB2312?B?t63S63RyYW5zbGF0aW9u?= Work correct now. So i would assign the satus to fixed.
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Comment 13•20 years ago
|
||
No bug / patch specified as the fix. ->WORKSFORME
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•20 years ago
|
Status: REOPENED → RESOLVED
Closed: 20 years ago → 20 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Comment 14•20 years ago
|
||
Mike, do you happen to have 'Always use this default character encodings in replies' checked? There's a bug in my patch to add that feature. I've had a patch for quite a while, but forgot to file a bug and fix it.
Comment 15•20 years ago
|
||
The results I posted in comment 10 were based on tests with various 1.8a builds and TB 0.7. I just retest with 1.8a3-0824, Win2K. (Is there a related patch checked in more recently than this?) With "always use my default charset in replies," the results depend on which charset is specified. If I specify Big5 as the default, then the header posted in comment 12 is handled correctly, as is the one I posted from bug 252240, but the others look wrong. If I uncheck that setting, all the results look like junk, as I noted before; the header from comment 12 is entered as {junk}Beijing Chinese Translation <bjhyw35@eyou.com> Reopening this bug.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 16•20 years ago
|
||
I've gotten a slightly better hold on this problem. I believe the RFC2047 From: header is handled correctly -- that is, the same text appears in the To: field -- if the charset specified in the header is the same as the initial charset used to compose the message. Once the To: header has been initialized, the message encoding can be changed without affecting the header. This can be seen using either method of selecting the charset for reply -- if "Always use my character set" is selected, then replying to a message where the From: is encoded with another set reliably exhibits the problem. If "Use the original sender's character set", then replying to a message that is displayed in some charset other than that in the From: header will exhibit the problem. This last point is true whether the character set for the message display has been chosen from the Content-Type header, from the folder's properties, or from an explicit View|Encoding. Bug 258856's fix has done nothing to address this issue (I'm not sure if it was supposed to).
Comment 17•20 years ago
|
||
*** Bug 258155 has been marked as a duplicate of this bug. ***
Comment 18•20 years ago
|
||
*** Bug 265423 has been marked as a duplicate of this bug. ***
Updated•20 years ago
|
Product: MailNews → Core
Comment 19•20 years ago
|
||
*** Bug 252592 has been marked as a duplicate of this bug. ***
Comment 20•20 years ago
|
||
*** Bug 273381 has been marked as a duplicate of this bug. ***
Comment 21•19 years ago
|
||
I really suspect it has something to do with the body charset. Quoting bug 252592 (turn on UTF-8 in case you see borked text): QUOTE: If I try to reply to an email, of which the "From:" header was encoded in a different charset than it's body, the "To:" header gets transcoded improperly. an example: I recieved an email with the following headers: From: "=?iso-8859-4?Q?Rytis Umbrasas =AEol=ECdis?=" <user@provider.lt> Subject: =?iso-8859-4?B?UHJhuXltYXMgcGFk7HQgabluYWdyaW7sdCBTY3JpYmUgbGF1a3VzILHo6uznufn+vg==?= MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1257" Content-Transfer-Encoding: quoted-printable In message list, the sender looks like this: Rytis Umbrasas Žolėdis However, when replying, the newly formed "To:" field looks like this: Rytis Umbrasas ®olģdis IMHO, that's because Thunderbird uses the body charset to transcode the "From:" header, instead of using the header charset. /QUOTE Furthermore, recently I quite often recieve e-mails from a few Evolution users, for which Reply names and subjects get borked, for example: From: =?iso-8859-4?Q?K=EAstutis_Bili=FEnas?= <user@domain.lt> To: "komp_lt@konferencijos.lt" <list@another_domain.lt> Mime-Version: 1.0 X-Mailer: Evolution 2.0.3 Subject: Re: .po =?iso-8859-4?q?fail=F9?= =?iso-8859-2?q?_ra=B9ybos?= tikrinimas Sender: list-bounces@another_domain.lt Errors-To: list-bounces@another_domain.lt --===============1238572946== Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="=-lb66fM8zWmjR10clnq6M" --=-lb66fM8zWmjR10clnq6M Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On Pn, 2005-01-21 at 22:52 +0200, Marius Gedminas wrote: > Radau nuorod=C4=85, gal bus naudinga: skriptukas, leid=C5=BEiantis aspell= =C4=85 ant When replying to this email, i get following junk in To: and Subject: fields: To: � <kebil@kaunas.init.lt> Subject: Re: .po �
Comment 22•19 years ago
|
||
(In reply to comment #21) This one is in UTF-8. > From: "=?iso-8859-4?Q?Rytis Umbrasas =AEol=ECdis?=" <user@provider.lt> > Subject: > =?iso-8859-4?B?UHJhuXltYXMgcGFk7HQgabluYWdyaW7sdCBTY3JpYmUgbGF1a3VzILHo6uznufn+vg==?= > MIME-Version: 1.0 > Content-Type: text/plain; charset="windows-1257" > Content-Transfer-Encoding: quoted-printable > > In message list, the sender looks like this: > Rytis Umbrasas Žolėdis > > However, when replying, the newly formed "To:" field looks like this: > Rytis Umbrasas ®olģdis > > IMHO, that's because Thunderbird uses the body charset to transcode the "From:" > header, instead of using the header charset. I think my assumption is correct: rq@bliss:~$ echo "Rytis Umbrasas Žolėdis" |iconv -futf8 -tiso8859-4 | iconv -fcp1257 -tutf8 Rytis Umbrasas ®olģdis This example shows that if you take "Rytis Umbrasas Žolėdis" in ISO-8859-4 (case header charset) and interpret it as Windows-1257 (case body charset), you see "Rytis Umbrasas ®olģdis" instead.
Assignee | ||
Comment 23•19 years ago
|
||
Thanks for detailed analysis. Taking for better tracking. I may or may not have made a patch for this in the past. Even if I did, I may have lost it. I'll take another look.
Assignee: sspitzer → jshin1987
Status: REOPENED → NEW
Comment 24•19 years ago
|
||
It's kind of weird tho, that those headers are transcoded properly in the message list, and in the message preview window/pane. Why would a reply window use different transcoding functions at all?
Assignee | ||
Comment 25•19 years ago
|
||
(In reply to comment #24) > It's kind of weird tho, that those headers are transcoded properly in the > message list, and in the message preview window/pane. Why would a reply window > use different transcoding functions at all? They do use the same function but with a different option for 'override'. It used to use a different function, which I changed to use the same function but overlooked the override option. Anyway, I have a fix at hand, but I need to test more extensively.
Status: NEW → ASSIGNED
Comment 26•19 years ago
|
||
(In reply to comment #25) > Anyway, I have a fix at hand, but I need to test more extensively. > Any news?
Assignee | ||
Comment 27•19 years ago
|
||
Fix is very simple (one-liner), but I haven't yet manage to test it comprehensively.
Comment 28•19 years ago
|
||
I don't think it requires lots of testing actually. Don't forget the fact that you're already using exactly the same function in mesage list and preview pane.
Assignee | ||
Comment 29•19 years ago
|
||
What's so urgent? I do know my code.
Comment 30•19 years ago
|
||
I never said you don't know your code. If it looks to you like I said so, i'm sorry. OK, it's not that very urgent, however, I wish I could expect it to be fixed in the next release...
Comment 31•19 years ago
|
||
*** Bug 285053 has been marked as a duplicate of this bug. ***
Comment 32•19 years ago
|
||
Tweaking summary for searchability.
Summary: wrong reply address to some RFC2047 realnames [=?charset?...] → RFC2047 subject and realname headers [=?charset?...] miscopied if charset differs from compose body charset
Comment 33•19 years ago
|
||
Hello jshin, have you tested your one-line fix? If you have not, would you please do that or commit it anyways? I don't think anyone wants this small yet important bug to be forgotten for the next 15 months again. Please. Rimas
Assignee | ||
Comment 34•19 years ago
|
||
Sorry for the delay. It took me quite an extensive test (not to break what I fixed in the past while fixing this).
Attachment #179230 -
Flags: superreview?(mscott)
Attachment #179230 -
Flags: review?(bienvenu)
Updated•19 years ago
|
Attachment #179230 -
Flags: superreview?(mscott) → superreview+
Updated•19 years ago
|
Attachment #179230 -
Flags: review?(bienvenu) → review+
Comment 35•19 years ago
|
||
(In reply to comment #34) > Created an attachment (id=179230) [edit] > patch > > Sorry for the delay. It took me quite an extensive test (not to break what I > fixed in the past while fixing this). Thank you very much! :) You made my morning good! ;)
Assignee | ||
Comment 36•19 years ago
|
||
thanks for r/sr. Fixed on the trunk
Status: ASSIGNED → RESOLVED
Closed: 20 years ago → 19 years ago
Resolution: --- → FIXED
Comment 37•19 years ago
|
||
Verified fixed with TB 1.0+0406, Win2K. Thank you, Jungshik!
Status: RESOLVED → VERIFIED
Comment 38•19 years ago
|
||
jshin, don't listen to them when they hard press you to check in ;-) This patch caused the regression in bug 291320.
Comment 39•19 years ago
|
||
*** Bug 298669 has been marked as a duplicate of this bug. ***
Comment 40•19 years ago
|
||
*** Bug 274053 has been marked as a duplicate of this bug. ***
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•