Closed Bug 193439 Opened 22 years ago Closed 20 years ago

RFC 2231 style encoding should be used for filename parameter of attachment (instead of RFC 2047 style)

Categories

(MailNews Core :: Attachments, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jshin1987, Assigned: jshin1987)

References

(Blocks 1 open bug)

Details

(Keywords: intl)

Attachments

(1 file)

Mozilla-mail uses RFC 2047 style encoding for filename parameter of attachment. This is a clear violation of RFC 2231(RFC 2184) and RFC 822(STD 11). (note that we cannot use RFC 2047 style encoding for parameters of mail headers while abiding by RFC 822/STD 11, which is why RFC 2231 style encoding was introduced). Mozilla-mail can decipher RFC 2231 style encoding in incoming messages, but somehow it doesn't use that for outgoing messages. I searched bugzilla for 'rfc 2231', 'rfc 2184', 'rfc 2047' and 'content-disposition', but there's no bug filed on this as far as I can tell. (see bug 193142 comment #2 and bug 193142 comment #3) as well.
In the patch for bug 86089, the code for checking 'mail.strictly_mime.parm_folding' pref. entry was added [1], but it's never added to mailnews.js so that it's always set to the default 0. With that set to '2' (in user.js), RFC 2231 encoding is applied to parameter values that are already RFC-2047 encoded as shown below: Content-Type: image/gif; name*0*=UTF-8'en, ko, hi, en-us'%3D%3FUTF-8%3FB%3FaTE4bmwxMG4uY29tL8OGw4TC name*1*=tsO1wrDDuC5naWY%3D%3F%3D Content-Transfer-Encoding: base64 Content-Disposition: inline; filename*0*=UTF-8'en, ko, hi, en-us'%3D%3FUTF-8%3FB%3FaTE4bmwxMG4uY29tL8OG filename*1*=w4TCtsO1wrDDuC5naWY%3D%3F%3D Moreover, its 'lang' field is a verbatim copy of my 'accept-language' setting. If there's no better source (I can think of a couple of sources), 'lang' should be just omitted. [1] http://lxr.mozilla.org/seamonkey/source/mailnews/compose/src/nsMsgCompUtils.cpp#800
Assignee: mscott → jshin
Product: MailNews → Core
Blocks: 270670
Blocks: 136676
Should another ticket be opened for thunderbird? It has the same problem.
No. TB and Mozilla suite share the code, which is why it's filed under 'Core:MailNews attachment'
I did some testing to check which MUAs use RFC 2231. Here are the results so far: - pine: ok - evolution: ok - kmail: ok - mutt: ok - thunderbird: NOK - outlook express (from win98 up to xp): NOK - sylpheed: NOK
Thanks for testing. I thought only Pine does it right :-). A more important question (well, we have to live with the fact that a lot of people use MS products) to ask is whether MS Outlook and Outlook Express understand it when included in incoming emails although it doesn't use it for outgoing emails. While you're at it, can you check it out? Another test to conduct is how popular web mail services and programs handle it. The result of two tests will be an important factor in determining our default behavior. Btw, I've just made a patch to do the right thing when 'parm folding' pref. is set to 2.
Status: NEW → ASSIGNED
Well, outlook doesn't like RFC 2231 encoded attachments. Below is how *outlook* sees such attachments when sent by each MUA listed. In all cases, the attachment was "bláblá.txt": kmail -> ATT00045.txt pine -> .txt (*) evolution -> ATT00063.txt thunderbird -> bláblá.txt mutt -> ATT00089.txt sylpheed -> bláblá.txt (*): I have to revise the pine results. Here is what it sent: --699906-769375173-1107351078=:29202 Content-Type: TEXT/plain; charset=US-ASCII; name*="ISO-8859-1''bl%C3%A1bl%C3%A1.txt" Content-Transfer-Encoding: BASE64 Content-ID: <Pine.LNX.4.61L.0502021131180.29202@starway.conectiva> Content-Description: Content-Disposition: attachment; filename*="ISO-8859-1''bl%C3%A1bl%C3%A1.txt" So, it said it was ISO-8859-1, but the filename is actually using UTF-8. I don't know now if it's a pine issue or some other misconfiguration on that machine which sent it. And here is what each mailer actually sent: kmail 1.7.92 ============ --Boundary-00=_wLNAC/51m+M0Ch9 Content-Type: text/plain; charset="us-ascii"; name*=iso-8859-1''bl%E1bl%E1%2Etxt Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename*="iso-8859-1''bl%E1bl%E1%2Etxt" Evolution 2.0.1 =============== --=-aKNblJADXhdiBTYXWkPU Content-Disposition: attachment; filename*=iso-8859-1''bl%E1bl%E1.txt Content-Type: text/plain; name*=iso-8859-1''bl%E1bl%E1.txt; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Thunderbird 1.0 =============== --------------080408010609090603030907 Content-Type: text/plain; name="=?ISO-8859-1?Q?bl=E1bl=E1=2Etxt?=" Content-Transfer-Encoding: 8bit Content-Disposition: inline; filename="=?ISO-8859-1?Q?bl=E1bl=E1=2Etxt?=" Mutt 1.5.6i =========== --45Z9DzgjV8m4Oswq Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename*=iso-8859-1''bl%E1bl%E1%2Etxt Content-Transfer-Encoding: 8bit Sylpheed 0.9.10 =============== --Multipart=_Thu__3_Feb_2005_09_04_14_-0200_7Rd=QdMcWi8qx0gK Content-Type: application/octet-stream; name="=?ISO-8859-1?Q?bl=E1bl=E1.txt?=" Content-Disposition: attachment; filename="=?ISO-8859-1?Q?bl=E1bl=E1.txt?=" Content-Transfer-Encoding: base64 Things look bad for webmails, though: gmail (barfed completely) (NOK) ===== ------=_Part_1798_30712123.1107448921964 Content-Type: text/plain; name="bl" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="bl" yahoo webmail (no encoding at all?) (NOK) ============= --0-88921971-1107448623=:55346 Content-Type: text/plain; name="bláblá.txt" Content-Description: bláblá.txt Content-Disposition: inline; filename="bláblá.txt" terra webmail (big Brazilian ISP) (NOK) ============= --_=__=_XaM3_.1107449504.2A.268756.42.4689.52.42.007.1132790721 Content-Type: text/plain; name="=?iso-8859-1?Q?bl=E1bl=E1.txt?=" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="=?iso-8859-1?Q?bl=E1bl=E1.txt?=" iG webmail (another big Brazilian ISP) (NOK) ========== --Message-Boundary-by-Mail-Sender-1107449853 Content-type: text/plain; name="bláblá.txt" Content-description: Arquivo anexado pelo iGmail Content-transfer-encoding: 7BIT Content-disposition: attachment imp 3.2.2 (NOK) ========= ---MOQ1107460614fea1fb6b5e249914f2a48aab6361eef8 Content-Type: text/plain; name="bláblá.txt" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="bláblá.txt"
Thanks for testing. That's what I expected. One possible compromise is to use RFC 2047 for 'name parameter' in C-T header while using RFC 2231 for 'filename' parameter. re: Pine Pine's multilingual support (as is released by the UW Pine team) has a lot to be desired. There's a set of patch made by me and Bernhard Kaindl (of SuSE Linux) at http://www.suse.de/~bk/pine/FAQ.html
Evolution showed all emails with the correct attachment name (besides the Gmail one, which truncated the name), even those without RFC 2231 compliance. Same for Kmail (except for the Pine email, whose attachment was shown as not-decoded UTF-8) mutt only displayed correctly the emails which were either RFC 2231 encoded or which had no encoding at all. What a mess...
Finally, regarding how Thunderbird 1.0 reads/displays all those emails, it behaves like Evolution: shows everything correctly except for the gmail one, where there is nothing to show (truncated name). Ok, I'm done :)
Thanks again for testing. As for web mails, I don't care what they do with outgoing emails. (As you found out, Mozilla can deal with most of cases because I made multiple layers of fallbacks.) I don't exepct them to do the right thing given that virtually all of them are very poor in terms of the standard compliance in general and in I18N support in particular(google's trackrecord of standard compliance is not so good either.) Our concern is how they handle RFC 2231 in *incoming* emails. Perhaps, we have to make Mozilla use RFC 2231 by default unless major mail clients (including web mail services) make attachment with RFC 2231 filename params totally unavailable/invisible. Not being able to get the original filename is not critical.
Attached patch patchSplinter Review
fixes the problem mentioned in comment #1 and made RFC-2231 encoding the default. It shouldn't be a problem for ASCII-only relatively short (< ~70 chars) file names even if mail clients at the other end don't understand RFC 2231 becuase ASCII-only short filenames, we don't use RFC 2231. For non-ASCII filenames, it's a little inconvenient to lose the original file name (the file content is NOT lost) but that's what they should pay for using non-compliant mail clients like MS Outlook and MS OE. Anyway, one can revert to the old behavior by setting 'mail.strictly_mime_paramfolding' to 0 or 1
Attachment #173366 - Flags: superreview?(bienvenu)
Attachment #173366 - Flags: review?(mscott)
Comment on attachment 173366 [details] [diff] [review] patch + { + PR_FREEIF(encodedRealName); + encodedRealName = PL_strdup(real_name); can just use PR_Free here and a couple lines later... PR_Free checks for null. I'm not sure if you need moa for the nsEscape change. Darin on bzbarsky, perhaps.
Attachment #173366 - Flags: review?(mscott) → review+
Thanks for r, David. bz and darin, do you have anything to say about adding 'unconditional' to nsEscape (%-escape everything)? We need it for pesky encodings like SJIS and Big5 that 'infringe upon' the ASCII range. Alternatively, we can make it figure out whether a n octet in the ASCII range is a part of a multibyte character.. We might need it anyway (if we do that, it has to be a separate function or with an optional parameter).
Comment on attachment 173366 [details] [diff] [review] patch r was from David. let's ask bz for sr because it includes a change in 'nsEscape'.
Attachment #173366 - Flags: superreview?(bienvenu) → superreview?(bzbarsky)
Comment on attachment 173366 [details] [diff] [review] patch I really don't know nsEscape well enough to sr this...
Attachment #173366 - Flags: superreview?(bzbarsky) → superreview?(darin)
Comment on attachment 173366 [details] [diff] [review] patch >Index: xpcom/io/nsEscape.h >+ unconditional = 0 /**< escape everything */ url_Unconditional or url_All would seem like more consistent names for this. remember that this guy lives at global scope, so a name like unconditional is much more likely to interfere with someone elses code than one with the url_ prefix. also, it's very unclear to me what this means. are you saying that each byte will be %xx escaped unconditionally? please make that clear in the comment if so. >Index: mailnews/compose/src/nsMsgCompUtils.cpp >+ // RFC 2047 style encoding (it's not standard-compliant) >+ if (parmFolding != 2) { nit: can you write parmFolding == 0 or do you really mean != 2? >+ PR_FREEIF(encodedRealName); >+ encodedRealName = PL_strdup(real_name); nit: there's a little bit of mixing of allocators here. PR_Free is not necessarily the same as PL_strfree, which is required to free the results of PL_strdup. Of course, I think we mix these everywhere and in fact PR_Free and PL_strfree are equivalent in mozilla today, but we should try not to code to that assumption. if it's difficult to fix, then don't bother. sr=darin
Attachment #173366 - Flags: superreview?(darin) → superreview+
In my build: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b2) Gecko/20050329 in prefs.js is: mail.strictly_mime_parm_folding not: mail.strictly_mime.parm_folding which is checked here: http://lxr.mozilla.org/seamonkey/source/mailnews/compose/src/nsMsgCompUtils.cpp#800 and this I add to my prefs.js How should it be?
thanks for catching it. I've just fixed the 'typo'. I'm sorry I also forgot to mark this as fixed. I addressed reviewer's concern except for darin's (mix-up of alloc/free because it's quite involved)
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
*** Bug 308839 has been marked as a duplicate of this bug. ***
The fix causes incompatibilies with most e-mail clients, including Outlook and Outlook Express. See: bug 309566 bug 317972 bug 305650 bug 314116 bug 323388 bug 323390 Also seems to be causing international problems: https://bugzilla.mozilla.org/show_bug.cgi?id=309566#c17
Depends on: 309566
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: