Last Comment Bug 87653 - Message body contents are not displayed when Content-Type header is folded, doesn't handle whitespace (boundary="abc [CRLF] xyz"[CRLF] is specified, but --abcxyz is used for boundary line in mail)
: Message body contents are not displayed when Content-Type header is folded, d...
Status: RESOLVED WONTFIX
[patchlove][has draft patch]
: dataloss, helpwanted, testcase
Product: MailNews Core
Classification: Components
Component: MIME (show other bugs)
: Trunk
: x86 Windows NT
: -- major (vote)
: ---
Assigned To: Denis Antrushin
:
Mentors:
http://www.faqs.org/rfcs/rfc2822.html
Depends on:
Blocks: 234547
  Show dependency treegraph
 
Reported: 2001-06-25 11:19 PDT by Navin Gupta
Modified: 2013-01-10 22:04 PST (History)
18 users (show)
mozilla: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Body of an email that shows blank (4.28 KB, text/plain)
2001-09-09 20:41 PDT, Richard Ekle
no flags Details
remove rfc822's line continuations (504 bytes, patch)
2001-11-02 06:50 PST, Denis Antrushin
mozilla: review-
mozilla: superreview-
Details | Diff | Review
stip line continuations (2.10 KB, patch)
2009-09-02 14:26 PDT, Denis Antrushin
no flags Details | Diff | Review

Description Navin Gupta 2001-06-25 11:19:57 PDT
From Bugzilla Helper:
User-Agent: Mozilla/4.7 [en]C-AOLNSCP  (WinNT; U)
BuildID:    2001-06-25-04-trunk

I have a message in my inbox for which the body does not get displayed
in the message pane. If you do View | Message Source you can see the 
source. Also it worksfine on 4.x. I can send the message to the person
who will work on this bug. Also it happens on 2001062004. 

Reproducible: Always
Steps to Reproduce:
1.Select the message

Actual Results:  The contents are not displayed. 

Expected Results:  The contents should be displayed.
Comment 1 Navin Gupta 2001-06-25 11:21:12 PDT
If I fwd the message it gets displayed. 
Comment 2 Richard Ekle 2001-09-09 20:41:56 PDT
Created attachment 48818 [details]
Body of an email that shows blank
Comment 3 Richard Ekle 2001-09-09 20:48:11 PDT
I can confirm that this bug is still happening in Mozilla 0.9.3.  I added an
attachment containing a complete email that is exhibiting this problem. 
Hopefully that will help fix the bug.
Comment 4 Denis Antrushin 2001-11-02 05:17:21 PST
This is besause Content-Type header of that message is folded (in terms of
rfc822):
Content-Type: multipart/alternative; boundary="=_alternative 
    0011E5AD86256AC0_="

According to rfc822, (unfolded) boundary value should be
"=_alternative 0011E5AD86256AC0_=" (CRLF and all spaces at the beginning of
next line are replaced with single space), while in mozilla it's
"=_alternative     0011E5AD86256AC0_=" (extra spaces are not removed.

This is bug in MIME_StripContinuations (mozilla/mailnews/mime/src/mimehdrs.cpp)
But there is yet another problem with that header and I can't find answer in
RFCs yet: first line have whitespace at the end: ..."=_alternative<SPACE><CRLF>
What to do with SPACE? I believe that spaces at the end of line should be
trimmed,
but this is quoted string, so what to do with this trailing space? If we don't 
remove it, message still will not be displyed (we'll have double space after
word 'a;ternative', but separator actually used has only one space.
If we remove that whitespace unconditionally... is it safe?

Does anyone here have Lotus Notes mailer? If so, could you please send 
me (adu@sparc.spb.su) small message with attachment? 

P.S.: I think that platform/OS should All, it's not windows only bug :-)
Comment 5 Denis Antrushin 2001-11-02 06:50:42 PST
Created attachment 56241 [details] [diff] [review]
remove rfc822's line continuations
Comment 6 Denis Antrushin 2001-11-06 05:00:53 PST
Did I understand rfc2822 right: folding whitespace
([*WSP CRLF] 1*WSP) is semantically equivalent to just whitespace
(even inside of quoted string)? In that case:

Index: mimehdrs.cpp
===================================================================
RCS file: /cvsroot/mozilla/mailnews/mime/src/mimehdrs.cpp,v
retrieving revision 1.52
diff -u -r1.52 mimehdrs.cpp
--- mimehdrs.cpp        2001/09/28 20:07:43     1.52
+++ mimehdrs.cpp        2001/11/06 12:50:49
@@ -807,7 +807,10 @@
                /* p2 runs ahead at (CR and/or LF) */
                if ((p2[0] == nsCRT::CR) || (p2[0] == nsCRT::LF))
                {
-            p2++;
+                       p2++;
+                       while (nsCRT::IsAsciiSpace(*p1)) p1--;
+                       while (*p2 && nsCRT::IsAsciiSpace(*p2)) p2++;
+                       if (*p2) *p1++ = ' ';
                } else {
             *p1++ = *p2++;
         }

Or, even just  like that (isn't too risky?):
if (nsCRT::IsAciiSpace(*p2)) {
    p2++;
    while ( *p2 && nsCRT::IsAsciiSpace(*p2)) p2++;
    if (*p2) *p1++ = ' ';
} else {
    *p1++ = *p2++; 
}

In any case, breaking line at the middle of qstring is a bad idea
of lotus notes mailer, i think :-)

Comment 7 Dan 2004-04-16 05:23:11 PDT
Bug 226502 may be related.
Comment 8 Wayne Mery (:wsmwk, NI for questions) 2008-06-16 13:49:00 PDT
has patch, needs owner :)

will be challenging to find dupes.
bug 317263?
Comment 9 Wayne Mery (:wsmwk, NI for questions) 2009-03-21 11:49:44 PDT
ran testcase, still fails.
Comment 10 David :Bienvenu 2009-03-24 12:51:38 PDT
this looks like the right thing to do, actually. But we need an hg patch to start with, and one that doesn't have tabs, etc.
Comment 11 David :Bienvenu 2009-03-24 12:52:45 PDT
Comment on attachment 56241 [details] [diff] [review]
remove rfc822's line continuations

we should also have the while and if clauses on their own lines. I can try this in my own tree and see what happens. We'd also want a test case for this.
Comment 12 David :Bienvenu 2009-03-24 13:22:33 PDT
Comment on attachment 56241 [details] [diff] [review]
remove rfc822's line continuations

the patch doesn't apply, and if I fix it to apply, and then fix it to compile by using NS_IsAsciiWhitespace, it still doesn't work - in fact, this code doesn't seem to get hit.
Comment 13 David :Bienvenu 2009-03-24 13:28:45 PDT
My suspicion is that you'd want to fix MimeHeaders_get to strip continuations correctly.
Comment 14 Gary Kwong [:gkw] [:nth10sd] 2009-06-19 00:42:00 PDT
Comment on attachment 56241 [details] [diff] [review]
remove rfc822's line continuations

Obsoleting the patch due to rejected review.
Comment 15 Gary Kwong [:gkw] [:nth10sd] 2009-06-19 00:42:38 PDT
Denis, any chance of an updated patch?
Comment 16 Wayne Mery (:wsmwk, NI for questions) 2009-08-08 14:48:56 PDT
(In reply to comment #15)
> Denis, any chance of an updated patch?

won't be hearing from Denis, his address bounces.
Comment 17 Denis Antrushin 2009-08-17 02:52:00 PDT
I lost password from my old account, so could not update it with new email 
address. I'm surprised this bug wasn't fixed in 8 years since I left active
work with mozilla. :-) 
I can not promise updated patch anytime soon - to make it, I will have to learn
how to develop mozilla again.
Comment 18 Denis Antrushin 2009-09-02 14:26:10 PDT
Created attachment 398222 [details] [diff] [review]
stip line continuations

MimeHeaders_get is not used to strip continuations for boundary parameter.
In MimeMultipart_initialize() we have:

118   ct = MimeHeaders_get (object->headers, HEADER_CONTENT_TYPE, PR_FALSE, PR_FALSE);
119   mult->boundary = (ct
120           ? MimeHeaders_get_parameter (ct, HEADER_PARM_BOUNDARY, NULL, NULL)
121           : 0);

And in MimeHeaders_get_parameter we have

504   rv = mimehdrpar->GetParameterInternal(header_value, parm_name, charset,
505                                         language, getter_Copies(result));

This is nsMIMEHeaderParamImpl::GetParameterInternal who improperly strip
continuations:

250    // if the parameter spans across multiple lines we have to strip out the
251    //     line continuation -- jht 4/29/98 
252    nsCAutoString tempStr(valueStart, valueEnd - valueStart);
253    tempStr.StripChars("\r\n");
254    *aResult = ToNewCString(tempStr);
255    NS_ENSURE_TRUE(*aResult, NS_ERROR_OUT_OF_MEMORY);
256    return NS_OK;

Attached path fixes the problem.
Note, however, that I didn't hacked mozilla for last 8 year, 
so this patch most likely won't be acceptable as is :-)
Comment 19 Gary Kwong [:gkw] [:nth10sd] 2009-09-02 21:43:58 PDT
Denis, please attach a patch that excludes ogg stuff - I don't think ogg relates to MIME, does it? :)

Also, please do a `hg diff` or `hg export` from the comm-central folder, not the comm-central/mozilla folder.

After that, you should be set to request review from bienvenu.

I could get the patch to work once I worked around these. :)
Comment 20 Gary Kwong [:gkw] [:nth10sd] 2009-09-02 21:45:34 PDT
Also, assigning to Denis.
Comment 21 WADA 2009-09-03 01:04:52 PDT
(In reply to comment #6)
> Did I understand rfc2822 right: folding whitespace
> ([*WSP CRLF] 1*WSP) is semantically equivalent to just whitespace
> (even inside of quoted string)? In that case:

My understanding of folding/unfolding of message header defined by RFC 2822 is as follows.
  - Folding  : Insert a [CRLF] before a(single) WSP
    Unfodling: Remove [CRLF] before a WSP (the single WSP should be kept)
  - [CRLF] for folding can be inserted before any WSP in message header,
    although RFC recommends insert at WSP for delimiter of higher level.
    e.g. Content-Disposition: attachment; filename="abc  xyz.txt"
         => attachment;[CRLF] filename="abc  xyz.txt"
            instead of attachment; filename="abc [CRLF] xyz.txt"
  - Interpretation of folded message header should be done after unfolding.
So, I think the message header has to be interpreted as next.
> Content-Type: multipart/alternative; boundary="=_alternative     0011E5AD86256AC0_=".

Denis Antrushin, what do you mean by "folding whitespace"?
Is this bug for tolerance of Tb with such wrongly created message header by bug of old mailer or old mail server?

This bug's quirks will produce real problem on next VALID header with spaces in boundary delimiter.
> Content-Type: multipart/xxx; boundary="   ...   ...[CRLF]   ..."[CRLF]
> (boundary delimiter line)
> [CRLF]--   ...   ...   ...[CRLF]
Please note that space is a valid character of boundary delimiter.
> http://tools.ietf.org/html/rfc2046
> boundary := 0*69<bchars> bcharsnospace
> bchars := bcharsnospace / " "
> bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
>                      "+" / "_" / "," / "-" / "." /
>                      "/" / ":" / "=" / "?"

AFAIK, quirks of next exists.
  remove WSP for folding and following spaces" in name="abc[CRLF]   def.txt"
  in Conetnt-Type: header. (I don't know about filename of Content-Disposition:)
AFAIR, reason of the quirks was that such header was generated by MS's software.
So, I'm not opposite to implementation of quirks for this bug's case.
However, break in above VALID header case should be cared for, because quirks by this bug apparently produces RFC violation by Tb for above VLAID header.
Note:
Quirks on name parameter won't produce real problem, because quirks on file name.

Is quirks by this bug still required? Do many mailer still send mail of this bug's header?
Note: The header was generated by Beta of first version of Lotus Notes in 2001.
> X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001
Comment 22 Denis Antrushin 2009-09-03 03:08:05 PDT
(In reply to comment #21)
> My understanding of folding/unfolding of message header defined by RFC 2822 is
> as follows.
>   - Folding  : Insert a [CRLF] before a(single) WSP

Section 2.2.3 says:
   The general rule is that wherever this standard allows for folding 
   white space (not simply WSP characters), a CRLF may be inserted 
   before any WSP.

Section 3.2.3 defines folding whitespace (FWS) as

FWS             =       ([*WSP CRLF] 1*WSP) /   ; Folding white space
                        obs-FWS
obs-FWS         =       1*WSP *(CRLF 1*WSP)     ; obsolete FWS

I.e., there may be any whitespaces before and at least one after CRLF.

>     Unfodling: Remove [CRLF] before a WSP (the single WSP should be kept)
>   - [CRLF] for folding can be inserted before any WSP in message header,

Section 2.2.3:
      Unfolding is accomplished by simply removing any CRLF
      that is immediately followed by WSP.
but in Section 3.2.3 we read:
      Runs of FWS, comment or CFWS that occur between lexical 
      tokens in a structured field header are semantically 
      interpreted as a single space character.

Does it means that _whole_ FWS is interpreted as a single space?

> Denis Antrushin, what do you mean by "folding whitespace"?

This is term from RFC 2822

> Is this bug for tolerance of Tb with such wrongly created message header 
> by bug of old mailer or old mail server?

I have no idea.

> Is quirks by this bug still required? Do many mailer still send mail of this
> bug's header?
> Note: The header was generated by Beta of first version of Lotus Notes 
> in 2001.
> > X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001

Have no idea, either. 
I've been out of mozilla development for 8 years and been surprised to see
this bug still open and being asked for updated patch :-)
Also, two bugs mentioned in this report (234547 and 317263) seems as a
different issues for me
Comment 23 WADA 2009-09-05 00:15:55 PDT
Thanks for pointing RFC description.

As you say, issues are next (i) / (ii) is true or false.
  (i)  Spaces in quoted string is "folding white spaces".
  (ii) Spaces in quoted string is "Runs of FWS, comment or CFWS that occur
       between lexical tokens in a structured field header".
In any case, next (A) should be interpreted after conversion to (B) by unfolding.
> (A) Content-Type: multipart/xxx; boundary="  ...  [CRLF]  ..."[CRLF]
> (B) Content-Type: multipart/xxx; boundary="  ...    ..."[CRLF]
Because I saw many Content-Type: aa/bb; name="xx[CRLF] yy.ext"[CRLF] in bugs, and I didn't see description of "RFC violation" in such bugs, I think (i) is true. Because the spaces is one in quoted string which is a token(semantically same as a word), I think (ii) is false. 

However, if { number of mails with (P) >> number of mails with (Q) } && { number of mails with (Q) is negligible } && { number of mails with (P) is still not so small }, quirks by this bug is practically acceptable.

                           Boundary in Content-Type:             Used boundary
(P) This bug's case      : boundary="abc [CRLF]   xyz"[CRLF]     --abcxyz
(Q) Sample in Comment #9 : boundary="  abc  [CRLF]  xyz"[CRLF]   --  abc    xyz
(R) Apparently valid one : boundary="  abc    xyz"[CRLF]         --  abc    xyz
Comment 24 Wayne Mery (:wsmwk, NI for questions) 2010-01-01 10:51:04 PST
Dennis, will you be following up on the draft patch?
current procedure is https://developer.mozilla.org/En/Developer_Guide/How_to_Submit_a_Patch
Comment 25 Denis Antrushin 2010-01-19 11:12:05 PST
I can update the patch, but what about concerns expressed in comments #21 and 23?
If the root of the problem is broken Lotus mailer and fix could break valid
messages, do we want to fix it? It's 8 years old and seems did not caused much 
trouble to anyone except submitter and has no vote. 

Also note that fix for this bug won't fix bug 234547.
Comment 26 Gary Kwong [:gkw] [:nth10sd] 2010-01-29 23:32:18 PST
(In reply to comment #25)
> I can update the patch, but what about concerns expressed in comments #21 and

bienvenu / dmose, ping &/or thoughts?
Comment 27 Dan Mosedale (:dmose) 2010-02-08 11:47:57 PST
Comment on attachment 398222 [details] [diff] [review]
stip line continuations

Adding review flags to get this onto David's radar.
Comment 28 JAmes Coleman 2010-06-01 09:56:24 PDT
I see same problem. 
With latest Thunderbird 3.0 (and previous 2.x versions) MIME boundary containing commas (and other non-alpha chars) doesn't get decoded. MIME message just shown inline as text not decoded.

Coming from someone using a nokia E65 phone.

2 messages okay:
  boundary="EPOC32-z1G82Qp+YQsMY44N0z3DsrBDJ5L2s9mKj+Ykc140ms_jxhkn"
  boundary="EPOC32-Lx73B+hVbJ5wSRpXTlLKyjwD8YWVf1hg8bHk+'Tssd4tJHbT"

5 messages not okay (note the one without the comma in boundary .. a - at end of boundary is a problem?):
  boundary="EPOC32-GQ-K-kR,QDS7Gvzz5VbDXdBbM477'73Sc_wWCYc,nmD9fKgf"
  boundary="EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6"
  boundary="EPOC32-0VSK1,9DqPMqGctWX_PyNxx,2+FmXKn5b1DM9_K7yclDZlst"
  boundary="EPOC32-yMNqRYYmV'jRmXlDTj7rtscB8H9mtYrjHfxb0JFc7VyYK47-"
  boundary="EPOC32-WnvGdp2s03kDPHxjvTW,khRc1,4_C-q1BWv_lMZdg6_-CwBs"
Comment 29 JAmes Coleman 2010-06-01 13:54:51 PDT
No.
I'm wrong.
It's not same problem. Sorry!

The problem I see seems to be because the email headers have a blank line inside the To: header line. Email readers Thunderbird/Evolution/Outlook/mutt display the message body starting after the blank line in To: line and MIME message type is not detected.  nokia + hotmail
munpack extracts attachments okay.

 >From muh  Tue May 11 16:00:42 2010
Return-path: <muh@meh.mah>
Envelope-to: muh@meh.mah
Delivery-date: Tue, 11 May 2010 16:00:42 +0100
Received: from meh.hotmail.com ([meh.mah.meh.moo])
    by dspsrv.com with esmtp (Exim 4.71)
    (envelope-from <muh@meh.mah>)
    id 1OBqwr-0008WG-4B
    for muh@meh.mah Tue, 11 May 2010 16:00:42 +0100
Received: from meh ([meh.mah.meh.moo]) by meh.hotmail.com with Microsoft SMTPSVC(meh.mah.meh.moo);
     Tue, 11 May 2010 08:00:39 -0700
X-Originating-IP: [meh.mah.meh.moo]
X-Originating-Email: muh@meh.mah
Message-ID: <muh@meh.mah>
Received: from [meh.mah.meh.moo] ([meh.mah.meh.moo]) by meh.hotmail.com over TLS secured channel with Microsoft SMTPSVC(meh.mah.meh.moo);
     Tue, 11 May 2010 08:00:24 -0700
From: muh@meh.mah
Reply-to: muh@meh.mah
To: <muh@meh.mah>,
 <muh@meh.mah>, <muh@meh.mah>,
X-OriginalArrivalTime: 11 May 2010 15:00:26.0865 (UTC) FILETIME=[B0A6DE10:01CAF11A]
Date: 11 May 2010 08:00:26 -0700

<muh@meh.mah>, <muh@meh.mah>,
 <muh@meh.mah>, <muh@meh.mah>,
  <muh@meh.mah>,<muh@meh.mah>,
  <muh@meh.mah>, <muh@meh.mah>,
  <muh@meh.mah>,
  <muh@meh.mah>
Subject: Does this mean
Date: Tue, 11 May 2010 16:00:19 +0100
Message-ID: <muh@meh.mah>
X-Mailer: EPOC Email Version 2.10
MIME-Version: 1.0
Content-Language: i-default
Content-Type: multipart/mixed;
  boundary="EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6"

This is a MIME Message

--EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6
Content-Type: text/plain; charset=UTF-8
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

text of message muhed mehed and mahed too
--EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6
Content-Type: image/jpeg
Content-Disposition: attachment;
    filename="11052010.jpg"
Content-Transfer-Encoding: base64

/9j/4RusRXhpZgAASUkqAAgAAAAIAA8BAgAGAAAAbgAAABABAgAEAAAARTYz
ABIBAwABAAAAAQAAABoBBQABAAAAdAAAABsBBQABAAAAfAAAACgBAwABAAAA
AgAAABMCAwABAAAAAQAAAGmHBAABAAAAhAAAAKoBAABOb2tpYQAsAQAAAQAA
.
.
9pceYAFO7jp7VV1NAkxKD7wzg1E04yEmmivbSOjZwRnqPWr28FVPbINataXI
vpY//9k=

--EPOC32-NnWHRR1Qc18lTP8sP4YxqmvW0zz+RSFktwVyL,NZ3bG2k0L6--
Comment 30 David :Bienvenu 2010-06-23 15:05:16 PDT
Comment on attachment 398222 [details] [diff] [review]
stip line continuations

sorry for the delay - this is core necko code so I can't technically review it.
Comment 31 Ronald J. Yacketta 2010-06-23 16:15:41 PDT
I am noticing something similar in bug #574155 where the Content-Type appears to have a line feed and spaces before the data

IE:

Content-Type:
 application/vnd.openxmlformats-officedocument.wordprocessingml.document;
 name="DocumentName.docx"

This causes TB to display a binary representation of the attachment in the message body, also when opening the attachment notepad.exe is used.

The message in question was sent from Squirrel Mail 1.4.19, message sent from TB itself do not have the line feed + space issue.

A bug has been filed with SM as I am not sure where the issue is.

-Ron
Comment 32 paul 2010-06-23 17:10:15 PDT
I believe Dennis was more on track with his reading of the RFC.  I think WADA has an incorrect understanding of what to do with spaces after the CRLF in a fold.  The [CRLF] and any following whitespace should be treated as a single space.  It's more clear (with examples that speak to this issue) in RFC 822, section 3.1.1.  I realize that has been obsoleted by 2822, but I think that section is still relevant and puts this question to rest.

@Ron - this bug is a pretty good indicator that your issue is with Thunderbird
Comment 33 paul 2010-06-24 13:00:58 PDT
On more detailed reading, I think the problem is that the RFCs are simply unclear and contradictory.  

RFC 822 (section 3.1.1) is contradictory wherein its examples show that you can add any number of spaces after the CRLF and it is supposedly identical to a single space, but then it goes on to state that "Unfolding is accomplished by regarding CRLF immediately followed by a LWSP-char as equivalent to the LWSP-char."

RFC 2822 is just as contradictory, as section 2.2.3 states "Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP."  However, the definition of FWS is: ([*WSP CRLF] 1*WSP), which implies that a fold is any trailing spaces on the first line, the CRLF and any spaces after that (which should all be removed when unfolding).

I tend to think that the implied meaning is that unfolding is accomplished by removing the CRLF and all spaces both before and after it, but I'm not sure how many clients do this.  SquirrelMail does not.  WADA believes Thunderbird should not.  Who knows.
Comment 34 WADA 2010-06-25 02:23:03 PDT
(In reply to comment #33)

FYI.

Mail data attached to comment #0.
> X-Mailer: Lotus Notes Build M10_08082001 Beta 3 August 08, 2001
> Date: Thu, 6 Sep 2001 22:15:29 -0500

Following is Comment #4 by Denis Antrushin on 2001-11-02.
> According to rfc822, (unfolded) boundary value should be
> "=_alternative 0011E5AD86256AC0_="
> (CRLF and all spaces at the beginning of next line are replaced with single space),
> while in mozilla it's 
> "=_alternative     0011E5AD86256AC0_="
> (extra spaces are not removed.

RFC 2822:
> Request for Comments: 2822                         QUALCOMM Incorporated
> Obsoletes: 822                                                April 2001
> Category: Standards Track

"Lotus Notes Build M10_08082001 Beta 3 August 08, 2001" looks to have used RFC822 for folding, with bug of "excess space just before inserted [CRLF] for folding". "RFC822 folding/unfolding or RFC822 folding/unfolding" was possibly option of Lotus Notes, because Lotus Notes has option for "Return-Receipt-To:" or "Disposition-Notification-To:".
  
Mozilla at 2001-11-02 apparently applied RFC2822 instead of RFC822 to unfolding.

RFC2822 defines folding pattern produced by RFC822 and refers to problems in folding of RFC822.
> 4.2. Obsolete folding white space
>   In the obsolete syntax, any amount of folding white space MAY be
>   inserted where the obs-FWS rule is allowed.  This creates the
>   possibility of having two consecutive "folds" in a line, and
>   therefore the possibility that a line which makes up a folded header
>   field could be composed entirely of white space.
>     obs-FWS         =       1*WSP *(CRLF 1*WSP)

paul@squirrelmail.org, your knowledge about header folding/unfolding looks based on RFC822. 
Please note that boundary line of --=_alternative0011E5AD86256AC0_= is absolutely mail sender side RFC violation even if unfolding of RFC822 is applied,  
> RFC822  : boundary="=_alternative 0011E5AD86256AC0_="
> RFC2822 : boundary="=_alternative     0011E5AD86256AC0_="
although apparent bug of old Lotus Notes Beta is "adding a space before [CRLF] for RFC822 folding".

For quirks by this bug.

If pattern is like next, automattic quirks of "application of RFC822 unfolding" + quirks of "remove space(s) just before [CRLF] for RFC822 folding" is possible.  
> Content-Type: xxx/yyy; boundary="abcdefg[SP][CRLF]
> [SP] ... [SP][CRLF]  <== RFC violation, because space only line is invalid.
> [SP] ... [SP][Non-SP-chars]";[SP][CRLF]
> [SP] ... [CRLF]
However, if next, it's impossible to know which folding was used, and it's impossible to know space before [CRLF] is valid one or garbage by mailer's bug. 
> Content-Type: xxx/yyy; boundary="abcdefg[SP][CRLF]
> [SP] ....[SP][Non-SP-chars]"[CRLF]
If "RFC822 unfolding"+"quirks for space(s) just before [CR]" is still required, I think folder option like next(for wrong charset) is better.
> [?] Apply default to all messages in the folder (individual message
>     character encoding settings and auto-detection will be ignored)

My questions are;
- "any number of added spaces after [CRLF] in RFC822 folding" is really
  applicable to RFC822 folding within quoted text as value of boundary
  parameter?
- "application of RFC822 unfolding" is still required for very old mails? 
- quirks of "remove space(s) just before [CRLF] for RFC822 folding" is still
  required?
- If quirks is still required, can quirks be "remove any spaces in boundary
  parameter" after RFC2822 unfolding?
  Folder Properties:
    [?] Remove space(s) in boundary parameter of multipart
        for torelance with header folded by RFC822 folding.
  If this kind of quirks, it can be applied to bug 234547 case too.
  I guess (number of modern mailers who use space in boundary line) is far
  smaller than (number of buggy mailers who produce problem like bug 234547).
Comment 35 paul 2010-06-25 14:08:24 PDT
Reply to comment #34

> "Lotus Notes Build M10_08082001 Beta 3 August 08, 2001" looks to have used
> RFC822 for folding, with bug of "excess space just before inserted [CRLF] for
> folding".

No, I think that's an incorrect interpretation of that header.  I think they were trying to follow RFC2822.  See below.

> paul@squirrelmail.org, your knowledge about header folding/unfolding looks
> based on RFC822. 

Why would you say that when I in fact quoted both 822 and 2822?  Please read with care.

> > RFC822  : boundary="=_alternative 0011E5AD86256AC0_="
> > RFC2822 : boundary="=_alternative     0011E5AD86256AC0_="

You put these here like this is the unquestioned way to unfold.  My point is that there is NOT a clear definition of how to unfold -- depending on the section of the RFC you are reading (either 822 OR 2822), you can make a case that all white space around a CRLF should be removed or that only one white space after the CRLF should be removed (or replaced with a single space).

So I believe that your claim about how to unfold could be argued to be wrong (in fact I think your interpretation of RFC822 unfolding IS wrong).  As I see it, these are the possible interpretations of how to unfold that header, depending on how you read the RFCs:

RFC822 : "=_alternative     0011E5AD86256AC0_="
("CRLF WSP ==> WSP"; per section 3.1.1, last sentence of 2nd to last paragraph)

RFC822 : "=_alternative  0011E5AD86256AC0_="
("CRLF 1*WSP ==> WSP"; per section 3.1.1 examples)

RFC2822 : "=_alternative     0011E5AD86256AC0_="
("CRLF WSP ==> WSP" ("CRLF is invisible"); per section 2.2.3 and most of section 3.2.3)

RFC2822: "=_alternative 0011E5AD86256AC0_="
("*WSP CRLF 1*WSP ==> WSP"; per last paragraph of section 3.2.3)

However, because the RFCs are both contradictory, I can't say which is right or which is wrong.  

My FEELING is that the "CRLF is invisible" approach is best in that it allows the recipient to respect what the sender was doing with multiple spaces (as long as the spaces aren't fluff).  Otherwise, intentional spacing near a fold gets munged.  In the specific case of this old Lotus Notes header, the extra spaces are in fact fluff, which is THEIR problem IMO.

In that sense, I would say that this bug is INVALID and Thunderbird's current behavior is RFC-CORRECT.

> although apparent bug of old Lotus Notes Beta is "adding a space before [CRLF]
> for RFC822 folding".

This is NOT a bug in the sense that you suggest!  I believe this was probably INTENTIONAL.  That space is NOT "added."  It is NOT "garbage" as you suggest.  The space at the end of the line before the fold corresponds to the actual space that is in the real boundary string that is used later.  The fold happens after the space.  I think they left it on the end of the line before the fold so that it would not be removed by greedy unfolding (removal of all white space after CRLF).  It is part of the boundary string.  They seem to assume that unfolding would be "remove CRLF 1*WSP", which is, as far as I can tell, a misunderstanding of section 2.2.3, specifically "Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP," where they assumed removal of the CRLF AND the WSP, even though I believe the RFC is saying only the CRLF gets removed (which is further backed up in section 3.2.3).

HOWEVER, their assumption/misunderstanding of section 2.2.3 ends up being a semi-valid way to read RFC2822, in that it gives you the same result as what is described by the last paragraph of section 3.2.3: "Runs of FWS, comment or CFWS that occur between lexical tokens in a structured field header are semantically interpreted as a single space character."  

Given that the date on the Lotus Notes version in use is after the release of RFC2822, this is a plausible explanation.

> - "any number of added spaces after [CRLF] in RFC822 folding" is really
>   applicable to RFC822 folding within quoted text as value of boundary
>   parameter?

1) RFC822 section 3.1.1 is indeterminate on this point
2) I believe Lotus was following RFC2822, not RFC822

> - "application of RFC822 unfolding" is still required for very old mails?

That seems like a rat's nest; moreover, it's probably not possible to detect the difference, especially considering that some clients might be adding extra spaces on purpose or for fluff.

> - quirks of "remove space(s) just before [CRLF] for RFC822 folding" is still
>   required?

You misunderstand what Lotus was doing.  This is not a "quirk" or a "bug" per se.  They were trying to follow RFC2822.  I think this concern is unfounded and should be dropped.

Keep in mind that although you assume there is only one way to unfold in a RFC-2822-compliant manner, this is not necessarily the case.  So there is still another open question as to what the best way to unfold per RFC2822 is.  My gut feeling is that Thunderbird is already doing the right thing.

(My prior comment @Ron thus has to be taken back; SquirrelMail has a small bug if this is the case.)

It would be interesting to see an additional header added to tell recipients how to unfold:

X-HEADER-UNFOLDING: Remove CRLF
X-HEADER-UNFOLDING: Remove CRLF WSP
X-HEADER-UNFOLDING: Remove CRLF WSP+
X-HEADER-UNFOLDING: Remove WSP+ CRLF WSP
X-HEADER-UNFOLDING: Remove WSP+ CRLF WSP+
Comment 36 Gary Kwong [:gkw] [:nth10sd] 2010-07-31 01:10:31 PDT
Comment on attachment 398222 [details] [diff] [review]
stip line continuations

Denis, I couldn't get the patch to apply cleanly as-is, were the liboggplay changes for FreeBSD intended to be in this patch as well?

Moreover the nsMIMEHeaderParamImpl.cpp file now seems to be in the mozilla/netwerk/mime/ directory.

Up for a new patch? :)
Comment 37 paul 2010-08-01 12:31:06 PDT
I don't think a Thunderbird patch should be what you want.  

As I said in my last (long) comment #35, "I would say that this bug is INVALID and Thunderbird's current behavior is RFC-CORRECT."

It is my opinion that if you find this "problem" with Thunderbird viewing emails, you need to contact the authors of the email client that was used to compose the offending message.
Comment 38 Kent James (:rkent) 2012-11-02 09:12:26 PDT
This bug ended up in Wayne's TheList, hence I am reviewing it, but looking over the content it seems to me that this really needs a ruling about whether a fix would even be accepted.

:squib and/or :jcranmer, could you review this and make some comments about whether it makes sense to fix this or not?
Comment 39 Joshua Cranmer [:jcranmer] 2012-11-02 10:04:27 PDT
What is the intent of the RFC? Quoting from 5322, 2.2.3:
The general rule is that wherever this specification allows for folding white space (not simply WSP characters), a CRLF may be inserted before any WSP.
[...]
Unfolding is accomplished by simply removing any CRLF that is immediately followed by WSP.

The confusion comes from RFC822's ambiguous definition for folding, the BNF for FWS, and this paragraph in 3.2.3:
   Runs of FWS, comment, or CFWS that occur between lexical tokens in a
   structured header field are semantically interpreted as a single
   space character.

If we concern ourselves solely with the legal definitions in RFC 2822 and 5322 (they are, to my knowledge, equivalent), then the interpretation that comes out is that continuations should be dealt with by simply stripping CRLF (string.replace(/\r|\n/, ''), basically).

The intention in section 3.2.3 is to remind us that spaces and comments in structured headers are merely separators for the actual tokens (think whitespace and comments in C or C++) and have no semantic meaning whatsoever. FWS can occur within a lexical token, namely quoted strings, but as is mentioned earlier, folding requires that CRLF be stripped.

Given that I see several instances in RFC 5322 which state emphatically that CRLF (within FWS) is semantically invisible, it seems to me that the intent is that you could preprocess headers by deleting all CRLF from the header and there would be no semantic difference.

Now, if there were clear evidence that CRLF + WSP* -> SP is the more common assumption and is necessary for compatibility, I would be persuaded to implement it. The lack of duplicates, votes, or complaining comments on this bug suggests to me that its impact is relatively minor, though.
Comment 40 Kent James (:rkent) 2012-11-03 13:49:38 PDT
Wayne: jcranmer says, which I tend to agree with, "The lack of duplicates, votes, or complaining comments on this bug suggests to me that its impact is relatively minor, though."

So why is this in The List?
Comment 41 Wayne Mery (:wsmwk, NI for questions) 2012-11-03 19:20:20 PDT
(In reply to Kent James (:rkent) from comment #40)
> So why is this in The List?

most likely because it had some of the attributes we are looking for - testcase, above average severity, good discussion. Plus draft patch, ... it looked ready to roll.  Beyond that, why I would have chosen it instead of other mime bugs would have been highly subjective - perhaps even random - expecially given that it's highly unlikely I read all of the bug.
Comment 42 Joshua Cranmer [:jcranmer] 2013-01-10 22:04:40 PST
It's been two months since I made comment 39, and no evidence has been forthcoming that this bug is worth fixing. In lieu of such information, I am marking this as WONTFIX.

Note You need to log in before you can comment on or make changes to this bug.