Closed Bug 384842 Opened 17 years ago Closed 15 years ago

Specially crafted subject lines (certain length, space at the end) generated invalid mail headers

Categories

(Thunderbird :: Message Compose Window, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3.0b3

People

(Reporter: cwright, Assigned: mkmelin)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 5 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:1.8.1.3) Gecko/20070309 Firefox/2.0.0.3
Build Identifier: Thunderbird 1.5.0.10 (Windows/20070221), version 2.0.0.0 (Macintosh/20070326), Thunderbird 3.0a1pre (Macintosh/20070611)

Subject lines of special lengths followed by whitespace can cause invalid mail headers.

Reproducible: Always

Steps to Reproduce:
1. Enter "Lorem ipsum dolor sit amet, consectetuer adipiscing elit volutpat. " (all text between quotes, including the space at the end) on the subject line.
2. Send the mail
3. Frown at the invalid header  :(
Actual Results:  
INVALID HEADER: FOLDED HEADER FIELD MADE UP ENTIRELY OF WHITESPACE                                        
                                                                                                          
  Improper folded header field made up entirely of whitespace (char 20 hex):                              
    Subject: ...met, consectetuer adipiscing elit volutpat.\n \n

Expected Results:  
There should be a validation pass that will remove all-whitespace lines, as per RFC 2822:

The RFC 2822 standard specifies rules for forming internet messages.                                    
  In section '3.2.3. Folding white space and comments' it explicitly                                      
  prohibits folding of header fields in such a way that any line of a                                     
  folded header field is made up entirely of white-space characters                                       
  (control characters SP and HTAB) and nothing else. 

The magic line length looks to be 66 characters, then an additional space.  Shorter subjects with enough padding trailing spaces also manifest this bug.  If the user isn't saving local copies of their mail and the remote receiver throws out undeliverable messages, the email is lost.

A simple fix would be to trim trailing whitespace from a subject line before sending.
This strips trailing whitespace from the subject.  It seems to fix the problem.  Are there side effects for localization?  Is it solving the problem correctly?
Attachment #268838 - Flags: review?
Same thing, strips tabs too.
Attachment #268838 - Attachment is obsolete: true
Attachment #268838 - Flags: review?
WFM on Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9a6pre) Gecko/20070621 Thunderbird/3.0a1pre ID:2007062103

Where exactly do you get the error message? On send or when received?
For me, the error comes from Postfix, our MTA, in the form of a Bounce/Undeliverable Message message.  Maybe other mailers handle this themselves?
Postfix over here to...  ESMTP Postfix (2.3.3)
We're using 2.2.8 I think (I'm not root on that machine.)  Is there a way to get thunderbird to just write the message to a local file to inspect headers manually?
Not that I know of, unless you can find another smtp that will send it, and then look in the sent folder.
Attached file tcpdump captures. (obsolete) —
Here are two test-case emails captured in flight from my machine to the mail server.  The first message does not trigger this bug.  The second one does.  Check the file in a hex editor, between the Subject: and Content-Type: line on the second message, you'll see 0D0A200D0A (newline, space, newline), which violates the RFC.

The only post-processing done on these packets is the removal of garbage packets, and the hostname (I wasn't actually sending mail from example.com).  The messages themselves are otherwise left perfectly intact, just how Thunderbird sent them.
This is the output through another sender.  It sends, but the headers are still technically invalid.  The only reason it bounced under our postfix was due to a strict content filter we have in place that caught this.
Confirming.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Specially crafted subject lines generated invalid mail headers. → Specially crafted subject lines (certain length, space at the end) generated invalid mail headers
Attachment #268941 - Flags: review?
Attachment #268941 - Flags: superreview?(bienvenu)
Attachment #268941 - Flags: review?(mkmelin+mozilla)
Attachment #268941 - Flags: review?
Comment on attachment 268941 [details] [diff] [review]
Added htab detection (not sure if it happens though)

While i guess this can work around the problem, I don't think this is the right fix - the right one being done in the code that does the folding. 

I think that code needs to check if all that's left for the next line is whitespace, and in case it is, fold the header earlier.
Attachment #268941 - Flags: review?(mkmelin+mozilla) → review-
Good point.  I'll get started on that.

Unfortunately, I've noticed a problem while thinking up test cases:

RFC states that field lengths should be no longer than 78 spaces, and must not be longer than 998 spaces (stated in 2.1.1).  How do we correctly handle cases where there are 80 or 1000 trailing spaces?  There isn't a correct way to fold these without trimming whitespace (unless there's some way to escape whitespace?  I'm not too familiar with e-mail encoding and escaping).

Thoughts?
Comment on attachment 268941 [details] [diff] [review]
Added htab detection (not sure if it happens though)

I wonder if there's some regular expression foo that would enable you to do this with .replace - something like replacing a sequence of ' ' or \t followed by the end of the string ($) with "".
Attachment #268941 - Flags: review- → review?(mkmelin+mozilla)
>+        if(subjectLength != subject.length)
>+        {
>+            msgCompFields.subject = subject.substring(0,subjectLength);
>+            GetMsgSubjectElement().value = subject.substring(0,subjectLength);
>+        }
>+
>         // Check if we have a subject, else ask user for confirmation
>         if (subject == "")
This looks wrong, needs
subject= subject.substring(0,subjectLength)

The regex would be easier...

var newSubject= subject.replace(/\s+$/, '');
if(newSubject.length != subject.length)
{
    subject= newSubject;
    msgCompFields.subject = subject;
    GetMsgSubjectElement().value = subject;
}
Yeah, something like that should do, if we want to change the subject. 

Don't know how it's supposed to be if there are several lines of consecutive whitespace. 
Good idea with regex stuff (I'm obviously a C programmer by default :).  And
you're right, we do need to set subject, otherwise the next block won't grab
the emptied subject (if we decide that trimming is the way to handle this).
Oh yeah, long runs of white space in the middle would be a problem.

RFC2822, 3.2.3: [folding whistespace] MUST NOT be inserted in such a way that any line of a folded header field is made up entirely of [whitespace] characters and nothing else."

This would collapse each whitespace sequence to a single space.  Not sure if that is the desired behavior, but it seems reasonable.  I assume that folding and possible MIME coding happen later.  But maybe the folding code should handle this anyway.

var newSubject= subject.replace(/\s+/g, ' ');
newSubject= newSubject.replace(/ $/, '');
if(newSubject != subject)
{
    subject= newSubject;
    msgCompFields.subject = subject;
    GetMsgSubjectElement().value = subject;
}
Comment on attachment 268941 [details] [diff] [review]
Added htab detection (not sure if it happens though)

I'm going to minus this since we've got enough comments to warrant a new patch.

And I hear ya about regex and c++ :-)
Attachment #268941 - Flags: superreview?(bienvenu) → superreview-
Attachment #268941 - Flags: review?(mkmelin+mozilla)
Ok, what's the Right Way to handle intra-word whitespace blocks?  Most of these blocks (all blocks less than 78 spaces) aren't invalid, so should the regex only take effect on huge space blocks like that?  I would think that this would preserve most of the user's input unmodified, except for failure conditions.
This is based on posted regex replacement code.  It should only kick in on trailing whitespace, and intra-subject whitespace blocks when they're longer than 78 spaces.

The current CVS is broken (crashes when the To field is filled in in the Compose window), so I can't test this, but if someone has an older stable version to try it out on, I'd love to know if it behaves as expected.
Attachment #268941 - Attachment is obsolete: true
Attachment #272489 - Flags: review?
When "A <many many spaces> Z" is entered in Subject field, Tb 2.0 generated following 3 lines as Draft mail. (second line is 72 space characters + CRLF)  
> Subject: A                                                               
>                                                                         
>                                           Z
Is this also a RFC2822 violation?
If yes, what is proper way? Make it atoms by encoding continuous spaces in Base64?
Hmm, didn't try but I assume this isn't a problem form non-ascii subjects. Not really sure about how multi line space should be done, but encoding the subject (spaces included) would be one way.

 =?iso-8859-1?q?this=20is=20some=20=20=20=20=20=20=20text?=

Then again, the latest patch could make for an easy fix...
I still say collapse any whitespace (\w+) to a single space.  That's what Gmail does, on incoming and outgoing mail.  It also allows text which has already been folded to be properly refolded (example, the text is copied, modified with a prefix "fwd:", and refolding would add an extra space at the old line break, depending on the copy/paste implementation).  In other words, it's simpler and canonical.

Does multiple space have any meaning?
> Does multiple space have any meaning?

You never know... and no need to remove all of it if the users says he wants it.
Re comment #23: Gmail does no such thing (at least for display), though if you view it through the web ui you wouln't see it.

Christopher: you have to request review from a specific person. The patch looks good to me, with one nit: add space after // in the comment and a dot at the end of the sentence.
Whiteboard: [needs reviewer]
Magnus writes comment #25:
> Re comment #23: Gmail does no such thing (at least for display), though if you
> view it through the web ui you wouln't see it.
> 
> Christopher: you have to request review from a specific person. The patch looks
> good to me, with one nit: add space after // in the comment and a dot at the
> end of the sentence.

Chris writes "The problem isn't properly solvable (the RFC is too vague, so any fixes will still leave corner cases), as far as I recall."
Keywords: helpwanted
Whiteboard: [needs reviewer] → [patchlove][needs reviewer]
Comment on attachment 272489 [details] [diff] [review]
Fixes excessively long (>=78) space blocks and trailing space.

Patch has unfortunately bitrotted.

$ patch -p0 --dry-run < ~/Desktop/Thunderbird.subject.header.v0.3.patch 
patching file mail/components/compose/content/MsgComposeCommands.js
Hunk #1 FAILED at 1712.
1 out of 1 hunk FAILED -- saving rejects to file mail/components/compose/content/MsgComposeCommands.js.rej
Attachment #272489 - Attachment is obsolete: true
Attachment #272489 - Flags: review?
Whiteboard: [patchlove][needs reviewer] → [patchlove]
(In reply to comment #26)
> Magnus writes comment #25:
> > Re comment #23: Gmail does no such thing (at least for display), though if you
> > view it through the web ui you wouln't see it.
> > 
> > Christopher: you have to request review from a specific person. The patch looks
> > good to me, with one nit: add space after // in the comment and a dot at the
> > end of the sentence.
> 
> Chris writes "The problem isn't properly solvable (the RFC is too vague, so any
> fixes will still leave corner cases), as far as I recall."

cc'ing bienvenu - do we still want the patch then?
I think we might - Magnus, would you be interested in driving this in?
Attached patch proposed fixSplinter Review
The 66 + a space "magic limit" from comment 0 is
Subject: 66chars+space+CRLF = 78 chars, which is the SHOULD limit in rfc2822.

I've fixed up the patch to remove trailing spaces in the subject (so we won't that space over to the next line and get an empty header line), and remove sequences of spaces (>74) from within subject text. Why 74? The folded line should be less than 78 and the folding will add arbitrarily many WSP characters according to the rfc - tb seems to add 2 spaces. 74+2+CRLF=78
Assignee: nobody → mkmelin+mozilla
Attachment #269393 - Attachment is obsolete: true
Attachment #269667 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #384068 - Flags: review?(bienvenu)
Attachment #384068 - Flags: review?(bienvenu) → review+
Comment on attachment 384068 [details] [diff] [review]
proposed fix

thx, Magnus.
changeset:   2897:7a767870e14e
http://hg.mozilla.org/comm-central/rev/7a767870e14e

->FIXED
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Keywords: helpwanted
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 3.0b3
Blocks: TB2SM
Whiteboard: [patchlove]
Version: unspecified → Trunk
No longer blocks: TB2SM
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: