Open Bug 697448 Opened 13 years ago Updated 2 years ago

Problem with txt attachment, equals character add at end of file? // multipart/mixed and quoted-printable

Categories

(MailNews Core :: MIME, defect)

defect

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: luca.preziati, Assigned: sshagarwal)

Details

(Whiteboard: [lang=c++])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 6.0) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.202 Safari/535.1

Steps to reproduce:

I have recieved a mail with a txt in attachment


Actual results:

I thunderbird the txt attchement have '=' at the end of the file. 
I have send the same mail to lotus and outlook and nobody see this '='


Expected results:

Don't add to attachement the =
Severity: normal → critical
I can't add as attachment the mail, if you need I can send you the message as confidentionaly.
Group: core-security
if you send it to me make sure top have the bug number somewhere so I know which bug we are talking about.
(In reply to Luca Preziati from comment #1)
> I can't add as attachment the mail, if you need I can send you the message
> as confidentionaly.

Ok I'm going to disclose technical information about the email - nothing about the content.

X-Mailer: Lotus Notes Release 8.5 December 05, 2008
Content-type: multipart/mixed; boundary="=_mixed 00551CB1C125791E_="

--=_mixed 00551CB1C125791E_=
Content-Type: multipart/alternative; boundary="=_alternative 00551CB1C125791E_="

--=_alternative 00551CB1C125791E_=
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"

--=_alternative 00551CB1C125791E_=
Content-Transfer-Encoding: 7bit
Content-Type: text/html; charset="us-ascii"

--=_alternative 00551CB1C125791E_=--
--=_mixed 00551CB1C125791E_=
Content-Type: text/plain; name="Addebiti_BBS.txt"
Content-Disposition: attachment; filename="Addebiti_BBS.txt"
Content-Transfer-Encoding: quoted-printable

--=_mixed 00551CB1C125791E_=--

And just before that line you have an empty line with a single = 

To me it seems the lotus mailer is sending the text file with the extra = and get rids of it when it saves/extracts the file from the email.
Summary: Problem with txt attachment, equals character add at end of file? → Problem with txt attachment, equals character add at end of file? // multipart/mixed and quoted-printable
If I save the attachment with Thunderbird 9.0, a trailing equal "=" is left at the end of the file.

1234567
=

The python quoted-printable decoding result is without the equal:

>>> import quopri
>>> quopri.decodestring('1234567\r\n=')
'1234567\r\n'
Luca (reporter), or zegreb (comment 4), can you add detailed Steps to reproduce (STR) for this bug?

There is an extra = in its own line in test.txt attached to test.eml of attachment 591729 [details], yes - but why should this be a bug of TB? Did TB add the extra = when you attached a txt file with only "1234567" as text content?

This shows the relevant part of the source of tst.eml of attachment 591729 [details]:

--=_mixed 0041E91AC125798E_=
Content-Type: text/plain; name="TEST.TXT"
Content-Disposition: attachment; filename="TEST.TXT"
Content-Transfer-Encoding: quoted-printable

1234567
=
--=_mixed 0041E91AC125798E_=--

This looks like wfm per ludo's comment 3.
I create the file with PSPAD and I write the text.

Then I attach the file to a mail whit the mouse using drag and drop and I send to myself. 

At the end I recievied the file with a new line and '=' added.

I don't know who and why it happen. 

I do with a windows system in italian language.
This is going to get a bit standards-lawyer, but the test message as shown isn't quite legal.



https://tools.ietf.org/html/rfc2046 page 19, the "Note" in the second paragraph, indicates that the example part in Comment 5 ends with the bare '=', because the subsequent CRLF belongs to the multipart boundary.

https://tools.ietf.org/html/rfc2045#section-6.7, and in particular Note 3 on https://tools.ietf.org/html/rfc2045#page-22, states that a bare '=' is illegal as the last byte in a Quoted-Printable (Q-P) object.


Now, the = at the end of a Q-P line means to ignore the line break and join the current line with the next line; a generous interpretation of a bare = at the end of a Q-P object would be to assume it means "this Q-P object doesn't end in CRLF", which looks like the interpretation some of the other decoders are applying. Note that this isn't necessary; the multipart boundary rule cited above already allows for non-CRLF terminated parts by giving the last CRLF to the boundary; when a part ends with CRLF, the multipart representation has two CRLFs at the end - one for the part, one for the boundary.

I'd be happy to mentor a contributor through implementing this change if they really want it, but it's pretty low priority for the core team since it's accommodating somebody else's violation of a standard. Unless TB is also generating bogus Q-P objects...
Severity: critical → minor
Status: UNCONFIRMED → NEW
Component: General → MIME
Ever confirmed: true
Product: Thunderbird → MailNews Core
QA Contact: general → mime
Whiteboard: [mentor=irving][lang=c++]
I'm interested in helping out with TB development. I  came across this bug as being one that might be good for getting introduced.  I can open the attached email sample in TB and see the offending '='.  I'm not able to reproduce it myself though.  Is it Windows only?

Irving, are you up for helping me out with this?
Thanks
Sean, thanks for looking at this. A properly implemented mail client won't create a message that reproduces the problem, so the sample attached to the bug is the best way we have of reproducing.

Next step for you is to get a Thunderbird build environment up and working, if you don't have one already; the conversation about doing that mostly happens in the IRC channel #maildev (see https://wiki.mozilla.org/IRC), but you can start with https://developer.mozilla.org/en-US/docs/Simple_Thunderbird_build.

You can also get in touch with me by email at the address in the CC field of this bug, but you'll get faster turnaround and help from more people if you ask in #maildev.
Attached patch PatchSplinter Review
Assignee: nobody → syshagarwal
Status: NEW → ASSIGNED
Attachment #759903 - Flags: review?(irving)
OS: Windows Vista → All
Hardware: x86 → All
Comment on attachment 759903 [details] [diff] [review]
Patch

Review of attachment 759903 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch, and my apologies for taking so long to review. I suspect there's one easy-to-fix regression in your code change, and unfortunately it doesn't look like we have unit test coverage for either that case, or the new one you're handling with this patch. A fixed version of this patch would be great, but adding unit tests would be even greater - I don't suppose you'd be interested in writing those tests?

::: mailnews/mime/src/mimeenc.cpp
@@ +743,5 @@
>    /* Flush out the last few buffered characters. */
>    if (!abort_p &&
>      data->token_size > 0 &&
> +    data->token[0] != '=' &&
> +    data->encoding == mime_Base64)

The change is a bit subtle, it took me a little while to figure out why it works. I'd like to update the comment at line 743; one option would be to say something like

/* insert missing '=' characters at the end of a base64 part,
 * discard any other incomplete escape sequence at the end of
 * an encoded part */

@@ -747,4 @@
>    {
> -    if (data->encoding == mime_Base64)
> -    while ((unsigned int)data->token_size < sizeof (data->token))
> -      data->token [data->token_size++] = '=';

I think we still need this while loop to fill in any missing '=' from a truncated base64 part.
Attachment #759903 - Flags: review?(irving) → review-
Okay, thanks for reviewing.
I will fix this soon and I'll definitely try writing the test.

Thanks.
Mentor: irving
Whiteboard: [mentor=irving][lang=c++] → [lang=c++]
Severity: minor → S4
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: