Closed Bug 958393 Opened 10 years ago Closed 7 years ago

Feed item is always attached as base64 leading to display of base64 text if the original message CTE is 8bit (not base64)

Categories

(MailNews Core :: Composition, defect)

x86
Windows 8.1
defect
Not set
normal

Tracking

(thunderbird52 fixed, thunderbird53 fixed, thunderbird54 fixed)

RESOLVED FIXED
Thunderbird 54.0
Tracking Status
thunderbird52 --- fixed
thunderbird53 --- fixed
thunderbird54 --- fixed

People

(Reporter: m_kato, Assigned: jorgk-bmo)

References

(Depends on 1 open bug)

Details

Attachments

(6 files)

Attached file RSS.eml
Step
====
1. Compose new mail
2. attach feed item by RSS to composting mail.
3. send and receive this mail.

Result
======
Each attached feed item is encoded by BASE64, but item isn't decoded to HTML.

Excepted Result
===============
view as HTML or doesn't show BASE64 raw data.
m_kato, can you still reproduce this?  i was able to save a (greek) blog entry as an .eml, compose and attach the file, and it is received with a correctly viewable attachment. there have been a lot of encoding changes subsequently.
Flags: needinfo?(m_kato)
(In reply to alta88 from comment #1)
> m_kato, can you still reproduce this?  i was able to save a (greek) blog
> entry as an .eml, compose and attach the file, and it is received with a
> correctly viewable attachment. there have been a lot of encoding changes
> subsequently.

Yes, I can reproduce this on 47 beta channel.

- Step
1. subscribe https://www.mozilla.jp/blog/feed/
2. new compose mail
3. drag item from 1's blog, then drop composing window
4. send mail

- Result
Preview's text keeps BASE64.  But subject and from are correct text.  When using viewing message source, it is like the following.  HTML doesn't have correct Content-Transfer-Encoding header.

X-Mozilla-Keys:                                                                                 
Received: by localhost; Mon, 8 Aug 2016 11:53:02 +0900
Date: Wed, 13 Apr 2016 17:00:00 +0900
Message-Id: <http://www.mozilla.jp/blog/entry/10542/@localhost.localdomain>
From: Mozilla Japan ブログ
MIME-Version: 1.0
Subject: Mozilla がサポートする Let's Encrypt がベータから正式版へ
Keywords: Announce,セキュリティ
Content-Transfer-Encoding: 8bit
Content-Base: http://www.mozilla.jp/blog/entry/10542/
Content-Type: text/html; charset=UTF-8

PGh0bWw+DQogIDxoZWFkPg0KICAgIDx0aXRsZT5Nb3ppbGxhIOOBjOOCteODneODvOODiOOB
meOCiyBMZXQncyBFbmNyeXB0IOOBjOODmeODvOOCv+OBi+OCieato+W8j+eJiOOBuDwvdGl0
bGU+DQogICAgPGJhc2UgaHJlZj0iaHR0cDovL3d3dy5tb3ppbGxhLmpwL2Jsb2cvZW50cnkv
...
Flags: needinfo?(m_kato)
Using the exact same feed message, and having set Outgoing Mail to utf8, everything works for me when received and viewed in utf8. The eml is attached.
It could be the bug (probably filed) has to do with a message/rfc822 attachment encoding not being respected in favor of the main message encoding.
I attached two messages from this feed http://hg.mozilla.org/mozilla-central/pushlog and the message is garbled.
Flags: needinfo?(alta88)
what's the question?  did you read/test the hypothesis in comment 3?
Flags: needinfo?(alta88)
Hmm, I think what I've done pretty much fits the description given in comment #0. So the questions is whether someone could debug this and find/fix the cause of the problem. How do I say it nicely: A hypothesis doesn't really fix the bug ;-)

As far as I can see, the attachment says base64, yet the message says 8bit, when it's not:

Content-Type: message/rfc822;
 name="Attached Message"
Content-Transfer-Encoding: base64 <===
Content-Disposition: attachment;
 filename="Attached Message"

X-Mozilla-Keys: $label1                                                                         
Received: by localhost; Thu, 26 Jan 2017 07:32:43 +0100
Date: Thu, 26 Jan 2017 00:56:03 GMT
Message-Id: <http://www.selenic.com/mercurial/#changeset-bba5ddb15392baa3c8968179120e50e99b6061fa@localhost.localdomain>
From: kwierso@gmail.com
MIME-Version: 1.0
Subject: Changeset bba5ddb15392baa3c8968179120e50e99b6061fa
Content-Transfer-Encoding: 8bit <===
Content-Base: http://hg.mozilla.org/mozilla-central/rev/bba5ddb15392baa3c8968179120e50e99b6061fa
Content-Type: text/html; charset=UTF-8

PGh0bWw+DQogIDxoZWFkPg0KICAgIDx0aXRsZT5DaGFuZ2VzZXQgYmJhNWRkYjE1MzkyYmFh
M2M4OTY4MTc5MTIwZTUwZTk5YjYwNjFmYTwvdGl0bGU+DQogICAgPGJhc2UgaHJlZj0iaHR0
cDovL2hnLm1vemlsbGEub3JnL21vemlsbGEtY2VudHJhbC9wdXNobG9nIj4NCiAgPC9oZWFk

So maybe something went wrong attaching the feed. The original feed message was 8bit encoded. If I change 8bit to base64 the attachment displays fine.

Typically we an attach messages and the message/rfc822 attachment displays fine. So it would be good if someone could debug this a little. Maybe it not only happens with feed messages.
OK, so the problem is this.

"Normal" messages have CR+LF to show the end of the line. Feed messages, at least the one I looked at from my M-C pushlog feed at <http://hg.mozilla.org/mozilla-central/pushlog>, only have LF.

When attaching such a message, the we are looking at the line length, and of this is longer than 990 characters, we force it to base64, which breaks the attachment.

I'm a bit surprised, since the comment says
https://dxr.mozilla.org/comm-central/rev/1994d82f97322ab5fc42601ad49a859a6844e96b/mailnews/compose/src/nsMsgSend.h#53
that the 990 character limit doesn't apply to type message/rfc822 and that's also what the code says here:
https://dxr.mozilla.org/comm-central/rev/1994d82f97322ab5fc42601ad49a859a6844e96b/mailnews/compose/src/nsMsgAttachmentHandler.cpp#465

I haven't debugged it further.

So Alta88, since you prefer concrete questions:
1) Could you debug this to see what m_type is at the spot I indicated.
2) Explain why feed messages when exported/attached only have LF and are seen by
   that code as one horribly long line that is forced to base64.
Flags: needinfo?(alta88)
Add of course, add "please" to 1) and 2) ;-)
Hmm, I looked at items 1) and 2) myself. This is the debug:
=== m_type message/rfc822 336
where 336 is the longest line, and the LF alone counted as line breaks. So that was all false alarm.

So despite the attachment being type message/rfc822 and not having a long line, it ends up being encoded base64. Strange.
Flags: needinfo?(alta88)
Feed messages seem to have LF for an unknown reason. They trigger the code that forces everything to base64 if it encounters mixed CR and CR+LF. The CR+LF most likely comes from the attachment header.
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #8830866 - Flags: review?(mkmelin+mozilla)
Attachment #8830866 - Flags: feedback?(alta88)
Summary: attachment item is broken when attach feed item → Feed item is always attached as base64 leading to display of base64 text if the original message CTE is 8bit (not base64)
Comment on attachment 8830866 [details] [diff] [review]
958393-feed-attachments.patch (v1)

If the spec for storing Berkely mailbox format messages indicates mixed LF and CR/LF is forbidden or undesired, feeds can and should be fixed (it's been this way since the beginning afaict).
https://dxr.mozilla.org/comm-central/rev/c0e5b93e1369ec16a2c0747df816fbcae5084fff/mailnews/extensions/newsblog/content/FeedItem.js

Otherwise, it would be a good practice for the export to file function to do the right thing, to make Tb extra robust in its output.

The compose attachment processor obviously doesn't trust the eml file message's 8bit header. This legacy behavior may even be valid in order to detect bad Content-Transfer-Encoding. If there's no spec about LF, then it seems your solution is correct and in the right code point. Even if feeds are doing the wrong thing, it's still necessary for backward compat.
Attachment #8830866 - Flags: feedback?(alta88) → feedback+
Thanks for the feedback.

I think something needs to be fixed in feeds. Here is a sample message with a line that is 10033 characters long. If my change goes ahead, that will be sent with this long line which is illegal, since the limit is 1000 bytes according to RFC 821. So I'm sure that something will break somewhere, if not in TB, then in some MTA. I don't care what you store in the mailbox, but when the message gets presented for further processing, it needs to be compliant with what the rest of the system assumes.
Looking at the reference you pointed out, I get the impression that feeds always have HTML bodies:

  MESSAGE_TEMPLATE: '\n' +
    '<html>\n' +
    '  <head>\n' +
    '    <title>%TITLE%</title>\n' +
    '    <base href="%BASE%">\n' +
    '  </head>\n' +
    '  <body id="msgFeedSummaryBody" selected="false">\n' +
    '    %CONTENT%\n' +
    '  </body>\n' +
    '</html>\n',

so nothing stops you implementing a very crude line breaking algorithm that inserts a newline somewhere at the end of some tag to keep lines shorter than our magic 990 limit.
Test - before you get the chance to ask for one ;-)
Attachment #8831073 - Flags: review?(mkmelin+mozilla)
(In reply to Jorg K (GMT+1) from comment #12)
> Created attachment 8831051 [details]
> Sample feed message with really long line

this is OT here, but if you feel there an actual bug, please file one.  on linux, i don't see any mixed LF and CR/LF in headers/content when exporting a feed message to .eml (but now  i notice you said the whole LF hypothesis was a canard).

as for line length, what if one were to compose an email that consisted of
<p>two thousand chars before the closing tag</p>?  and what if Tb were to receive such a message?  so i doubt it's a feeds only problem.
Depends on: 1334511
(In reply to alta88 from comment #15)
> as for line length, what if one were to compose an email that consisted of
> <p>two thousand chars before the closing tag</p>?  and what if Tb were to
> receive such a message?  so i doubt it's a feeds only problem.
It's impossible to construct such a message.

Firstly, the DOM gets serialised and that sticks to a line length of 72-80 characters. Surely a <p> can be broken at most spaces in the <p>.

Secondly, if there is a long line that the serialiser can't break, we ship the message as base64 encoded.

No message produced with TB composition will ever exceed 990 characters.

Further reading: Bug 1225904 and its friends (bug 1225864, bug 653342). Trust me, I cleaned up the whole mess, so I know exactly about long lines under all circumstances.

The problem is forwarding invalid messages received from elsewhere, as noted here:
https://dxr.mozilla.org/comm-central/rev/1994d82f97322ab5fc42601ad49a859a6844e96b/mailnews/compose/src/nsMsgAttachmentHandler.cpp#461
(quote: "not ideal", so basically: bad data in, bad data out in this case).

... and feeds, which are easily invalid as we have seen.

Moved the incompliance of feed messages to bug 1334511.

You're right. Feeds are not the only problem. The other problem are invalid messages which have been received. That doesn't mean that feed messages should easily run up and over 10K in a single line.
Comment on attachment 8830866 [details] [diff] [review]
958393-feed-attachments.patch (v1)

Review of attachment 8830866 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsMsgAttachmentHandler.cpp
@@ +379,5 @@
>      // or several flavors of newlines are present, always use base64
>      // (so that we don't get confused by newline conversions.)
> +    // However, never encode messages this way.
> +    if (!m_type.LowerCaseEqualsLiteral(MESSAGE_RFC822))
> +      needsB64 = true;

Maybe always assigning would be easier to grasp.

// Use base64, unless it's message that we're attaching.
needsB64 = !m_type.LowerCaseEqualsLiteral(MESSAGE_RFC822);
Attachment #8830866 - Flags: review?(mkmelin+mozilla) → review+
Comment on attachment 8831073 [details] [diff] [review]
958393-feed-attachments-test.patch (v1)

Review of attachment 8831073 [details] [diff] [review]:
-----------------------------------------------------------------

Great, r=mkmelin
Attachment #8831073 - Flags: review?(mkmelin+mozilla) → review+
Thanks.

(In reply to Magnus Melin from comment #17)
> > +    if (!m_type.LowerCaseEqualsLiteral(MESSAGE_RFC822))
> > +      needsB64 = true;
> // Use base64, unless it's message that we're attaching.
> needsB64 = !m_type.LowerCaseEqualsLiteral(MESSAGE_RFC822);
I reshuffled it completely to *maximise* the change ;-) You'll see it when it lands.
https://hg.mozilla.org/comm-central/rev/f57f050f3abcf5bdb7fe39ff437089150f3ac814
https://hg.mozilla.org/comm-central/rev/f9d29f77e862738ac7d6bfab1bb12823666c818b
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 54.0
Comment on attachment 8830866 [details] [diff] [review]
958393-feed-attachments.patch (v1)

Let's take this to the branches, low risk one liner ;-)
Attachment #8830866 - Flags: approval-comm-beta+
Attachment #8830866 - Flags: approval-comm-aurora+
Comment on attachment 8831073 [details] [diff] [review]
958393-feed-attachments-test.patch (v1)

... and the test, too.
Attachment #8831073 - Flags: approval-comm-beta+
Attachment #8831073 - Flags: approval-comm-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: