Closed Bug 1167982 Opened 4 years ago Closed 3 months ago

Thunderbird misencodes "git format-patch" attachments with UTF-8; should add charset=UTF-8 to Content-Type: text/x-patch

Categories

(Thunderbird :: Message Compose Window, defect)

38 Branch
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 68.0

People

(Reporter: eggert, Assigned: jorgk)

Details

Attachments

(4 files, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0
Build ID: 20150511103303

Steps to reproduce:

I ran 'git format-patch' on a repository that uses UTF-8 encoding, and attached to my outgoing mail the resulting .patch file, which contained some UTF-8 characters.  (I've attached this .patch file to this bug report.)


Actual results:

Thunderbird creates a .patch attachment with a header that says "Content-Type: text/x-patch;" with no charset attribute.


Expected results:

The header should say "Content-Type: text/x-patch; charset=UTF-8;".

This bug has caused problems for people trying to read my email with non-Thunderbird clients.  For details, please see:

http://lists.gnu.org/archive/html/emacs-devel/2015-05/msg00710.html
Dmitry Gutov reports in the abovementioned thread on emacs-devel that the bug also occurs in Thunderbird 38 (beta).
Version: 31 → 38
Summary: Thunderbird misencodes "git format-patch" attachments with UTF-8 → Thunderbird misencodes "git format-patch" attachments with UTF-8; should add charset=UTF-8 to Content-Type: text/x-patch

I've run into this bug several times since reporting it, and just today ran into it again with Thunderbird 60.6.1. I thought I'd mention this version number here (don't know how to set a version number in Bugzilla.)

We should probably check for "text/" here: https://searchfox.org/comm-central/rev/e2719324dfa7e67b746ee7d65410dd0fb43a9bbd/mailnews/compose/src/nsMsgAttachmentHandler.cpp#525

Paul, maybe you want to send a patch for this?

(In reply to Magnus Melin [:mkmelin] from comment #3)

We should probably check for "text/" here: https://searchfox.org/comm-central/rev/e2719324dfa7e67b746ee7d65410dd0fb43a9bbd/mailnews/compose/src/nsMsgAttachmentHandler.cpp#525

Paul, maybe you want to send a patch for this?

Sorry, I'm unfamiliar with the code base. This partly because my C++ is pretty rusty; it's been 25 years since I last used it seriously and I would rather not repeat the experience.

Benjamin, here's something to get your feet wet.

Assignee: nobody → benjamin
Status: UNCONFIRMED → NEW
Ever confirmed: true

(In reply to Magnus Melin [:mkmelin] from comment #5)

Benjamin, here's something to get your feet wet.

Looking into it now. Lots to absorb the next few days.

Attachment #8609888 - Attachment is obsolete: true
Comment on attachment 8609888 [details]
0045-Prefer-this-to-this-in-doc-strings.patch

This appears to be test data, so a sample patch that when attached in TB causes the problem.
Attachment #8609888 - Attachment is obsolete: false

To clarify the intent I need to write a patch for TB that stops TB from messing around with the attachments that it thinks it should read and in this case cause the patch to be unusable by the recipient? Or at the very least stop it from making the patch attachment unusable?

Flags: needinfo?(mkmelin+mozilla)

(In reply to Benjamin Flanagin from comment #8)

To clarify the intent I need to write a patch for TB that stops TB from messing around with the attachments that it thinks it should read and in this case cause the patch to be unusable by the recipient?

I'm not sure what you mean by "thinks it should read". Does TB read the patch, or does it merely look at the filename of the patch?

Either way, for this particular patch, TB should put "Content-Type: text/x-patch; charset=UTF-8; ..." instead of just ""Content-Type: text/x-patch; ..." in the header for the attachment. It does not need to mess with the attachment itself; allit needs to do is to put better metainfo into the header for the attachment.

To help make the bug report clearer I am attaching a copy of email sent by Thunderbird. Notice that it contains the following lines. The first "Content-Type:" line (generated by Thunderbird) is missing a "charset=UTF-8;" and that is what is causing the problem.

--------------774ADBE11107067D28CE4369
Content-Type: text/x-patch;
name="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment;
filename*0="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"

From 325f51c84d9ad4d9776784bd324b347ffe4fe51b Mon Sep 17 00:00:00 2001
From: Paul Eggert <eggert@cs.ucla.edu>
Date: Tue, 30 Apr 2019 10:45:48 -0700
Subject: [PATCH] Fix decode-time/encode-time roundtrip on macOS
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

  • src/timefns.c (Fencode_time): Ignore DST flag when the zone is
    numeric or is a cons, as the doc string says it’s ignored in that

Ah, now I see. Not nearly the issue I thought it was at first in that it was causing the patch file attachments to fail on the recipients machine.

Thank you for submitting the eml it makes the context a bit more clear.

Yep. All comment 9 explains it, and comment 3 has a pointer to (likely) what needs to change.

Flags: needinfo?(mkmelin+mozilla)

Yes, as per comment 3, try changing
https://searchfox.org/comm-central/rev/e2719324dfa7e67b746ee7d65410dd0fb43a9bbd/mailnews/compose/src/nsMsgAttachmentHandler.cpp#525

  if (!m_charset.IsEmpty() || !m_type.LowerCaseEqualsLiteral(TEXT_PLAIN))
to
  if (!m_charset.IsEmpty() || PL_strncasecmp(m_type.get(), "text/", 5))
    return NS_OK;

So this will only return if the type is not "text/", so for "text/x-patch" we will detect the charset further down.

Building suggested fix now.

It didn't seem to change anything.

This is from the build with the changes suggested in comment 13. As you can see, the changes made...well no change.

--------------2136ED136C74D8F1B646E019
Content-Type: text/x-patch;
name="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment;
filename*0="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"

Attachment #9064232 - Attachment mime type: message/rfc822 → text/plain
Attached patch 1167982-debug.patch (obsolete) — Splinter Review

Here's my debug patch, it doesn't even modify the behaviour yet. This is what I see in the debug window:
=== m_charset: ||, m_type: |text/plain|
=== detected m_charset: |UTF-8|, rv=0

I'm sure that's for the attachment since I tried non-UTF-8 attachments and got a different result. The attachment gets base64 encoded, like so:

Content-Type: text/plain; charset=UTF-8;
 name="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename*0="0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch"

PkZyb20gMzI1ZjUxYzg0ZDlhZDRkOTc3Njc4NGJkMzI0YjM0N2ZmZTRmZTUxYiBNb24gU2Vw
IDE3IDAwOjAwOjAwIDIwMDENCkZyb206IFBhdWwgRWdnZXJ0IDxlZ2dlcnRAY3MudWNsYS5l
ZHU+DQpEYXRlOiBUdWUsIDMwIEFwciAyMDE5IDEwOjQ1OjQ4IC0wNzAwDQpTdWJqZWN0OiBb

In other words, I can't even reproduce the problem. Is this specific to Linux or Mac, I'm on Windows (as most people would know). I'm not sure who determines the MIME type of text/x-patch that was reported. To Windows, it's simply text. But we can see that the charset detection works, so the change I suggested should fix the issue.

Benjamin, can you apply the patch and show me the resulting debug.

Sort HG lesson: If you have Mercurial set up correctly:
hg qimport bz:1167982
hg qpush

Attachment #8609888 - Attachment is patch: false

Test data, a patch in UTF-8.

I can reproduce on Linux with the attachment "0001-Fix-decode-time-encode-time-roundtrip-on-macOS.patch" saved as a file with .patch attachment.

(In reply to Jorg K (GMT+2) from comment #13)

Yes, as per comment 3, try changing
https://searchfox.org/comm-central/rev/e2719324dfa7e67b746ee7d65410dd0fb43a9bbd/mailnews/compose/src/nsMsgAttachmentHandler.cpp#525

  if (!m_charset.IsEmpty() || !m_type.LowerCaseEqualsLiteral(TEXT_PLAIN))
to
  if (!m_charset.IsEmpty() || PL_strncasecmp(m_type.get(), "text/", 5))
    return NS_OK;

With this change it seems to work:
Debug output:
=== m_charset: ||, m_type: |text/x-patch|
=== detected m_charset: |UTF-8|, rv=0

Headers of the attachment in the created message source:
Content-Type: text/x-patch; charset=UTF-8;
name="attachment9065174 [details].patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment;
filename="attachment9065174 [details].patch"

But please use a nicer function (like StringBeginsWith) if possible, not PL_strncasecmp.

Attached patch 1167982.patchSplinter Review

This works for me.

Attachment #9065173 - Attachment is obsolete: true
Attachment #9065250 - Flags: review?(jorgk)

Thanks. I can't review this, you have me as the author ;-) You put an r+ onto it.

Attachment #9065250 - Flags: review?(jorgk)
Attachment #9065250 - Flags: review+
Attachment #9065250 - Flags: feedback?(benjamin)

So I feel like a dummy. I must have had 60 AND the daily running at the same time. I must not have noticed the version I was in when I was testing. Stupid mistake, should have tested it twice before commenting.

Comment on attachment 9065250 [details] [diff] [review]
1167982.patch

Review of attachment 9065250 [details] [diff] [review]:
-----------------------------------------------------------------

Patch works.I sent and received several patches that were affected before. Issue appears fixed.

You can (In reply to Benjamin Flanagin from comment #22)

So I feel like a dummy. I must have had 60 AND the daily running at the same time. I must not have noticed the version I was in when I was testing. Stupid mistake, should have tested it twice before commenting.

You can avoid that by making the production version look distinct. I have a very subtle theme so this doesn't happen to me. Believe me, it has happened ;-)

Attachment #9065250 - Flags: feedback?(benjamin) → feedback+

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/3b0f430eb654
Do charset detection for all text/* attachments in composition, not just text/plain. r=aceman

Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → FIXED

Sorry I hijacked this, but I had to check comment #15. Besides, strings in Mozilla are not so easy, not even I thought of StringBeginsWith(). The patch was created during an IRC conversation with Aceman.

Assignee: benjamin → jorgk
Target Milestone: --- → Thunderbird 68.0

After this landed we have:

TEST-UNEXPECTED-FAIL | comm/mailnews/compose/test/unit/test_messageHeaders.js | testContentHeaders - [testContentHeaders : 71] "text/html; charset=UTF-8" == "text/html"

I was thinking the same, but is this true?
The push with the change doesn't show the failures:
https://treeherder.mozilla.org/#/jobs?repo=comm-central&revision=6512afbb0b5671acb2c4550b23f492d6cd2f5b72

Oh, damn, that push didn't run and tests :-( - Grrrrr.

Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/c7bb9e23894a
Follow-up: Adjust test to charset detection for text/* attachments. r=me
You need to log in before you can comment on or make changes to this bug.