Closed
Bug 1473893
Opened 7 years ago
Closed 7 years ago
Detaching attachments corrupts the main body of the e-mail text
Categories
(MailNews Core :: MIME, defect)
Tracking
(thunderbird_esr52 fixed, thunderbird_esr60 fixed, thunderbird60 fixed, thunderbird61 wontfix, thunderbird62 fixed, thunderbird63 fixed)
RESOLVED
FIXED
Thunderbird 63.0
People
(Reporter: kristen.panfilio, Assigned: jorgk-bmo)
References
()
Details
(Keywords: dataloss, regression)
Attachments
(3 files, 4 obsolete files)
822 bytes,
text/plain
|
Details | |
2.93 KB,
patch
|
jorgk-bmo
:
review+
jorgk-bmo
:
approval-comm-beta+
jorgk-bmo
:
approval-comm-esr60+
|
Details | Diff | Splinter Review |
3.23 KB,
text/plain
|
Details |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/603.3.8 (KHTML, like Gecko) Version/10.1.2 Safari/603.3.8
Steps to reproduce:
This problem newly arose on 5th July 2018, possibly after the latest Thunderbird upgrade (now on 52.9.0, 64-bit)? When detaching or deleting attachments from messages in either the Inbox or Sent folders, the main body of the e-mail message becomes corrupted in a message-specific way. Variously the entire body of the message loses correct text encoding and becomes a string of nonsense symbols, or formatting is altered such that hidden formatting is rendered in visible characters (e.g., "<span>" or  in place of double spaces or carriage returns), or the message suddenly acquires a random non-white background color.
This is true for the specific message even when I close and reopen it, move it out of the Inbox into an offline (not IMAP) folder, and restart Thunderbird and my entire laptop. If I change the setting to View > Message Body As > Plain text, then I can readily see the contents in a clean way, but plain text isn't very suitable for many incoming e-mails these days, meaning I'm perpetually toggling back and forth. Also, the plain text view is only helpful in Thunderbird -- if I view the same IMAP message through my university's webmail interface, then regardless of the Thunderbird viewing settings, through that interface I can only see the corrupted, illegible version.
This problem persists regardless of the text encoding options chosen (Preferences > Display > Formatting > Advanced..., now changed from the default Western ISO-8859-1 to Unicode UTF-8 for incoming mail). Also, the altered text is carried over when included in a reply message from my account. The "repair folder" function makes no difference (perhaps not surprising, given that message alteration persists even when transferred between folders).
Simply double-clicking to view attachments does not have this effect, but detaching and deleting files is important both for file organization and to keep the Thunderbird mail folder size down.
Comment 2•7 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #1)
> Joe, does this reproduce for you?
I am not able to repro this using win10
Flags: needinfo?(jsabash)
Reporter | ||
Comment 3•7 years ago
|
||
I am on Mac OS Sierra (10.12.6) and have been using Thunderbird on this system/ machine for three years.
Updated•7 years ago
|
Keywords: regression
Assignee | ||
Comment 4•7 years ago
|
||
I can (sadly) confirm this issue using a message from Outlook/Exchange (as stated at https://support.mozilla.org/en-US/questions/1224698). We have an example in attachment 8986821 [details] from bug 1470210. Most likely yet another regression from the security fixes we shipped in TB 52.9 :-(
STR: Import message, delete the two attachments (I didn't try detach).
In the test message I mentioned, changing the text encoding to Unicode almost displays the message correctly.
Alice, can you confirm the regression range on trunk only. Should be the same as bug 1470210 comment #14.
Status: UNCONFIRMED → NEW
Component: Untriaged → MIME
Ever confirmed: true
Flags: needinfo?(alice0775)
Product: Thunderbird → MailNews Core
Version: 52 Branch → 52
Assignee | ||
Comment 5•7 years ago
|
||
Reduced test case: Import message, delete attachment, content garbled.
Assignee | ||
Comment 6•7 years ago
|
||
The test message has the body encoded in windows-1252. After the attachment is removed, the body is re-encoded in UTF-8 but the message header still says:
Content-Type: text/html; charset=windows-1252
Content-Transfer-Encoding: 8bit
Assignee | ||
Comment 7•7 years ago
|
||
I repeated the steps using TB 52.8. There the message body is left untouched.
![]() |
||
Comment 8•7 years ago
|
||
comm-esr52:
https://hg.mozilla.org/releases/comm-esr52/pushloghtml?fromchange=3112e25f905d0423f6f433a385f8c015a254c1e9&tochange=e3e80074c7e10f6dbcd596001b1e995e4f78a492
https://hg.mozilla.org/releases/mozilla-esr52/pushloghtml?fromchange=babe11f6b2bc52b18ae2391b6dca835dc7901505&tochange=babe11f6b2bc52b18ae2391b6dca835dc7901505
comm-beta chanel:
https://hg.mozilla.org/releases/comm-beta/pushloghtml?fromchange=8b06247dc9cc33621458b0f17b79655c409154c9&tochange=b39d98d20e6dd5786e25d85eab6159d04d207e17
comm-central:
https://hg.mozilla.org/comm-central/pushloghtml?fromchange=148c2d27abc3daa72f3cb3c3e2f8404e2f018f48&tochange=d16fff497d408a3ece7f9eb58874e8151bec70c5
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=000309d44abb81084276c9fa977c57aee7126053&tochange=77c06979d9e88979ec96263eccdbd750cb9221a4
Flags: needinfo?(alice0775)
Assignee | ||
Comment 10•7 years ago
|
||
Somehow the problem appears to be bigger than some charset/encoding issue. Using the message from attachment 8986821 [details] from bug 1470210, we see that after deleting both attachments, a third attachment appears, which is the embedded image from the HTML part. The HTML part somehow lost the reference to it, so it now appears as attachment.
Due to the intricacies of MIME, this won't be easy to fix at all :-(
Assignee | ||
Comment 11•7 years ago
|
||
This makes the delete for the test.eml message work again. Of course it's totally wrong to hard-code "windows-1252", but it works for the message in windows-1252. The debug above prints UTF-8 twice, so even is this approach were heading in the right direction, it would be hard to get the correct charset at that point.
I'm not sure this is the correct approach.
Interestingly, I just produced a test message in ISO-2022-JP and remove and attachment. Lo and behold, *no* problem, the body remained ISO-2022-JP and wasn't re-encoded into UTF-8.
Assignee | ||
Comment 12•7 years ago
|
||
Korean in EUC-KR is obliterated, as is Russian in KOI8-U, or Japanese in Shift_JIS. ISO-2022-JP always had special treatment, since it's the only stateful encoding supported by Mozilla. It's certainly a different code path; no I idea why those messages withstand attachment removal when all other encodings, apart from UTF-8, are destroyed.
Assignee | ||
Comment 13•7 years ago
|
||
Reverting this change
https://hg.mozilla.org/comm-central/rev/96fab4a2b811#l3.54
- clazz = (MimeObjectClass *)&mimeInlineTextHTMLClass;
+ clazz = (MimeObjectClass *)&mimeInlineTextHTMLParsedClass;
so reverting back to the original HTML inline class, fixes the issue. Of course it also reverts the EFAIL fix :-(
So there's something in that new class that's wrong.
Assignee | ||
Comment 14•7 years ago
|
||
Looking into this a little further with attachment 8986821 [details].
Previously, when removing attachments, the HTML was just copied over to the new message that didn't have the attachment. Now it's run through the HTML parser and serialised again. The effect is that the HTML is changed in the process.
Here a comparison:
Before Now
NBSP character encoded in UTF-8
we’ve we’ve encoded in UTF-8
There's also a case of
wasn̵=
7;t
where the code is broken into two lines and that gets messed up completely and we get: wasn̵7;t, yes some funny character after the n.
I think it's basically very wrong to subject a message to parsing and serialising of its HTML, just to copy that part to a new message where the attachments will be replaced.
So the fix would be to find the code-path that does that and switch that back to the previous HTML class which is still available in the system.
Assignee | ||
Comment 15•7 years ago
|
||
This fixes the issue. Looking at
https://dxr.mozilla.org/comm-central/rev/3bcadb80b56798587d199b6cbd4d80807c0ad619/mailnews/mime/public/nsIMimeStreamConverter.idl#19
I wonder whether other output types should be exempt from using the parsed class. When importing the test message I saw nsMimeMessageFilterSniffer = 12, I saw 2 for display, 4 for reply and 6 for forward.
Looking at
https://dxr.mozilla.org/comm-central/rev/3bcadb80b56798587d199b6cbd4d80807c0ad619/mailnews/mime/src/mimei.cpp#420
convinced myself to add the filter sniffer and others to the list.
Assignee: nobody → jorgk
Attachment #8990519 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8990526 -
Flags: review?(mkmelin+mozilla)
Attachment #8990526 -
Flags: review?(ben.bucksch)
Attachment #8990526 -
Flags: review?(acelists)
Assignee | ||
Comment 16•7 years ago
|
||
Assignee | ||
Comment 17•7 years ago
|
||
Got a tip from FRG to add parenthesis. Maybe more noise here will wake up the reviewers during cucumber time (https://en.wikipedia.org/wiki/Silly_season).
Attachment #8990526 -
Attachment is obsolete: true
Attachment #8990526 -
Flags: review?(mkmelin+mozilla)
Attachment #8990526 -
Flags: review?(ben.bucksch)
Attachment #8990526 -
Flags: review?(acelists)
Attachment #8990545 -
Flags: review?(mkmelin+mozilla)
Attachment #8990545 -
Flags: review?(ben.bucksch)
Attachment #8990545 -
Flags: review?(acelists)
Assignee | ||
Comment 18•7 years ago
|
||
Didn't even compile without the parenthesis, so here again:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=eed585bdaeb665dee51e1832dcba75de136fb418
Assignee | ||
Comment 19•7 years ago
|
||
OK, one test failed, I had to revert the first part of
https://hg.mozilla.org/comm-central/rev/a2472fbf9a2c07e95419cfefa1eb5dd3cd396d88
Now it passes.
Note that this bug keeping the entire project on tenterhooks, we had to scrap TB 60 beta 10 and purge TB 52.9 from our websites so it's no longer handed out to users.
Attachment #8990545 -
Attachment is obsolete: true
Attachment #8990545 -
Flags: review?(mkmelin+mozilla)
Attachment #8990545 -
Flags: review?(ben.bucksch)
Attachment #8990545 -
Flags: review?(acelists)
Attachment #8990577 -
Flags: review?(mkmelin+mozilla)
Attachment #8990577 -
Flags: review?(ben.bucksch)
Attachment #8990577 -
Flags: review?(acelists)
Assignee | ||
Comment 20•7 years ago
|
||
Discussed with Aceman via IRC which other types might want to use the previous HTML class.
https://dxr.mozilla.org/comm-central/rev/a75878fe1f574a55f9e03ac27027f11e5fa336e3/mailnews/mime/src/nsStreamConverter.cpp#520
gives a clue. nsMimeOutput::nsMimeMessageSource might apply to HTML processing, but nsMimeOutput::nsMimeMessageRaw might need to be added. BTW, nsMimeOutput::nsMimeUnknown is unused.
Assignee | ||
Comment 22•7 years ago
|
||
I did some more testing: Without the patch, when saving a message as HTML, not the original HTML is saved, but the parsed/serialised version, so that's undesired.
nsMimeMessageRaw(5) happens with nsMimeMessageBodyDisplay(2) on a multipart/related message with HTML+image. So I don't think we want the original behaviour for "raw".
nsMimeMessageDecrypt is used in nsMsgAttachmentHandler::SnarfMsgAttachment, so I think that should have the original HTML returned, same as the filter sniffer.
So I think the patch is right.
![]() |
||
Comment 23•7 years ago
|
||
Comment on attachment 8990577 [details] [diff] [review]
1473893-no-parsing-for-save.patch (v1b) + fixed test
Review of attachment 8990577 [details] [diff] [review]:
-----------------------------------------------------------------
So I can confirm this at least fixes the attachment detaching.
The other modes also sound reasonable to use the original message source, so let's try to add them.
Attachment #8990577 -
Flags: review?(acelists) → review+
Assignee | ||
Comment 24•7 years ago
|
||
Same code, clarified comment a bit.
Attachment #8990577 -
Attachment is obsolete: true
Attachment #8990577 -
Flags: review?(mkmelin+mozilla)
Attachment #8990577 -
Flags: review?(ben.bucksch)
Attachment #8990621 -
Flags: review+
Comment 25•7 years ago
|
||
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/16adae064e80
Don't use parsed HTML MIME class if saving, processing attachments and in other cases. r=aceman
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Target Milestone: --- → Thunderbird 63.0
Assignee | ||
Updated•7 years ago
|
Attachment #8990621 -
Flags: approval-comm-esr60+
Attachment #8990621 -
Flags: approval-comm-beta+
Assignee | ||
Comment 26•7 years ago
|
||
TB 60 beta 10 (BETA_60_CONTINUATION branch):
https://hg.mozilla.org/releases/comm-beta/rev/78780e0cc6f537bb75866da19a1d764373be99e5
Beta (TB 62):
https://hg.mozilla.org/releases/comm-beta/rev/598a2b0bee38536a38c89d448479b04e4670070f
status-thunderbird60:
--- → fixed
status-thunderbird61:
--- → wontfix
status-thunderbird62:
--- → fixed
status-thunderbird63:
--- → fixed
status-thunderbird_esr60:
--- → affected
Assignee | ||
Comment 27•7 years ago
|
||
status-thunderbird_esr52:
--- → fixed
Comment 28•7 years ago
|
||
question - I'm wondering how this squeaked past unit tests and I haven't followed closely enough - is this a case where delete/detach testcase succeeded in both PRE-bug-1419417 unit tests and bug-1419417 unit tests? I.E. both testcases were flawed?
Flags: needinfo?(jorgk)
Assignee | ||
Comment 29•7 years ago
|
||
There appears to be a unit test
https://dxr.mozilla.org/comm-central/source/mailnews/base/test/unit/test_detachToFile.js
that tests detaching attachments. Briefly glancing at it, I see that it checks for "AttachmentDetached" in the resulting message. It certainly doesn't check the body of the resulting e-mail. So the test is somewhat deficient.
In general, I'm surprised on a regular basis how much of TB's functionality is covered by tests, but there is also a lot of functionality *not* covered by tests. All the security fixes were rushed through review and release and none of them is covered by any test.
Flags: needinfo?(jorgk)
Assignee | ||
Comment 31•7 years ago
|
||
Another test case motivated by bug 1474330.
Assignee | ||
Comment 33•7 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•