The default bug view has changed. See this FAQ.

utf-7 encoded characters are not interpreted properly in TB 5.0.

RESOLVED FIXED in Thunderbird 12.0

Status

MailNews Core
Internationalization
--
major
RESOLVED FIXED
6 years ago
5 years ago

People

(Reporter: mm-muell, Assigned: smontagu)

Tracking

({dataloss, regression})

Thunderbird 12.0
x86_64
Windows 7
dataloss, regression
Bug Flags:
in-testsuite ?

Thunderbird Tracking Flags

(thunderbird7-, thunderbird8-, thunderbird9-, thunderbird10+ fixed, thunderbird11 fixed)

Details

Attachments

(3 attachments)

(Reporter)

Description

6 years ago
Created attachment 551352 [details]
thunderbirt-utf7-bug.zip

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0
Build ID: 20110615151330

Steps to reproduce:

I updated from TB 3.1.11 to 5.0 and tried to read utf-7 encoded mail.


Actual results:

Utf-7 formatted mail such as the attached sample is not displayed properly anymore since the update. Schreenshots with the same (included) sample mail being displayed are in the attached file:

*Garbled interpretation by TB5
*An OK one by TB 3.1.11 (on second machine).

The original environment was Windows 7-64. I verified the problem with a clean profile and under Linux (Knoppix).


Expected results:

The character interpretation should have stayed the same as before!
The problem seems to be pretty far down. Html mail is totally garbled, as "<" becomes "+ADw+", with tags no more being recognized as such. (Attached example is plain text to keep things simple).
(Reporter)

Updated

6 years ago
Severity: normal → critical
Assignee: nobody → smontagu
Severity: critical → normal
Component: General → Internationalization
Product: Thunderbird → MailNews Core
QA Contact: general → i18n
I think UTF-7 was dropped as a recognized protocol for the HTML parser, though I'm not certain.

Comment 2

6 years ago
bug 414064 dropped support for utf7
but there is bug 587475
(Reporter)

Comment 4

6 years ago
Addendum: If one replies to utf-7 encoded mail, the result is utf-7 encoded itself. Apparently TB5 can write utf-7. It just can't read it, not even in its own "sent" folder. This seems kind of odd / half-done.

The removal of utf-7 support in an automatic update is bad. I am losing quite some mail history. There is no easy way back to 3.1.11 because the lightning plugin updated itself, too, including (now no more downward-compatible) calendar data.
tracking-thunderbird7: --- → ?
Assignee: smontagu → dbienvenu
tracking-thunderbird7: ? → -
tracking-thunderbird8: --- → +

Comment 5

6 years ago
Created attachment 561056 [details]
Another example using utf-7

UTF-7 is still used, at least some mailservers send delivery reports using it.
This is such a notification (edited to exclude confidential info).

Removing support for UTF-7 is fine as long as you don't allow to create new content using it, but removing the ability to read (old) messages (and not-too-old issued by still-functioning software) isn't acceptable. This brings the old question regarding one's ability to read old documents saved in a proprietary file formats, that is being addressed by new international standards (e.g. ODF). This move makes the opposite: makes a document composed using (once) standard format to be unreadable.
And yet another thought: while striving to implement HTML5, you will now drop supporting HTML4 and older, will you? Your implementation of HTML5 may be completely UTF-7-clean, but you could let it be in other places.

By the way, this message shows another bug displaying attached rfc822 message. As you can see, its attachments are shown as the main message attachments (well, this is another issue, I'll look for it already filed, or create a separate issue).
Removing support for reading UTF-7 wasn't intentional and is why this bug exists. Obviously when bug 414064 and bug 587475 landed we managed to break something, which is why we're tracking this for TB 8 and we'll try and get it fixed there.
Blocks: 587475
Status: UNCONFIRMED → NEW
Ever confirmed: true
Unfortunately we've not been able to fix this in time for 8, so we'll shoot for 9 instead.
tracking-thunderbird8: + → -
tracking-thunderbird9: --- → +
(Assignee)

Comment 8

5 years ago
Created attachment 572568 [details] [diff] [review]
Patch

*IF* we are OK with supporting UTF-7 even in HTML messages, this is relatively straightforward: we just have to use GetUnicodeDecoderInternal all the time.

Last time I tried this, view source didn't work with UTF-7 messages, but now it does, presumably because of the change to use the HTML5 parser in view source. The UTF-7 isn't decoded in view source, but that is no worse than the status quo with quoted-printable and base64 encoded messages.
Assignee: dbienvenu → smontagu
Attachment #572568 - Flags: review?(dbienvenu)

Comment 9

5 years ago
Simon, thx for the patch. My understanding is that we don't do utf-7 in html in the browser because of xss exploits. I assume e-mail is not vulnerable because we don't have js turned on. rss feed messages, on the other hand, might be vulnerable, except that I'm not sure how much of the feed content actually goes through libmime. Cc'ing dveditz to see if this scares him.

I think it should be possible to not do utf-7 decoding in html parts, though it is libmime, so nothing is easy. Do you know for sure that with this patch we actually do utf-7 decoding of html?
(Assignee)

Comment 10

5 years ago
> I think it should be possible to not do utf-7 decoding in html parts, though
> it is libmime, so nothing is easy. Do you know for sure that with this patch
> we actually do utf-7 decoding of html?

Yes, I've tested it with plain text and html messages (but not rss feeds, I have to admit ;-) 
I think it shouldn't be too hard to exclude HTML messages from UTF-7 decoding, but note that comment 0 does specifically mention HTML messages as part of the problem.

Comment 11

5 years ago
(In reply to Simon Montagu from comment #10)

> I think it shouldn't be too hard to exclude HTML messages from UTF-7
> decoding, but note that comment 0 does specifically mention HTML messages as
> part of the problem.

It does, but the message is actually of type text/plain. I'm curious if there's really text/html mail out there with utf-7. But in any case, I do appreciate the argument that we shouldn't break old mail, and as long as dveditz is OK with this, I'm ok with it.

Comment 12

5 years ago
(In reply to David :Bienvenu from comment #11)

> It does, but the message is actually of type text/plain. I'm curious if
> there's really text/html mail out there with utf-7.

Yes, those text/html emails do exist. We are still receiving quite some of those:

[...]
X-Mailer: Microsoft Office Outlook 11
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5931
[...]
------=_NextPart_001_0013_01CC9E01.3F5BABB0
Content-Type: text/html;
	charset="utf-7"
Content-Transfer-Encoding: quoted-printable
[...]

Comment 13

5 years ago
Comment on attachment 572568 [details] [diff] [review]
Patch

We could probably have a simple unit test for this, like mailnews/mime/test/unit/test_mimeStreaming.js, except that you'd have to verify the results, which that test doesn't do currently.
Attachment #572568 - Flags: review?(dbienvenu) → review+
Are we waiting for the test to check this in ?

Comment 15

5 years ago
No, we can check this in, I think.
Keywords: checkin-needed
tracking-thunderbird10: --- → +
tracking-thunderbird9: + → -
Severity: normal → major
Status: NEW → ASSIGNED
status-thunderbird10: --- → affected
status-thunderbird11: --- → affected
tracking-thunderbird11: --- → ?
tracking-thunderbird12: --- → ?
Flags: in-testsuite?
Keywords: dataloss, regression
There's one thing I've been asked to test with this, which I'll do in a little while, so please don't land just yet.
status-thunderbird10: affected → ---
status-thunderbird11: affected → ---
tracking-thunderbird11: ? → ---
tracking-thunderbird12: ? → ---
Keywords: checkin-needed
Checked in: http://hg.mozilla.org/comm-central/rev/dc9e0a572606
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 12.0
Attachment #572568 - Flags: approval-comm-beta+
Attachment #572568 - Flags: approval-comm-aurora+
Checked into branches:

http://hg.mozilla.org/releases/comm-aurora/rev/93f5b1f340bb
http://hg.mozilla.org/releases/comm-beta/rev/88b6a044983c
status-thunderbird10: --- → fixed
status-thunderbird11: --- → fixed
You need to log in before you can comment on or make changes to this bug.