Closed Bug 1146099 Opened 5 years ago Closed 3 years ago

Behaviour of malformed rfc2047 encoded Subject message header inconsistent

Categories

(Thunderbird :: Mail Window Front End, defect)

x86_64
Windows 7
defect
Not set

Tracking

(thunderbird47 wontfix, thunderbird48 wontfix, thunderbird49 fixed, thunderbird50 fixed, thunderbird_esr38 wontfix, thunderbird_esr45- wontfix)

RESOLVED FIXED
Thunderbird 50.0
Tracking Status
thunderbird47 --- wontfix
thunderbird48 --- wontfix
thunderbird49 --- fixed
thunderbird50 --- fixed
thunderbird_esr38 --- wontfix
thunderbird_esr45 - wontfix

People

(Reporter: jorgk, Assigned: jorgk)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 4 obsolete files)

A mal-formed RFC2047 subject header is decoded in the message list, the message summary, the tab and when saving as EML file, but not the message header, see enclosed image.

It's mal-formed since it contains a "closing single quote" (’) (U+2019).
TB displays this in three different ways.
Enclosed two messages with mal-formed subjects.

The second one, the "Schnapp-broken" is from bug 506927. It has an illegal additional "= " in the header.

TB copes with the malformed headers as well as possible, only in the message header display in the message pane they look bad.
Flags: needinfo?(Pidgeot18)
I was surprised to see that there are two RFC2047 decoders at work here. The new JSMime implementation and an old C++ implementation in comm-central/mozilla/netwerk/mime.

This latter one doesn't perform as well as the new implementation, for "bad" headers, it passes the header back, as the following debug print shows:

Bad header:
=== (nsMIMEHeaderParamImpl) DecodeRFC2047Str: Header >>>=?windows-1252?Q?Announcing_India’s_First_Regional_Hospitality_Awards?=<<<
=== (nsMIMEHeaderParamImpl) DecodeRFC2047Str: Result >>>=?windows-1252?Q?Announcing_India’s_First_Regional_Hospitality_Awards?=<<<

Good header (replaced the ’ with a '):
=== (nsMIMEHeaderParamImpl) DecodeRFC2047Str: Header >>>=?windows-1252?Q?Announcing_India's_First_Regional_Hospitality_Awards?=<<<
=== (nsMIMEHeaderParamImpl) DecodeRFC2047Str: Result >>>Announcing India's First Regional Hospitality Awards<<<
OK, I cheated a bit on the debug output. What is returned from DecodeRFC2047Str for the ’ is
xE2, x80, x99. This the UTF-8 encoding for U+2019:
[1110] 0010 - [10]00 0000 - [10]01 1001, so stripping the part in [] away we have:
0010 0000 0001 1001 (decimal: 2019).

The big question is: When will JSMime replace the C++ code in comm-central/mozilla/netwerk/mime?
I noticed that in
https://dxr.mozilla.org/comm-central/source/mail/base/content/msgHdrViewOverlay.js#78
we don't provide an output function for the subject header.
Is this the place to ensure a "decent" display?

I still haven't figured out where <mail-headerfield id="expandedsubjectBox" ... receives its content.
Summary: Behaviour of mal-formed rfc2047 encoded Subject message header inconsistent → Behaviour of malformed rfc2047 encoded Subject message header inconsistent
Blocks: TB38found
Kent: Sorry to trouble you with this one. Maybe you can take a look. Perhaps it's just another omission (like in bug 1166206 and bug 1130248) and an easy fix. Maybe not all places where "glue" to the new JSMIME was necessary were covered. If it's easy, we should consider inclusion in TB38. A quick look at attachment 8581251 [details] demonstrates the problem.
Flags: needinfo?(rkent)
Blocks: RFC2047
See Also: → 1224426, 1204499
Flags: needinfo?(rkent)
Flags: needinfo?(Pidgeot18)
Duplicate of this bug: 1279804
Related/dupe: bug 1023285
Duplicate of this bug: 1285481
See Also: → 1023285
(In reply to Jorg K (GMT+2, PTO during summer, NI me) from comment #11)
> So this bug here continues to be about the different display in tread pane
> (using jsmime) and message header in message pane.
More precisely:
This bug here continues to be about the different display of a malformed *subject* in tread pane (using jsmime) and message header in message pane (apparently not yet using jsmime, see comment #2).

Bug 1023285 fixed the different display of the *from/to* headers using raw UTF-8 which was a different problem. For the subject, raw UTF-8 always worked. Sorry about the confusion in comment #9 (marked obsolete).
OK, since I'm revisiting this bug, here is a stack trace of how we get into the M-C netwerk code:

DecodeRFC2047Str(...) Line 1173	C++
internalDecodeRFC2047Header(...) Line 751	C++
nsMIMEHeaderParamImpl::DecodeRFC2047Header(...) Line 778	C++
MIME_DecodeMimeHeader(...) Line 34	C++  <==== Mailnews.
MimeHeaders_convert_header_value(...) Line 50	C++
MimeHeaders_write_all_headers(...) Line 603	C++
MimeMessage_write_headers_html(...) Line 776	C++
MimeMessage_close_headers(...) Line 433	C++
MimeMessage_parse_line(...) Line 251	C++
convert_and_send_buffer(...) Line 152	C++
mime_LineBuffer(...) Line 238	C++
MimeObject_parse_buffer(...) Line 240	C++
mime_display_stream_write(...) Line 990	C++
nsStreamConverter::OnDataAvailable(...) Line 938	C++
nsDocumentOpenInfo::OnDataAvailable(...) Line 303	C++
nsMailboxProtocol::ReadMessageResponse(...) Line 589	C++
nsMailboxProtocol::ProcessProtocolState(...) Line 675	C++
nsMsgProtocol::OnDataAvailable(...) Line 293	C++

It's called from mailnews/mime/src/comi18n.cpp

I'm not sure why mimehdrpar->DecodeRFC2047Header()
https://dxr.mozilla.org/comm-central/source/mailnews/mime/src/comi18n.cpp#32

isn't replaced with a call to decodeMimeHeader()
https://dxr.mozilla.org/comm-central/source/mailnews/mime/public/nsIMimeConverter.idl#70
https://dxr.mozilla.org/comm-central/source/mailnews/mime/src/mimeJSComponents.js#482
Attached patch Proposed solution (v1). (obsolete) — Splinter Review
Sorry Kent, more work for you.

Inspired by bug 1023285, I have returned to this bug here, my old pet hate, and finally got to the bottom of it.

The fix is to replace the use of the M-C netwerk stuff with jsmime. This has happened in many other locations, but not here.

Since this is used for the display of the message header in the message pane, there is no performance problem. The tread pane already uses jsmime.
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Attachment #8769612 - Flags: review?(rkent)
Attached patch Proposed solution (v1a). (obsolete) — Splinter Review
Removed unnecessary include.
Attachment #8769612 - Attachment is obsolete: true
Attachment #8769612 - Flags: review?(rkent)
Attachment #8769779 - Flags: review?(rkent)
Attached file Another test
Here's another test. To/From/Subject look bad in message pane, but good with the patch here and from bug 1023285.
A picture says more than 1000 words ;-)
How about these:
1. From: Homer Simpson
2. From: <Homer Simpson>
3. From: Homer Simpson:;

Only #3 is spec compliant, and if the first 2 are issued by an MTA, it's just wrong.  But in fact, #1 and #2 are/were used in feeds..  A separate bug, sure ;)
Comment on attachment 8769779 [details] [diff] [review]
Proposed solution (v1a).

Review of attachment 8769779 [details] [diff] [review]:
-----------------------------------------------------------------

I made one comment about what to me is premature optimization.

But the real issue is that I am very uncomfortable reviewing this, as I don't really know the differences between the two code variants, and what the issues might be in doing this change. You are proposing a change in underlying method in a really commonly used code path. My fear is that you are solving one regression at the risk of introducing many others.

I could review this as a reviewer of last resort, but is there anyone else who might have more knowledge of this?

::: mailnews/mime/src/comi18n.cpp
@@ +23,5 @@
>                             bool override_charset, bool eatContinuations,
>                             nsACString &result)
>  {
> +  // Since we are in C code here, do some basic caching with a static.
> +  static bool triedToGetConverter = false;

We do this sort of getting of XPCOM components all over the place without caching it. The pattern that you are introducing is very rare in other Gecko code, and I am reluctant to introduce a new pattern, with possible unforeseen consequences, unless there is a clear indication that this is a performance bottleneck, and the proposed fix has a clear benefit.
(In reply to Kent James (:rkent) from comment #19)
> But the real issue is that I am very uncomfortable reviewing this, as I
> don't really know the differences between the two code variants, and what
> the issues might be in doing this change. You are proposing a change in
> underlying method in a really commonly used code path. My fear is that you
> are solving one regression at the risk of introducing many others.
Not really true. That jsmime method is used all over the place now:
https://dxr.mozilla.org/comm-central/search?q=mimeConverter-%3EDecodeMimeHeader&redirect=false

If you look at
https://dxr.mozilla.org/comm-central/source/mailnews/db/msgdb/src/nsMsgDatabase.cpp#3644

you can see that there is caching going on, but in a member of the class.

> I could review this as a reviewer of last resort, but is there anyone else
> who might have more knowledge of this?
Me ;-) ?

> We do this sort of getting of XPCOM components all over the place without
> caching it.
I can remove the caching. No problem, although other code does caching, see above.
Attached patch Proposed solution (v2). (obsolete) — Splinter Review
No caching.
Attachment #8769779 - Attachment is obsolete: true
Attachment #8769779 - Flags: review?(rkent)
Attachment #8769903 - Flags: review?(rkent)
(In reply to Jorg K (GMT+2, PTO during summer, NI me) from comment #20)
> (In reply to Kent James (:rkent) from comment #19)
> You are proposing a change in
> > underlying method in a really commonly used code path. My fear is that you
> > are solving one regression at the risk of introducing many others.
> Not really true. That jsmime method is used all over the place now:
> https://dxr.mozilla.org/comm-central/search?q=mimeConverter-
> %3EDecodeMimeHeader&redirect=false
The point is this: There are two decoders, the jsmime one and the M-C::Network one.

The jsmime one is used here:
https://dxr.mozilla.org/comm-central/search?q=mimeConverter-%3EDecodeMimeHeader&redirect=false
(five times).

The M-C:Network one is used here:
https://dxr.mozilla.org/comm-central/search?q=DecodeRFC2047Header&redirect=false
(once).

That means that the entire system has been changed over to jsmime, minus the one call site in
mailnews/mime/src/comi18n.cpp 

The two decoders give different results in some cases and therefore cause inconsistencies in the UI. Additionally, the jsmime decoder works better, that means it is more error tolerant to malformed headers. That's what this bug is about.

I don't know why the last call site wasn't changed over. Maybe an oversight, or maybe is was left out deliberately. The only reason I could think of would be a threading issue. Apparently XPCOM written in JavaScript must run on Gecko's main thread; that was the issue why Outlook import doesn't work any more since jsmime was introduced.

Maybe that's why my try run
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=561cc6161b81&selectedJob=20106
came put totally red with heaps of identical errors (only quoting a few here):
TEST-UNEXPECTED-FAIL | mailnews/mime/test/unit/test_hidden_attachments.js | xpcshell return code: -11
TEST-UNEXPECTED-FAIL | mailnews/mime/test/unit/test_mimeStreaming.js | xpcshell return code: -11
TEST-UNEXPECTED-FAIL | mailnews/mime/test/unit/test_rfc822_body.js | xpcshell return code: -11
TEST-UNEXPECTED-FAIL | mailnews/mime/test/unit/test_text_attachment.js | xpcshell return code: -11

So maybe "doing the right thing" and using the same decoder is technically not possible.

Magnus, do you know more about this (since Joshua won't answer)?
Flags: needinfo?(mkmelin+mozilla)
Comment on attachment 8769903 [details] [diff] [review]
Proposed solution (v2).

Clearing the r? for now since the solution seems to cause massive Xpcshell failures.
Attachment #8769903 - Flags: review?(rkent)
No idea, but it's possible it wasn't moved over because those test failures would have to be figure out first.
Flags: needinfo?(mkmelin+mozilla)
Since the review was cancelled, I'm adding NI for Kent as a reminder.

Weird, I ran some of the failing tests locally on Windows 10 and they all passed:
mozilla/mach xpcshell-test mailnews/mime/test/unit/test_rfc822_body.js
mozilla/mach xpcshell-test mailnews/mime/test/unit/test_hidden_attachments.js
mozilla/mach xpcshell-test mailnews/base/test/unit/test_bug366491.js
mozilla/mach xpcshell-test mailnews/db/gloda/test/unit/test_query_messages_imap_online_to_offline.js

The one I could get to fail locally was this one, the only one that didn't fail with error -11:
mozilla/mach xpcshell-test mailnews/mime/test/unit/test_message_attachment.js

Looks like I need help ;-(

I'm shipping this to try again, this time on Windows:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=7b68a873a081
Flags: needinfo?(rkent)
On Windows, I get exactly one test failure, the one that I can reproduce locally, the one that isn't error -11:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&selectedJob=20174

TEST-UNEXPECTED-FAIL | mailnews/mime/test/unit/test_message_attachment.js | gStreamListener.onStopRequest - [gStreamListener.onStopRequest : 81] "testSubje.eml" == "testSubject.eml"

I can look into those, but I have no idea what the error -11 are about.
Sadly the test was wrong.
'testSubject' should be 
'=?UTF-8?B?dGVzdFN1YmplY3Q=?=' and not
'=?UTF-8?B?dGVzdFN1YmplY3Q?='
Try it here: http://dogmamix.com/MimeHeadersDecoder/
Obviously jsmime and M-C::Network decoders treated the faulty input differently.

So let's ship this off to try one more time:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=16ee96c41c5d
Attachment #8769903 - Attachment is obsolete: true
Comment on attachment 8770431 [details] [diff] [review]
Proposed solution (v2) + test fixed.

So here we go again with r?.

Surprisingly, or maybe not surprisingly, the Xpcshell test came out green on *all* platforms after fixing the one true issue. No more error -11:

https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=16ee96c41c5d5677d2c53bfab719d5337f4200b9
Flags: needinfo?(rkent)
Attachment #8770431 - Flags: review?(rkent)
Comment on attachment 8770431 [details] [diff] [review]
Proposed solution (v2) + test fixed.

Review of attachment 8770431 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/comi18n.cpp
@@ +31,5 @@
>      return;
>    }
> +
> +  nsAutoString res;
> +  mimeConverter->DecodeMimeHeader(header, default_charset, override_charset, 

I'll fix the trailing space later.
Kent, if it's any consolation to you: I've downloaded the try build (which also includes the fix to bug 1023285, the other jsmime issue), and it works just fine. I haven't seen any adverse effects. I'd be happy to let this ride the trains for a while ;-)
Simplified string processing.
Attachment #8770431 - Attachment is obsolete: true
Attachment #8770431 - Flags: review?(rkent)
Attachment #8770579 - Flags: review?(rkent)
Comment on attachment 8770579 [details] [diff] [review]
Proposed solution (v2b) + test fixed.

Review of attachment 8770579 [details] [diff] [review]:
-----------------------------------------------------------------

OK looks good to me.

Since this was the last caller of DecodeRFC2047Header AFAICT, bonus points for filing a bug in core to remove this. Even more bonus points for adding the patch to remove this (I would have required that if the removed methods was in c-c instead of m-c).
Attachment #8770579 - Flags: review?(rkent) → review+
Thanks for the quick review. Mojibake (https://en.wikipedia.org/wiki/Mojibake) is really a pet hate of mine ;-)

https://hg.mozilla.org/comm-central/rev/32170e6fb298 --> FIXED.

I don't think I can earn bonus points here since DecodeRFC2047Header is part of Necko:
https://dxr.mozilla.org/comm-central/source/mozilla/netwerk/mime/nsIMIMEHeaderParam.idl#165

I don't think they'd want that removed but I can ask:

====

Honza, Valentin or Patrick: We eliminated the last call to nsMIMEHeaderParamImpl::DecodeRFC2047Header. Would you like to cull this or hang onto it?
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(mcmanus)
Flags: needinfo?(honzab.moz)
Target Milestone: --- → Thunderbird 50.0
Comment on attachment 8770579 [details] [diff] [review]
Proposed solution (v2b) + test fixed.

[Approval Request Comment]
Regression caused by (bug #): Bug 858337 (jsmime)
User impact if declined: https://en.wikipedia.org/wiki/Mojibake
Testing completed (on c-c, etc.): Yes, in test suite, see comment #28.
Risk to taking this patch (and alternatives if risky):
Some risk, that's why I'm uplifting it to TB 49 so it can get into the next beta early.
Attachment #8770579 - Flags: approval-comm-aurora+
(oops, forgot to mark it fixed in comment #34).
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Blocks: 858337
if you searched the addons repo for it you can make a patch to remove it.

otherwise I would just let it hang there until the great web-extensions migration

thanks for asking
Flags: needinfo?(mcmanus)
Flags: needinfo?(valentin.gosu)
Flags: needinfo?(honzab.moz)
(In reply to Kent James (:rkent) from comment #33)
> Since this was the last caller of DecodeRFC2047Header AFAICT, bonus points
> for filing a bug in core to remove this.
So I earned bonus points for filing bug 1286792. I might submit a patch there, too.
Duplicate of this bug: 1204499
Comment on attachment 8770579 [details] [diff] [review]
Proposed solution (v2b) + test fixed.

[Approval Request Comment]
This is tracking-tb45 and has been through beta. Any reason not to uplift?
Attachment #8770579 - Flags: approval-comm-esr45?
I didn't want to annoy the ESR45 manager ;-)
if the question is for me - in general it's great to have patches, but I'll simply restate my opinion that we should avoid taking patches for things that lack more than marginal impact to the customer base.  

As to this specific patch, I've skimmed some comments (not read the whole bug) and what I read doesn't seem to indicate high value to 45.x customers. If I am incorrect, free free to uplift
Comment on attachment 8770579 [details] [diff] [review]
Proposed solution (v2b) + test fixed.

OK I've looked this over, and as far as I can tell it is a bug with some risk, whose goal is to have more consistent handling of faulty input. That sounds to me like a risk not worth taking.
Attachment #8770579 - Flags: approval-comm-esr45? → approval-comm-esr45-
I don't want to argue, but bug 1023285, which you took, seems way more risky.
Yes but it also had 6 duplicates, so it was clear that somebody cared about it.

But these are all judgement calls and I cannot claim perfect consistency.
Depends on: 1386585
You need to log in before you can comment on or make changes to this bug.