Note: There are a few cases of duplicates in user autocompletion which are being worked on.

Behaviour of malformed rfc2047 encoded From message header inconsistent

RESOLVED FIXED in Thunderbird 40.0

Status

MailNews Core
MIME
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: Jorg K (GMT+2), Assigned: Jorg K (GMT+2))

Tracking

({regression})

Thunderbird 40.0
regression

Thunderbird Tracking Flags

(thunderbird38+ fixed, thunderbird39 fixed, thunderbird40 fixed, thunderbird_esr31 unaffected)

Details

Attachments

(9 attachments, 8 obsolete attachments)

18.57 KB, image/png
Details
25.61 KB, application/zip
Details
18.90 KB, image/png
Details
5.88 KB, application/zip
Details
429.05 KB, application/x-7z-compressed
Details
100.98 KB, image/png
Details
3.62 KB, patch
Details | Diff | Splinter Review
3.85 KB, patch
Details | Diff | Splinter Review
3.86 KB, patch
rkent
: review+
Details | Diff | Splinter Review
(Assignee)

Description

2 years ago
Created attachment 8575126 [details]
Screenshot of 8 imported messages. 4 imported with TB 31, four with TB 38

rfc2047 encoded message headers, which are mal-formed, are treated inconsistently by TB. Enclosed are four messages with this from header:

From: "=?UTF-8?Q?S=C3=A1nchez_Edgar?= edgsanchezg@hotmail.com [Barcelona-Freecycle]" <Barcelona-Freecycle-noreply@yahoogroups.com>

The quotes should not be there.

In the *message list* for one message TB displays the undecoded text, for the others it shows "Sánchez Edgar". This behaviour is consistent for TB 24 to TB 38. I tried 24, 31, 36 und 38.

In the message preview or when opening the message, the different versions do different things: TB 24 and TB 31 show
  From "Sánchez Edgar"
TB 36 and further show the undecoded text.

Import from EML files:
Importing into TB 31, all messages are now displayed good.
Importing into TB 36, newly imported messages display bad, the ones imported before with TB 31 still show good, see screenshot.

This is very confusing and inconsistent behaviour.

It would be good if all messages were always displayed well, despite the mal-formed headers.
(Assignee)

Comment 1

2 years ago
Created attachment 8575127 [details]
Archive containing for messages as shown on the screenshot
(Assignee)

Comment 2

2 years ago
Another observation:
If in TB 31, 36 or 38 you select a set of these messages, so a summary is displayed in the preview pane, the decoded "Sánchez Edgar" is shown for all messages (however, in TB 36 the accented character doesn't show correctly, most likely due to bug 1082896, which is fixed in TB 38).
(Assignee)

Comment 3

2 years ago
There is something really weird going on here. In TB 36 I revisited the messages previously showing correctly, and now they are *all* showing bad.

This is the reproducible case:
Import the four attached messages from the ELM files into TB 31.5 using drag from a file system folder onto the TB window. They should show correctly.
Start TB 36 or 38. These message should show still correctly. Now import the same messages in the same way. They should look bad.
This is what is shown in the attached screenshot.
Flags: needinfo?(Pidgeot18)

Updated

2 years ago
Keywords: regression
(Assignee)

Comment 4

2 years ago
Would be good if TB 38 didn't look worse than TB 31 ;-)
tracking-thunderbird38: --- → ?
(Assignee)

Comment 5

2 years ago
Created attachment 8577630 [details]
Screenshot of the From address shown in two different ways

This is seriously weird. Here the From address shows correctly in the message list, it shows bad in the message pane.

I've had cases where the From header looked bad in the message list and a "Repair Folder" fixed it. I tried again, and the second time, it didn't fix it.

Hard to find a reproducible case. To me it looks like some caching problem, since at times the From header is shown correctly, other times not.
(Assignee)

Comment 6

2 years ago
OK. Things are getting clearer. The information in the message list is displayed from the MSF file. So what matters is with which version the message was received or imported. That explains the observation in comment #3.

Most likely it's just a JSMime regression. Joshua Cranmer pointed me to:
https://dxr.mozilla.org/comm-central/source/mailnews/mime/jsmime/jsmime.js#375

There we decode headers of the form
From: "=?UTF-8?Q?S=C3=A1nchez_Edgar?="
so with quotes, but =? and ?= at the start/end of the string
(if I read it correctly).

The reported case:
From: "=?UTF-8?Q?S=C3=A1nchez_Edgar?= more stuff" and even more stuff
is not handled.
(Assignee)

Updated

2 years ago
Assignee: nobody → mozilla
(Assignee)

Comment 7

2 years ago
Created attachment 8577674 [details] [diff] [review]
Proposal for discussion

This is a simple one line change to be more aggressive with rfc2047 decoding.
Please let me know what you think.
Attachment #8577674 - Flags: feedback?(Pidgeot18)
(Assignee)

Comment 8

2 years ago
Created attachment 8577675 [details]
The message in this archive still doesn't decode

Here's a message, where the decoder worked in 24.x (before jsmime) and now it doesn't. From header is:

From: "=?iso-8859-1?Q?Daniel_Rius_garcer=C3=A1n?= daniriusgarceran@yahoo.es [Barcelona-Freecycle]" <Barcelona-Freecycle-noreply@yahoogroups.com>

OK, this header is wrong, it should be UTF-8, but how come it worked before?
(Assignee)

Comment 9

2 years ago
Created attachment 8577713 [details]
A mailbox with 332 messages all with rfc2047 encoded From

Some more testing: This mailbox contains 332 messages with rfc2047 encoded From field. With the patch in attachment 8577674 [details] [diff] [review] all but one message display fine. The one that doesn't display "correctly" is the one mentioned in comment #8. Maybe we can live with the fact that a "really bad" rfc2047 now doesn't decode correctly any more.
When multiple messages are selected, the "conversations" displayed in the message pane are also displayed correctly.
(Assignee)

Updated

2 years ago
Summary: Behaviour of mal-formed rfc2047 encoded message header inconsistent → Behaviour of mal-formed rfc2047 encoded From message header inconsistent
(Assignee)

Comment 10

2 years ago
Created attachment 8584753 [details] [diff] [review]
Proposed patch
Attachment #8577674 - Attachment is obsolete: true
Attachment #8577674 - Flags: feedback?(Pidgeot18)
Attachment #8584753 - Flags: review?(Pidgeot18)
(Assignee)

Comment 11

2 years ago
Comment on attachment 8584753 [details] [diff] [review]
Proposed patch

This is really a one-line change:
Old:
if (opts.rfc2047 && text.startsWith("=?") && text.endsWith("?="))
New:
if (opts.rfc2047 && text.indexOf("=?") > -1 && text.indexOf("?=") > -1)

In other words: Instead of only doing the decoding when the text starts and ends with the rfc2047 markers =? and ?=, we always (try to) decode when these markers are found anywhere in the text.

Sorry for being impatient, but we need this for TB 38 right about *now* ;-)
Attachment #8584753 - Flags: review?(kent)
Comment on attachment 8584753 [details] [diff] [review]
Proposed patch

Review of attachment 8584753 [details] [diff] [review]:
-----------------------------------------------------------------

Just because we need it now doesn't mean you can skimp on tests.

::: mailnews/mime/jsmime/jsmime.js
@@ +372,5 @@
>          // Quoted strings don't include their delimiters.
>          let text = value.slice(tokenStart + 1, i);
>  
> +        // If RFC 2047 is enabled, always decode the qstring if a both 2047 tokens are found
> +        if (opts.rfc2047 && text.indexOf("=?") > -1 && text.indexOf("?=") > -1)

First off, you're not explaining the rationale of the test.

Secondly, the test is not correct. (It may be simpler to just unconditionally decode the string).

Finally, this needs tests.
Attachment #8584753 - Flags: review?(kent)
Attachment #8584753 - Flags: review?(Pidgeot18)
Attachment #8584753 - Flags: review-
(Assignee)

Comment 13

2 years ago
Can you propose a patch then please.
I'm not at all familiar with (automated?) testing in Thunderbird.
(Assignee)

Updated

2 years ago
Assignee: mozilla → nobody
(Assignee)

Comment 14

2 years ago
Created attachment 8586091 [details] [diff] [review]
Proposed patch, second attempt

Here is a second patch with unconditional decoding.
This *may* be what Joshua Cranmer had in mind, sadly, he didn't give exact instructions.

I don't know how to add an automated test. I tested it manually with the mailbox in attachment 8577713 [details]: Delete the MSF and open the mailbox. All sender addresses look good.

As Joshua confirmed in his comment #12, we need this now, or else a problem will regress into TB 38. I believe we should be pragmatic and if need be, add a test later.

Sadly I can't do anything else, so I unassigned myself from the bug.
(Assignee)

Comment 15

2 years ago
Created attachment 8586095 [details]
User's should not get their mailbox displayed like this
(In reply to Jorg K from comment #13)
> Can you propose a patch then please.
> I'm not at all familiar with (automated?) testing in Thunderbird.

Fortunately, this is easy for this sort of patch to jsmime: just add some extra entries to the array here:
<https://dxr.mozilla.org/comm-central/source/mailnews/mime/jsmime/test/test_header.js#341>.

You can test that the tests pass with |mozilla/mach xpcshell-test mailnews/mime|
Flags: needinfo?(Pidgeot18)
(Assignee)

Comment 17

2 years ago
OK, thanks, but I still don't know where in the file you want the tests to go.
Here are the tests, can you please add them to avoid further delay (I will be travelling for the next few days):

Header, includes the quotes:
From: "=?UTF-8?Q?Jordi_Fern=C3=A1ndez_Nieto?= jordi.fernandez.nieto@gmail.com [Barcelona-Freecycle]"
From: "=?UTF-8?B?TWlyaWFtIEJlcm5hYsOpIFBlcmVsbMOz?= miriambernabe@gmail.com [Barcelona-Freecycle]"
From: "=?iso-8859-1?Q?Juana_Mar=EDa_Furi=F3_Sancho?= shangaguay@yahoo.es [Barcelona-Freecycle]"
From: "=?iso-8859-1?B?U29maWEgQ2FzdGVsbPMgUm9tZXJv?= soniasofia2@hotmail.com [Barcelona-Freecycle]"

Expected:
Jordi Fernández Nieto jordi.fernandez.nieto@gmail.com [Barcelona-Freecycle]
Miriam Bernabé Perelló miriambernabe@gmail.com [Barcelona-Freecycle]
Juana María Furió Sancho shangaguay@yahoo.es [Barcelona-Freecycle]
Sofia Castelló Romero soniasofia2@hotmail.com [Barcelona-Freecycle]
(Assignee)

Comment 18

2 years ago
Created attachment 8586986 [details] [diff] [review]
Proposed patch, third attempt, now with tests

If you don't like the comments, I'll fix them.
Please tell me *exactly* what you want.
Attachment #8586091 - Attachment is obsolete: true
Attachment #8586986 - Flags: review?(Pidgeot18)
(Assignee)

Updated

2 years ago
Component: Mail Window Front End → MIME
Product: Thunderbird → MailNews Core
Target Milestone: --- → Thunderbird 38.0
Version: 31 → 38
Comment on attachment 8586986 [details] [diff] [review]
Proposed patch, third attempt, now with tests

Review of attachment 8586986 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but the patch does need some nits fixed before it can land.

::: mailnews/mime/jsmime/jsmime.js
@@ +371,5 @@
>        if (ch == endQuote && ch == '"') {
>          // Quoted strings don't include their delimiters.
>          let text = value.slice(tokenStart + 1, i);
>  
> +        // If RFC 2047 is enabled, always decode the qstring

Nit: period.

::: mailnews/mime/jsmime/test/test_header.js
@@ +349,5 @@
>        ["=?UTF-8?Q?<?= <a@b.c>", [{name: "<", email: "a@b.c"}]],
>        ["Simple =?UTF-8?Q?<?= a@b.c>",
>          [{name: "", email: '"Simple < a"@b.c'}]],
>        ["Tag <=?UTF-8?Q?email?=@b.c>", [{name: "Tag", email: "email@b.c"}]],
> +	  // Bug 1141446: mal-formed From addresses with erroneous quotes

Nit: "malformed"

Also, please replace the tabs with spaces, and keep to an 80-character maximum line length.
Attachment #8586986 - Flags: review?(Pidgeot18) → feedback+
(Assignee)

Comment 20

2 years ago
Created attachment 8594393 [details] [diff] [review]
Proposed patch, after review

Thanks!
Sorry about the tabs.
The longest line is now 85 characters long, I hope that's acceptable as line 104 was already 86 characters long (I thought the days of the teletype (24x80) are over) ;-)
Attachment #8584753 - Attachment is obsolete: true
Attachment #8586986 - Attachment is obsolete: true
Attachment #8594393 - Flags: review?(Pidgeot18)
(Assignee)

Updated

2 years ago
Summary: Behaviour of mal-formed rfc2047 encoded From message header inconsistent → Behaviour of malformed rfc2047 encoded From message header inconsistent
(In reply to Jorg K (at the beach until 22nd April) from comment #20)
> Thanks!
> Sorry about the tabs.
> The longest line is now 85 characters long, I hope that's acceptable as line
> 104 was already 86 characters long (I thought the days of the teletype
> (24x80) are over) ;-)

I do most of my development using vim in tiled 24x80 xterms...
(Assignee)

Comment 22

2 years ago
I used to be a 'vi' man, too, but now I like using Notepad++ on Windows ;-)
I hope you can accept the slightly overlong lines.
(Assignee)

Comment 23

2 years ago
Created attachment 8594429 [details] [diff] [review]
Proposed patch, after review, again, this time with shorter lines
Attachment #8594393 - Attachment is obsolete: true
Attachment #8594393 - Flags: review?(Pidgeot18)
Attachment #8594429 - Flags: review?(Pidgeot18)
Comment on attachment 8594429 [details] [diff] [review]
Proposed patch, after review, again, this time with shorter lines

Review of attachment 8594429 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/jsmime/test/test_header.js
@@ +349,5 @@
>        ["=?UTF-8?Q?<?= <a@b.c>", [{name: "<", email: "a@b.c"}]],
>        ["Simple =?UTF-8?Q?<?= a@b.c>",
>          [{name: "", email: '"Simple < a"@b.c'}]],
>        ["Tag <=?UTF-8?Q?email?=@b.c>", [{name: "Tag", email: "email@b.c"}]],
> +    // Bug 1141446: malformed From addresses with erroneous quotes

This block of text is still improperly formatted, unfortunately.
Attachment #8594429 - Flags: review?(Pidgeot18) → feedback+
(Assignee)

Comment 25

2 years ago
Created attachment 8596351 [details] [diff] [review]
Proposed patch, after review, yet again, guessing formatting rules

Sorry, I don't follow.

The previous review said:
> > +	  // Bug 1141446: mal-formed From addresses with erroneous quotes
> Nit: "malformed"

I fixed that. What else don't you like? Why don't you just say?
Would you like "Malformed" or perhaps a period at the end, it's not a full sentence though?

Most lines have a period, but not all:
// Make sure the input is an array (accept a single entry)
// Unstructured headers (just decode RFC 2047 for the first header value)
// Make the buffer be a typed array for what follows
*   name: The display name of the address
*   email: The address of the object
*   name: The display name of the group
*   group: An array of address object for members in the group.
* @param {String} headerValue The MIME header value to parse.
* @param {Boolean} doRFC2047  If true, decode RFC 2047 encoded-words.
* @param {Boolean} doRFC2231  If true, decode RFC 2231 encoded parameters.
* @return {Map(String -> String)} A map of parameter names to parameter values.

Let me know the rules that you follow and I adhere to them.
Attachment #8594429 - Attachment is obsolete: true
Attachment #8596351 - Flags: review?(Pidgeot18)
(In reply to Jorg K from comment #25)
> Let me know the rules that you follow and I adhere to them.

The code you have looks like this:
.....
      ["Simple =?UTF-8?Q?<?= a@b.c>",
        [{name: "", email: '"Simple < a"@b.c'}]],	
        [{name: "", email: '"Simple < a"@b.c'}]],
      ["Tag <=?UTF-8?Q?email?=@b.c>", [{name: "Tag", email: "email@b.c"}]],	
      ["Tag <=?UTF-8?Q?email?=@b.c>", [{name: "Tag", email: "email@b.c"}]],
    // Bug 1141446: Malformed From addresses with erroneous quotes,
    ["name

Notice how your new lines fail to match up? That's the formatting issues I complained about, now for the third time. The first time you used tabs, then you converted to spaces, but you failed to convert to enough spaces to cause the lines to match up properly.
Attachment #8596351 - Flags: review?(Pidgeot18) → review-
(Assignee)

Comment 27

2 years ago
Created attachment 8596365 [details] [diff] [review]
Proposed patch, after review, yet again, fixed block indent

Sorry about the incorrect indent of the inserted block.
Attachment #8596351 - Attachment is obsolete: true
Attachment #8596365 - Flags: review?(Pidgeot18)

Updated

2 years ago
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
OS: Windows 7 → All
Hardware: x86_64 → All
Target Milestone: Thunderbird 38.0 → ---

Updated

2 years ago
tracking-thunderbird38: ? → +

Comment 28

2 years ago
Created attachment 8602407 [details] [diff] [review]
Unbitrotted

I think I can review the remaining issues, which is formatting. Unfortunately if I am going to add my name, then I am going to add my standards.

Are the names in question like jordi.f.nieto@gmail.com and miriam@gmail.com names of real valid users? If so, I don't think we should be adding them to our codebase. I don't mind Barcelona-Freecycle-noreply@yahoogroups.com as I assume it is a public group. All you have to do is to change gmail.com to example.com and change the user part, so jordi.f.nieto@gmail.com can be aaaaa.b.ccccc@example.com and vary parts of the names, so for example "Jordi Fern" can be something like "Jazzy Plant" or "Aaaaa Bbbbb"
Attachment #8602407 - Flags: review?(rkent)
(Assignee)

Comment 29

2 years ago
Created attachment 8602524 [details] [diff] [review]
Obsctured names and e-mail addresses

Hi Kent, good to see some movement here. Here is an updated patch.
Remarks:
1) I made the changes in a text editor. I am in Spain with the travel laptop.
   I have only Firefox compiled, not Thunderbird, so I can't run the test
   (and I'd like to avoid a 90 minute compilation just for that):
   |mozilla/mach xpcshell-test mailnews/mime|
2) The names are real, the e-mail address had already been changed/shortened,
   to comply with the holy 80 character per line limit.
3) I changed gmail.com and hotmail.com to example.com, of course that results
   in a line that is now 81 characters long.
4) I changed Jordi to Jazzy and Juana to First, I can't change the other two
   since their complete names are ?B? (binary) encoded.
5) There is no Fern, it's part of Fernández.
6) There is no privacy issue here, since the names are visible on Yahoo Groups:
   https://es.groups.yahoo.com/neo/groups/Barcelona-Freecycle/conversations/messages
7) Am I really surprised how stuff was checked in, that clearly violates the
   80 character rule, see bug 1069790 comment #21. I think this rules is antiquated,
   but I don't understand why some and not others must adhere to it :-((
Attachment #8602524 - Flags: review?(rkent)
(Assignee)

Comment 30

2 years ago
Created attachment 8602542 [details] [diff] [review]
Obscured names and e-mail addresses, even more obfuscated
Attachment #8602524 - Attachment is obsolete: true
Attachment #8602524 - Flags: review?(rkent)
(Assignee)

Updated

2 years ago
Attachment #8602542 - Flags: review?(rkent)
(Assignee)

Comment 31

2 years ago
I did the 72 minute compile (after running into bug 777770) and ran the test:
mozilla/mach xpcshell-test mailnews/mime

Summary
=======
Ran 26 tests
Expected results: 26
Unexpected results: 0
OK

So it's ready to land, finally ...

Updated

2 years ago
Attachment #8602542 - Flags: review?(rkent)
Attachment #8602542 - Flags: review+
Attachment #8602542 - Flags: approval-comm-beta?
Attachment #8602542 - Flags: approval-comm-aurora?

Comment 32

2 years ago
Comment on attachment 8602542 [details] [diff] [review]
Obscured names and e-mail addresses, even more obfuscated

https://hg.mozilla.org/releases/comm-beta/rev/47119288c8c5
https://hg.mozilla.org/releases/comm-aurora/rev/6f83b5819850
Attachment #8602542 - Flags: approval-comm-beta?
Attachment #8602542 - Flags: approval-comm-beta+
Attachment #8602542 - Flags: approval-comm-aurora?
Attachment #8602542 - Flags: approval-comm-aurora+

Updated

2 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-thunderbird38: --- → fixed
status-thunderbird39: --- → fixed
status-thunderbird40: --- → fixed
status-thunderbird_esr31: --- → unaffected
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 40.0

Updated

2 years ago
Attachment #8602407 - Flags: review?(rkent)

Updated

2 years ago
Attachment #8596365 - Flags: review?(Pidgeot18)
(Assignee)

Comment 33

2 years ago
Unless I'm blind, this never landed on trunk (TB40)
Flags: needinfo?(rkent)
(Assignee)

Comment 34

2 years ago
OK, it got landed, I pulled it back to my local tree, but you didn't post the changeset:
https://hg.mozilla.org/comm-central/rev/df51276ef640
Flags: needinfo?(rkent)
You need to log in before you can comment on or make changes to this bug.