Invalid quoted-printable format

RESOLVED FIXED in Thunderbird 60.0

Status

defect
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: azurit, Assigned: infofrommozilla)

Tracking

Thunderbird 60.0

Thunderbird Tracking Flags

(thunderbird_esr52 affected, thunderbird59 fixed, thunderbird60 fixed)

Details

Attachments

(2 attachments, 3 obsolete attachments)

Reporter

Description

a year ago
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20180208173149

Steps to reproduce:

In account settings, set your name to something, which contains all:
 - unicode character
 - colon character (:)
 - space

for example:

Test: Ján


Actual results:

Thunderbird will quote it into quoted-printable string with UNQUOTED colon character, the example above will be:
From: =?UTF-8?Q?Test:_J=c3=a1n?= <test@example.com>

Thunderbird 52.6.0 (64-bit), linux.


Expected results:

A colon character MUST be quoted as =3a , see:
https://tools.ietf.org/html/rfc2047#section-2
https://tools.ietf.org/html/rfc5322#section-3.2.3

This is doing major problems in e-mail server and client software.

Comment 1

a year ago
You don't need to configure the account like this, you can just send a message with subject "Test: Ján" and get this header:

Subject: =?UTF-8?Q?Test:_J=c3=a1n?=

The fix looks like simple tweak to jsmime.js#2873:
/// A list of ASCII characters forbidden in RFC 2047 encoded-words
var qpForbidden = "=?_()\",";

I'm just wondering why some others, like "." are also missing there and why , and _ are included.
 especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
               <"> / "/" / "[" / "]" / "?" / "." / "="

Alfred, care to take a look?
Flags: needinfo?(infofrommozilla)
Assignee

Comment 2

a year ago
(In reply to Jorg K (GMT+1) from comment #1)

> The fix looks like simple tweak to jsmime.js#2873:
> /// A list of ASCII characters forbidden in RFC 2047 encoded-words
> var qpForbidden = "=?_()\",";


> I'm just wondering why some others, like "." are also missing there and why
> , and _ are included.

"," is the mailbox and address separator.
"_" is the replacement char of SPACE.

>  especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
>                <"> / "/" / "[" / "]" / "?" / "." / "="

I'm already trying with:

var qpForbidden = "\"#$%&'(),.:;<=>?@[\]^_`{|}~";

which is the complement of:

| RFC 2047               Message Header Extensions           November 1996
| 5. Use of encoded-words in message headers

| (3) As a replacement for a 'word' entity within a 'phrase', for example,
|     one that precedes an address in a From, To, or Cc header.  The ABNF
[...]
|     In this case the set of characters that may be used in a "Q"-encoded
|     'encoded-word' is restricted to: <upper and lower case ASCII
|     letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_"

I still want to see if that destroys a test.
Flags: needinfo?(infofrommozilla)
Assignee

Comment 3

a year ago
(In reply to Alfred Peters from comment #2)
> (In reply to Jorg K (GMT+1) from comment #1)

> >  especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
> >                <"> / "/" / "[" / "]" / "?" / "." / "="

> I'm already trying with:
> var qpForbidden = "\"#$%&'(),.:;<=>?@[\]^_`{|}~";

As expected, two tests from mailnews/mime/jsmime/test/test_header_emitter.js fail.

Before I write the patch, should we decide if we really want to include all characters as above or only those from 'especials'?
Flags: needinfo?(jorgk)

Comment 4

a year ago
Hmm, I'm really bad at reading those RFCs. Let's look at https://tools.ietf.org/html/rfc2047#section-2 together:

2. Syntax of encoded-words

   An 'encoded-word' is defined by the following ABNF grammar.  The
   notation of RFC 822 is used, with the exception that white space
   characters MUST NOT appear between components of an 'encoded-word'.

   encoded-word = "=?" charset "?" encoding "?" encoded-text "?="

   charset = token    ; see section 3

   encoding = token   ; see section 4

   token = 1*<Any CHAR except SPACE, CTLs, and especials>

   especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
               <"> / "/" / "[" / "]" / "?" / "." / "="

   encoded-text = 1*<Any printable ASCII character other than "?"
                     or SPACE>
                  ; (but see "Use of encoded-words in message
                  ; headers", section 5)

'especials' is only referenced in 'token' and that is referenced in 'encoding' and 'charset'.

We're interested in the encoded-text, and there any printable ASCII character is possible apart from ? and SPACE. That brings us to section 5 which you quoted at the end of your comment #2.

I'm really surprised that we currently have
  var qpForbidden = "=?_()\",";
when according to that section 5 a whole heap more shouldn't occur, like you said:
  var qpForbidden = "\"#$%&'(),.:;<=>?@[\]^_`{|}~";

I guess there is no harm done in forbidding the larger set and fix the tests.
Flags: needinfo?(jorgk)
Assignee

Comment 5

a year ago
(In reply to Jorg K (GMT+1) from comment #4)

> I'm really surprised that we currently have

We are not the first to ask this question: Bug 1088975 comment #23
The question remained unanswered.

> said:
>   var qpForbidden = "\"#$%&'(),.:;<=>?@[\]^_`{|}~";

OK, I'll do.

> I guess there is no harm done in forbidding the larger set and fix the tests.

There is nothing wrong with that. In general, it would even be allowed to encode all the characters.
But:
1. The coding should be minimized so that the text remains as readable as possible.
2. The more characters we encode, the longer the result becomes. And that leads to the more frequent use of Base64.
Reporter

Comment 6

a year ago
(In reply to Alfred Peters from comment #5)
> 1. The coding should be minimized so that the text remains as readable as
> possible.
> 2. The more characters we encode, the longer the result becomes. And that
> leads to the more frequent use of Base64.

3. Some spam filters are adding a little spam score for excessive coding of subject and sender name.

Comment 7

a year ago
(In reply to Alfred Peters from comment #5)
> We are not the first to ask this question: Bug 1088975 comment #23
> The question remained unanswered.
Nice research, thank you. That was when I started here in 2015. Reading through the bug, it mentions www.jorgk.com/decode, oops, I wrote that. Trying that, all of "#$%&'(),.:;<=>?@[\]^_`{|}~ gets quoted. The code behind that page is:

<?php
$a1 = mb_encode_mimeheader($_POST['string'], $_POST['charset'], "Q");
$a2 = mb_encode_mimeheader($_POST['string'], $_POST['charset'], "B");
echo $a1 . "<br>" . $a2;
echo $a;
?>

I see no reason to differ from what PHP does.
Assignee

Comment 8

a year ago
(In reply to Alfred Peters from comment #3)

> As expected, two tests from mailnews/mime/jsmime/test/test_header_emitter.js
> fail.

The patch also affects Mime Part headers.
I do not get this test corrected:

<https://dxr.mozilla.org/comm-central/rev/534ffa5e3de52cce520d86fba10757eb6801af82/mailnews/compose/test/unit/test_attachment.js#47>

My attempt:
| RFC2047: 'Content-Type: text/plain; charset=US-ASCII;\r\n'+
|          ' name="=?UTF-8?B?ICEiIyQlJicoKSorLC0uLzAxMjM0NTY3ODk6Ozw9Pj9AQUJDREVGR0hJ?='+
|          '=?UTF-8?B?SktMTU5PUFFSU1RVVldYWVpbXF1eX2BhYmNkZWZnaGlqa2xtbm9wcXJz?='+
|          '=?UTF-8?B?dHV2d3h5ent8fX7CoMKhwqLCo8KkwqXCpsKnwqjCqcKqwqvCrMKtwq4=?='+
|          '=?UTF-8?B?wq/CsMKxwrLCs8K0wrXCtsK3wrjCucK6wrvCvMK9wr7Cv8OAw4HDgsOD?='+
|          '=?UTF-8?B?w4TDhcOGw4fDiMOJw4rDi8OMw43DjsOPw5DDkcOSw5PDlMOVw5bDl8OY?='+
|          '=?UTF-8?B?w5nDmsObw5zDncOew5/DoMOhw6LDo8Okw6XDpsOnw6jDqcOqw6vDrMOt?='+
|          '=?UTF-8?B?w67Dr8Oww7HDssOzw7TDtcO2w7fDuMO5w7rDu8O8w73DvsO/LnR4dA==?="\r\n',

Result:
| Expected PASS, got FAIL
| [testInput0 : 110]
|  "Content-Type: text/plain; charset=US-ASCII;\\r\\n name=\\"=?UTF-8?B?ICEiIyQlJicoKSorLC0uLzAxMjM0NTY3ODk6Ozw9Pj9AQUJDREVGR0hJ?==?UT
|  == 
|  "Content-Type: text/plain; charset=US-ASCII;\\r\\n name=\\"=?UTF-8?B?ICEiIyQlJicoKSorLC0uLzAxMjM0NTY3ODk6Ozw9Pj9AQUJDREVGR0hJ?==?UT

Is there a way to see more of the output?

Comment 9

a year ago
Only if you add dump() statements next to the Assert.equal().

Comment 10

a year ago
In this test, the original QP goes away and is replaced by base64, yes, since we're now encoding more and the threshold to using bas64 is reached, right?
  RFC2047: 'Content-Type: text/plain; charset=US-ASCII;\r\n'+
           ' name="=?UTF-8?Q?_!=22#$%&\'=28=29*+=2c-./0123456789:;<=3d>=3f@ABCDEFGHIJKLM?='+
If you attach your WIP, I can look at it tonight.

Or maybe reduce the amount of "special" characters to maintain the QP?

Comment 11

a year ago
(In reply to Jorg K (GMT+1) from comment #10)
> Or maybe reduce the amount of "special" characters to maintain the QP?
Ignore that, it uses what looks like "all printable" as input.
Assignee

Comment 12

a year ago
(In reply to Alfred Peters from comment #8)

> Is there a way to see more of the output?

Ok, I found I found another way.
Assignee

Comment 13

a year ago
$ mozilla/mach xpcshell-test mailnews
| Ran 400 checks (400 tests)
| Expected results: 381
| Skipped: 19 tests
| OK

Do I have to worry about the skipped tests?

Comment 14

a year ago
No, we disable tests which have major problems. I doubt that any of the disabled ones will break further due to this change.

https://dxr.mozilla.org/comm-central/search?q=skip-if+%3D&redirect=false

Comment 16

a year ago
Comment on attachment 8951967 [details] [diff] [review]
Extension of the characters to be encoded (quoted-printable) according to RFC2047.

Review of attachment 8951967 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice work. Sadly the test has suffered in bug 1088975, so let's restore it to its previous glory.

::: mailnews/mime/jsmime/test/test_header_emitter.js
@@ +142,5 @@
>        ["My house   burned down!", "My house burned down!"],
>  
>        // Which variables need to be encoded in QP encoding?
>        ["! \" # $ % & ' ( ) * + - .",
>         "! \" # $ % & ' ( ) * + - ."],

Something has gone wrong here:
https://hg.mozilla.org/comm-central/rev/e3a09b8bc8e1#l3.34
The original test had all 32 "special" characters
["! \" # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \\ ] ^ _ ` { | } ~ \x7f",
forced, them to QP by adding the \x7f and then looked which ones got converted. The two lines above are meaningless since they repeat a test below, that is, if QP is not required, nothing gets encoded.

So please remove these two nonsense lines and replace with the original idea, so:
["! \" # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \\ ] ^ _ ` { | } ~ \x7f",

@@ +162,5 @@
> +        " =?UTF-8?Q?est_=5e_/_=5f_/_T?=\r\n" +
> +        " =?UTF-8?Q?est_=60_/_=7b=7f?="],
> +      ["Test | / } / Test ~ /\x7f",
> +        "=?UTF-8?Q?Test_=7c_/_=7d_/_T?=\r\n" +
> +        " =?UTF-8?Q?est_=7e_/=7f?="],

I imagine it was quite time-consuming to adjust the tests since the (somewhat unpredictable?) folding into multiple lines gets in the way here.
Please restore the original test, as I suggested above and adjust the expected result accordingly.

@@ +167,1 @@
>        // But non-printable characters don't need it in the first place!

I believe there is a typo here. This should read:
// But the 32 printable "special" characters don't need it in the first place!

Comment 17

a year ago
Posted patch bug1438590.patch (obsolete) — Splinter Review
This is how I'd restore the test to it's original glory. I was lazy, I haven't worked out the "expected result".

Updated

a year ago
Assignee: nobody → infofrommozilla
Status: UNCONFIRMED → ASSIGNED
Component: Untriaged → MIME
Ever confirmed: true
Product: Thunderbird → MailNews Core
Version: 52 Branch → 52
Assignee

Comment 18

a year ago
(In reply to Jorg K (GMT+1) from comment #16)
> Comment on attachment 8951967 [details] [diff] [review]

> ... replace with the original idea, so:
> ["! \" # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \\ ] ^ _ ` { | } ~ \x7f",
> 
> @@ +162,5 @@
> > +        " =?UTF-8?Q?est_=5e_/_=5f_/_T?=\r\n" +
> > +        " =?UTF-8?Q?est_=60_/_=7b=7f?="],
> > +      ["Test | / } / Test ~ /\x7f",
> > +        "=?UTF-8?Q?Test_=7c_/_=7d_/_T?=\r\n" +
> > +        " =?UTF-8?Q?est_=7e_/=7f?="],
> 
> I imagine it was quite time-consuming to adjust the tests since the
> (somewhat unpredictable?) folding into multiple lines gets in the way here.

Indeed. :-)

> Please restore the original test, as I suggested above and adjust the
> expected result accordingly.

That would lead to an Base64 encoded String because that's shorter.
So that wont test quoted printable at all.
do you really want that?

Comment 19

a year ago
(In reply to Alfred Peters from comment #18)
> That would lead to an Base64 encoded String because that's shorter.
> So that wont test quoted printable at all.
> do you really want that?
I think the original test was best, yes, but if it now switches to base64, it's not good.

Can we prepend ABCDEFGHIJ etc. so it still comes to QP?

If not, your tests are fine, but the two lines above, the 
-      ["! \" # $ % & ' ( ) * + - .",
-       "! \" # $ % & ' ( ) * + - ."],
are nonsense, *and* we don't have a test that shows which characters aren't replaced, like + * / and some others.

I also find your 
  Test \" / # / Test $ / % / Test & / ' / Test ( / )\x7f"
a little hard to read. If we need to insert padding, I'd prefer:
  ABCDEFGHIJK \" # $ % & ' ( )\x7f"
since the Test and the / makes it hard to spot the stuff that is converted, / is not.
In this case, add a comment why we're adding the padding.

Also, the entire thing was confusing, and managed to confuse the author in bug 1088975, since there was no comment re. the \x7f.

I have to head out the door, back at 21:00.

Comment 20

a year ago
Thinking about this afk, since we have 32 characters, how about splitting this into 8 tests:
Padding ! \" # $\x7f
Padding % & ' (\x7f
Padding ) * + ,\x7f
Padding - . / :\x7f
Padding ; < = >\x7f
Padding ? @ [ \\\x7f
Padding ] ^ _ `\x7f
Padding { | } ~\x7f
Assignee

Comment 21

a year ago
(In reply to Jorg K (GMT+1) from comment #20)
> Thinking about this afk, since we have 32 characters, how about splitting
> this into 8 tests:

Yes, I had that thought too. But 8 tests are not enough.
Due to the limitation of the line length in the result, a maximum of 2 encoded characters per line can be accommodated.
Attachment #8951967 - Attachment is obsolete: true
Attachment #8951967 - Flags: review?(jorgk)
Attachment #8951999 - Flags: review?(jorgk)

Comment 22

a year ago
Comment on attachment 8951999 [details] [diff] [review]
Version with a more readable Test.

Very nice, but allow me to improve it even further ;-)
Attachment #8951999 - Flags: review?(jorgk) → review+

Comment 23

a year ago
I hope you like it.
Attachment #8951984 - Attachment is obsolete: true
Attachment #8952014 - Flags: feedback?(infofrommozilla)
Assignee

Comment 24

a year ago
Comment on attachment 8952014 [details] [diff] [review]
bug1438590.patch - Test even more readable.

Yes, that's fine for me.
Attachment #8952014 - Flags: feedback?(infofrommozilla) → feedback+

Updated

a year ago
Attachment #8951999 - Attachment is obsolete: true

Comment 25

a year ago
Comment on attachment 8952014 [details] [diff] [review]
bug1438590.patch - Test even more readable.

Thanks, let's go with this then.
Attachment #8952014 - Flags: review+

Updated

a year ago
Keywords: checkin-needed

Comment 26

a year ago
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/1e70ad96280e
Extension of the characters to be encoded (quoted-printable) according to RFC2047. r=jorgk
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Keywords: checkin-needed
Resolution: --- → FIXED

Updated

a year ago
Target Milestone: --- → Thunderbird 60.0
Assignee

Comment 27

a year ago
Unfortunately, another broken test on the calendar
<https://treeherder.mozilla.org/#/jobs?repo=comm-central&selectedJob=162944645>
Attachment #8952049 - Flags: review?(jorgk)

Comment 28

a year ago
Comment on attachment 8952049 [details] [diff] [review]
Fix of Calendar test

Thanks for taking care of this. I haven't tested it, but I assume you have ;-)
So they all switch to base64 now.
Attachment #8952049 - Flags: review?(jorgk) → review+

Comment 29

a year ago
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/2d16fee9404f
Follow-up: fix calendar test test_ltninvitationutils.js. r=jorgk
Reporter

Comment 30

a year ago
Thanks everyone! Is this getting into Thunderbird 52?

Comment 31

a year ago
Comment on attachment 8952014 [details] [diff] [review]
bug1438590.patch - Test even more readable.

(In reply to azurit from comment #30)
> Thanks everyone! Is this getting into Thunderbird 52?
It will go into TB 59 beta 2 this week and we can ship it in TB 52.7 although this bug has been there since TB 38 in 2015, so I really don't see the rush.
Attachment #8952014 - Flags: approval-comm-esr52?
Attachment #8952014 - Flags: approval-comm-beta+
Comment on attachment 8952049 [details] [diff] [review]
Fix of Calendar test

Please make sure calendar code gets reviewed by a calendar peer.
Attachment #8952049 - Flags: review+

Comment 34

a year ago
Comment on attachment 8952014 [details] [diff] [review]
bug1438590.patch - Test even more readable.

It's not worth it at TB 52.7 when TB 60 ESR is coming out six weeks later.
Attachment #8952014 - Flags: approval-comm-esr52?
You need to log in before you can comment on or make changes to this bug.