Closed Bug 1301989 Opened 8 years ago Closed 7 years ago

Failed to decode base64 string (at jsmime.js:70) - Make jsmime more tolerant to bad data and/or improve error handling.

Tracking

(thunderbird_esr5255+ fixed, thunderbird55 fixed, thunderbird56 fixed)

Status:

RESOLVED FIXED

Milestone:

Thunderbird 56.0

Tracking Flags:

Tracking

Status

thunderbird_esr52

55+

fixed

thunderbird55

---

fixed

thunderbird56

---

fixed

People

(Reporter: joao.m.santos.silva, Assigned: jorgk-bmo)

References

Details

Attachments

(2 files, 3 obsolete files)

Test message showing the problem. 8 years ago Jorg K (CEST = GMT+2) 701 bytes, text/plain		Details
make jsmime decode_base64() more fail-save (Ignore '=' in the code) 7 years ago Alfred Peters [:infofrommozilla] 1.66 KB, patch		Details \| Diff \| Splinter Review
1301989-base64.patch (v1) 7 years ago Jorg K (CEST = GMT+2) 2.12 KB, patch		Details \| Diff \| Splinter Review
1301989-base64.patch (v1b) 7 years ago Jorg K (CEST = GMT+2) 2.77 KB, patch	infofrommozilla : review-	Details \| Diff \| Splinter Review
1301989-base64.patch (v2). 7 years ago Jorg K (CEST = GMT+2) 4.24 KB, patch	infofrommozilla : review+ jorgk-bmo : approval-comm-beta+ jorgk-bmo : approval-comm-esr52+	Details \| Diff \| Splinter Review

João M. S. Silva

Reporter

Description

•

8 years ago

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160728203720

Steps to reproduce:

I used two add-ons, "Remove Duplicate Messages" and "Remove Duplicate Messages (Alternate)", to find mail duplicates. They worked on other folders, but fail in folders with certain e-mails.


Actual results:

Add-on "crashed" (did not complete).

Error console shows:

Timestamp: 09/11/2016 08:26:44 PM
Error: Error: Failed to decode base64 string!
Source File: resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js
Line: 70

Timestamp: 09/11/2016 08:26:44 PM
Error: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "Failed to decode base64 string!" {file: "resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js" line: 70}]'[JavaScript Error: "Failed to decode base64 string!" {file: "resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js" line: 70}]' when calling method: [nsIMimeConverter::decodeMimeHeader]
Source File: chrome://removeduplicates/content/showduplicatemessages.js
Line: 125


Expected results:

Duplicate mail result list.

João M. S. Silva

Reporter

Comment 1

•

8 years ago

I forgot to say that I've observed what seems to be this bug many years ago, while using the same add-ons. Now that I've used the add-ons again I saw the bug is still there.

Jorg K (CEST = GMT+2)

Assignee

Comment 2

•

8 years ago

Sorry, we don't deal with faulty add-ons here. Please contact the add-on author.

Status: UNCONFIRMED → RESOLVED

Closed: 8 years ago

Resolution: --- → INVALID

Whiteboard: [addon:Remove Duplicate Messages]

João M. S. Silva

Reporter

Comment 3

•

8 years ago

Are you sure this is a bug in the add-on (both add-ons)?

Because if I run a search by recipient in the same folder (no add-on) I get a similar error:

Timestamp: 09/12/2016 02:50:50 AM
Error: Error: Failed to decode base64 string!
Source File: resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js
Line: 70

Flags: needinfo?(jorgk)

Jorg K (CEST = GMT+2)

Assignee

Comment 4

•

8 years ago

Well, line 70 of jsmime.js has:
  return [atob(sanitize), buffer];
which is decoding base64.

Most likely you have a corrupt message in your folder. It may not be an add-on problem, but there's nothing we can do.

Can you attach the folder or find the message and attach it?

OK, I can reopen the bug but with the aim to make JSMime more tolerant to bad data.

Status: RESOLVED → REOPENED

Ever confirmed: true

Flags: needinfo?(jorgk)

Resolution: INVALID → ---

Summary: Failed to decode base64 string → Failed to decode base64 string (at jsmime.js:70) - Make jsmime more tolerant to bad data and/or improve error handling.

Whiteboard: [addon:Remove Duplicate Messages]

Jorg K (CEST = GMT+2)

Assignee

Updated

•

8 years ago

Status: REOPENED → NEW

Jorg K (CEST = GMT+2)

Assignee

Comment 5

•

8 years ago

The reporter send me an offending message. TB displays no recipient and the quick search by recipient fails with the mentioned error: Error: Failed to decode base64 string! jsmime.js:70:11

Now, the To: header of the message is:
To: =?UTF-8?B?Sm=2FDo28gU2lsdmE=?= <joao.m.santos.silva@gmail.com>
and that's clearly invalid as you can check here: http://www.jorgk.com/decode/.

So, as I said, we can make jsmime more tolerant.

In the meantime, I suggest you replace the To: headers in the mailbox with a text editor and you change your subscriptions details at the sender's website who obviously cannot deal with your name João M. S. Silva.

Jorg K (CEST = GMT+2)

Assignee

Comment 6

•

8 years ago

Attached file Test message showing the problem. — Details

Don't worry, I stripped all the personal content ;-)

João M. S. Silva

Reporter

Comment 7

•

8 years ago

Thanks!

If you need help testing a fix I found the remaining offending e-mails and now have 45 of them in a separate folder.

João M. S. Silva

Reporter

Comment 8

•

7 years ago

Hi,

This bug is still here.

Jorg K (CEST = GMT+2)

Assignee

Comment 9

•

7 years ago

Clearly so, since there was no movement in this bug.
I also don't know what a solution would look like. The header is invalid, so what should TB display instead of displaying nothing?

Alfred, do you have an opinion here? Could you perhaps suggest a solution and a patch?

Flags: needinfo?(infofrommozilla)

João M. S. Silva

Reporter

Comment 10

•

7 years ago

Is there an easy fix so that the add-on does not crash (and provides duplicate results)?

Jorg K (CEST = GMT+2)

Assignee

Comment 11

•

7 years ago

Have you talked to the add-on author? He can wrap code that can fail into a try/catch block and deal with the failure.

Alfred Peters [:infofrommozilla]

Comment 12

•

7 years ago

(In reply to Jorg K (GMT+2) from comment #9)
> I also don't know what a solution would look like. The header is invalid, so

We should do something, because it breaks my Header Pane.

The error is caused by the '=' in the text.

According to the comment, decode_base64() should be a fail-safe function. But it stumbles over the '='.

> what should TB display instead of displaying nothing?

The Header is broken, so we should display it so.
It will be "Jm����M��ل".

Flags: needinfo?(infofrommozilla)

Alfred Peters [:infofrommozilla]

Comment 13

•

7 years ago

Attached patch make jsmime decode_base64() more fail-save (Ignore '=' in the code) (obsolete) — Details — Splinter Review

This removes any '=' from the mime word and corrects the padding.

The jsmime tests run without complaint.

Attachment #8883422 - Flags: review?(jorgk)

João M. S. Silva

Reporter

Comment 14

•

7 years ago

I have contacted the author of the add-on meanwhile.

If you'd like I can test the fix.

Jorg K (CEST = GMT+2)

Assignee

Comment 15

•

7 years ago

Comment on attachment 8883422 [details] [diff] [review]
make jsmime decode_base64() more fail-save (Ignore '=' in the code)

Review of attachment 8883422 [details] [diff] [review]:
-----------------------------------------------------------------

Can we add a simple example to mailnews/mime/jsmime/test/test_header.js in decodeRFC2047Words to simulate bad input and the respective bad output with replacement characters � (\ufffd).

::: mailnews/mime/jsmime/jsmime.js
@@ +66,5 @@
> +    if (more)
> +      buffer = sanitize.slice(-excess);
> +    else
> +      for (var i = 0; i < excess; i++)
> +        sanitize += '=';

Can you please explain this hunk a little an add a comment. I'm not a specialist on base64 but https://en.wikipedia.org/wiki/Base64#Output_padding suggests that there is padding with = or ==. If excess ==3 you'd add ===. Is this intended?

Jorg K (CEST = GMT+2)

Assignee

Comment 16

•

7 years ago

Just a thought: Perhaps it would be easier to remove = which are not at the end in one line:
let sanitize = buffer.replace(/[^A-Za-z0-9+\/=]/g,'');
sanitize = sanitize.replace(/=/([A-Za-z0-9+\/]/)/g, '$1');
Untested!!

Jorg K (CEST = GMT+2)

Assignee

Comment 17

•

7 years ago

Or rather:
sanitize = sanitize.replace(/[=]+/([A-Za-z0-9+\/]/)/g, '$1');

Alfred Peters [:infofrommozilla]

Comment 18

•

7 years ago

(In reply to Jorg K (GMT+2) from comment #15)

> ::: mailnews/mime/jsmime/jsmime.js
> @@ +66,5 @@
> > +    if (more)
> > +      buffer = sanitize.slice(-excess);
> > +    else
> > +      for (var i = 0; i < excess; i++)
> > +        sanitize += '=';

I only changed the structure to avoid a second "buffer = '';".
New is only the padding when 'excess' and '!more'

> Can you please explain this hunk a little an add a comment. I'm not a
> specialist on base64 but https://en.wikipedia.org/wiki/Base64#Output_padding
> suggests that there is padding with = or ==. If excess ==3 you'd add ===. Is
> this intended?

Somehow. The length of the Base64 string should always be mod(4)=0.
In a regular term, 3*'=' does not occur.

Base64('a')    => 'YQ=='
Base64('ab')   => 'YWI='
Base64('abc')  => 'YWJj'
Base64('abcd') => 'YWJjZA=='

If we want to work with irregular terms, three are also 3*'=' possible.

(In reply to Jorg K (GMT+2) from comment #16)
> Just a thought: Perhaps it would be easier to remove = which are not at the
> end in one line:

We would still have to correct the length. So no benefit. So that does not offer any advantage.

Alfred Peters [:infofrommozilla]

Comment 19

•

7 years ago

(In reply to Alfred Peters from comment #18)

> Base64('abcd') => 'YWJjZA=='
> 
> If we want to work with irregular terms, three are also 3*'=' possible.

Bad news: atob("YWJjZ"); throws an exception, no matter how many '=' I add.

Could we cut the last character when length is mod(4)=1?
Or add an 'A' (=0)?

Alfred Peters [:infofrommozilla]

Comment 20

•

7 years ago

(In reply to Alfred Peters from comment #19)

> Bad news: atob("YWJjZ"); throws an exception, no matter how many '=' I add.
> 
> Could we cut the last character when length is mod(4)=1?
> Or add an 'A' (=0)?

On a first thought, filling up with 0-bits would make sense. But a broken term is a broken term and we only would increase trash data.

So cutting of the last char would, in my opinion, make the most sense.

Jorg K (CEST = GMT+2)

Assignee

Comment 21

•

7 years ago

(In reply to Alfred Peters from comment #20)
> So cutting of the last char would, in my opinion, make the most sense.
Agreed. So the aim is to sanitise the base64 string so much that the atob() won't fail, right? We don't want a try/catch and simply return "bad input"?

OK, so let's do something that doesn't throw and returns something semi-useful, well, more useful than nothing since "Jm����M��ل" doesn't look very useful to me ;-)

Alfred Peters [:infofrommozilla]

Comment 22

•

7 years ago

(In reply to Alfred Peters from comment #20)
> So cutting of the last char would, in my opinion, make the most sense.

This is already be done in the original code:

| sanitize = sanitize.substring(0, sanitize.length - excess);

If !more this throws away all chars above mod(4)=0.

So:
| sanitize = sanitize.replace(/=+([A-Za-z0-9+\/])/g, '$1');

looks really good. :-)

Alfred Peters [:infofrommozilla]

Comment 23

•

7 years ago

(In reply to Jorg K (GMT+2) from comment #21)
> (In reply to Alfred Peters from comment #20)
> > So cutting of the last char would, in my opinion, make the most sense.
> Agreed. So the aim is to sanitise the base64 string so much that the atob()
> won't fail, right? We don't want a try/catch and simply return "bad input"?

That's a design decision. I do not insist on my solution.
What choice of words do you prefer? The text must be translatable, right?

> OK, so let's do something that doesn't throw and returns something
> semi-useful, well, more useful than nothing since "Jm����M��ل" doesn't look
> very useful to me ;-)

This has the advantage that at least the beginning is correct. OK, here it is only the 'J'

Jorg K (CEST = GMT+2)

Assignee

Comment 24

•

7 years ago

OK, how about a hybrid approach: We try to sanitise as much as we can and still add try/catch. Or is this unreasonable, that is, properly sanitised will always decode? Then let's just do that.

You're going to send a new patch, right? I'm a little confused by your comments:
> We would still have to correct the length. So no benefit. So that does not offer any advantage.
contradicting
> looks really good. :-) [referring to the wrong regexp which I corrected later].

So how about:
// Drop all non-base64 characters and also = which can be harmful in the middle.
let sanitize = buffer.replace(/[^A-Za-z0-9+\/]/g,'');
let excess = sanitize.length % 4;
if (excess != 0)
  if (more) {
    // Keep excess for next time.
    buffer = sanitize.slice(-excess);
    sanitize = sanitize.substring(0, sanitize.length - excess);
  } else {
    buffer = '';
    if (excess == 1) {
      // we can't pad with three =, so cut excess
      sanitize = sanitize.substring(0, sanitize.length - excess);
    } else if (excess == 2) {
      sanitize += "==";
    } else if (excess == 3) {
      sanitize += "=";
    }
  }

Actually, that's not right. Because you stripped "=", you create cases where you end up with 4*n-1 and when more is true, you'll return some excess where you shouldn't have. For AAA= you will even cause an endless loop, since you'll always return those three A's.

Am I right? Let's do it my way, patch coming ;-)

Jorg K (CEST = GMT+2)

Assignee

Comment 25

•

7 years ago

Attached patch 1301989-base64.patch (v1) (obsolete) — Details — Splinter Review

How does that look?

I had to put the hunk in a different spot since my test case Y=WJjZA== didn't work. The code right after the hunk removed the last =, then I removed the = in the middle leaving YWJjZA=, so excess of 3 characters was removed and only YWJj was decoded.

Attachment #8884042 - Flags: review?(infofrommozilla)

Jorg K (CEST = GMT+2)

Assignee

Updated

•

7 years ago

Attachment #8883422 - Flags: review?(jorgk)

Jorg K (CEST = GMT+2)

Assignee

Comment 26

•

7 years ago

Attached patch 1301989-base64.patch (v1b) (obsolete) — Details — Splinter Review

Removing trailing spaces.

Attachment #8884042 - Attachment is obsolete: true

Attachment #8884042 - Flags: review?(infofrommozilla)

Attachment #8884043 - Flags: review?(infofrommozilla)

Alfred Peters [:infofrommozilla]

Comment 27

•

7 years ago

Comment on attachment 8884043 [details] [diff] [review]
1301989-base64.patch (v1b)

Review of attachment 8884043 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/jsmime/jsmime.js
@@ +635,5 @@
>        if (/[^A-Za-z0-9+\/=]/.exec(text))
>          return false;
>  
> +      // Remove harmful `=' chars in the middle.
> +      text = text.replace(/[=]+([A-Za-z0-9+\/])/g, '$1');

This should be placed inside decode_base64(), because there are more indirect calls to it.
Search for 'ContentDecoders' and 'this._convertData'.

Why '[=]+' and not just '=+'?

::: mailnews/mime/jsmime/test/test_header.js
@@ +641,5 @@
>          "Byte missing \ufffd"], // Replacement character for invalid input.
> +
> +      // Test for bug 1301989: Tolerate = in the middle of base64.
> +      ["=?us-ascii?B?YWJjZA==?=X=?us-ascii?B?Y=WJjZA==?=X=?us-ascii?B?Y=WJjZ==A==?=",
> +        "abcdXabcdXabcd"],

I have just done some tests myself. Maybe you want to take over some:
| // Tolerate invalid Base64 encoding - Bug 1301989
| //["=?UTF-8?B?YWJjZA==?=", "abcd"], // correct
| ["=?UTF-8?B?YWJjZA=?=", "abc"], // Length is not a multipel of 4
| ["=?UTF-8?B?YWJjZA?=", "abc"],
| ["=?UTF-8?B?YWJjZ?=", "abc"],
| ["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"], // Additional '=' within the word
| ["=?UTF-8?B?Sm=2FDo28=?=", "Jm\ufffd\u000e\ufffd\ufffd"], // Q encoded B string

The test run was successful.

Attachment #8884043 - Flags: review?(infofrommozilla) → review-

Alfred Peters [:infofrommozilla]

Updated

•

7 years ago

Attachment #8883422 - Attachment is obsolete: true

Jorg K (CEST = GMT+2)

Assignee

Comment 28

•

7 years ago

(In reply to Alfred Peters from comment #27)
> This should be placed inside decode_base64(), because there are more
> indirect calls to it.
Agreed. But then testcase Y=WJjZA== doesn't work, have you tried that? See comment #25.
Do we want to add this code twice?

> Why '[=]+' and not just '=+'?
OK.

Alfred Peters [:infofrommozilla]

Comment 29

•

7 years ago

(In reply to Jorg K (GMT+2) from comment #28)
> (In reply to Alfred Peters from comment #27)
> > This should be placed inside decode_base64(), because there are more
> > indirect calls to it.
> Agreed. But then testcase Y=WJjZA== doesn't work, have you tried that? See
> comment #25.

|  function decode_base64(buffer, more) {
|    // Drop all non-base64 characters
|    let sanitize = buffer.replace(/[^A-Za-z0-9+\/=]/g,'');
| +  // Drop all '=' that are not padding
| +  sanitize = sanitize.replace(/=+([A-Za-z0-9+\/])/g, '$1');

So it should work.

Jorg K (CEST = GMT+2)

Assignee

Comment 30

•

7 years ago

Hmm, you're not reading my comment #25. Let's read it together.

Start with Y=WJjZA==

Then we have this code:
      // Base64 strings must be a length of multiple 4, but it seems that some
      // mailers accidentally insert one too many `=' chars. Gracefully handle
      // this case; see bug 227290 for more information.
      if (text.length % 4 == 1 && text.charAt(text.length - 1) == '=')
        text = text.slice(0, -1);

So I get Y=WJjZA=

Now we remove the invalid = and get YWJjZA=

Now we remove ZA= and finally decode YWJj.

Oops, not what we wanted.

That may be justifiable behaviour, but as a test Y=WJjZA== will decode to abc and not abcd.

Alfred Peters [:infofrommozilla]

Comment 31

•

7 years ago

(In reply to Jorg K (GMT+2) from comment #30)
> Hmm, you're not reading my comment #25. Let's read it together.
> 
> Start with Y=WJjZA==
> 
> Then we have this code:
>       // Base64 strings must be a length of multiple 4, but it seems that some
>       // mailers accidentally insert one too many `=' chars. Gracefully handle
>       // this case; see bug 227290 for more information.
>       if (text.length % 4 == 1 && text.charAt(text.length - 1) == '=')
>         text = text.slice(0, -1);

> So I get Y=WJjZA=

Indeed. Isn't this code path covered by the tests?

My test:
| ["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"], // Additional '=' within the word
was successful.

As I wrote in Comment 22, decode_base64() already cuts off chars above mod(4)=0.

So the above code is obsolete, right? Then we could remove it.

Jorg K (CEST = GMT+2)

Assignee

Comment 32

•

7 years ago

["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"] is successful since you added twp "=" and hence the code isn't triggered. But I agree, that extra processing can just go since we cut anyway. I'll update my patch.

Jorg K (CEST = GMT+2)

Assignee

Comment 33

•

7 years ago

Attached patch 1301989-base64.patch (v2). — Details — Splinter Review

OK, this should do it then.

Attachment #8884043 - Attachment is obsolete: true

Attachment #8884095 - Flags: review?(infofrommozilla)

Jorg K (CEST = GMT+2)

Assignee

Comment 34

•

7 years ago

BTW, mailnews/mime/test/unit/test_bug493544.js still passes which has a test for bug 227290.

Alfred Peters [:infofrommozilla]

Comment 35

•

7 years ago

Comment on attachment 8884095 [details] [diff] [review]
1301989-base64.patch (v2).

Review of attachment 8884095 [details] [diff] [review]:
-----------------------------------------------------------------

Great

Attachment #8884095 - Flags: review?(infofrommozilla) → review+

Jorg K (CEST = GMT+2)

Assignee

Comment 36

•

7 years ago

https://hg.mozilla.org/comm-central/rev/23fafbba334ac5e12dd48bca5a8b10680265f6f1

Assignee: nobody → jorgk

Status: NEW → RESOLVED

Closed: 8 years ago → 7 years ago

Component: Search → MIME

Product: Thunderbird → MailNews Core

Resolution: --- → FIXED

Target Milestone: --- → Thunderbird 56.0

Version: 45 Branch → 45

Jorg K (CEST = GMT+2)

Assignee

Comment 37

•

7 years ago

Comment on attachment 8884095 [details] [diff] [review]
1301989-base64.patch (v2).

Let's get this uplifted.

Attachment #8884095 - Flags: approval-comm-esr52?

Attachment #8884095 - Flags: approval-comm-beta+

Jorg K (CEST = GMT+2)

Assignee

Comment 38

•

7 years ago

Beta (TB 55):
https://hg.mozilla.org/releases/comm-beta/rev/40791579fdc6

status-thunderbird55: --- → fixed

status-thunderbird_esr52: --- → affected

Jorg K (CEST = GMT+2)

Assignee

Updated

•

7 years ago

Attachment #8884095 - Flags: approval-comm-esr52? → approval-comm-esr52+

Jorg K (CEST = GMT+2)

Assignee

Comment 39

•

7 years ago

TB 52 ESR
https://hg.mozilla.org/releases/comm-esr52/rev/4314c18674d1590b402c1579052b486862b9adc0

status-thunderbird_esr52: affected → fixed

tracking-thunderbird_esr52: --- → 55+

Jorg K (CEST = GMT+2)

Assignee

Updated

•

7 years ago

status-thunderbird56: --- → fixed

João M. S. Silva

Reporter

Comment 40

•

7 years ago

Thanks for the fix.

Alfred Peters [:infofrommozilla]

Updated

•

1 year ago