Closed
Bug 1301989
Opened 8 years ago
Closed 7 years ago
Failed to decode base64 string (at jsmime.js:70) - Make jsmime more tolerant to bad data and/or improve error handling.
Categories
(MailNews Core :: MIME, defect)
Tracking
(thunderbird_esr5255+ fixed, thunderbird55 fixed, thunderbird56 fixed)
RESOLVED
FIXED
Thunderbird 56.0
People
(Reporter: joao.m.santos.silva, Assigned: jorgk-bmo)
References
Details
Attachments
(2 files, 3 obsolete files)
701 bytes,
text/plain
|
Details | |
4.24 KB,
patch
|
infofrommozilla
:
review+
jorgk-bmo
:
approval-comm-beta+
jorgk-bmo
:
approval-comm-esr52+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:48.0) Gecko/20100101 Firefox/48.0 Build ID: 20160728203720 Steps to reproduce: I used two add-ons, "Remove Duplicate Messages" and "Remove Duplicate Messages (Alternate)", to find mail duplicates. They worked on other folders, but fail in folders with certain e-mails. Actual results: Add-on "crashed" (did not complete). Error console shows: Timestamp: 09/11/2016 08:26:44 PM Error: Error: Failed to decode base64 string! Source File: resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js Line: 70 Timestamp: 09/11/2016 08:26:44 PM Error: NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "Failed to decode base64 string!" {file: "resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js" line: 70}]'[JavaScript Error: "Failed to decode base64 string!" {file: "resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js" line: 70}]' when calling method: [nsIMimeConverter::decodeMimeHeader] Source File: chrome://removeduplicates/content/showduplicatemessages.js Line: 125 Expected results: Duplicate mail result list.
Reporter | ||
Comment 1•8 years ago
|
||
I forgot to say that I've observed what seems to be this bug many years ago, while using the same add-ons. Now that I've used the add-ons again I saw the bug is still there.
Assignee | ||
Comment 2•8 years ago
|
||
Sorry, we don't deal with faulty add-ons here. Please contact the add-on author.
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
Whiteboard: [addon:Remove Duplicate Messages]
Reporter | ||
Comment 3•8 years ago
|
||
Are you sure this is a bug in the add-on (both add-ons)? Because if I run a search by recipient in the same folder (no add-on) I get a similar error: Timestamp: 09/12/2016 02:50:50 AM Error: Error: Failed to decode base64 string! Source File: resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js Line: 70
Flags: needinfo?(jorgk)
Assignee | ||
Comment 4•8 years ago
|
||
Well, line 70 of jsmime.js has: return [atob(sanitize), buffer]; which is decoding base64. Most likely you have a corrupt message in your folder. It may not be an add-on problem, but there's nothing we can do. Can you attach the folder or find the message and attach it? OK, I can reopen the bug but with the aim to make JSMime more tolerant to bad data.
Status: RESOLVED → REOPENED
Ever confirmed: true
Flags: needinfo?(jorgk)
Resolution: INVALID → ---
Summary: Failed to decode base64 string → Failed to decode base64 string (at jsmime.js:70) - Make jsmime more tolerant to bad data and/or improve error handling.
Whiteboard: [addon:Remove Duplicate Messages]
Assignee | ||
Updated•8 years ago
|
Status: REOPENED → NEW
Assignee | ||
Comment 5•8 years ago
|
||
The reporter send me an offending message. TB displays no recipient and the quick search by recipient fails with the mentioned error: Error: Failed to decode base64 string! jsmime.js:70:11 Now, the To: header of the message is: To: =?UTF-8?B?Sm=2FDo28gU2lsdmE=?= <joao.m.santos.silva@gmail.com> and that's clearly invalid as you can check here: http://www.jorgk.com/decode/. So, as I said, we can make jsmime more tolerant. In the meantime, I suggest you replace the To: headers in the mailbox with a text editor and you change your subscriptions details at the sender's website who obviously cannot deal with your name João M. S. Silva.
Assignee | ||
Comment 6•8 years ago
|
||
Don't worry, I stripped all the personal content ;-)
Reporter | ||
Comment 7•8 years ago
|
||
Thanks! If you need help testing a fix I found the remaining offending e-mails and now have 45 of them in a separate folder.
Reporter | ||
Comment 8•7 years ago
|
||
Hi, This bug is still here.
Assignee | ||
Comment 9•7 years ago
|
||
Clearly so, since there was no movement in this bug. I also don't know what a solution would look like. The header is invalid, so what should TB display instead of displaying nothing? Alfred, do you have an opinion here? Could you perhaps suggest a solution and a patch?
Flags: needinfo?(infofrommozilla)
Reporter | ||
Comment 10•7 years ago
|
||
Is there an easy fix so that the add-on does not crash (and provides duplicate results)?
Assignee | ||
Comment 11•7 years ago
|
||
Have you talked to the add-on author? He can wrap code that can fail into a try/catch block and deal with the failure.
Comment 12•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #9) > I also don't know what a solution would look like. The header is invalid, so We should do something, because it breaks my Header Pane. The error is caused by the '=' in the text. According to the comment, decode_base64() should be a fail-safe function. But it stumbles over the '='. > what should TB display instead of displaying nothing? The Header is broken, so we should display it so. It will be "Jm����M��ل".
Flags: needinfo?(infofrommozilla)
Comment 13•7 years ago
|
||
This removes any '=' from the mime word and corrects the padding. The jsmime tests run without complaint.
Attachment #8883422 -
Flags: review?(jorgk)
Reporter | ||
Comment 14•7 years ago
|
||
I have contacted the author of the add-on meanwhile. If you'd like I can test the fix.
Assignee | ||
Comment 15•7 years ago
|
||
Comment on attachment 8883422 [details] [diff] [review] make jsmime decode_base64() more fail-save (Ignore '=' in the code) Review of attachment 8883422 [details] [diff] [review]: ----------------------------------------------------------------- Can we add a simple example to mailnews/mime/jsmime/test/test_header.js in decodeRFC2047Words to simulate bad input and the respective bad output with replacement characters � (\ufffd). ::: mailnews/mime/jsmime/jsmime.js @@ +66,5 @@ > + if (more) > + buffer = sanitize.slice(-excess); > + else > + for (var i = 0; i < excess; i++) > + sanitize += '='; Can you please explain this hunk a little an add a comment. I'm not a specialist on base64 but https://en.wikipedia.org/wiki/Base64#Output_padding suggests that there is padding with = or ==. If excess ==3 you'd add ===. Is this intended?
Assignee | ||
Comment 16•7 years ago
|
||
Just a thought: Perhaps it would be easier to remove = which are not at the end in one line: let sanitize = buffer.replace(/[^A-Za-z0-9+\/=]/g,''); sanitize = sanitize.replace(/=/([A-Za-z0-9+\/]/)/g, '$1'); Untested!!
Assignee | ||
Comment 17•7 years ago
|
||
Or rather: sanitize = sanitize.replace(/[=]+/([A-Za-z0-9+\/]/)/g, '$1');
Comment 18•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #15) > ::: mailnews/mime/jsmime/jsmime.js > @@ +66,5 @@ > > + if (more) > > + buffer = sanitize.slice(-excess); > > + else > > + for (var i = 0; i < excess; i++) > > + sanitize += '='; I only changed the structure to avoid a second "buffer = '';". New is only the padding when 'excess' and '!more' > Can you please explain this hunk a little an add a comment. I'm not a > specialist on base64 but https://en.wikipedia.org/wiki/Base64#Output_padding > suggests that there is padding with = or ==. If excess ==3 you'd add ===. Is > this intended? Somehow. The length of the Base64 string should always be mod(4)=0. In a regular term, 3*'=' does not occur. Base64('a') => 'YQ==' Base64('ab') => 'YWI=' Base64('abc') => 'YWJj' Base64('abcd') => 'YWJjZA==' If we want to work with irregular terms, three are also 3*'=' possible. (In reply to Jorg K (GMT+2) from comment #16) > Just a thought: Perhaps it would be easier to remove = which are not at the > end in one line: We would still have to correct the length. So no benefit. So that does not offer any advantage.
Comment 19•7 years ago
|
||
(In reply to Alfred Peters from comment #18) > Base64('abcd') => 'YWJjZA==' > > If we want to work with irregular terms, three are also 3*'=' possible. Bad news: atob("YWJjZ"); throws an exception, no matter how many '=' I add. Could we cut the last character when length is mod(4)=1? Or add an 'A' (=0)?
Comment 20•7 years ago
|
||
(In reply to Alfred Peters from comment #19) > Bad news: atob("YWJjZ"); throws an exception, no matter how many '=' I add. > > Could we cut the last character when length is mod(4)=1? > Or add an 'A' (=0)? On a first thought, filling up with 0-bits would make sense. But a broken term is a broken term and we only would increase trash data. So cutting of the last char would, in my opinion, make the most sense.
Assignee | ||
Comment 21•7 years ago
|
||
(In reply to Alfred Peters from comment #20) > So cutting of the last char would, in my opinion, make the most sense. Agreed. So the aim is to sanitise the base64 string so much that the atob() won't fail, right? We don't want a try/catch and simply return "bad input"? OK, so let's do something that doesn't throw and returns something semi-useful, well, more useful than nothing since "Jm����M��ل" doesn't look very useful to me ;-)
Comment 22•7 years ago
|
||
(In reply to Alfred Peters from comment #20) > So cutting of the last char would, in my opinion, make the most sense. This is already be done in the original code: | sanitize = sanitize.substring(0, sanitize.length - excess); If !more this throws away all chars above mod(4)=0. So: | sanitize = sanitize.replace(/=+([A-Za-z0-9+\/])/g, '$1'); looks really good. :-)
Comment 23•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #21) > (In reply to Alfred Peters from comment #20) > > So cutting of the last char would, in my opinion, make the most sense. > Agreed. So the aim is to sanitise the base64 string so much that the atob() > won't fail, right? We don't want a try/catch and simply return "bad input"? That's a design decision. I do not insist on my solution. What choice of words do you prefer? The text must be translatable, right? > OK, so let's do something that doesn't throw and returns something > semi-useful, well, more useful than nothing since "Jm����M��ل" doesn't look > very useful to me ;-) This has the advantage that at least the beginning is correct. OK, here it is only the 'J'
Assignee | ||
Comment 24•7 years ago
|
||
OK, how about a hybrid approach: We try to sanitise as much as we can and still add try/catch. Or is this unreasonable, that is, properly sanitised will always decode? Then let's just do that. You're going to send a new patch, right? I'm a little confused by your comments: > We would still have to correct the length. So no benefit. So that does not offer any advantage. contradicting > looks really good. :-) [referring to the wrong regexp which I corrected later]. So how about: // Drop all non-base64 characters and also = which can be harmful in the middle. let sanitize = buffer.replace(/[^A-Za-z0-9+\/]/g,''); let excess = sanitize.length % 4; if (excess != 0) if (more) { // Keep excess for next time. buffer = sanitize.slice(-excess); sanitize = sanitize.substring(0, sanitize.length - excess); } else { buffer = ''; if (excess == 1) { // we can't pad with three =, so cut excess sanitize = sanitize.substring(0, sanitize.length - excess); } else if (excess == 2) { sanitize += "=="; } else if (excess == 3) { sanitize += "="; } } Actually, that's not right. Because you stripped "=", you create cases where you end up with 4*n-1 and when more is true, you'll return some excess where you shouldn't have. For AAA= you will even cause an endless loop, since you'll always return those three A's. Am I right? Let's do it my way, patch coming ;-)
Assignee | ||
Comment 25•7 years ago
|
||
How does that look? I had to put the hunk in a different spot since my test case Y=WJjZA== didn't work. The code right after the hunk removed the last =, then I removed the = in the middle leaving YWJjZA=, so excess of 3 characters was removed and only YWJj was decoded.
Attachment #8884042 -
Flags: review?(infofrommozilla)
Assignee | ||
Updated•7 years ago
|
Attachment #8883422 -
Flags: review?(jorgk)
Assignee | ||
Comment 26•7 years ago
|
||
Removing trailing spaces.
Attachment #8884042 -
Attachment is obsolete: true
Attachment #8884042 -
Flags: review?(infofrommozilla)
Attachment #8884043 -
Flags: review?(infofrommozilla)
Comment 27•7 years ago
|
||
Comment on attachment 8884043 [details] [diff] [review] 1301989-base64.patch (v1b) Review of attachment 8884043 [details] [diff] [review]: ----------------------------------------------------------------- ::: mailnews/mime/jsmime/jsmime.js @@ +635,5 @@ > if (/[^A-Za-z0-9+\/=]/.exec(text)) > return false; > > + // Remove harmful `=' chars in the middle. > + text = text.replace(/[=]+([A-Za-z0-9+\/])/g, '$1'); This should be placed inside decode_base64(), because there are more indirect calls to it. Search for 'ContentDecoders' and 'this._convertData'. Why '[=]+' and not just '=+'? ::: mailnews/mime/jsmime/test/test_header.js @@ +641,5 @@ > "Byte missing \ufffd"], // Replacement character for invalid input. > + > + // Test for bug 1301989: Tolerate = in the middle of base64. > + ["=?us-ascii?B?YWJjZA==?=X=?us-ascii?B?Y=WJjZA==?=X=?us-ascii?B?Y=WJjZ==A==?=", > + "abcdXabcdXabcd"], I have just done some tests myself. Maybe you want to take over some: | // Tolerate invalid Base64 encoding - Bug 1301989 | //["=?UTF-8?B?YWJjZA==?=", "abcd"], // correct | ["=?UTF-8?B?YWJjZA=?=", "abc"], // Length is not a multipel of 4 | ["=?UTF-8?B?YWJjZA?=", "abc"], | ["=?UTF-8?B?YWJjZ?=", "abc"], | ["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"], // Additional '=' within the word | ["=?UTF-8?B?Sm=2FDo28=?=", "Jm\ufffd\u000e\ufffd\ufffd"], // Q encoded B string The test run was successful.
Attachment #8884043 -
Flags: review?(infofrommozilla) → review-
Updated•7 years ago
|
Attachment #8883422 -
Attachment is obsolete: true
Assignee | ||
Comment 28•7 years ago
|
||
(In reply to Alfred Peters from comment #27) > This should be placed inside decode_base64(), because there are more > indirect calls to it. Agreed. But then testcase Y=WJjZA== doesn't work, have you tried that? See comment #25. Do we want to add this code twice? > Why '[=]+' and not just '=+'? OK.
Comment 29•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #28) > (In reply to Alfred Peters from comment #27) > > This should be placed inside decode_base64(), because there are more > > indirect calls to it. > Agreed. But then testcase Y=WJjZA== doesn't work, have you tried that? See > comment #25. | function decode_base64(buffer, more) { | // Drop all non-base64 characters | let sanitize = buffer.replace(/[^A-Za-z0-9+\/=]/g,''); | + // Drop all '=' that are not padding | + sanitize = sanitize.replace(/=+([A-Za-z0-9+\/])/g, '$1'); So it should work.
Assignee | ||
Comment 30•7 years ago
|
||
Hmm, you're not reading my comment #25. Let's read it together. Start with Y=WJjZA== Then we have this code: // Base64 strings must be a length of multiple 4, but it seems that some // mailers accidentally insert one too many `=' chars. Gracefully handle // this case; see bug 227290 for more information. if (text.length % 4 == 1 && text.charAt(text.length - 1) == '=') text = text.slice(0, -1); So I get Y=WJjZA= Now we remove the invalid = and get YWJjZA= Now we remove ZA= and finally decode YWJj. Oops, not what we wanted. That may be justifiable behaviour, but as a test Y=WJjZA== will decode to abc and not abcd.
Comment 31•7 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #30) > Hmm, you're not reading my comment #25. Let's read it together. > > Start with Y=WJjZA== > > Then we have this code: > // Base64 strings must be a length of multiple 4, but it seems that some > // mailers accidentally insert one too many `=' chars. Gracefully handle > // this case; see bug 227290 for more information. > if (text.length % 4 == 1 && text.charAt(text.length - 1) == '=') > text = text.slice(0, -1); > So I get Y=WJjZA= Indeed. Isn't this code path covered by the tests? My test: | ["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"], // Additional '=' within the word was successful. As I wrote in Comment 22, decode_base64() already cuts off chars above mod(4)=0. So the above code is obsolete, right? Then we could remove it.
Assignee | ||
Comment 32•7 years ago
|
||
["=?UTF-8?B?Y=WJ=jZA==?=", "abcd"] is successful since you added twp "=" and hence the code isn't triggered. But I agree, that extra processing can just go since we cut anyway. I'll update my patch.
Assignee | ||
Comment 33•7 years ago
|
||
OK, this should do it then.
Attachment #8884043 -
Attachment is obsolete: true
Attachment #8884095 -
Flags: review?(infofrommozilla)
Assignee | ||
Comment 34•7 years ago
|
||
BTW, mailnews/mime/test/unit/test_bug493544.js still passes which has a test for bug 227290.
Comment 35•7 years ago
|
||
Comment on attachment 8884095 [details] [diff] [review] 1301989-base64.patch (v2). Review of attachment 8884095 [details] [diff] [review]: ----------------------------------------------------------------- Great
Attachment #8884095 -
Flags: review?(infofrommozilla) → review+
Assignee | ||
Comment 36•7 years ago
|
||
https://hg.mozilla.org/comm-central/rev/23fafbba334ac5e12dd48bca5a8b10680265f6f1
Assignee: nobody → jorgk
Status: NEW → RESOLVED
Closed: 8 years ago → 7 years ago
Component: Search → MIME
Product: Thunderbird → MailNews Core
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 56.0
Version: 45 Branch → 45
Assignee | ||
Comment 37•7 years ago
|
||
Comment on attachment 8884095 [details] [diff] [review] 1301989-base64.patch (v2). Let's get this uplifted.
Attachment #8884095 -
Flags: approval-comm-esr52?
Attachment #8884095 -
Flags: approval-comm-beta+
Assignee | ||
Comment 38•7 years ago
|
||
Beta (TB 55): https://hg.mozilla.org/releases/comm-beta/rev/40791579fdc6
status-thunderbird55:
--- → fixed
status-thunderbird_esr52:
--- → affected
Assignee | ||
Updated•7 years ago
|
Attachment #8884095 -
Flags: approval-comm-esr52? → approval-comm-esr52+
Assignee | ||
Comment 39•7 years ago
|
||
TB 52 ESR https://hg.mozilla.org/releases/comm-esr52/rev/4314c18674d1590b402c1579052b486862b9adc0
Assignee | ||
Updated•7 years ago
|
status-thunderbird56:
--- → fixed
Reporter | ||
Comment 40•7 years ago
|
||
Thanks for the fix.
You need to log in
before you can comment on or make changes to this bug.
Description
•