Closed Bug 883767 Opened 11 years ago Closed 11 years ago

[Email] Unsupported encoding in some email makes the email client explode

Categories

(Firefox OS Graveyard :: Gaia::E-Mail, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:1.3+, b2g18 wontfix, b2g-v1.1hd wontfix, b2g-v1.2 wontfix)

RESOLVED FIXED
blocking-b2g 1.3+
Tracking Status
b2g18 --- wontfix
b2g-v1.1hd --- wontfix
b2g-v1.2 --- wontfix

People

(Reporter: gerard-majax, Assigned: asuth)

Details

Attachments

(1 file)

Gaia master, when checking my mozilla email:
I/Gecko   ( 3314): WERR: Explosion while processing data EncodingError: The given encoding is not supported.
I/GeckoDump( 3314): ERR: onerror reporting: uncaught exception: EncodingError: The given encoding is not supported. @  : 0
E/GeckoConsole( 3314): [JavaScript Error: "uncaught exception: EncodingError: The given encoding is not supported."]
Does everything break, or just the one message?  I think it should just be the one message, although we lack the ability to just understand a message to be bad, so every time we re-sync that time window, we would get upset...
After adding some try/catch:

I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn
I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn
I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn
I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn
I/Gecko   ( 3636): WERR: Explosion while processing data TypeError: module.exports.mimeFunctions.decodeMimeWord(...) is undefined
I/Gecko   ( 3636): WERR: Stack: module.exports.decodeMimeWord@app://email.gaiamobile.org/js/ext/mimelib.js:343
I/Gecko   ( 3636): MailParser.prototype._replaceMimeWords/<@app://email.gaiamobile.org/js/ext/mailparser/mailparser.js:1614
I/Gecko   ( 3636): MailParser.prototype._replaceMimeWords@app://email.gaiamobile.org/js/ext/mailparser/mailparser.js:1615
I/Gecko   ( 3636): MailParser.prototype._processHeaderLine@app://email.gaiamobile.org/js/ext/mailparser/mailparser.js:942
I/Gecko   ( 3636): MailParser.prototype._processStateHeader@app://email.gaiamobile.org/js/ext/mailparser/mailparser.js:772
I/Gecko   ( 3636): MailParser.prototype._process@app://email.gaiamobile.org/js/ext/mailparser/mailparser.js:653
I/Gecko   ( 3636): parseFetch@app://email.gaiamobile.org/js/ext/mailapi/imap/probe.js:2037
I/Gecko   ( 3636): processData@app://email.gaiamobile.org/js/ext/mailapi/imap/probe.js:870
I/Gecko   ( 3636): ImapConnection.prototype.connect/<@app://email.gaiamobile.org/js/ext/mailapi/imap/probe.js:449
I/Gecko   ( 3636): EventEmitter.prototype.emit@app://email.gaiamobile.org/js/ext/mailapi/worker-bootstrap.js:10928
I/Gecko   ( 3636): NetSocket.prototype._ondata@app://email.gaiamobile.org/js/ext/mailapi/imap/probe.js
I/GeckoDump( 3636): ERR: onerror reporting: module.exports.mimeFunctions.decodeMimeWord(...) is undefined @ app://email.gaiamobile.org/js/ext/mimelib.js : 343
(In reply to Andrew Sutherland (:asuth) from comment #1)
> Does everything break, or just the one message?  I think it should just be
> the one message, although we lack the ability to just understand a message
> to be bad, so every time we re-sync that time window, we would get upset...

It breaks the whole thread update. I can update subfolders, and other mailboxes, but this one is fried.
Okay, so we'll definitely need some more guards on bad encodings.  I suppose we should just fall-back to ASCII or utf-8 or something and just expect gibberish to happen.  Probably the least bad option.

In terms of supporting iso-2022-cn, http://encoding.spec.whatwg.org/ explicitly seems to not like the encoding.  It suggests our TextDecoder should recognize the encoding, but should use the replacement decoder which just emits an error and an EOF.

Can you provide the source of the message in question or indicate what the sending mail app was and/or if the message is spam?

When you say gaia master, is this against mozilla-central, or b2g18?
> I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn

I'm really surprised that someone uses iso-2022-cn at all because Windows has never been supported this encoding. Could you disclose who sent this message?
Bug 470523 indicates we are unable to correctly decode iso-2022-cn ourselves and most other browsers can't either.  It seems like it has been a source of XSS attacks, etc.  So other than making sure we handle the invocation of the 'replacement' decoder, we should probably treat iso-2022-cn as an unsupported encoding.
(In reply to Andrew Sutherland (:asuth) from comment #4)
> Okay, so we'll definitely need some more guards on bad encodings.  I suppose
> we should just fall-back to ASCII or utf-8 or something and just expect
> gibberish to happen.  Probably the least bad option.
> 
> In terms of supporting iso-2022-cn, http://encoding.spec.whatwg.org/
> explicitly seems to not like the encoding.  It suggests our TextDecoder
> should recognize the encoding, but should use the replacement decoder which
> just emits an error and an EOF.
> 
> Can you provide the source of the message in question or indicate what the
> sending mail app was and/or if the message is spam?
> 
> When you say gaia master, is this against mozilla-central, or b2g18?

gaia master on top of gecko b2g18
(In reply to Andrew Sutherland (:asuth) from comment #4)
> It suggests our TextDecoder
> should recognize the encoding, but should use the replacement decoder which
> just emits an error and an EOF.

We don't support the replacement encoding yet (bug 863728), so our implementation will just throw at the moment.
(In reply to Andrew Sutherland (:asuth) from comment #6)
> Bug 470523 indicates we are unable to correctly decode iso-2022-cn ourselves

TextDecoder explicitly whitelists the usable encodings to comply with the spec, so our buggy decoder for iso-2022-cn will not be used.
(In reply to Masatoshi Kimura [:emk] from comment #5)
> > I/Gecko   ( 3636): Gaia Email: sourceEnc=iso-2022-cn
> 
> I'm really surprised that someone uses iso-2022-cn at all because Windows
> has never been supported this encoding. Could you disclose who sent this
> message?

After dumping a bit more, I can tell that the imputed string is from one of the mails I've got from Steven Yang. And it is in the headers (it's the name of one of the Cc'd people).

Headers show clearly "Content-Type: text/plain; charset=iso-2022-cn", mailer is "X-Mailer: Apple Mail (2.1503)".
This workaround works for me:
diff --git a/apps/email/js/ext/mailapi/worker-bootstrap.js b/apps/email/js/ext/mailapi/worker-bootstrap.js
index 8035167..0bf5984 100644
--- a/apps/email/js/ext/mailapi/worker-bootstrap.js
+++ b/apps/email/js/ext/mailapi/worker-bootstrap.js
@@ -11638,7 +11638,12 @@ exports.convert = function(str, destEnc, sourceEnc, ignoredUseLite) {
 
   // - decoding (Uint8Array => String)
   else if (/^utf-8$/.test(destEnc)) {
-    var decoder = new TextDecoder(sourceEnc, ENCODER_OPTIONS);
+    var decoder;
+    try {
+      decoder = new TextDecoder(sourceEnc, ENCODER_OPTIONS);
+    } catch (e) {
+      decoder = new TextDecoder('utf-8', ENCODER_OPTIONS);
+    }
     if (typeof(str) === 'string')
       str = new Buffer(str, 'binary');
     // XXX strictly speaking, we should be returning a buffer...
(In reply to Alexandre LISSY :gerard-majax from comment #11)
> This workaround works for me:
> diff --git a/apps/email/js/ext/mailapi/worker-bootstrap.js
> b/apps/email/js/ext/mailapi/worker-bootstrap.js
> index 8035167..0bf5984 100644
> --- a/apps/email/js/ext/mailapi/worker-bootstrap.js
> +++ b/apps/email/js/ext/mailapi/worker-bootstrap.js
> @@ -11638,7 +11638,12 @@ exports.convert = function(str, destEnc, sourceEnc,
> ignoredUseLite) {
>  
>    // - decoding (Uint8Array => String)
>    else if (/^utf-8$/.test(destEnc)) {
> -    var decoder = new TextDecoder(sourceEnc, ENCODER_OPTIONS);
> +    var decoder;
> +    try {
> +      decoder = new TextDecoder(sourceEnc, ENCODER_OPTIONS);
> +    } catch (e) {
> +      decoder = new TextDecoder('utf-8', ENCODER_OPTIONS);
> +    }
>      if (typeof(str) === 'string')
>        str = new Buffer(str, 'binary');
>      // XXX strictly speaking, we should be returning a buffer...

The only culprit here is an invalid string display for the part that was really in ISO-2022-CN, as expected.
This looks like a uncommon encoding and not a blocker for 1.1 as we are past feature/enhancement request at this stage..if this is a popular encoding we could consider supporting in our future release.

:asuth, depending on how common this may be can you please set koi:? to get support in 1.2
blocking-b2g: leo? → -
(In reply to bhavana bajaj [:bajaj] from comment #13)
> This looks like a uncommon encoding and not a blocker for 1.1 as we are past
> feature/enhancement request at this stage..if this is a popular encoding we
> could consider supporting in our future release.
> 
> :asuth, depending on how common this may be can you please set koi:? to get
> support in 1.2

That encoding might not be common, but are we sure this bug could not pop under other circumstances ? It basically makes the folder containing the email unusable.
blocking-b2g: - → leo?
Flags: needinfo?(bugmail)
Triage - non-blocking for now. 
Andrew can you make an assessment on the LOE and risk if this needs to be fixed and renom appropriately.

Thanks.
blocking-b2g: leo? → ---
The LOE is very low; Alexandre's patch in comment 11 is sufficient, although a comment on what we're doing and a unit test wouldn't hurt.  The probability of someone having a message with an unsupported encoding is on the low side, but since it breaks message sync on a folder basis, it makes for a very easy denial-of-service griefing attack.  Since we're already taking other things on v1.1 this week, I think we should fix this and take it too.  It doesn't need to be blocking.

I can fix up the patch with a comment and some limited test coverage today.
Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Flags: needinfo?(bugmail)
(In reply to Andrew Sutherland (:asuth) from comment #16)
> The LOE is very low; Alexandre's patch in comment 11 is sufficient, although
> a comment on what we're doing and a unit test wouldn't hurt.  The
> probability of someone having a message with an unsupported encoding is on
> the low side, but since it breaks message sync on a folder basis, it makes
> for a very easy denial-of-service griefing attack.  Since we're already
> taking other things on v1.1 this week, I think we should fix this and take
> it too.  It doesn't need to be blocking.
> 
> I can fix up the patch with a comment and some limited test coverage today.

I've been working on this for the past couple of hours, and I'm stuck. On your gaia-email-libs-and-more repo, hacking to fix this and add unit tests, but I can't reproduce the issue !

With the following change, and a b2g-desktop 18 from today, TextDecoder() does not throw.

diff --git a/test/unit/test_intl_unit.js b/test/unit/test_intl_unit.js
index 506a8b5..c7063cf 100644
--- a/test/unit/test_intl_unit.js
+++ b/test/unit/test_intl_unit.js
@@ -32,4 +32,9 @@ TD.commonSimple('encoding aliases', function(eLazy) {
   }
 });
 
+TD.commonSimple('convert not throwing', function(eLazy) {
+  $encoding.convert('aaa', 'utf-8', 'iso-8859-1');
+  $encoding.convert('bbb', 'utf-8', 'iso-2022-cn');
+});
+
 }); // end define
(In reply to Alexandre LISSY :gerard-majax from comment #17)
> (In reply to Andrew Sutherland (:asuth) from comment #16)
> > The LOE is very low; Alexandre's patch in comment 11 is sufficient, although
> > a comment on what we're doing and a unit test wouldn't hurt.  The
> > probability of someone having a message with an unsupported encoding is on
> > the low side, but since it breaks message sync on a folder basis, it makes
> > for a very easy denial-of-service griefing attack.  Since we're already
> > taking other things on v1.1 this week, I think we should fix this and take
> > it too.  It doesn't need to be blocking.
> > 
> > I can fix up the patch with a comment and some limited test coverage today.
> 
> I've been working on this for the past couple of hours, and I'm stuck. On
> your gaia-email-libs-and-more repo, hacking to fix this and add unit tests,
> but I can't reproduce the issue !
> 
> With the following change, and a b2g-desktop 18 from today, TextDecoder()
> does not throw.
> 
> diff --git a/test/unit/test_intl_unit.js b/test/unit/test_intl_unit.js
> index 506a8b5..c7063cf 100644
> --- a/test/unit/test_intl_unit.js
> +++ b/test/unit/test_intl_unit.js
> @@ -32,4 +32,9 @@ TD.commonSimple('encoding aliases', function(eLazy) {
>    }
>  });
>  
> +TD.commonSimple('convert not throwing', function(eLazy) {
> +  $encoding.convert('aaa', 'utf-8', 'iso-8859-1');
> +  $encoding.convert('bbb', 'utf-8', 'iso-2022-cn');
> +});
> +
>  }); // end define

I haven't been able to reproduce the issue when working on the gaia-email-libs-and-client repo, thus unable to write any test. However, with up to date Gaia master of yesterday, I'm reproducing the issue on another folder of another imap account.

So I'm thinking this issue might not be that spurious than initially thought, and it might deserve a quicker fix.
Okay so this is the third or fourth time now that I'm hitting this issue. Can we still consider this as "very unfrequent" ?

I'm willing to propose a PR but I might not be able to propose the associated test :(
blocking-b2g: --- → 1.3?
Flags: needinfo?(bugmail)
triage: blocking for 1.3
blocking-b2g: 1.3? → 1.3+
Comment on attachment 8335159 [details] [review]
mozilla-b2g:master PR#262

This is basically :gerard-majax's patch with some extra comments and unit test stuff on top.
Attachment #8335159 - Flags: review+
Flags: needinfo?(bugmail)
Whiteboard: checkin-needed
\o/ you did the unit tests :)
Whiteboard: checkin-needed
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: