Closed
Bug 1402813
Opened 7 years ago
Closed 7 years ago
Re-establish UTF-7 support in JS Mime for headers and C++ MIME for bodies. Re-establish UTF-7 tests.
Categories
(MailNews Core :: MIME, enhancement)
MailNews Core
MIME
Tracking
(thunderbird60+ fixed, thunderbird61 fixed)
RESOLVED
FIXED
Thunderbird 61.0
People
(Reporter: jorgk-bmo, Assigned: jorgk-bmo)
Details
Attachments
(2 files, 6 obsolete files)
306 bytes,
text/plain
|
Details | |
17.37 KB,
patch
|
mkmelin
:
review+
jorgk-bmo
:
approval-comm-beta+
|
Details | Diff | Splinter Review |
UTF-7 decoding was removed in bug 1363281 by switching to the new Rust-based encoding_rs. Tests were also removed here:
https://hg.mozilla.org/comm-central/rev/322986aa2dbabda0ba031d76ba2dc959c7cc9f7a#l3.12
var tests = [
- ["=?UTF-7?Q?+AKM-1?=", "\u00A31"],
- ["=?UTF-7?Q?+AK?= =?UTF-7?Q?M-1?=", "\u00A31"],
Three things are needed to re-establish UTF-7 support in JS Mime:
Implement a scriptable encoder/decoder like nsICharsetConverterManager.mutf7ToUnicode() and nsICharsetConverterManager.unicodeToMutf7().
Integrate this into the FakeTextDecoder in jsmime.jsm.
Re-establish the tests.
Assignee | ||
Comment 1•7 years ago
|
||
This does the trick, it's basically a copy/paste job of what we have for MUTF-7.
mozilla/mach xpcshell-test mailnews/mime/test/unit/test_jsmime_charset.js
passes.
I'm not sure how you want to handle the integration in jsmime.jsm. I think we could make the fake decoder exclusively for UTF-7. We could even take the try/catch out and just use the fake decoder if UTF-7 in encountered.
For the time being I fiddled it into the existing code, but since a) the scriptable unicode converter is on the way out (but we might have to port it to C-C) and b) it doesn't do anything else that the "real" text decoder doesn't do, we might as well get rid of it.
Please let me know.
Attachment #8912092 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 2•7 years ago
|
||
Comment on attachment 8912092 [details] [diff] [review]
1402813-utf-7.patch - (v1)
Review of attachment 8912092 [details] [diff] [review]:
-----------------------------------------------------------------
::: mailnews/mime/src/jsmime.jsm
@@ +43,5 @@
> "@mozilla.org/charset-converter-manager;1"]
> .createInstance(Components.interfaces.nsICharsetConverterManager);
> + this.charset = manager.getCharsetAlias(label);
> + if (this.charset.toLowerCase() != "utf-7")
> + this._encoder.charset = manager.getCharsetAlias(label);
Oops, use this.charset.
Assignee | ||
Comment 3•7 years ago
|
||
And of course we only need UTF-7 to unicode, so I could take the other direction out again.
Assignee | ||
Updated•7 years ago
|
tracking-thunderbird60:
--- → +
Assignee | ||
Comment 4•7 years ago
|
||
Here is the rebased patch. Joshua said on IRC in September 2017(!!) (2017-09-30T22:39:26.000Z)
jorgk: jcranmer: You can have utf-7 back, just do the review ;-)
jcranmer: jorgk: I don't like that patch for other reasons, but I might be able to get to it tonight to fix it
Sadly, more than six months later I haven't heard anything.
Apparently we need UTF-7 support form e-mail from Windows Live Mail which is still supported under Windows 10.
The C++ parts of the patch are copied from MUTF-7 (used for IMAP folder names) so there's nothing not to like. Test just re-establishes what was previously removed, again, nothing not to like.
So the only thing that Joshua might not like is in jsmime.jsm. If anyone can second-guess his intentions, please step forward.
The test in question is:
mach xpcshell-text comm/mailnews/mime/test/unit/test_jsmime_charset.js
(with M-C as top source dir).
Attachment #8912092 -
Attachment is obsolete: true
Attachment #8912092 -
Flags: review?(Pidgeot18)
Attachment #8970372 -
Flags: review?(mkmelin+mozilla)
Attachment #8970372 -
Flags: review?(acelists)
Attachment #8970372 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 5•7 years ago
|
||
Minor change according to comment #2.
Attachment #8970372 -
Attachment is obsolete: true
Attachment #8970372 -
Flags: review?(mkmelin+mozilla)
Attachment #8970372 -
Flags: review?(acelists)
Attachment #8970372 -
Flags: review?(Pidgeot18)
Attachment #8970375 -
Flags: review?(mkmelin+mozilla)
Attachment #8970375 -
Flags: review?(acelists)
Attachment #8970375 -
Flags: review?(Pidgeot18)
Comment 6•7 years ago
|
||
My sitting on this for months is inexcusable, but there is a reason for it.
Last September, I had lots of time owing to my previous job ending in August and the new one starting in October. When in the new job, I lost the ability to keep one eye on bugmail throughout the working day, with the predictable mess that I garnered a large backlog of email which meant most of my open source time consisted of working through email backlogs. And when it came to emails about this bug... I forgot what I objected to. The first thought that always crossed my mind was "I don't want progressive decoding," but rereading the bug, it doesn't do it so it can't be it.
Actually, in writing this up, I finally remembered what the problem was. One of the original things that prompted UTF-7 support was messages being sent with UTF-7 bodies. This patch, AIUI, doesn't support UTF-7 bodies in messages, only UTF-7 in RFC2047 support.
Comment 7•7 years ago
|
||
Comment on attachment 8970375 [details] [diff] [review]
1402813-utf-7.patch - (v1b) rebased
Review of attachment 8970375 [details] [diff] [review]:
-----------------------------------------------------------------
::: mailnews/intl/nsICharsetConverterManager.idl
@@ +63,5 @@
> /**
> + * Support for UTF-7 used by JS Mime.
> + */
> + AString utf7ToUnicode(in string aMutf7);
> + ACString unicodeToUtf7(in wstring aUnicode);
Using string and wstring in IDL is frowned upon. You can replace string with ACString, as I'm pretty sure we don't have any warnings anymore about high-bits being set in these strings when crossing XPConnect.
(I'd say octet array, since UInt8Array can be passed to an octet array argument, but you need to concatenate the arrays anyways, so the benefit here).
unicodeToUtf7 is unused and should be deleted. Let's not pretend that we want to support encoding anything to UTF-7.
Assignee | ||
Comment 8•7 years ago
|
||
Comment on attachment 8970375 [details] [diff] [review]
1402813-utf-7.patch - (v1b) rebased
OK, I'll address Joshua's comments and do some special casing in nsMsgI18N.cpp to cater for UTF-7 bodies.
Attachment #8970375 -
Flags: review?(mkmelin+mozilla)
Attachment #8970375 -
Flags: review?(acelists)
Attachment #8970375 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 9•7 years ago
|
||
Removed the UTF-16 to UTF-7 encoding.
Added special case to decode UTF-7 in nsMsgI18NConvertToUnicode() for bodies.
Switched string/wstring uses to nsACString/nsAString, also for pre-existing MUTF-7.
Assignee: nobody → jorgk
Attachment #8970375 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8970840 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 10•7 years ago
|
||
Comment on attachment 8970840 [details] [diff] [review]
1402813-utf-7.patch - (v2)
I've just noticed that we should enable test_decode_utf-7.js again and also restore the testcase with UTF-7 in test-charset-edit.js by reverting this:
https://hg.mozilla.org/comm-central/rev/c744963d757b#l1.16
Attachment #8970840 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 11•7 years ago
|
||
UTF-7 bodies now display correctly. The Xpcshell test works again. The Mozmill test almost works, somehow getMsgHeaders(msg, true).get("") returns an empty body since some part of JS Mime still can't handle UTF-7. That's strange since the tricky fake decoder should handle it. Needs more investigation.
Joshua, any hints?
Attachment #8970840 -
Attachment is obsolete: true
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 12•7 years ago
|
||
I added some debug and for the Xpcshell test test_jsmime_charset.js fake decoders are created and I can see that the utf-7 string is actually decoded, debug is:
0:01.09 pid:5656 === decoding utf-7 of +AKM-1
0:01.09 pid:5656 == JS stack> resource:///modules/jsmime.jsm (60)
0:01.09 pid:5656 == JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (695)
0:01.09 pid:5656 == JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (713)
0:01.09 pid:5656 == JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (211)
0:01.09 pid:5656 == JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (1404)
0:01.09 pid:5656 == JS stack> c:/mozilla-source/comm-central/obj-x86_64-pc-mingw32/_tests/xpcshell/comm/mailnews/mime/test/unit/test_jsmime_charset.js (26)
Not so for the Mozmill test. Again, a fake decoder is created for utf-7, but nothing ever gets decoded. JS stack is:
=== creating fake decoder for utf-7
== JS stack> resource:///modules/jsmime.jsm (93)
== JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (2428)
== JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (2149)
== JS stack> resource:///modules/jsmime.jsm -> resource:///modules/jsmime/jsmime.js (1979)
== JS stack> resource:///modules/mimeParser.jsm (121)
== JS stack> chrome://mozmill/content/modules/utils.js (437)
== JS stack> chrome://mozmill/content/modules/frame.js -> file:///c:/mozilla-source/comm-central/comm/mail/test/mozmill/composition/test-charset-edit.js (81)
== JS stack> chrome://mozmill/content/modules/frame.js -> file:///c:/mozilla-source/comm-central/comm/mail/test/mozmill/composition/test-charset-edit.js (146)
test-charset-edit.js (146) is
dump("=== getMsgHeaders(msg, true).get() = |"+getMsgHeaders(msg, true).get("")+"\n").
jsmime.js (2428) is:
this._decoder = new TextDecoder(this._charset);
in function Parser_startBody()
So the right decoder is created, but not called on the utf-7 body.
===
Finally I found the problem:
Body decode does this:
if (this._decoder)
return this._decoder.decode(buffer, {stream: more});
And my simple decoder does this:
if (more)
return "";
Flags: needinfo?(Pidgeot18)
Assignee | ||
Comment 13•7 years ago
|
||
Found and fixed the bug in JS Mime. Can't all the encoder with 'true' on the last call.
Attachment #8970935 -
Attachment is obsolete: true
Attachment #8971089 -
Flags: review?(Pidgeot18)
Assignee | ||
Comment 14•7 years ago
|
||
Comment on attachment 8971089 [details] [diff] [review]
1402813-utf-7.patch - (v4)
Green try.
Attachment #8971089 -
Flags: review?(mkmelin+mozilla)
Attachment #8971089 -
Flags: review?(acelists)
Assignee | ||
Updated•7 years ago
|
Attachment #8971089 -
Flags: review?(acelists)
Assignee | ||
Comment 15•7 years ago
|
||
Comment 16•7 years ago
|
||
Comment on attachment 8971089 [details] [diff] [review]
1402813-utf-7.patch - (v4)
Review of attachment 8971089 [details] [diff] [review]:
-----------------------------------------------------------------
::: mailnews/base/util/nsMsgI18N.cpp
@@ +110,5 @@
> {
> #define IMAP_UTF7_BUF_LENGTH 100
> nsUnicodeToMUTF7 converter;
> static char buffer[IMAP_UTF7_BUF_LENGTH];
> + char16_t *in = (char16_t *)aSrc.BeginReading();
I wonder why this needs the cast.
::: mailnews/mime/src/jsmime.jsm
@@ +16,5 @@
>
> var EXPORTED_SYMBOLS = ["jsmime"];
>
> +function bytesToString(buffer) {
> + var string = '';
Double quotes please.
@@ +18,5 @@
>
> +function bytesToString(buffer) {
> + var string = '';
> + for (var i = 0; i < buffer.length; i++)
> + string += String.fromCharCode(buffer[i]);
I think we require {} for loops.
::: mailnews/mime/src/mimemoz2.cpp
@@ +814,5 @@
> + nsAutoString utf16;
> + rv = CopyUTF7toUTF16(nsDependentCString(stringToUse, inLength), utf16);
> + if (NS_FAILED(rv))
> + return -1;
> + CopyUTF16toUTF8(utf16, outString);
Does the Copy* return a success value? Can we check it?
Like return NS_SUCCEEDED(CopyUTF16toUTF8(utf16, outString))? 0 : -1);
Assignee | ||
Comment 17•7 years ago
|
||
(In reply to :aceman from comment #16)
> I wonder why this needs the cast.
I can try and get rid of it.
> Double quotes please.
No, that's not the style in JS Mime.
> I think we require {} for loops.
No, that's not the style in JS Mime.
> > + CopyUTF16toUTF8(utf16, outString);
> Does the Copy* return a success value? Can we check it?
Infallible:
https://searchfox.org/mozilla-central/rev/08df4e6e11284186d477d7e5b0ae48483ecc979c/xpcom/string/nsReadableUtils.cpp#125
How could converting UTF-16 to UTF-8 fail?
So is that an r+ then modulo the cast ;-)
Assignee | ||
Comment 18•7 years ago
|
||
Removed the cast. Sorry about the noise.
Attachment #8971089 -
Attachment is obsolete: true
Attachment #8971089 -
Flags: review?(mkmelin+mozilla)
Attachment #8971089 -
Flags: review?(acelists)
Attachment #8971089 -
Flags: review?(Pidgeot18)
Attachment #8971965 -
Flags: review?(mkmelin+mozilla)
Attachment #8971965 -
Flags: review?(acelists)
Attachment #8971965 -
Flags: review?(Pidgeot18)
Comment 19•7 years ago
|
||
Is JSMime allowed to have its own style? :)
The linked code seems to make a fallible allocation. Just that the whole function returns 'void'.
I was misled by the other CopyUTF7toUTF16 functions that do return a value.
I'd think converting utf-16 to utf-8 can fail on invalid bytes easily, unless we already checked that the utf-16 string is valid. OK, maybe the function would then be called CopyRandombytestreamToUTF8().
Assignee | ||
Comment 20•7 years ago
|
||
Assignee | ||
Comment 21•7 years ago
|
||
Allocation in Mozilla is equally infallible ;-) - That means that control does not return to the caller in case of failure.
Assignee | ||
Updated•7 years ago
|
Summary: Re-establish UTF-7 support in JS Mime → Re-establish UTF-7 support in JS Mime for headers and C++ MIME for bodies. Re-establish UTF-7 tests.
Comment 22•7 years ago
|
||
Comment on attachment 8971965 [details] [diff] [review]
1402813-utf-7.patch - (v4b)
Review of attachment 8971965 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good to me, r=mkmelin
Attachment #8971965 -
Flags: review?(mkmelin+mozilla) → review+
Comment 23•7 years ago
|
||
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/f622e6a7a742
Re-establish UTF-7 support in message headers and bodies. r=mkmelin
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•7 years ago
|
Target Milestone: --- → Thunderbird 61.0
Assignee | ||
Comment 24•7 years ago
|
||
Comment on attachment 8971965 [details] [diff] [review]
1402813-utf-7.patch - (v4b)
[Triage Comment]
Maybe Joshua still want so look at it.
Attachment #8971965 -
Flags: review?(acelists) → approval-comm-beta+
Assignee | ||
Comment 25•7 years ago
|
||
Beta (TB 60 beta 6):
https://hg.mozilla.org/releases/comm-beta/rev/08f57e39702c2a30fa559e4f3866e20d826f9433
status-thunderbird60:
--- → fixed
status-thunderbird61:
--- → fixed
Assignee | ||
Updated•6 years ago
|
Attachment #8971965 -
Flags: review?(Pidgeot18)
You need to log in
before you can comment on or make changes to this bug.
Description
•