Replace use of escape() and unescape() in hacky UTF-8 <--> UTF-16 conversion
Categories
(MailNews Core :: Feed Reader, task)
Tracking
(Not tracked)
People
(Reporter: jorgk-bmo, Unassigned)
References
Details
+++ This bug was initially created as a clone of Bug #1349722 +++
https://hg.mozilla.org/comm-central/rev/55b04a77e7610a1907960a9268f5e816869cddc8
looks quite hacky and both escape() and unescape() are deprecated.
The JS Mime way to do the UTF-8 to UTF-16 is this:
https://searchfox.org/comm-central/rev/d86758c2328ae10f2d3f8b0422772e33e858f089/mailnews/mime/jsmime/jsmime.js#577
Basically, it's just using a "normal" TextDecoder() for "UTF-8".
Reporter | ||
Comment 1•6 years ago
|
||
Henri, what do you think of unescape(encodeURIComponent(source))
and decodeURIComponent(escape(url))
as UTF-16 to UTF-8 and UTF-8 to UTF-16 conversions in JS. Hacky? Our mail headers may contain raw UTF-8 and we need to convert from that to JS strings in UTF-16. Is there a better way? Maybe there is some code in M-C that does the same.
As stated in comment #0, JS Mime does the raw UTF-8 to JS string conversion using a byte array and a text decoder.
Comment 2•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #1)
Henri, what do you think of
unescape(encodeURIComponent(source))
anddecodeURIComponent(escape(url))
as UTF-16 to UTF-8 and UTF-8 to UTF-16 conversions in JS. Hacky?
That solution seems hacky and inefficient.
Our mail headers may contain raw UTF-8 and we need to convert from that to JS strings in UTF-16. Is there a better way? Maybe there is some code in M-C that does the same.
I suggest:
function binaryStringToArrayBuffer(str) {
let buf = new Uint8Buffer(str.length);
for (let i = 0; i < str.length; i++) {
buf[i] = str.charCodeAt(i);
}
return buf;
}
function decodeUtf8BytesInString(bytesAsUtf16LowerHalves) {
return (new TextDecoder()).decode(binaryStringToArrayBuffer(bytesAsUtf16LowerHalves));
}
Comment 3•6 years ago
|
||
I should have read earlier comments better. That's exactly what JSMime does (but with different function names).
Reporter | ||
Comment 4•6 years ago
|
||
Thanks, yes, JS Mimce does that. And for the way back, UTF-16 to UTF-8?
Comment 5•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #4)
Thanks, yes, JS Mimce does that. And for the way back, UTF-16 to UTF-8?
After TextEncoder
, I don't know if there's a more efficient way to convert a Uint8Array
into a string whose each code unit represents a byte value than to call String.fromCharCode()
on a per-byte basis and to concatenate the results.
Reporter | ||
Comment 6•6 years ago
|
||
Thanks, Henri. I'll get to it. Not our most pressing issue, I just saw this in passing.
Updated•6 years ago
|
Updated•2 years ago
|
Description
•