957424 - /lib/sdk/io/buffer.js should support ASCII encoding option

Reporter

Description

•

12 years ago

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:29.0) Gecko/20100101 Firefox/29.0 (Beta/Release) Build ID: 20131228030204 Steps to reproduce: http://pastebin.mozilla.org/3962845 // make a random string. var charArray = Array.apply(null, Array(20)).map(() => String.fromCharCode(parseInt(Math.random() * 0xFF)+1)); var string = charArray.join(''); // construct a buffer using the string var buffer = new Buffer(string); // will default to UTF-8 encoding option console.log(buffer.length, buffer.byteLength); // would be 20, 20 if the ASCII encoding option was supported. console.log(Buffer.byteLength(string)); // actual results vary and is unreliable not that it should be consistent with an ASCII only encoding

rodneyd.teal

Reporter

Comment 1

•

12 years ago

I think that StringView should replace TextEncoder and TextDecoder which would create support for the ASCII encoding. StringView: https://developer.mozilla.org/en-US/Add-ons/Code_snippets/StringView I've made the changes here: https://github.com/DecipherCode/FireBit/commit/87aa6dea4590bda0334ff2346232f8a06e9b8381#diff-b11a7397c4f97594328821e7e713ca5d This seems to be the best solution.

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Updated

•

12 years ago

Priority: -- → P2

rodneyd.teal

Reporter

Comment 2

•

12 years ago

StringView: Its not a WebGL Standard as such its of no concern to Bug 889977. Although perhaps it should be as it depends upon ArrayBuffer and TypedArrays: https://www.khronos.org/registry/webgl/specs/1.0/#5.13 StringView has been proposed as strawman for ES6 on ECMAScript Bugs. Depends: https://bugs.ecmascript.org/show_bug.cgi?id=1557 Its not in the ES6 draft at the moment: http://people.mozilla.org/~jorendorff/es6-draft.html I'm waiting on a reply from it's developer about status and expected time frame. Thoughts?

rodneyd.teal

Reporter

Comment 3

•

12 years ago

To quote Brendan Eich from a recent ES discussion regarding StringView: What can we do to make sure this thing stays dead, if it is dead? That said StringView is not headed towards being available as a native Web API. TextEncoder and TextDecoder serve the Buffer API nicely with the exception regarding interpreting chars as ASCII encoded, bytes(0-255) code points. The problem is when the code point is > 127, you may end up with a two byte representation of that code point. The desired effect when specifying ASCII as the encoding argument is that code points in the range of a single byte(0-255) would be interpreted as a single byte value. i.e. let string = String.fromCharCode(255); let buffer = new Buffer(string, 'ASCII'); // result buffer.byteLength === 1 // true buffer.length === 1 // true I'm no longer certain that StringView is the best solution as far as efficiency goes. A simple check for the encoding argument. -> if (encoding === 'ASCII') .... Then: string.split('').map((char) => String.charCodeAt(char)); Is all that is needed to create an array of bytes to construct the Buffer(Object) with. The "buffer.toString()" method could do a similar check and conversion back to string. Thoughts?

Flags: needinfo?(jsantell)

Jordan Santell [:jsantell] [@jsantell]

Comment 4

•

12 years ago

If that one liner can handle encoding/decoding in ASCII, sounds great! Of note, the tests for buffer were taken from node.js, so the ASCII tests should be reenabled in that case, and maybe pull out an encode/decode function (not exposed) where it uses Text[En|Dec]oder or the ascii specific line. Also should be added to 'isEncoding' if everything works out.

Flags: needinfo?(jsantell)

rodneyd.teal

Reporter

Comment 5

•

12 years ago

(In reply to Jordan Santell [:jsantell] [@jsantell] from comment #4) > If that one liner can handle encoding/decoding in ASCII, sounds great! Of > note, the tests for buffer were taken from node.js, so the ASCII tests > should be reenabled in that case, and maybe pull out an encode/decode > function (not exposed) where it uses Text[En|Dec]oder or the ascii specific > line. Also should be added to 'isEncoding' if everything works out. I think I need to be clear about the "ASCII" of which I'm referring to. ASCII which is a subset of UTF-8 will be handled properly when specifying the encoding as UTF-8. ISO 8859-1(Extended ASCII)(0-255) might be represented as more than one byte when the code point is (p > 127). If that is the proper behavior? I'm suggesting that when specifying ASCII as the encoding that each character should resolve to a single byte value. Why else would one need an ASCII encoding option when they could just use UTF-8? That one liner will correctly convert a JS String to an Array of single byte values which can be passed to the Buffer(Uint8Array) constructor. It will not do the reverse. ;) The """buffer.toString('ascii')""" method could do the reverse using the following line: String.fromCharCode(...buffer); // <- rest parameter The user of the buffer API would need to know that the string contains only ASCII(ISO 8859-1) before specifying ASCII(ISO 8859-1) as the encoding argument to expect consistent results between multiple conversions but that holds true for the other encodings as well, no? I have no idea how node.js handles this or where to find their test code for comparison. I don't however see any other logical way of handling ASCII(ISO 8859-1). I'll send a pull request with the changes given the go ahead. If some tests need to be performed then please inform where they can be found.

Jordan Santell [:jsantell] [@jsantell]

Comment 6

•

12 years ago

Hmm, sounds good -- read up on some encoding issues to brush up. What you said makes sense, but would be worried about edge cases. Our buffer tests are here -- 'ascii' would need to be added to the tests, and the original tests from node are here, which some ascii tests were removed or commented out. https://github.com/mozilla/addon-sdk/blob/master/test/test-buffer.js https://github.com/joyent/node/blob/master/test/simple/test-buffer.js Node also has additional ascii-specific tests that could be useful: https://github.com/joyent/node/blob/master/test/simple/test-buffer-ascii.js

Firefox Bug Husbandry Bot

Comment 7

•

8 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1399562

Status: UNCONFIRMED → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

Firefox Bug Husbandry Bot

Comment 8

•

8 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=1399562

Bugzilla

/lib/sdk/io/buffer.js should support ASCII encoding option

Categories

(Add-on SDK Graveyard :: General, defect, P2)

Tracking

(Not tracked)

People

(Reporter: rodneyd.teal, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8