Closed Bug 745095 Opened 12 years ago Closed 12 years ago

Converting between ArrayBuffer and String

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 795544

People

(Reporter: Yoric, Unassigned)

References

Details

OS.File will provide JS file read/write using ArrayBuffer. I believe that, atm, we have no way of converting between an ArrayBuffer and a string, which is a shame.

Note that this is very related to bug 552551.
There was some discussion of such a thing on the whatwg list, IIRC.
Converting in what sense?  One jschar per byte, via byte inflation?  Or something else?
(In reply to David Rajchenbach Teller [:Yoric] from comment #0)
> OS.File will provide JS file read/write using ArrayBuffer. I believe that,
> atm, we have no way of converting between an ArrayBuffer and a string, which
> is a shame.
> 
> Note that this is very related to bug 552551.

Couldn't one write JS functions to perform various conversions from ArrayBuffer to string?
new FileReader(new Blob([someArrayBuffer])).readAsWhateverYouWant() will provide various conversions. Is a C++ API really required?
(In reply to Masatoshi Kimura [:emk] from comment #4)
> new FileReader(new Blob([someArrayBuffer])).readAsWhateverYouWant() will
> provide various conversions. Is a C++ API really required?

Good point, this lets us convert an ArrayBuffer to various encodings, which solves a big part of the reading issue - although this might be insufficient if one wishes to decode streams in JavaScript, as we also need to be able to handle cases in which only a prefix of the ArrayBuffer is correctly encoded.


Now, how do we do the opposite? Say I have a String and I want to produce an ArrayBuffer with an encoded version of that string.
(In reply to David Rajchenbach Teller [:Yoric] from comment #5)
> Good point, this lets us convert an ArrayBuffer to various encodings, which
> solves a big part of the reading issue - although this might be insufficient
> if one wishes to decode streams in JavaScript, as we also need to be able to
> handle cases in which only a prefix of the ArrayBuffer is correctly encoded.
Use
new Blob([someArrayBuffer]).slice(start, end)
or
new Blob([new Uint8Array(someArrayBuffer, offset, length)]) (Fx15+)

> Now, how do we do the opposite? Say I have a String and I want to produce an
> ArrayBuffer with an encoded version of that string.
Unfortunately existing APIs are intentionally crippled to promote UTF-8 :(
So some APIs will be needed to encode a string back to an ArrayBuffer with a non-UTF-8 encoding.
(In reply to Masatoshi Kimura [:emk] from comment #6)
> (In reply to David Rajchenbach Teller [:Yoric] from comment #5)
> > Good point, this lets us convert an ArrayBuffer to various encodings, which
> > solves a big part of the reading issue - although this might be insufficient
> > if one wishes to decode streams in JavaScript, as we also need to be able to
> > handle cases in which only a prefix of the ArrayBuffer is correctly encoded.
> Use
> new Blob([someArrayBuffer]).slice(start, end)
> or
> new Blob([new Uint8Array(someArrayBuffer, offset, length)]) (Fx15+)

That's assuming you know how many bytes form a proper well-encoded string. In the case of a stream, you typically receive |n| bytes (with |n| fixed), but only a prefix of unknown length is actually a well-encoded string. In a multi-byte encoding, the final few bytes are often an incomplete char, which needs special handling.

> 
> > Now, how do we do the opposite? Say I have a String and I want to produce an
> > ArrayBuffer with an encoded version of that string.
> Unfortunately existing APIs are intentionally crippled to promote UTF-8 :(
> So some APIs will be needed to encode a string back to an ArrayBuffer with a
> non-UTF-8 encoding.

In that case, I am keeping this bug open :)
StringEncoding API (bug 764234) would meet the requirements. It supports streaming encoding/decoding. If you need to convert a non-UTF8 ArrayBuffer to a string, I consider adding a Chrome-only option to TextEncoder so that it can encode non-UTF8 bytes.
Depends on: 764234
> non-UTF8 ArrayBuffer
non-UTF ArrayBuffer. TextEncoder supports utf-8, utf-16, and utf-16be atm.
> convert a non-UTF8 ArrayBuffer to a string
It was "convert a string to a non-UTF ArrayBuffer"... TextDecoder (ArrayBuffer-to-string converter) accepts non-UTF encodings.
Is there anything left here to do?
I think we're ok.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.