Structured clones of typed arrays should handle buffers per spec

RESOLVED FIXED in mozilla23

Status

()

defect
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: sfink, Assigned: sfink)

Tracking

(Blocks 1 bug)

unspecified
mozilla23
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [js:t])

Attachments

(3 attachments, 3 obsolete attachments)

Currently, a structured clone of a typed array makes a new copy of the data for each view. According to the current spec, multiple views of the same buffer should still share a buffer after cloning.

See section 9.3 "Cloning an ArrayBufferView" of http://www.khronos.org/registry/typedarray/specs/latest/

Similarly, views of an ArrayBuffer that is transferred during structured cloning should be views of the new destination ArrayBuffer, not views of independent data.
Blocks: 789594
Whiteboard: [good first bug][mentor=sfink@mozilla.com][lang=C++]
Whiteboard: [good first bug][mentor=sfink@mozilla.com][lang=C++] → [good first bug][mentor=sfink@mozilla.com][lang=C++][js:t]
Hi,

I would like to work on this bug, could you please guide me on getting started with it ? 

Thanks
(In reply to Abhishek Potnis [:abhishekp] from comment #1)
> I would like to work on this bug, could you please guide me on getting
> started with it ? 

Sure. Where are you in the setup/dev process? Have you built the shell? You can find me ('sfink') on the #jsapi channel of irc.mozilla.org.

But the basic idea is to modify js/src/jsclone.cpp so that when cloning a typed array, it should clone the underlying ArrayBuffer first and then write just the data needed to recreate the view into the output stream. That should pretty much just be the byte offset and length, I think. You'll also want to add tests to check that if you do

  var ab = new ArrayBuffer(32);
  var v1 = new Float32Array(ab, 8, 2);
  var v2 = new Uint8Array(ab, 0, 16);

and then clone [ v1, v2 ], you get back two typed arrays of the correct type where

  v1copy.buffer === v2copy.buffer

You can clone with serialize() and deserialize(), in the JS shell. I think. You probably want to look for existing tests; I'm not completely sure how that works. (grep for serialize, and run help(serialize) and help(deserialize) in the JS shell.)

Also check that v1copy.length, .byteLength, and .offset are correct. And for that matter, if you clone [ v1, v2, v1 ], then both copies of v1 should be the same object.

Let me know if you have any questions. Read the portion of the spec I linked to in comment 0 for the exact semantics. And thanks!
Hey,
Could this bug be assigned to me? Let me ask you a few questions. I read the above link that contains the specs. In the specs there is mentioned a "internal structured cloning algorithm". What is that and when it is used? In the file jsclone.cpp is a function WriteStructuredClone and I should modify that function or not? In the shell I am cloning the the view v1 as v2=serialize(v1); is that ok? Thank you.
(In reply to fesniks from comment #3)
> Hey,
> Could this bug be assigned to me?

Sure! You even pinged me on irc, which is enough evidence for me that you have serious intent.

> Let me ask you a few questions. I read the
> above link that contains the specs. In the specs there is mentioned a
> "internal structured cloning algorithm". What is that and when it is used?

I believe that's the same as what is briefly described at https://developer.mozilla.org/en-US/docs/DOM/The_structured_clone_algorithm and specified at http://www.w3.org/TR/html5/common-dom-interfaces.html#safe-passing-of-structured-data

> In the file jsclone.cpp is a function WriteStructuredClone and I should
> modify that function or not?

Not that exact function, but what it calls, yes. As well as what ReadStructuredClone calls.

On the writing side, you'll need to change at least JSStructuredCloneWriter::writeTypedArray. Possibly more than that.

You will want to be sure to base your work on top of bug 748309 and bug 720083, since things have changed a bit since I filed this bug. Duplicate references to the same objects (whether typed arrays or other objects) are now handled properly, and there are tests for them.

This bug is still valid, though. The current code (post bug 748309) properly handles identical ArrayBuffers or typed arrays, but still doesn't know anything about the relationship between them. Specifically, cloning a typed array (or any ArrayBufferView) should really clone the ArrayBuffer that it is a view of, and then construct a new typed array as a view of the clone.

So what I think that means is that when you write a typed array, you write out its buffer (assuming it hasn't already been seen), and then write out the type, length, and offset of the typed array together with a back reference to the buffer. But when you read, you don't want to return the buffer, you want to return just the typed array. I'm not 100% sure how that'll look, but I think it just means that a typed array tag (see the 1st line of JSStructuredCloneWriter::writeTypedArray) is always followed by an ArrayBuffer in the output stream, in the same way an object is a tag (see JSStructuredCloneWriter::traverseObject) followed by id, value pairs (see JSStructuredCloneWriter::write).

> In the shell I am cloning the the view v1 as
> v2=serialize(v1); is that ok? Thank you.

Cloning would be

  clonedata = serialize(v1);
  v2 = deserialize(clonedata);

The intermediate representation 'clonedata' happens to be a typed array itself, which is confusing. But that's just because serialize() returns the serialized representation as a typed array -- a Uint8Array, to be precise:

js> Object.prototype.toString.call(serialize(new Date))
"[object Uint8Array]"
In the function JSStructuredCloneWriter::startWrite is an 
else if (obj->isTypedArray()). I think that first of all in the else if should be placed code that will clone the underlying Arraybuffer. Should I clone the underlying ArrayBuffer with the function WriteStructuredClone or JSStructuredCloneWriter or w.write or writeArrayBuffer? 
Thank you.
I think that code can stay the same. I think all of the write-side changes could be made in JSStructuredCloneWriter::writeTypedArray. As for exactly how to clone the ArrayBuffer, I think you can just make a Value out of it and call startWrite(). That will check if it has already been serialized and emit a back reference if so.

To be honest though, this is the first time I've actually looked at the clone traversal logic, and I'd have to think about it more before I was confident that this would fit into the JSStructuredCloneWriter::write() traversal. JSStructuredCloneWriter::write() does a depth-first traversal using a |counts| stack and an |ids| stack, and ::startWrite() is what pushes new stuff onto the stack.

I also noticed that currently we aren't serializing ArrayBuffers' other properties, which I suspect is a bug. (A separate bug!)
Blocks: 841904
This patch also preserves arbitrary properties on ArrayBuffers. If we're going to allow them, we probably shouldn't throw them away.
Attachment #717305 - Flags: review?(jorendorff)
Assignee: general → sphink
(In reply to Steve Fink [:sfink] from comment #7)
> 
> This patch also preserves arbitrary properties on ArrayBuffers. If we're
> going to allow them, we probably shouldn't throw them away.

(note that Chrome seems to allow named properties on both ArrayBuffers and typed arrays, but discards all of them during structured cloning)
This fixes a test failure resulting from the other patch, due to a compartment not being entered. Tryserver says life is good after this patch.

https://tbpl.mozilla.org/?tree=Try&rev=664c4ebad057
Attachment #719544 - Flags: review?(jorendorff)
Oops, I accidentally included some wildly unrelated stuff in the patch I posted for review.
Attachment #719557 - Flags: review?(jorendorff)
Attachment #719544 - Attachment is obsolete: true
Attachment #719544 - Flags: review?(jorendorff)
As it turns out, it was not at all a good first bug. Sorry, guys.
Whiteboard: [good first bug][mentor=sfink@mozilla.com][lang=C++][js:t] → [js:t]
Blocks: 842081
Blocks: 852187
> This patch also preserves arbitrary properties on ArrayBuffers. If we're going
> to allow them, we probably shouldn't throw them away.

We don't allow them:

  js> var arr = new Uint8Array(4);
  js> arr.prop = 0;
  0
  js> "prop" in arr
  false

Are there plans to change that at some point?
(In reply to Jason Orendorff [:jorendorff] from comment #12)
> > This patch also preserves arbitrary properties on ArrayBuffers. If we're going
> > to allow them, we probably shouldn't throw them away.
> 
> We don't allow them:
> 
>   js> var arr = new Uint8Array(4);
>   js> arr.prop = 0;
>   0
>   js> "prop" in arr
>   false
> 
> Are there plans to change that at some point?

We don't allow them on typed arrays (ArrayBufferViews). We do allow them on ArrayBuffers:

  js> var b = new ArrayBuffer(30)
  js> b.prop = "hi, mom!"
  "hi, mom!"
  js> "prop" in b
  true
  js> b.prop
  "hi, mom!"

Chrome allows them on both, but discards them during structured cloning. The spec situation is ambiguous, especially since the spec is going to switch from WebIDL to ES.
Comment on attachment 717305 [details] [diff] [review]
Clone typed arrays by cloning their buffers and only saving construction parameters

Review of attachment 717305 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think it's correct to clone expando properties of an ArrayBuffer object. The spec doesn't say anything about cloning properties, and we do not clone properties of Date, Regexp, String, Number, or Boolean objects.

Serialization format versioning: I know there's JS_STRUCTURED_CLONE_VERSION in jsapi.h, and I remember the original intent was for the deserialization routine to support data serialized by an earlier version, because we were going to use this serialization code for IndexedDB.

You'll need to either
- keep some crufty code around for parsing old data; or
- find out who's using this and determine that it's only used within a single Firefox build, not in the profile; or
- deliberately shrug it off (this would break people on upgrade -- but only if there's code already storing TypedArrays in the profile).

Clearing review flag for now.

::: js/src/jsclone.cpp
@@ +5,5 @@
>  
> +/*
> + * This file implements the structured clone algorithm of
> + * http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#safe-passing-of-structured-data
> + *

Thanks for the file comment.

::: js/src/tests/js1_8_5/extensions/clone-typed-array.js
@@ +95,5 @@
> +    a = deserialize(serialize(b));
> +    assertEq(b[0].buffer.prop, "yes");
> +    assertEq(b[1].buffer.prop, "yes");
> +    assertEq(b[0], b[0].buffer.loop);
> +    assertEq(b[0], b[0].buffer.loops[0]);

I think you mean for all these tests to be looking at "a" not "b".

(Sorry for the dumb naming convention in these tests. That's totally my fault. The rationale was, b for before, a for after.)
Attachment #717305 - Flags: review?(jorendorff)
Comment on attachment 719557 [details] [diff] [review]
Add a Value AutoCompartment and enter the compartment before writing a typed array to a clone buffer

Review of attachment 719557 [details] [diff] [review]:
-----------------------------------------------------------------

r=me for the JSAutoCompartment but not the new Value constructor signature.

::: dom/base/nsJSEnvironment.cpp
@@ +3532,5 @@
>    uint32_t height = imageData->Height();
>    JS::Value dataArray = JS::ObjectValue(*imageData->GetDataObject());
>  
>    // Write the internals to the stream.
> +  JSAutoCompartment ac(cx, dataArray);

Yep. See how this happens? UnwrapObject just ignores compartment boundaries. Therefore when you use the resulting C++ object's ->GetDataObject(), you really don't know where that JSObject is, and you must use another JSAutoCompartment.

Inside the engine, we use CallNonGenericMethod for these situations. It's clean, it's easy to write, and it hides the AutoCompartment madness. Unfortunately it only works for JSNatives... :(

::: js/src/jsapi.cpp
@@ +1493,5 @@
> +    AssertHeapIsIdleOrIterating(cx_);
> +    if (target.isObject())
> +        cx_->enterCompartment(target.toObject().compartment());
> +    else if (target.isString())
> +        cx_->enterCompartment(target.toString()->compartment());

Since Zones landed, strings no longer have a compartment. (!)

The JSAutoCompartment::JSAutoCompartment(JSContext *cx, JSString *target) signature is gone.

I'd rather stick with the JSObject * signature and not add a Value signature.

::: js/src/jsclone.cpp
@@ +591,5 @@
>  JS_PUBLIC_API(JSBool)
>  JS_WriteTypedArray(JSStructuredCloneWriter *w, jsval v)
>  {
>      JS_ASSERT(v.isObject());
> +    assertSameCompartment(w->context(), v);

+1.
Attachment #719557 - Flags: review?(jorendorff) → review+
Do not read additional properties of ArrayBuffer.

Also, I couldn't convince myself that there wouldn't be old-format buffers lying around in somebody's profile, especially since it seems like it would be relatively easy for an addon to do this. So I implemented reading of the older structured clone data. (I thought I could just pretend that the stored data was ArrayBuffer contents instead of typed array contents, but there's an endianness conversion that invalidates that approach.)

Finally, when implementing the above backwards-compatible reading, it felt like it made sense to stop encoding the typed array type in the SCTAG_* stuff. So new-format typed arrays all use the same tag, SCTAG_TYPED_ARRAY_OBJECT, and encode the type immediately following that tag. I think it makes more sense, given that typed arrays are now just construction parameters, and it'll probably allow for future elimination of boring switch statements.
Attachment #729796 - Flags: review?(jorendorff)
Attachment #717305 - Attachment is obsolete: true
The indexedDB version apparently wants to be bumped whenever the structured clone format version changes. BenT - this bug changes the structured clone format for typed arrays. It will now write them out in a different format. It can still read clone buffers using either the old or the new format, though; it's fully backwards-compatible when reading. I don't know what you use the idb version for, but it seemed safest to bump it up unless you tell me it's ok not to, since not everything saved using IDB version 13 cannot be read by a browser using IDB version 12.
Attachment #730330 - Flags: review?(bent.mozilla)
Comment on attachment 730330 [details] [diff] [review]
Bump indexedDB version to accommodate structured clone version bump

Review of attachment 730330 [details] [diff] [review]:
-----------------------------------------------------------------

This looks exactly right, thanks! God bless those static asserts...
Attachment #730330 - Flags: review?(bent.mozilla) → review+
JS_ASSERT(arrayType < TypedArray::TYPE_UINT8_CLAMPED);

Doh! Let's pretend the previous version of the patch never existed.
Attachment #730394 - Flags: review?(jorendorff)
Attachment #729796 - Attachment is obsolete: true
Attachment #729796 - Flags: review?(jorendorff)
Comment on attachment 730394 [details] [diff] [review]
Clone typed arrays by cloning their buffers and only saving construction parameters

Review of attachment 730394 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsclone.cpp
@@ +667,5 @@
>      if (!objs.append(ObjectValue(*obj)) || !counts.append(count))
>          return false;
>      checkStack();
>  
> +    return true;

We don't need this change anymore, right? Revert it?

@@ +717,5 @@
>          } else if (obj->isArrayBuffer() && obj->asArrayBuffer().hasData()) {
>              return writeArrayBuffer(obj);
>          } else if (obj->isObject() || obj->isArray()) {
> +            return out.writePair(obj->isArray() ? SCTAG_ARRAY_OBJECT : SCTAG_OBJECT_OBJECT, 0) &&
> +                   traverseObject(obj);

And here.

@@ +880,5 @@
> +    } else {
> +        uint64_t arrayType;
> +        if (!r->input().read(&arrayType))
> +            return false;
> +        return r->readTypedArray(arrayType, nelems, vp);

I guess the way things are in here right now, we try to detect all possible errors in serialized data, and throw exceptions rather than assert/crash?

So to keep that going, here the array type would have to be bounds-checked before passing it to readTypedArray. I'm not sure that property of the implementation is useful enough to be worth keeping though.

::: js/src/tests/js1_8_5/extensions/clone-v1-typed-array.js
@@ +122,5 @@
> +    var s = "captured[" + i + "] = ";
> +    if (captured[i] instanceof Error)
> +      print(s + captured[i].toSource() + ";");
> +    else
> +      print(s + "new Uint8Array([ i[1] for (i in Iterator" + captured[i].toSource() + ") ]);");

Nice test. You could have said
    print(s + "new Uint8Array(" + [...captured[i]].toSource() + ")");
which would produce more concise output.
Attachment #730394 - Flags: review?(jorendorff) → review+
Attachment #719557 - Flags: checkin+
Attachment #730330 - Flags: checkin+
> @@ +667,5 @@
> >      if (!objs.append(ObjectValue(*obj)) || !counts.append(count))
> >          return false;
> >      checkStack();
> >  
> > +    return true;
> 
> We don't need this change anymore, right? Revert it?
> 
> @@ +717,5 @@
> >          } else if (obj->isArrayBuffer() && obj->asArrayBuffer().hasData()) {
> >              return writeArrayBuffer(obj);
> >          } else if (obj->isObject() || obj->isArray()) {
> > +            return out.writePair(obj->isArray() ? SCTAG_ARRAY_OBJECT : SCTAG_OBJECT_OBJECT, 0) &&
> > +                   traverseObject(obj);
> 
> And here.


Reverted, with some sadness. I kind of liked having a generic traverseObject, but I guess if nothing else is using it, there's no point.

> @@ +880,5 @@
> > +    } else {
> > +        uint64_t arrayType;
> > +        if (!r->input().read(&arrayType))
> > +            return false;
> > +        return r->readTypedArray(arrayType, nelems, vp);
> 
> I guess the way things are in here right now, we try to detect all possible
> errors in serialized data, and throw exceptions rather than assert/crash?
> 
> So to keep that going, here the array type would have to be bounds-checked
> before passing it to readTypedArray. I'm not sure that property of the
> implementation is useful enough to be worth keeping though.

I dunno, seems like a good property. I put it back.

> ::: js/src/tests/js1_8_5/extensions/clone-v1-typed-array.js
> @@ +122,5 @@
> > +    var s = "captured[" + i + "] = ";
> > +    if (captured[i] instanceof Error)
> > +      print(s + captured[i].toSource() + ";");
> > +    else
> > +      print(s + "new Uint8Array([ i[1] for (i in Iterator" + captured[i].toSource() + ") ]);");
> 
> Nice test. You could have said
>     print(s + "new Uint8Array(" + [...captured[i]].toSource() + ")");
> which would produce more concise output.

Done.
https://hg.mozilla.org/mozilla-central/rev/058c640a3799
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Depends on: 868700
Duplicate of this bug: 878461
Blocks: 881922
Depends on: 887420
You need to log in before you can comment on or make changes to this bug.