Closed Bug 861925 Opened 11 years ago Closed 11 years ago

Cloning typed arrays with shared buffers and transferables does not work

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox27 --- fixed
firefox28 --- fixed
firefox29 --- fixed
firefox-esr24 - wontfix

People

(Reporter: sfink, Assigned: sfink)

References

Details

(Whiteboard: [qa-])

Attachments

(6 files, 12 obsolete files)

2.28 KB, patch
luke
: review+
Details | Diff | Splinter Review
10.50 KB, patch
luke
: review+
Details | Diff | Splinter Review
37.91 KB, patch
decoder
: feedback+
Details | Diff | Splinter Review
18.16 KB, patch
sfink
: review+
Details | Diff | Splinter Review
38.59 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
51.58 KB, patch
jorendorff
: review+
Details | Diff | Splinter Review
Structured clones of typed arrays now work, properly preserving their buffers as separate entities. Passing in a transferable ArrayBuffer now works. Sadly, the combination of the two does not.

The problem is that when a transferable ArrayBuffer is given, its contents are stolen when constructing the transfer map, which has the side effect of neutering all views on that buffer. If such a view is within the set of objects being cloned, it will appear to be zero length during the read.

But the simple solution of writing the transfer map last won't work either, because when reading the ArrayBuffer needs to be constructed first so that the views can later be created using it.

I think the solution is to make a new JSAPI entry that returns the content that will later be stolen. It's a dangerous thing to call, since the memory is still owned by the source ArrayBuffer (and the destination one, post-clone).

I have a patch in progress that does this.
Blocks: 842081
Assignee: general → sphink
Comment on attachment 747655 [details] [diff] [review]
Delay neutering of Transferred ArrayBuffers until after their views have been cloned

Argh, back to the drawing board. Shouldn't be calling releaseAsmJSArrayBuffer.
Attachment #747655 - Flags: review?(luke)
Ok, lemme try this again. Also, see the next comment.
Attachment #747766 - Flags: review?(luke)
Attachment #747766 - Flags: review?(jwalden+bmo)
Attachment #747655 - Attachment is obsolete: true
er, actually the next comment is this updated patch. I realized that I had just introduced the problem I was solving in the same patch, so no need for secrecy.
Attachment #747768 - Flags: review?(luke)
Attachment #747768 - Flags: review?(jwalden+bmo)
Attachment #747766 - Attachment is obsolete: true
Attachment #747766 - Flags: review?(luke)
Attachment #747766 - Flags: review?(jwalden+bmo)
I know we decided yesterday that Waldo should review this patch too, but I'd note that this particular patch hopefully doesn't affect any visible semantics. It just builds on top of existing stuff. I think the things Waldo might be "interested" in are prototype handling (which is currently not where we want it to end up) and asm.js buffers' properties when neutered. Hmm... which suggests to me that I really should have some neuteredness tests that don't just check the byteLength.
Attachment #747652 - Flags: review?(luke) → review+
Comment on attachment 747653 [details] [diff] [review]
Add an optional parameter to the shell serialize() function for specifying Transferables

Review of attachment 747653 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/shell/js.cpp
@@ +3408,5 @@
>  Serialize(JSContext *cx, unsigned argc, jsval *vp)
>  {
> +    CallArgs args = CallArgsFromVp(argc, vp);
> +
> +    jsval v = args.length() > 0 ? args[0] : JSVAL_VOID;

s/JSVAL_VOID/UndefinedValue()/

@@ +3424,5 @@
>      JS_ASSERT((uintptr_t(TypedArray::viewData(array)) & 7) == 0);
>      js_memcpy(TypedArray::viewData(array), datap, nbytes);
> +    JS_free(cx, datap);
> +
> +    JS_SET_RVAL(cx, vp, ObjectValue(*array));

Since you're all CallArgsing it up args.rval().set()
Attachment #747653 - Flags: review?(luke) → review+
Attachment #747654 - Flags: review?(luke) → review+
Comment on attachment 747768 [details] [diff] [review]
Delay neutering of Transferred ArrayBuffers until after their views have been cloned

Review of attachment 747768 [details] [diff] [review]:
-----------------------------------------------------------------

I haven't actually read any of the code in jsclone.cpp other than to fix my breakage of it, so if Waldo has, he'd be the better reviewer.  That said, I had one jsfriendapi.h interface comment.

::: js/src/jsclone.cpp
@@ +147,5 @@
>      return r.read(vp);
>  }
>  
> +static bool
> +discard(const uint64_t *begin, const uint64_t *end, bool ownedOnly)

Discard

::: js/src/jsfriendapi.h
@@ +1265,5 @@
> + * unneutered.
> + *
> + * This call is intended for use by structured clone's Transferables, and
> + * expects that the donor ArrayBuffer will be neutered soon after. Or, if an
> + * error is encountered, any owned pointer should be freed.

This is a pretty subtle thing and not a great interface contract.  It's hard to do better with just a C API, but what if we added a C++ RAII guard:

class ArrayBufferTransferGuard {
  ...
 public:
  ArrayBufferTransferGuard(JSObject *obj);
  void *contents() const;
  void commit();  // neuters 'obj'
};
My usual go-to person for jsclone.cpp is jorendorff, but I see that Waldo has reviewed stuff there too. (Ironically, r=luke is the top reviewer mentioned; you work on a lot of cross-cutting stuff.)

What do you think, Waldo?
 

    ::: js/src/jsfriendapi.h
    @@ +1265,5 @@
    > + * unneutered.
    > + *
    > + * This call is intended for use by structured clone's Transferables, and
    > + * expects that the donor ArrayBuffer will be neutered soon after. Or, if an
    > + * error is encountered, any owned pointer should be freed.

    This is a pretty subtle thing and not a great interface contract.  It's hard to
    do better with just a C API, but what if we added a C++ RAII guard:

    class ArrayBufferTransferGuard {
      ...
     public:
      ArrayBufferTransferGuard(JSObject *obj);
      void *contents() const;
      void commit();  // neuters 'obj'
    };


Sadly, the scope of this isn't even close to being lexical, nor can it be as far as I can tell. I'll see if my jsclone.cpp reviewer has any other ideas.

But for the friend API, I wonder if I shouldn't just delete that second paragraph ("This call is intended..."). It's not such an unusual thing; it's an API call that may or may not transfer ownership of a returned value to the caller. I think I'm just making it sound weirder here.

It would be possible to heap-allocate something like your TransferGuard though. I could keep those as values in a HashMap or something. Would it be worth the extra complexity there?
Comment on attachment 747768 [details] [diff] [review]
Delay neutering of Transferred ArrayBuffers until after their views have been cloned

I talked to Waldo. Sounds like we now want Jason for the structured clone stuff, and Waldo is going to take a look at the transferring/neutering stuff.
Attachment #747768 - Flags: review?(luke) → review?(jorendorff)
Comment on attachment 747768 [details] [diff] [review]
Delay neutering of Transferred ArrayBuffers until after their views have been cloned

Review of attachment 747768 [details] [diff] [review]:
-----------------------------------------------------------------

I looked at the clone bits too, probably not enough to be 100% confident but enough to be okay enough with it.

Agreed that the transferring API is kinda nutso.  Not sure about anything better.  :-\  Lacking any better ideas here, I'm not inclined to complain too much.

::: js/src/jstypedarray.cpp
@@ +270,4 @@
>      struct OldObjectRepresentationHack {
>              uint32_t capacity;
>              uint32_t initializedLength;
>              HeapPtrObject views;

This representation ordering is wrong, flags comes first.  Fix it?

@@ +297,4 @@
>  ArrayBufferObject::changeContents(JSContext *maybecx, ObjectElements *newHeader)
>  {
> +    if (!newHeader)
> +        newHeader = ObjectElements::fromElements(fixedElements());

This doesn't make for a very nice API -- it's something callers should be determining, and applying, if they must.

@@ +331,5 @@
> +{
> +    JS_ASSERT(cx);
> +    if (hasDynamicElements() && !isAsmJSArrayBuffer()) {
> +        ObjectElements *oldHeader = getElementsHeader();
> +        changeContents(cx, NULL);

Per the previous comment (with rewrapping as needed):

    ObjectElements *oldHeader = getElementsHeader();

    // Revert this buffer's contents back to inline storage, fully dissociating |oldHeader|
    // from this object so it can be freed.
    changeContents(cx, ObjectElements::fromElements(fixedElements()));

    FreeOp fop(cx->runtime, false);
    fop.free_(oldHeader);

If you think ObjectElements::fromElements(fixedElements()) is a bit wordy, let's add an ObjectImpl::fixedObjectElements() that encapsulates it.

@@ +337,5 @@
> +        FreeOp fop(cx->runtime, false);
> +        fop.free_(oldHeader);
> +    }
> +
> +    updateElementsHeader(getElementsHeader(), 0 /* byteLength */);

uint32_t byteLength = 0;
updateElementsHeader(getElementsHeader(), byteLength);

?

@@ +356,5 @@
> +ArrayBufferObject::ensureNonInline(JSContext *maybecx)
> +{
> +    if (!hasDynamicElements())
> +        return copyData(maybecx);
> +    return true;

I'd do if (has dynamic) return true; return copyData(...), myself.  Make the easy case be the side path, mentally, and all that.

@@ +364,5 @@
> +// Otherwise, allocate some new contents and copy the data over, but in no case
> +// modify the original ArrayBuffer. (Also, any allocated contents will have no
> +// views linked to in its header.)
> +void *
> +ArrayBufferObject::getTransferableContents(JSContext *maybecx, bool *callerOwns)

Seems to me this internal method should return ObjectElements*.  The JS_* method can/should return void*, but inside the engine I think it's best to be clear about what exactly this data is.  Casts don't seem to do much except add noise here, to me.

@@ +371,5 @@
> +        *callerOwns = false;
> +        return getElementsHeader();
> +    }
> +
> +    ObjectElements *newheader = AllocateArrayBufferContents(maybecx, byteLength(), dataPointer());

uint32_t byteLen = byteLength();

and then use that both places here -- a little clearer, I think.

@@ +488,5 @@
>  #else  /* defined(JS_ION) && defined(JS_CPU_X64) */
>  bool
>  ArrayBufferObject::prepareForAsmJS(JSContext *cx, Handle<ArrayBufferObject*> buffer)
>  {
> +    if (!buffer->copyData(cx, false))

Um, copyData only takes JSContext*.

Add an assert that buffer->hasDynamicElements() for extra documentation, etc.?

@@ +686,4 @@
>      } else {
> +        // header is a pointer to the dynamically allocated data, so re-point
> +        // the ArrayBuffer at its fixed elements and neuter it.
> +        buffer.changeContents(cx, NULL);

buffer.fixedObjectElements()

@@ +686,5 @@
>      } else {
> +        // header is a pointer to the dynamically allocated data, so re-point
> +        // the ArrayBuffer at its fixed elements and neuter it.
> +        buffer.changeContents(cx, NULL);
> +        buffer.neuter(cx);

Since buffer.neuter(cx) is being shared between these, how about something like this:

// Clear out the object's elements.
if (!own) {
    // If the object has dynamically-allocated elements, revert it back to
    // fixed-element storage before neutering it.
    buffer.changeContents(cx, buffer.fixedObjectElements());
}

buffer.neuter(cx);

@@ +3961,5 @@
> +JS_FRIEND_API(bool)
> +JS_NeuterArrayBuffer(JSContext *cx, JSObject *obj)
> +{
> +    if (!obj)
> +        return false;

Don't do this.  Callers should be responsible for passing a non-null pointer.

@@ +3965,5 @@
> +        return false;
> +    ArrayBufferObject &buffer = obj->asArrayBuffer();
> +    buffer.neuterViews(cx);
> +    buffer.neuter(cx);
> +    return true;

If we anticipate this always being infallible, make this return void, and reverse the argument ordering to put the cx last.

::: js/src/jstypedarrayinlines.h
@@ +17,5 @@
>  // show that they have not yet been added to any ArrayBuffer list
>  JSObject * const UNSET_BUFFER_LINK = (JSObject*)0x2;
>  
>  inline void
>  js::ArrayBufferObject::updateElementsHeader(js::ObjectElements *header, uint32_t bytes)

It'd be nice if bytes were renamed to byteLength, or byteLen to avoid shadowing, or something, which has much clearer semantic meaning in context than just "bytes" does.  Would be nice if callers had the same sort of thing applied to the names of their local variables.  Separate patch.

::: js/src/tests/js1_8_5/extensions/clone-transferables.js
@@ +13,5 @@
> +    return ab;
> +}
> +
> +function test() {
> +    for (var size of [0, 8, 4096, -8192]) {

Probably would be good to test some non-power-of-two sizes as well.  Also you should probably add some DataView uses here, too, to test that those get neutered correctly and all.
Attachment #747768 - Flags: review?(jwalden+bmo) → review+
Ok, addressed all of Waldo's comments (including fixing the inevitable bug uncovered by the DataView tests he requested!)

Now it's down to the structured clone changes. Jason, you're of course welcome to review any and all of this patch, but Waldo's looked at everything but the structured clone piece. And it needs some serious looking at; I am hopeful you can suggest something tidier.

Also note that this ownership management stuff is very likely to be rewritten -- or at least, expanded -- soon, to externalize and refcount large strings. (Assuming we come up with a workable way to implement that.)
Attachment #760010 - Flags: review?(jorendorff)
Attachment #747768 - Attachment is obsolete: true
Attachment #747768 - Flags: review?(jorendorff)
Attachment #760067 - Flags: feedback?(gary)
Attachment #760067 - Flags: feedback?(choller)
Depends on: 881100
Comment on attachment 760067 [details] [diff] [review]
rollup of typed array shared buffers + transferables, for fuzzing

Failures (all with m-c revision 855a29c9dd68 + patch):

Build/Run Type: 32 bit debug, run with --ion-eager
Failure: Assertion failure: view, at js/src/jstypedarray.cpp:855
Test:

gczeal(9, 2)
function AsmJSArrayBuffer(size) {
    var ab = new ArrayBuffer(size);
    (new Function('buffer', 'var i32 = new Int32Array(buffer);'))(ab);
    return ab;
}
for (var size of [,]) {
  var buffer_ctor = AsmJSArrayBuffer;
  var constructors = [ Int8Array ];
  for (var ctor of constructors) {
    var buf = buffer_ctor(size);
    var old_arr = ctor(buf);
    var copy_arr = deserialize(serialize(old_arr, [ buf ]));
  }
}



This similar test asserts the same way but also crashes an opt-build [@ js::ArrayBufferObject::resetArrayBufferList]:

gczeal(10, 2)
for (var size of [ 1000, 4096 ]) {
        var buffer_ctor = ArrayBuffer;
        var constructors = [ Int8Array ];
        gc();
        for (var ctor of constructors) {
            var buf = buffer_ctor(size);
            var old_arr = ctor(buf);
            var dv = DataView(buf);
            var copy_arr = deserialize(serialize(old_arr, [ buf ]));
        }
}
Attachment #760067 - Flags: feedback?(choller) → feedback-
Comment on attachment 760067 [details] [diff] [review]
rollup of typed array shared buffers + transferables, for fuzzing

I didn't find anything caused by this patch overnight, so feedback+.
Attachment #760067 - Flags: feedback?(gary) → feedback+
My manually-reduced version of the testcase is:

gczeal(9, 1);

var buf = ArrayBuffer(0);
Int16Array(buf);
var old_arr = Int8Array(buf);
deserialize(serialize(old_arr, [ buf ]));
And the deserialize() wasn't necessary either.

This showed a pretty big and obvious problem with the patch (fortunately easily fixable). Fuzzing is awesome. Thanks!

Can you try this one? The change from the previous is small, but it avoids NULLing a pointer that shouldn't have been NULLed, so it may expose some degenerate cases.
Attachment #760067 - Attachment is obsolete: true
Attachment #761794 - Flags: feedback?(gary)
Attachment #761794 - Flags: feedback?(choller)
Comment on attachment 761794 [details] [diff] [review]
rollup of typed array shared buffers + transferables, for fuzzing, v2

I got a compile error. :(
Attachment #761794 - Flags: feedback?(gary) → feedback-
Yeah, bitrot. (Somebody converted cx->runtime from a field to an accessor method.)

Also, I screwed up when generating that rollup, but I don't think that mattered.

Attempt #3.
Attachment #761794 - Attachment is obsolete: true
Attachment #761826 - Attachment is obsolete: true
Attachment #761794 - Flags: feedback?(choller)
Attachment #761876 - Flags: feedback?(gary)
Attachment #761876 - Flags: feedback?(choller)
And now there's a v4. Forgot to qref before uploading v3. Sorry!
Attachment #761876 - Attachment is obsolete: true
Attachment #761876 - Flags: feedback?(gary)
Attachment #761876 - Flags: feedback?(choller)
Attachment #762175 - Flags: feedback?(gary)
Attachment #762175 - Flags: feedback?(choller)
Comment on attachment 762175 [details] [diff] [review]
rollup of typed array shared buffers + transferables, for fuzzing, v4

Nothing bad found overnight.
Attachment #762175 - Flags: feedback?(gary) → feedback+
Comment on attachment 762175 [details] [diff] [review]
rollup of typed array shared buffers + transferables, for fuzzing, v4

Haven't seen any more failures on first glance.
Attachment #762175 - Flags: feedback?(choller) → feedback+
Here's a followup to fix the broken error handling all over jsclone.cpp. I suspect I'm to blame for most of it, but I didn't introduce it all in the previous patch, so it seems easier to review as a separate patch.
Attachment #767460 - Flags: review?(jorendorff)
Comment on attachment 760010 [details] [diff] [review]
Delay neutering of Transferred ArrayBuffers until after their views have been cloned

Review of attachment 760010 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing the review flag for now. Feel free to let someone else do the final review on this one! Or I can do it, of course. :-)

I did not review the changes in jstypedarray.cpp.



The patch that adds support for transferables to the shell serialize() builtin will make it possible for the fuzzer to produce double frees, just by deserializing the same buffer twice and then allowing the resulting ArrayBuffer objects to be GC'd. Right?

Separately, I think that with that patch, serialize() and deserialize() never call JS_ClearStructuredClone, which is a memory leak. :-P

The notion of serialize() producing "just bits" is dead. I propose making the result of serialize() an opaque StructuredCloneData object; and deserialize() should call JS_ClearStructuredData if the buffer had any transferables. (The destructor should make sure it's not leaking buffers either way.)

::: js/src/jsclone.cpp
@@ +73,1 @@
>      SCTAG_TYPED_ARRAY_V1_UINT16 = SCTAG_TYPED_ARRAY_V1_MIN + TypedArray::TYPE_UINT16,

At the top of jsclone.cpp, there's an #include that has strayed:
    #include "mozilla/Endian.h"

Please move it down, so it's just before the line that says:
    #include "mozilla/FloatingPoint.h"

@@ +86,5 @@
> + * ownership of its Transferable contents has therefore been given to the
> + * reader.)
> + *
> + * During the writing process, ownership may be shared with the donor
> + * Transferable objects up until the point when transferOwnership() is called.

Might want to point out here that this is observable, and in fact required by spec, since the structured cloning algorithm calls getters on plain objects.

That can then re-enter structured cloning, transferring the same buffers again; that might be worth mentioning here; definitely worth some tests!

Another thing to test is that changes to a buffer-being-transferred that we make from a getter should in fact show up on the other side.

Another case to test is that inline and uninlined buffers are not observably different during this time. (I think we actually get this right, but it's worth testing pretty thoroughly.)

@@ +158,5 @@
>          if ((TransferableMapHeader)uint32_t(u) == SCTAG_TM_NOT_MARKED) {
>              while (point != end) {
>                  uint64_t u = LittleEndian::readUint64(point++);
>                  uint32_t tag = uint32_t(u >> 32);
> +                bool own = uint32_t(u);

This code has two variables named "u" and two named "tag" in nested scopes. Please don't do that!

Separately, how about:

    U32Pair pair = ReadPair(point++);
    if (pair.tag == SCTAG_TRANSFER_MAP_HEADER) {
        if (TransferableMapHeader(pair.data) == SCTAG_TM_NOT_MARKED) {
            while (point != end) {
                pair = ReadPair(point++);
                if (pair.tag == SCTAG_TRANSFER_MAP) {
                    if (pair.data || !ownedOnly)
                        ...

SCInput::readPair could use the same ReadPair static inline. Just a thought, it's your call.

@@ +788,5 @@
> +                return false;
> +            }
> +        } else {
> +            JS_NOT_REACHED("invalid transferable found after validation");
> +        }

Instead of this if-else, I think the code should be like:

    JS_ASSERT(JS_IsArrayBufferObject(obj));

    // Cannot steal the contents outright yet, ...

@@ +821,5 @@
> +            void *content;
> +            uint8_t *data;
> +            if (!JS_StealArrayBufferContents(context(), obj, &content, &data))
> +                ok = false;
> +            JS_ASSERT(content == oldContent);

I think at this point we have to modify the map entry to say that we now own the content, that is, if we fail after this, Discard is responsible for freeing it.

@@ +884,3 @@
>  
> +    (void) Discard(out.rawBuffer(), out.rawBuffer() + out.count() * sizeof(uint64_t), true);
> +    return false;

I think if the user calls init(), and writeTransferMap() fails partway through, the transfers already written are leaked.

If JSStructuredCloneReader::readTransferMap() fails partway through (this can only happen with OOM), I think we leak the remaining transfers.

Since we would leak transfer buffers if init() were not followed on success by write(), please merge them into a single method.

I'm not convinced I didn't miss anything.

I think the desired story is like this:

    enter js::WriteStructuredClone
      create JSStructuredCloneWriter writer
        writer.writeTransferMap()
        - creates transfers
        - at this point inline and asm transfers are owned by writer
        - if that fails midway, writer frees those transfers
        writer.doWrite() // doesn't affect ownership of anything
        writer.transferOwnership()
        - neuter all transferables
        - non-inline non-asm buffers are transferred to the writer
        - if that fails midway, writer frees all owned transfers
        - otherwise:
        out.extractBuffer()
        - That transfers ownership of everything to the caller
        - after doing that we must not do anything else fallible
      end JSStructuredCloneWriter lifetime
    leave js::WriteStructuredClone
    - At this point the caller owns the buffer and its embedded transfers
    - they call JS_ClearStructuredClone to clean it up

If the code were more straight-line this would be easier to believe; can you squish it all into a single method?

Deferring actual js_free() calls or other resource cleanup to destructors is fine, if that helps...

::: js/src/jsfriendapi.h
@@ +1407,5 @@
> + * expects that the donor ArrayBuffer will be neutered soon after. Or, if an
> + * error is encountered, any owned pointer should be freed.
> + */
> +extern JS_FRIEND_API(void *)
> +JS_GetTransferableArrayBufferContents(JSObject *obj, bool *callerOwns);

Let's make these internal-only. This bucks the tradition of jsclone.cpp only using public API, but it's time to do that.

Merge this comment with the one on ArrayBuffer::getTransferableContents, and clarify that "has total ownership of and must free" and "should be freed" mean the caller is responsible for calling js_free (or whatever it actually means).
Attachment #760010 - Flags: review?(jorendorff)
Comment on attachment 767460 [details] [diff] [review]
Clean up the error handling in jsclone.cpp

Review of attachment 767460 [details] [diff] [review]:
-----------------------------------------------------------------

As discussed, the Vector and HashMap operations already report OOM.

::: js/src/jsclone.cpp
@@ +447,1 @@
>          return false;

Aaahhh! Yes, I'm awake!

@@ +925,5 @@
>          if (p) {
>              p[len] = jschar(0);
>              return true;
>          }
> +        js_ReportAllocationOverflow(cx);

JSContext::pod_malloc appears to have a bug. It reports an exception if n * sizeof(T) overflows size_t, but it does not report OOM if malloc fails. Feel free to fix that!

@@ +1206,1 @@
>              return false;

I think if you fix pod_malloc, then ensureStable() will be well-behaved, and you don't need to add an error report here.

@@ +1220,5 @@
>          JSObject *obj = (tag == SCTAG_ARRAY_OBJECT)
>                          ? NewDenseEmptyArray(context())
>                          : NewBuiltinClassInstance(context(), &ObjectClass);
> +        if (!obj || !objs.append(ObjectValue(*obj))) {
> +            js_ReportAllocationOverflow(context());

Calling this when !obj would have been a mistake.
Attachment #767460 - Flags: review?(jorendorff)
(In reply to Jason Orendorff [:jorendorff] from comment #27)
> Comment on attachment 760010 [details] [diff] [review]
> Delay neutering of Transferred ArrayBuffers until after their views have
> been cloned
> 
> Review of attachment 760010 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Clearing the review flag for now. Feel free to let someone else do the final
> review on this one! Or I can do it, of course. :-)

But now you have momentum! Familiarity! Enthusiasm!

> I did not review the changes in jstypedarray.cpp.

Yeah, I should've split those out.

> The patch that adds support for transferables to the shell serialize()
> builtin will make it possible for the fuzzer to produce double frees, just
> by deserializing the same buffer twice and then allowing the resulting
> ArrayBuffer objects to be GC'd. Right?

Ugh, yes. I attempted to handle this:

js> b = ArrayBuffer(1000)
js> a = Int32Array(b)
js> x = serialize(ajs> x = serialize(a, [b])
js> y1 = deserialize(x)
js> y2 = deserialize(x) 
typein:7:0 InternalError: bad serialized structured data (invalid input)

...but the fact that serialize() returns a Uint8Array defeats this, because you could copy x before it gets updated by the first deserialize():

js> b = ArrayBuffer(1000)
js> a = Int32Array(b)
js> x = serialize(ajs> x = serialize(a, [b])
js> x2 = Uint8Array(x)
js> y1 = deserialize(x)
js> y2 = deserialize(x2) 
Assertion failure: *GetViewList(&obj->asArrayBuffer()) == __null, at /home/sfink/src/MI-pending/js/src/jstypedarray.cpp:4014

> Separately, I think that with that patch, serialize() and deserialize()
> never call JS_ClearStructuredClone, which is a memory leak. :-P

A kinda big one, yeah.

> The notion of serialize() producing "just bits" is dead. I propose making
> the result of serialize() an opaque StructuredCloneData object; and
> deserialize() should call JS_ClearStructuredData if the buffer had any
> transferables. (The destructor should make sure it's not leaking buffers
> either way.)

Looks like you're right. It's a total fuzzing hazard otherwise.

> @@ +86,5 @@
> > + * ownership of its Transferable contents has therefore been given to the
> > + * reader.)
> > + *
> > + * During the writing process, ownership may be shared with the donor
> > + * Transferable objects up until the point when transferOwnership() is called.
> 
> Might want to point out here that this is observable, and in fact required
> by spec, since the structured cloning algorithm calls getters on plain
> objects.
> 
> That can then re-enter structured cloning, transferring the same buffers
> again; that might be worth mentioning here; definitely worth some tests!

That is a very true statement.

> Another thing to test is that changes to a buffer-being-transferred that we
> make from a getter should in fact show up on the other side.

Ugh ugh ugh yuck.

So with this patch, if you transfer a buffer that is either inline or AsmJS, it'll make a copy and store that in the clone buffer. So that copy won't see the updates made by the getter. Argh!

Hm, does the spec say when the neutering happens? If we neutered the ArrayBuffer but not its views, then the getter can't modify the ArrayBuffer. We'd just have to be sure to neuter the views on an error.

> Another case to test is that inline and uninlined buffers are not observably
> different during this time. (I think we actually get this right, but it's
> worth testing pretty thoroughly.)

I'm not so sure about that, if you're talking about changes to the buffer being observed.

> @@ +158,5 @@
> >          if ((TransferableMapHeader)uint32_t(u) == SCTAG_TM_NOT_MARKED) {
> >              while (point != end) {
> >                  uint64_t u = LittleEndian::readUint64(point++);
> >                  uint32_t tag = uint32_t(u >> 32);
> > +                bool own = uint32_t(u);
> 
> This code has two variables named "u" and two named "tag" in nested scopes.
> Please don't do that!

Sorry, my fault as reviewer.

> Separately, how about:
> 
>     U32Pair pair = ReadPair(point++);
>     if (pair.tag == SCTAG_TRANSFER_MAP_HEADER) {
>         if (TransferableMapHeader(pair.data) == SCTAG_TM_NOT_MARKED) {
>             while (point != end) {
>                 pair = ReadPair(point++);
>                 if (pair.tag == SCTAG_TRANSFER_MAP) {
>                     if (pair.data || !ownedOnly)
>                         ...
> 
> SCInput::readPair could use the same ReadPair static inline. Just a thought,
> it's your call.

I haven't looked very hard at that code. Seems reasonable, though.

> @@ +788,5 @@
> > +                return false;
> > +            }
> > +        } else {
> > +            JS_NOT_REACHED("invalid transferable found after validation");
> > +        }
> 
> Instead of this if-else, I think the code should be like:
> 
>     JS_ASSERT(JS_IsArrayBufferObject(obj));
> 
>     // Cannot steal the contents outright yet, ...

Yeah, that's one indent level better. I wrote it the way I did because I'm anticipating more types of Transferables. Though come to think of it, they may need to come in through callbacks anyway, so I ought to do this the cleaner way for now.

> @@ +821,5 @@
> > +            void *content;
> > +            uint8_t *data;
> > +            if (!JS_StealArrayBufferContents(context(), obj, &content, &data))
> > +                ok = false;
> > +            JS_ASSERT(content == oldContent);
> 
> I think at this point we have to modify the map entry to say that we now own
> the content, that is, if we fail after this, Discard is responsible for
> freeing it.

Ok

> @@ +884,3 @@
> >  
> > +    (void) Discard(out.rawBuffer(), out.rawBuffer() + out.count() * sizeof(uint64_t), true);
> > +    return false;
> 
> I think if the user calls init(), and writeTransferMap() fails partway
> through, the transfers already written are leaked.
> 
> If JSStructuredCloneReader::readTransferMap() fails partway through (this
> can only happen with OOM), I think we leak the remaining transfers.
> 
> Since we would leak transfer buffers if init() were not followed on success
> by write(), please merge them into a single method.

Ok

> I'm not convinced I didn't miss anything.

At least you found a bunch of stuff the fuzzers didn't. :-)

> I think the desired story is like this:
> 
>     enter js::WriteStructuredClone
>       create JSStructuredCloneWriter writer
>         writer.writeTransferMap()
>         - creates transfers
>         - at this point inline and asm transfers are owned by writer
>         - if that fails midway, writer frees those transfers
>         writer.doWrite() // doesn't affect ownership of anything
>         writer.transferOwnership()
>         - neuter all transferables
>         - non-inline non-asm buffers are transferred to the writer
>         - if that fails midway, writer frees all owned transfers
>         - otherwise:
>         out.extractBuffer()
>         - That transfers ownership of everything to the caller
>         - after doing that we must not do anything else fallible
>       end JSStructuredCloneWriter lifetime
>     leave js::WriteStructuredClone
>     - At this point the caller owns the buffer and its embedded transfers
>     - they call JS_ClearStructuredClone to clean it up
> 
> If the code were more straight-line this would be easier to believe; can you
> squish it all into a single method?

I'll take a stab at it.

> Deferring actual js_free() calls or other resource cleanup to destructors is
> fine, if that helps...
> 
> ::: js/src/jsfriendapi.h
> @@ +1407,5 @@
> > + * expects that the donor ArrayBuffer will be neutered soon after. Or, if an
> > + * error is encountered, any owned pointer should be freed.
> > + */
> > +extern JS_FRIEND_API(void *)
> > +JS_GetTransferableArrayBufferContents(JSObject *obj, bool *callerOwns);
> 
> Let's make these internal-only. This bucks the tradition of jsclone.cpp only
> using public API, but it's time to do that.

Ok, like I said on irc, I think it would be good for external callers to implement their own transferables. But you're still right; this doesn't mean they have to be able to implement transferable ArrayBuffers.

> Merge this comment with the one on ArrayBuffer::getTransferableContents, and
> clarify that "has total ownership of and must free" and "should be freed"
> mean the caller is responsible for calling js_free (or whatever it actually
> means).

Ok.

Thanks much! I know this review was a PITA.
(In reply to Steve Fink [:sfink] from comment #29)
> > Clearing the review flag for now. Feel free to let someone else do the final
> > review on this one! Or I can do it, of course. :-)
> 
> But now you have momentum! Familiarity! Enthusiasm!

Scars! Bruises! :-)

> Hm, does the spec say when the neutering happens?

Yes, it does. The transfers happen last. See:

http://www.whatwg.org/specs/web-apps/current-work/multipage/web-messaging.html#posting-messages

"Obtaining a structured clone" happens in step 5; "transferring" happens in 6.2. Transferring is what neuters the original ArrayBuffer:

http://www.whatwg.org/specs/web-apps/current-work/multipage/common-dom-interfaces.html#transfer-a-transferable-object

> > Another case to test is that inline and uninlined buffers are not observably
> > different during this time. (I think we actually get this right, but it's
> > worth testing pretty thoroughly.)
> 
> I'm not so sure about that, if you're talking about changes to the buffer
> being observed.

Rats. Well, it seems fixable: copy the data into the buffer at the last minute.

> > > +            if (!JS_StealArrayBufferContents(context(), obj, &content, &data))
> > > +                ok = false;
> > > +            JS_ASSERT(content == oldContent);
> > 
> > I think at this point we have to modify the map entry to say that we now own
> > the content, that is, if we fail after this, Discard is responsible for
> > freeing it.
> 
> Ok

Oh -- and when reading the transfer map, I think we have to do something similar: mark each buffer as "not owned" as soon as we attach it to a TypedArray JSObject and the GC becomes responsible for it once again. In case *that* fails partway through.

Maybe we shouldn't put the transfer map in the serialized data. Instead, the serialized data could start with a TRANSFERS tag followed by a pointer to a good old C++ struct:

    struct TransferArray {
        size_t length;
        Transfer transfers[0];
    };

    struct Transfer {
      public:
        Transfer(uint8_t *p, bool owned);
        bool owned() const;
        void setOwned(bool);
        uint8_t *buf() const;
      private:
        uintptr_t data_;  // tagged pointer
    };

Maybe this would simplify some code, maybe not...
So, what's the next step here?
Flags: needinfo?(sphink)
(In reply to Jason Orendorff [:jorendorff] from comment #28)
> @@ +925,5 @@
> >          if (p) {
> >              p[len] = jschar(0);
> >              return true;
> >          }
> > +        js_ReportAllocationOverflow(cx);
> 
> JSContext::pod_malloc appears to have a bug. It reports an exception if n *
> sizeof(T) overflows size_t, but it does not report OOM if malloc fails. Feel
> free to fix that!

I think it does. It calls malloc_, not malloc, and malloc_ calls client->onOutOfMemory, which will report the OOM.

> @@ +1206,1 @@
> >              return false;
> 
> I think if you fix pod_malloc, then ensureStable() will be well-behaved, and
> you don't need to add an error report here.

Yup, this whole patch can and should die.
Comment on attachment 767460 [details] [diff] [review]
Clean up the error handling in jsclone.cpp

Die patch die.
Attachment #767460 - Attachment is obsolete: true
This was already r+ by luke, but I needed to make this into an opaque type to prevent eg fuzzers from deserializing a clone buffer twice. The trick (read: hack) I picked was an '@@clonebuffer' bogosymbol, and clearing that property's value when a transferable-using clone buffer is deserialized. I thought jorendorff might have opinions about that. It will not prevent a fuzzer from tripping over this, but it'll make it far less likely. (The fuzzer would have to grab out the value before deserializing and put it back after. And there are no tests yet that do that. Though there are some that manufacture clone buffers from nothing.)

I could prep for an extra layer once we have symbols by adding a makeSymbol shell function that is the identity function for now?
Attachment #808256 - Flags: review?(jorendorff)
Attachment #747653 - Attachment is obsolete: true
I've separated out the portion of the patch that Waldo already r+'ed, and will carry the r+ forward. The one change is what jorendorff requested -- remove the public JS_GetTransferableArrayBufferContents entry point. (Except rather than calling it internally, I won't be using it at all now. But the internal implementation is still used when stealing.)
Attachment #760010 - Attachment is obsolete: true
Comment on attachment 808258 [details] [diff] [review]
Allow grabbing data from ArrayBuffers and neutering them independently (in addition to Steal, which does both at the same time).

Carrying forward Waldo's r+.
Attachment #808258 - Flags: review+
*Much* simpler fix, which makes me feel like a total idiot for wasting so much time with the previous approach.

The whole difficulty is one of ordering. If you write the transfer map at the beginning, then when you grab the data to put in the transfer map, you have to neuter. But then when you write out the views, they're already neutered. Fail. This was the preexisting state.

If you write the transfer map at the end, then your buffers are still alive when you write out the views, and you're happy. But now you can't read the data in order, because you need the buffers stored at the end of the clone buffer in order to create the views in the middle. Fixable if you add a directory or tail pointer or something, and then seek while reading out the clone buffer. Ugly. (And there's a correctness issue with modifying the buffers during the clone, though it's fixable too.)

My first attempt to fix this was to separate out the contents grabbing from the neutering. I grabbed the contents first without neutering and promised that I would eventually neuter. Then at the end, I scanned through and neutered everything. It works, except (1) error handling is pure hell, because the ownership is extremely messy; and (2) it breaks if you modify transferred objects during the clone, because the data may have already been copied away at the beginning.

The current fix is to seek while writing the clone buffer so as to produce a nice clean linearly readable buffer by the end of the write: When writing out the transfer map, insert placeholder pointers for the data. Fill them in at the end of the write, while neutering.

This dramatically simplifies the ownership situation, because the ArrayBuffers own their data until neutered, and the clone buffer only contains NULL placeholders until then. Then when neutering, each ArrayBuffer's ownership is transferred by writing the data pointer into the buffer. Thus, partial state cleanup is easy. Also, this resolves the modify-during-clone problem, because the ArrayBuffers are only used for their identity until it's neutering time.

This also suggests a pretty clean set of hooks needed to transfer arbitrary objects, which is being requested right now for a couple of different embedding object types. They require that the transferred state be boiled down to a 64-bit value (which could point to the heap or anything else), but otherwise I think they'll be straightforward.
Attachment #808261 - Flags: review?(jorendorff)
Ok, here's the more principled version.

You may note my cowardly |if (nelems > 0)| in copyAndSwapFromLittleEndian. See bug 920384 for the details, but it seemed expedient to avoid the problem here for now.
Attachment #809681 - Flags: review?(jorendorff)
Attachment #808256 - Attachment is obsolete: true
Attachment #808256 - Flags: review?(jorendorff)
Comment on attachment 809681 [details] [diff] [review]
Add an optional parameter to the shell serialize() function for specifying Transferables

Review of attachment 809681 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/shell/js.cpp
@@ +3551,5 @@
> +            return NULL;
> +        obj->setReservedSlot(0, PrivateValue(nullptr));
> +        obj->setReservedSlot(1, Int32Value(0));
> +
> +        JS_DefineProperties(cx, obj, props_);

check return value

@@ +3557,5 @@
> +        return &obj->as<CloneBufferObject>();
> +    }
> +
> +    static CloneBufferObject *Create(JSContext *cx, JSAutoStructuredCloneBuffer *buffer) {
> +        Rooted<CloneBufferObject*> obj(cx, Create(cx));

check return value

@@ +3572,5 @@
> +    }
> +
> +    void setData(uint64_t *aData) {
> +        JS_ASSERT(!data());
> +        setReservedSlot(0, PrivateValue(aData));

`enum {DataSlot, NBytesSlot};` perhaps?

@@ +3584,5 @@
> +        return getReservedSlot(1).toInt32();
> +    }
> +
> +    void setNBytes(size_t nbytes) {
> +        setReservedSlot(1, Int32Value(nbytes));

At least assert that nbytes <= UINT32_MAX.

@@ +3589,5 @@
> +    }
> +
> +    // Release just the buffer allocation itself. Do not free any transferable
> +    // pointers.
> +    void release() {

I don't think this method is ever the right thing to do. Can it be removed?

@@ +3611,5 @@
> +            args.rval().setUndefined();
> +            return true;
> +        }
> +
> +        if (!args.thisv().isObject() || !args.thisv().toObject().is<CloneBufferObject>() || args.length() != 1) {

Also need to check args[0].isString().

But actually, why the change to strings for this? The existing code used Uint8Array, which seems to make more sense, to me.

@@ +3632,5 @@
> +
> +    static bool
> +    getCloneBuffer(JSContext* cx, unsigned argc, JS::Value* vp) {
> +        CallArgs args = CallArgsFromVp(argc, vp);
> +        Rooted<CloneBufferObject*> obj(cx, &args.thisv().toObject().as<CloneBufferObject>());

Need to check the type of thisv first.

Here, and in setCloneBuffer(), I'd prefer to see the CallNonGenericMethod boilerplate, just because it's consistent with what we do everywhere else, and it's safe.

(I wish there was just a flag you could set somewhere: "hey this method is nongeneric".)

@@ +3647,5 @@
> +
> +        if (hasTransferable) {
> +            JS_ReportError(cx, "cannot retrieve structured clone buffer with transferables");
> +            return false;
> +        }

Sweet.

@@ +3744,5 @@
> +    }
> +    args.rval().set(deserialized);
> +
> +    if (hasTransferable)
> +        obj->release();

I think this should be obj->discard(), because we're done with the transfer map thingies, and if we don't call ClearStructuredClone, we'll leak them.
Attachment #809681 - Flags: review?(jorendorff) → review+
(In reply to Jason Orendorff [:jorendorff] from comment #39)
> Comment on attachment 809681 [details] [diff] [review]
> Add an optional parameter to the shell serialize() function for specifying
> Transferables
> 
> Review of attachment 809681 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/shell/js.cpp
> @@ +3572,5 @@
> > +    }
> > +
> > +    void setData(uint64_t *aData) {
> > +        JS_ASSERT(!data());
> > +        setReservedSlot(0, PrivateValue(aData));
> 
> `enum {DataSlot, NBytesSlot};` perhaps?

Sorry, forgot to do that before posting.

> @@ +3589,5 @@
> > +    }
> > +
> > +    // Release just the buffer allocation itself. Do not free any transferable
> > +    // pointers.
> > +    void release() {
> 
> I don't think this method is ever the right thing to do. Can it be removed?

Ok

> @@ +3611,5 @@
> > +            args.rval().setUndefined();
> > +            return true;
> > +        }
> > +
> > +        if (!args.thisv().isObject() || !args.thisv().toObject().is<CloneBufferObject>() || args.length() != 1) {
> 
> Also need to check args[0].isString().
> 
> But actually, why the change to strings for this? The existing code used
> Uint8Array, which seems to make more sense, to me.

Hm. Yeah, I'm not sure why I did that. Though doing it *did* find a bug, oddly enough. I think my vague thinking was (1) JSStrings are simpler than full objects, and (2) this is usually used for testing structured clones of typed arrays, and it has been some small amount of confusion to have *another* typed array in the mix. I've gotten my breakpoints tangled up and ended up debugging through the wrong typed array.

But neither of those are good arguments. Still, I'm inclined to leave it as is for now, since I'm running short on time and it works. I may switch back later. I care more about the other patch on the bug, for now.

> @@ +3632,5 @@
> > +
> > +    static bool
> > +    getCloneBuffer(JSContext* cx, unsigned argc, JS::Value* vp) {
> > +        CallArgs args = CallArgsFromVp(argc, vp);
> > +        Rooted<CloneBufferObject*> obj(cx, &args.thisv().toObject().as<CloneBufferObject>());
> 
> Need to check the type of thisv first.
> 
> Here, and in setCloneBuffer(), I'd prefer to see the CallNonGenericMethod
> boilerplate, just because it's consistent with what we do everywhere else,
> and it's safe.
> 
> (I wish there was just a flag you could set somewhere: "hey this method is
> nongeneric".)

You inspired me to do bug 921081 (or rather, my disgust at typing in yet another of these triggered it.) Though it's an ugly-for-hairy trade.

> @@ +3647,5 @@
> > +
> > +        if (hasTransferable) {
> > +            JS_ReportError(cx, "cannot retrieve structured clone buffer with transferables");
> > +            return false;
> > +        }
> 
> Sweet.

I'm not entirely sure why I did that. Keeping setCloneBuffer out of the hands of the fuzzers ought to be enough. But I guess this'll avoid problems with differential testing.

> @@ +3744,5 @@
> > +    }
> > +    args.rval().set(deserialized);
> > +
> > +    if (hasTransferable)
> > +        obj->release();
> 
> I think this should be obj->discard(), because we're done with the transfer
> map thingies, and if we don't call ClearStructuredClone, we'll leak them.

Ugh. Why do I keep getting this wrong?

I guess I haven't totally gotten comfortable with the notion that reading a structured clone with transferables has side effects, namely updating the buffer for loss of ownership.
Flags: needinfo?(sphink)
Hey, you’re right, this is simpler. Reviewing now.
Comment on attachment 808261 [details] [diff] [review]
Steal and neuter ArrayBuffers at end of structured clone

Review of attachment 808261 [details] [diff] [review]:
-----------------------------------------------------------------

I'm sorry this is so long. Way too tired to write a shorter one. r=me with comments addressed; the comments about destructor-nature, the `else`, and the last one are the important ones. Again, sorry for the rest.

Reminder to myself: bumping JS_STRUCTURED_CLONE_VERSION is not necessary because the whole use case for that constant is data that's serialized, stored, and then loaded up under possibly a different version of Gecko; which is unpossible in combination with transferables.

::: js/src/jsapi.h
@@ +225,5 @@
>  class AutoVectorRooter : protected AutoGCRooter
>  {
> +  private:
> +    typedef js::Vector<T, 8> VectorImpl;
> +    VectorImpl vector;

Style nit: Please put all the class's data members together (even though the others are just weirdo RAII helpers).

::: js/src/tests/js1_8_5/extensions/clone-transferables.js
@@ +15,5 @@
> +
> +function test() {
> +    // Note: -8 and -200 will trigger asm.js link failures because 8 and 200
> +    // bytes are below the minimum allowed size, and the buffer will not
> +    // actually be converted to an asm.js buffer.

This is copied from somewhere, right? Any chance of commoning up things in extensions/shell.js? It's ok if not.

@@ +21,5 @@
> +        var buffer_ctor = (size < 0) ? AsmJSArrayBuffer : ArrayBuffer;
> +        size = Math.abs(size);
> +
> +        var old = buffer_ctor(size);
> +        var copy = deserialize(serialize(old, [old]));

Hmm. Is there a synchronous way to clone with transferables in the browser? (That is, could this test be made to work both in the shell and in the browser?)

@@ +50,5 @@
> +            assertEq(copy_arr.length, size / old_arr.BYTES_PER_ELEMENT);
> +
> +            buf = null;
> +            old_arr = null;
> +            gc();

If we have to GC here (I'm assuming this is because asm.js buffers take up a ton of VM? right?) then it seems like we should probably also GC above, as soon as we're done with `old` and `copy`.

@@ +60,5 @@
> +            var dv = DataView(buf); // Second view
> +            var copy_arr = deserialize(serialize(old_arr, [ buf ]));
> +            assertEq(buf.byteLength, 0, "donor array buffer should be neutered");
> +            assertEq(old_arr.length, 0, "donor typed array should be neutered");
> +            assertEq(dv.byteLength, 0, "all view of donor array buffer should be neutered");

Grammar nit: "all views of..."

::: js/src/vm/StructuredClone.cpp
@@ +340,5 @@
> +            if (tag == SCTAG_TRANSFER_MAP_ENTRY) {
> +                uint32_t mapEntryDescriptor = uint32_t(u);
> +                if (mapEntryDescriptor != SCTAG_TM_UNOWNED) {
> +                    if (!DiscardEntry(mapEntryDescriptor, point++))
> +                        ok = false;

DiscardEntry and Discard have destructor-nature. I think they should be infallible (return type void).

@@ +345,1 @@
>                  } else {

This else is attached to the wrong if. It's meant to go with the `if (tag == SCTAG_TRANSFER_MAP_ENTRY)`.

(If you can come up with a test that reveals the bug and doesn't seem pointless, great. If not, that's fine.)

Reusing TransferMapRange is another way to fix this.

@@ +345,5 @@
>                  } else {
>                      // The only things in the transfer map should be
> +                    // SCTAG_TRANSFER_MAP_ENTRY tags paired with pointers. If
> +                    // we find any other tag, we've walked off the end of the
> +                    // transfer map.

Pre-existing: this seems a bit silly. Since we know the exact number of transferables, why not store that number in the header? Then the range class could be more straightforward and we wouldn't need this comment.

And, each entry could be two pointers rather than a uint32 and a pointer. Two pointers would be nice when this needs to be extended to multiple types of transferables. The first pointer could point to some kind of vtable, the hooks for re-Object-ifying/discarding transferable data.</sidetrack>

@@ +360,5 @@
> +ClearStructuredClone(const uint64_t *data, size_t nbytes)
> +{
> +    JS_ASSERT(nbytes % 8 == 0);
> +    if (!Discard(data, data + nbytes / 8))
> +        return false;

Yeah, this should definitely be infallible --- letting it fail here would leak the buffer.

@@ +715,5 @@
>              return false;
>          }
>  
> +        // No duplicates allowed
> +        if (std::find(transferableObjects.begin(), transferableObjects.end(), tObj) != transferableObjects.end()) {

Brendan used to say that every O(n^2) algorithm he ever exposed to the web came back to bite, without exception. So this should be a fun experiment!

@@ +716,5 @@
>          }
>  
> +        // No duplicates allowed
> +        if (std::find(transferableObjects.begin(), transferableObjects.end(), tObj) != transferableObjects.end()) {
> +            JS_ReportError(context(), "Duplicate transferable passed");

Please use js.msg; or if there's a reason not to, at least use a lowercase 'd', for consistency.

And please change the "Permission denied to access object" error message just above this code to use JSMSG_UNWRAP_DENIED.

@@ +952,5 @@
>  
> +        // Emit a placeholder pointer. We will steal the data and neuter the
> +        // transferable later. Note that we mark all entries as "owned", which
> +        // is not yet true (the contents have not yet been transferred; the
> +        // writer is the proud owner of a collection of NULL pointers.)

Heh. I do this all the time: this comment has three sentences run together with parens and a semicolon. Consider changing to three ordinary sentences with full stops? The parens don't seem to express anything here.

The semicolon I don't mind so much, and Waldo likes them, so... whatever. :)

@@ +966,5 @@
> +    uint64_t *cur;
> +    uint64_t *end;
> +
> +  public:
> +    TransferMapRange(SCOutput& out)

It seems like this could take begin and end pointers, and then you could reuse it for Discard.

@@ +978,5 @@
> +            advance();
> +    }
> +
> +    bool empty() const { return cur == end; }
> +    uint64_t *front() { return cur; }

OK. So, the first time I read this class I totally misinterpted the meaning of `cur`. When !empty(), cur points to the *second half* of an entry. That seems odd; also it's a bit fiddly how this increments cur in three different places.

If we had the number of entries in the header, this could be so much simpler... I dunno, this isn't bad, it's just tricky. One-line comment on cur?

Whenever I write a range class, I try to MOZ_ASSERT(!empty()) where sensible; YMMV.

@@ +1017,5 @@
> +    if (!tr.empty()) {
> +        // On an error, free everything already transferred up to this point.
> +        TransferMapRange killer(out);
> +        while (killer != mr) {
> +            uint64_t tmp = LittleEndian::readUint64(killer.front());

I'm not sure this is what should happen. See next comment.

@@ +1085,2 @@
>  
> +    (void) Discard(out.rawBuffer(), out.rawBuffer() + out.count() * sizeof(uint64_t));

Definitely remove the `* sizeof(uint64_t)` here either way. (!)

I'm not sure this Discard() call is right, though. There's something very similar (though not an explicit call to Discard()) happening in JSStructuredCloneWriter::transferOwnership() if an error occurs there. Double free?

How about giving SCOutput a destructor that calls Discard()? Here's where the destructor would be called:

    bool
    WriteStructuredClone(. . . . . .)
    {
        SCOutput out(cx);
        JSStructuredCloneWriter w(out, cb, cbClosure, transferable);
        return w.init() && w.write(v) && out.extractBuffer(bufp, nbytesp);
    }

On success, out.extractBuffer() will have happened, so the destructor will have nothing to discard. On error, stuff gets freed if and only if an entry was written to the transfer map.

I dunno, it seems more obviously-correct to me, less ad-hoc cleanup. Maybe something similar could be done in SCInput.
Attachment #808261 - Flags: review?(jorendorff) → review+
Many apologies for the re-review request, but I ended up changing a lot of code in response the your review and didn't feel comfortable landing as-is. The main changes:

1. The clone buffer now stores the number of transferables in the map header.

2. Map entries are followed by a pointer and a 64-bit "userdata" that is not yet used for anything.

3. All iteration over the transfer map is now done via the length. TransferMapRange is no more.

4. SCOutput manages memory for the transferables now, recording ownership state in the buffer. An entry is owned by the buffer if its tag is at least SCTAG_TM_FIRST_OWNED (currently that's only SCTAG_TM_ALLOC_DATA). I stopped using SCTAG_TM_ALLOC_DATA + a NULL pointer to mean "not owned yet", since that would force custom transferables to have a sentinel value too.

Given how many times I've gotten it wrong, #4 is the most important to check.

5. Cosmetic:
  SCTAG_TM_NOT_MARKED -> SCTAG_TM_UNREAD
  SCTAG_TM_MARKED -> SCTAG_TM_TRANSFERRED

6. To move serialize/deserialize to TestingFunctions, I had to define a notion o fuzzing-safety there.

In my very next patch, or maybe baku's, I reserve the right to define a general mechanism for custom transferables and convert ArrayBuffers over to using it.

(In reply to Jason Orendorff [:jorendorff] from comment #42)
> I'm sorry this is so long. Way too tired to write a shorter one. r=me with
> comments addressed; the comments about destructor-nature, the `else`, and
> the last one are the important ones. Again, sorry for the rest.

Urk? What's bad about a long review? Long review good.

And given how much I've rambled on in this bug, I can't really object...

> Reminder to myself: bumping JS_STRUCTURED_CLONE_VERSION is not necessary
> because the whole use case for that constant is data that's serialized,
> stored, and then loaded up under possibly a different version of Gecko;
> which is unpossible in combination with transferables.

Given that I've had to explain this to two different people, and struggled to grasp myself, I think I'd better add a comment. Actually, more than a comment -- we need to carve out a separate portion of the SCTAG_* numberspace to use for nonpersistent stuff, so that we don't run the risk of perturbing the persistent tag numbers when adding nonpersistent stuff. (I had baku do that in bug 720083, but it's probably better to do it in this patch.)

> ::: js/src/tests/js1_8_5/extensions/clone-transferables.js
> @@ +15,5 @@
> > +
> > +function test() {
> > +    // Note: -8 and -200 will trigger asm.js link failures because 8 and 200
> > +    // bytes are below the minimum allowed size, and the buffer will not
> > +    // actually be converted to an asm.js buffer.
> 
> This is copied from somewhere, right? Any chance of commoning up things in
> extensions/shell.js? It's ok if not.

I assume this refers to the AsmJSArrayBuffer function. It is not copied from anywhere else, unless you count luke's brain. It's kind of a shame since more tests ought to have cases with AsmJS buffers. Moved to extensions/shell.js.

> @@ +21,5 @@
> > +        var buffer_ctor = (size < 0) ? AsmJSArrayBuffer : ArrayBuffer;
> > +        size = Math.abs(size);
> > +
> > +        var old = buffer_ctor(size);
> > +        var copy = deserialize(serialize(old, [old]));
> 
> Hmm. Is there a synchronous way to clone with transferables in the browser?
> (That is, could this test be made to work both in the shell and in the
> browser?)

That is a very good point. Moved to TestingFunctions.cpp. Except that it disabled some functionality if --fuzzing-safe was passed to the shell, and TestingFunctions.cpp doesn't have such a notion, so after some discussion with Jesse I've added a MOZ_FUZZING_SAFE environment variable for the purpose.

> 
> @@ +50,5 @@
> > +            assertEq(copy_arr.length, size / old_arr.BYTES_PER_ELEMENT);
> > +
> > +            buf = null;
> > +            old_arr = null;
> > +            gc();
> 
> If we have to GC here (I'm assuming this is because asm.js buffers take up a
> ton of VM? right?) then it seems like we should probably also GC above, as
> soon as we're done with `old` and `copy`.

No, this isn't for space. We have plenty of memory left, and even if we didn't, the gc should fire automatically.

These explicit gcs are for testing particularly problematic paths through the typed array code, where arraybuffers keep sometimes-weak references to their views. (It's a strong ref if there's only one view, a weak ref if there are more than one, and the pointers get filled in during the GC. It's fun.)

Really, the tests for this stuff live in tests/js1_8_5/extensions/typedarray.js, but I brought along the gc calls because it's more likely to trigger problems than not having them. Commented.

> @@ +60,5 @@
> > +            var dv = DataView(buf); // Second view
> > +            var copy_arr = deserialize(serialize(old_arr, [ buf ]));
> > +            assertEq(buf.byteLength, 0, "donor array buffer should be neutered");
> > +            assertEq(old_arr.length, 0, "donor typed array should be neutered");
> > +            assertEq(dv.byteLength, 0, "all view of donor array buffer should be neutered");
> 
> Grammar nit: "all views of..."

Changed to "all our view are no longer belong to donor array buffer".

Ok, not really.

> ::: js/src/vm/StructuredClone.cpp
> @@ +340,5 @@
> > +            if (tag == SCTAG_TRANSFER_MAP_ENTRY) {
> > +                uint32_t mapEntryDescriptor = uint32_t(u);
> > +                if (mapEntryDescriptor != SCTAG_TM_UNOWNED) {
> > +                    if (!DiscardEntry(mapEntryDescriptor, point++))
> > +                        ok = false;
> 
> DiscardEntry and Discard have destructor-nature. I think they should be
> infallible (return type void).

Yeah, ok. You're right that there's really no good way to recover from failures, at least without leaking.

> @@ +345,1 @@
> >                  } else {
> 
> This else is attached to the wrong if. It's meant to go with the `if (tag ==
> SCTAG_TRANSFER_MAP_ENTRY)`.

Yow!

> (If you can come up with a test that reveals the bug and doesn't seem
> pointless, great. If not, that's fine.)

I did your transferable-count thing instead. *Much* nicer.

> Reusing TransferMapRange is another way to fix this.
> 
> @@ +345,5 @@
> >                  } else {
> >                      // The only things in the transfer map should be
> > +                    // SCTAG_TRANSFER_MAP_ENTRY tags paired with pointers. If
> > +                    // we find any other tag, we've walked off the end of the
> > +                    // transfer map.
> 
> Pre-existing: this seems a bit silly. Since we know the exact number of
> transferables, why not store that number in the header? Then the range class
> could be more straightforward and we wouldn't need this comment.

Very good point. As long as we don't give custom transferables access to the buffer (so they could write in multiple transfer map entries), that should be fine. And I don't think we want that access. Way too many ways to mess things up. (And custom ones can store pointers to malloced memory if they need more space.)

> And, each entry could be two pointers rather than a uint32 and a pointer.
> Two pointers would be nice when this needs to be extended to multiple types
> of transferables. The first pointer could point to some kind of vtable, the
> hooks for re-Object-ifying/discarding transferable data.</sidetrack>

Hm, yeah, that does seem nicer.

> @@ +715,5 @@
> >              return false;
> >          }
> >  
> > +        // No duplicates allowed
> > +        if (std::find(transferableObjects.begin(), transferableObjects.end(), tObj) != transferableObjects.end()) {
> 
> Brendan used to say that every O(n^2) algorithm he ever exposed to the web
> came back to bite, without exception. So this should be a fun experiment!

I actually looked into using your OrderedMap stuff for this, but it's not separated out for reuse right now. And I'd have to Root it.

I suppose I could keep an extra copy just for checking, then (lazy way) sort and scan for O(n log(n)). Or do it the right way with a hash map, but then I'd need to make another AutoRooted variant. Bleh. I think I'll try the web experiment.

Oh, wait... no, it doesn't need AutoRooting if you wait until the end to check for duplicates. The immediacy is observable, though.

> @@ +716,5 @@
> >          }
> >  
> > +        // No duplicates allowed
> > +        if (std::find(transferableObjects.begin(), transferableObjects.end(), tObj) != transferableObjects.end()) {
> > +            JS_ReportError(context(), "Duplicate transferable passed");
> 
> Please use js.msg; or if there's a reason not to, at least use a lowercase
> 'd', for consistency.

Done.

> And please change the "Permission denied to access object" error message
> just above this code to use JSMSG_UNWRAP_DENIED.

Me bad programmer. No cookie.

> @@ +952,5 @@
> >  
> > +        // Emit a placeholder pointer. We will steal the data and neuter the
> > +        // transferable later. Note that we mark all entries as "owned", which
> > +        // is not yet true (the contents have not yet been transferred; the
> > +        // writer is the proud owner of a collection of NULL pointers.)
> 
> Heh. I do this all the time: this comment has three sentences run together
> with parens and a semicolon. Consider changing to three ordinary sentences
> with full stops? The parens don't seem to express anything here.
> 
> The semicolon I don't mind so much, and Waldo likes them, so... whatever. :)

I still have vivid memories of my high school writing teacher complaining about all of my parenthetical comments. If I didn't listen to her then, why would I now?

> 
> @@ +966,5 @@
> > +    uint64_t *cur;
> > +    uint64_t *end;
> > +
> > +  public:
> > +    TransferMapRange(SCOutput& out)
> 
> It seems like this could take begin and end pointers, and then you could
> reuse it for Discard.

After switching to length-based stuff, this whole class is unnecessary. Murdered.

> @@ +978,5 @@
> > +            advance();
> > +    }
> > +
> > +    bool empty() const { return cur == end; }
> > +    uint64_t *front() { return cur; }
> 
> OK. So, the first time I read this class I totally misinterpted the meaning
> of `cur`. When !empty(), cur points to the *second half* of an entry. That
> seems odd; also it's a bit fiddly how this increments cur in three different
> places.
> 
> If we had the number of entries in the header, this could be so much
> simpler... I dunno, this isn't bad, it's just tricky. One-line comment on
> cur?
> 
> Whenever I write a range class, I try to MOZ_ASSERT(!empty()) where
> sensible; YMMV.

The badness is all gone now.

> @@ +1017,5 @@
> > +    if (!tr.empty()) {
> > +        // On an error, free everything already transferred up to this point.
> > +        TransferMapRange killer(out);
> > +        while (killer != mr) {
> > +            uint64_t tmp = LittleEndian::readUint64(killer.front());
> 
> I'm not sure this is what should happen. See next comment.

Gone, gone.

> @@ +1085,2 @@
> >  
> > +    (void) Discard(out.rawBuffer(), out.rawBuffer() + out.count() * sizeof(uint64_t));
> 
> Definitely remove the `* sizeof(uint64_t)` here either way. (!)

Yeaauurgh! Why do you guys let me touch code?!

> I'm not sure this Discard() call is right, though. There's something very
> similar (though not an explicit call to Discard()) happening in
> JSStructuredCloneWriter::transferOwnership() if an error occurs there.
> Double free?

It's the only way to be sure they're dead. Shoot them, cut off their heads, burn the bodies.

(Uh, yeah, it was a double free. I could claim that I meant to NULL out the pointers or set them to SCTAG_TM_UNOWNED or something. But it'd probably be a lie.)

> How about giving SCOutput a destructor that calls Discard()? Here's where
> the destructor would be called:
> 
>     bool
>     WriteStructuredClone(. . . . . .)
>     {
>         SCOutput out(cx);
>         JSStructuredCloneWriter w(out, cb, cbClosure, transferable);
>         return w.init() && w.write(v) && out.extractBuffer(bufp, nbytesp);
>     }
> 
> On success, out.extractBuffer() will have happened, so the destructor will
> have nothing to discard. On error, stuff gets freed if and only if an entry
> was written to the transfer map.
> 
> I dunno, it seems more obviously-correct to me, less ad-hoc cleanup. Maybe
> something similar could be done in SCInput.

Yeah, I think you're right. Again. As usual. Seems like I just can't bring myself to trust the bits stored in the clone buffer.

It does mean that custom transferables will get their free-ers called from a destructor, which gives me the willies. But I guess there's nothing fundamentally awful about that.
Attachment #813632 - Flags: review?(jorendorff)
Attachment #808261 - Attachment is obsolete: true
Comment on attachment 813632 [details] [diff] [review]
Steal and neuter ArrayBuffers at end of structured clone

Review of attachment 813632 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, r=me!

::: js/src/builtin/TestingFunctions.cpp
@@ +1072,5 @@
> +    static const size_t DATA_SLOT   = 0;
> +    static const size_t LENGTH_SLOT = 1;
> +
> +  public:
> +    static const size_t NUM_SLOTS   = 2;

This doesn't have to be public, does it? I think you can put these in an enum.

@@ +1111,5 @@
> +    }
> +
> +    void clearData() {
> +        setReservedSlot(DATA_SLOT, PrivateValue(nullptr));
> +    }

clearData() shouldn't be a separate method, and certainly not public. Inline it into discard()?

@@ +1230,5 @@
> +const JSPropertySpec CloneBufferObject::props_[] = {
> +    {"clonebuffer", 0,
> +     JSPROP_SHARED | JSPROP_NATIVE_ACCESSORS,
> +     { (JSPropertyOp)CloneBufferObject::getCloneBuffer, nullptr },
> +     { (JSStrictPropertyOp)CloneBufferObject::setCloneBuffer, nullptr }

Use JS_PSGS for this?

@@ +1503,5 @@
>  bool
>  js::DefineTestingFunctions(JSContext *cx, HandleObject obj)
>  {
> +    if (getenv("MOZ_FUZZING_SAFE") && getenv("MOZ_FUZZING_SAFE")[0] != '0')
> +        fuzzingSafe = true;

Don't we want this to work even if the caller passed --fuzzing-safe on the command line but didn't set the environment variable?

::: js/src/vm/StructuredClone.cpp
@@ +106,5 @@
> +enum TransferableObjectType {
> +    SCTAG_TM_UNFILLED = 0,
> +    SCTAG_TM_UNOWNED = 1,
> +    SCTAG_TM_FIRST_OWNED = 2,
> +    SCTAG_TM_ALLOC_DATA = 2,

Add a comment explaining what these last two things are and why they have the same value. Just what you already wrote in the bugzilla comment for this patch is plenty.

@@ +244,5 @@
>      JSContext *context() { return out.context(); }
>  
>      bool writeTransferMap();
>  
> +    bool writeInternal(const js::Value &v);

Why writeInternal()? It only seems to be called in one place.

@@ +992,5 @@
> +    point++;
> +    JS_ASSERT(LittleEndian::readUint64(point) == transferableObjects.length());
> +    point++;
> +
> +    while (!tr.empty()) {

I think you can use the usual form:

    for (JS::AutoObjectVector::Range tr = transferableObjects.all();
         !tr.empty(); 
         tr.popFront())
    {
        ...
    }

unless you think it's clearer the way it's written now, which would be fine.

@@ +1506,5 @@
>              return false;
>  
>          JSObject *obj = JS_NewArrayBufferWithContents(context(), content);
>          if (!obj || !allObjs.append(ObjectValue(*obj)))
>              return false;

I guess technically the entry should be marked as UNOWNED after JS_NewArrayBufferWithContents succeeds, even if the subsequent append() call fails with OOM.
Attachment #813632 - Flags: review?(jorendorff) → review+
In one of those very very rare cases that make sheriffs actually smile, it turns out that this needed to come out anyway, because it busted hell out of asm.js/testFloatingPoint.js on 32bit browsers (and, odd bedfellow, OS X opt but not debug) only. The 32bit warning-as-errors shell build was fine, but make check in the Linux32 and Win32 opt and debug and OS X opt (which should be running on the 64bit half) browsers was not.
(In reply to Phil Ringnalda (:philor) from comment #47)
> In one of those very very rare cases that make sheriffs actually smile, it

I'll need photographic proof of that.
Blocks: 841904
Depends on: 939469
Depends on: 939472
No longer depends on: CVE-2014-1488
Flags: in-testsuite?
There are multiple tests for this now. The main one is js/src/tests/js1_8_5/extensions/clone-transferables.js
Flags: in-testsuite? → in-testsuite+
Given in-testsuite coverage this fix will not be manually verified by QA. If you believe this warrants extra QA attention please nominate for testing by removing this whiteboard tag and adding the verifyme keyword. Please also provide any details you have that may inform our testing.
Whiteboard: [qa-]
Can't see how this fits with ESR criteria, not tracking, wontfixing for ESR24
You need to log in before you can comment on or make changes to this bug.