Closed Bug 1287298 Opened 3 years ago Closed 3 years ago

Add an API to give up ownership of ArrayBuffer data

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: ehsan, Assigned: ehsan)

References

Details

(Whiteboard: [v8api])

Attachments

(2 files, 2 obsolete files)

This is similar to stealing the buffer, except that the ArrayBuffer
won't be detached.  The caller is still responsible for freeing the
buffer.
Attachment #8771707 - Flags: review?(jorendorff)
Whiteboard: [v8api]
Ping?
Depends on: 1288555
Comment on attachment 8771707 [details] [diff] [review]
Add an API to give up ownership of ArrayBuffer data

Review of attachment 8771707 [details] [diff] [review]:
-----------------------------------------------------------------

Some problems here, but also this should probably depend on bug 1240984 since it changes a bunch of related code. Once this is fixed up, though, I'd also like to run it past Waldo because we're opening up some new possibilities here that I don't think we may be able to handle, and I don't know if my imagination is good enough to figure out how much trouble it could cause.

::: js/src/jsapi-tests/testArrayBuffer.cpp
@@ +173,5 @@
> +{
> +    JS::RootedObject buffer(cx);
> +
> +    buffer = JS_NewArrayBuffer(cx, n);
> +    JS::RootedObject view(cx, JS_NewUint8ArrayWithBuffer(cx, buffer, 0, -1));

You shouldn't skip null checking even in tests, in case we start doing our OOM testing in jsapi-tests. (For JS tests, we run the code multiple times and OOM after the first allocation, then the second, then the third...)

::: js/src/jsapi.h
@@ +3434,5 @@
>  /**
> + * Externalizes the contents of the given array buffer.  The array buffer length is not modified,
> + * and its contents array still points to the returned buffer. The caller takes ownership of the
> + * return value and must free it or transfer ownership via JS_NewArrayBufferWithContents when done
> + * using it.

This doesn't sound quite right.

If the caller is done with it, then it seems very dangerous to just free it without notifying the ArrayBuffer, since the ArrayBuffer still has a pointer to it and could write into it (via JS or whatever).

Perhaps the contract should be for the caller to JS_DetachArrayBuffer on it when done with it? You'd need to check that this actually does the right thing.

Also, JS_NewArrayBufferWithContents makes this kind of scary too. That would result in two ArrayBuffers sharing the same data (one that owns it and one that doesn't). I don't *think* that breaks any invariants within ArrayBufferObject.cpp, but it's not something the code currently expects, and in particular I'm thinking that the JIT may not handle that sort of aliasing. ABO.cpp would treat it the same as inline data, and therefore copy it if you Transfer it or something, which is fine.

::: js/src/vm/ArrayBufferObject.cpp
@@ +1319,5 @@
> +    // require that the buffer is plain. In the future, we could consider
> +    // returning something that handles releasing the memory.
> +    bool hasStealableContents = buffer->hasStealableContents() && buffer->hasMallocedContents();
> +
> +    return ArrayBufferObject::externalizeContents(cx, buffer, hasStealableContents).data();

This seems kind of strange if !hasStealableContents. You ask it to externalize an ArrayBuffer's contents, presumably so you can update those contents through other means. But if !hasStealableContents, this will end up making a copy and handing it back, which seems surprising and useless. I would expect an error to be reported in that case, and for externalizeContents to not have the bool param at all.
Attachment #8771707 - Flags: review?(jorendorff)
(In reply to Steve Fink [:sfink] [:s:] from comment #3)
> Some problems here, but also this should probably depend on bug 1240984
> since it changes a bunch of related code. Once this is fixed up, though, I'd
> also like to run it past Waldo because we're opening up some new
> possibilities here that I don't think we may be able to handle, and I don't
> know if my imagination is good enough to figure out how much trouble it
> could cause.

FWIW after discussing this with Steve on IRC, it seems like the dependency on bug 1240984 doesn't need to exist since this patch doesn't change the behavior of any existing JSAPIs (basically all of the patch except for ArrayBufferObject::externalizeContents and JS_ExternalizeArrayBufferContents is just refactoring existing code.)  But he still wanted Waldo to review this as well.

> ::: js/src/jsapi.h
> @@ +3434,5 @@
> >  /**
> > + * Externalizes the contents of the given array buffer.  The array buffer length is not modified,
> > + * and its contents array still points to the returned buffer. The caller takes ownership of the
> > + * return value and must free it or transfer ownership via JS_NewArrayBufferWithContents when done
> > + * using it.
> 
> This doesn't sound quite right.
> 
> If the caller is done with it, then it seems very dangerous to just free it
> without notifying the ArrayBuffer, since the ArrayBuffer still has a pointer
> to it and could write into it (via JS or whatever).
> 
> Perhaps the contract should be for the caller to JS_DetachArrayBuffer on it
> when done with it? You'd need to check that this actually does the right
> thing.

Yeah, the correct contract seems to be either the caller needs to keep the buffer alive for as long as the AB object isn't GCed, or call JS_DetachArrayBuffer when it's done with it.  I verified that JS_DetachArrayBuffer would do the right thing in that case.

> Also, JS_NewArrayBufferWithContents makes this kind of scary too. That would
> result in two ArrayBuffers sharing the same data (one that owns it and one
> that doesn't). I don't *think* that breaks any invariants within
> ArrayBufferObject.cpp, but it's not something the code currently expects,
> and in particular I'm thinking that the JIT may not handle that sort of
> aliasing. ABO.cpp would treat it the same as inline data, and therefore copy
> it if you Transfer it or something, which is fine.

I can't really think of any concrete problems there, but I guess we can for now document that the returned buffer shouldn't be used with JS_NewArrayBufferWithContents...  I'll let Waldo also weigh in on this.

> ::: js/src/vm/ArrayBufferObject.cpp
> @@ +1319,5 @@
> > +    // require that the buffer is plain. In the future, we could consider
> > +    // returning something that handles releasing the memory.
> > +    bool hasStealableContents = buffer->hasStealableContents() && buffer->hasMallocedContents();
> > +
> > +    return ArrayBufferObject::externalizeContents(cx, buffer, hasStealableContents).data();
> 
> This seems kind of strange if !hasStealableContents. You ask it to
> externalize an ArrayBuffer's contents, presumably so you can update those
> contents through other means. But if !hasStealableContents, this will end up
> making a copy and handing it back, which seems surprising and useless. I
> would expect an error to be reported in that case, and for
> externalizeContents to not have the bool param at all.

No, returning an error here is the wrong thing to do, since small ArrayBuffers which contain their storage inline will hit the !hasStealableContents case.  This is precisely why my test tests both types of sizes.
This is similar to stealing the buffer, except that the ArrayBuffer
won't be detached.  The caller is still responsible for freeing the
buffer.
Attachment #8779790 - Flags: review?(jwalden+bmo)
Attachment #8771707 - Attachment is obsolete: true
Waldo: ping?
Flags: needinfo?(jwalden+bmo)
Comment on attachment 8779790 [details] [diff] [review]
Add an API to give up ownership of ArrayBuffer data

Review of attachment 8779790 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi-tests/testArrayBuffer.cpp
@@ +162,5 @@
> +BEGIN_TEST(testArrayBuffer_externalize)
> +{
> +    if (!testWithSize(cx, 2))    // inlined storage
> +        return false;
> +    if (!testWithSize(cx, 2000)) // externalized storage

If you're trying to squat "externalize" as a verb, no way should you use it here to mean something different.  Instead use "ArrayBuffer data stored inline in the object." and "ArrayBuffer data stored out-of-line in a separate heap allocation."

@@ +168,5 @@
> +
> +    return true;
> +}
> +
> +bool testWithSize(JSContext* cx, int32_t n)

size_t

@@ +172,5 @@
> +bool testWithSize(JSContext* cx, int32_t n)
> +{
> +    JS::RootedObject buffer(cx);
> +
> +    buffer = JS_NewArrayBuffer(cx, n);

JS::RootedObject buffer(cx, JS_NewArrayBuffer(cx, n));
CHECK(buffer != nullptr);

and then a blank line.

@@ +174,5 @@
> +    JS::RootedObject buffer(cx);
> +
> +    buffer = JS_NewArrayBuffer(cx, n);
> +    JS::RootedObject view(cx, JS_NewUint8ArrayWithBuffer(cx, buffer, 0, -1));
> +    CHECK(view != nullptr);

Blank line after.

@@ +179,5 @@
> +    void* contents = JS_ExternalizeArrayBufferContents(cx, buffer);
> +    CHECK(contents != nullptr);
> +    CHECK(hasExpectedLength(view, n));
> +    CHECK(!JS_IsDetachedArrayBufferObject(buffer));
> +    CHECK(JS_GetArrayBufferByteLength(buffer) == uint32_t(n));

Blank line after this.  (Generally you need more blank lines between discrete operations and their subsequent tests.)

Also, it seems like at this point you should do something like

  uint8_t* uint8Contents = static_cast<uint8_t*>(contents);
  CHECK(*uint8Contents == 0);
  uint8_t randomByte(rand());
  *uint8Contents = randomByte;

  JS::RootedValue v(cx);
  CHECK(JS_GetElement(cx, view, 0, &v));
  CHECK(v.toInt32() == randomByte);

to verify the same memory's used for both.

@@ +182,5 @@
> +    CHECK(!JS_IsDetachedArrayBufferObject(buffer));
> +    CHECK(JS_GetArrayBufferByteLength(buffer) == uint32_t(n));
> +    view = nullptr;
> +    GC(cx);
> +    buffer = nullptr;

This code is following whatever hazy contract you had before, for how to deallocate the pointer returned from JS_ExternalizeArrayBufferContents.  The protocol you now have documented requires that you detach |buffer|, and only then perform the JS_free.  Please follow the protocol.

@@ +198,5 @@
> +}
> +
> +bool hasExpectedLength(JS::HandleObject obj, int32_t n) {
> +    JS::RootedValue v(cx);
> +    return JS_GetProperty(cx, obj, "byteLength", &v) && v.toInt32() == n;

static bool
hasExpectedLength(JSContext* cx, JS::HandleObject obj, uint32_t* len)
{
    JS::RootedValue v(cx);
    if (!JS_GetProperty(cx, obj, "byteLength", &v))
        return false;
    *len = v.toInt32();
    return true;
}

and then

uint32_t actualLength;
CHECK(hasExpectedLength(cx, obj, n, &actualLength));
CHECK(actualLength == n);

at the testing site.  We shouldn't just be pretending error handling doesn't have to happen, even in tests.  CONSTANT VIGILANCE!

::: js/src/jsapi.h
@@ +3443,5 @@
>  
>  /**
> + * Externalizes the contents of the given array buffer.  The array buffer length is not modified,
> + * and its contents array still points to the returned buffer. The caller takes ownership of the
> + * return value and must free it after calling JS_DetachArrayBuffer.

This is generally too terse.

"""
Returns a pointer to the ArrayBuffer |obj|'s data.  |obj| and its views will store and expose the data in the returned pointer: assigning into the returned pointer will affect values exposed by views of |obj| and vice versa.

The caller must ultimately deallocate the returned pointer to avoid leaking.  The memory is *not* garbage-collected with |obj|.  These steps must be followed to deallocate:

1. The ArrayBuffer |obj| must be detached using JS_DetachArrayBuffer.
2. The returned pointer must be freed using JS_free.

To perform step 1, callers *must* hold a reference to |obj| until they finish using the returned pointer.  They *must not* attempt to let |obj| be GC'd, then JS_free the pointer.

If |obj| isn't an ArrayBuffer, this function returns null and reports an error.
"""

And as a general comment: don't document APIs using a vague word in the API name -- "externalize", here.  Unless you're adding a general term to use repeatedly in numerous places (e.g. "detach" for ArrayBuffer), you're just being circular.

::: js/src/vm/ArrayBufferObject.cpp
@@ +746,5 @@
> +    BufferContents newContents = AllocateArrayBufferContents(cx, buffer->byteLength());
> +    if (!newContents)
> +        return BufferContents::createPlain(nullptr);
> +    memcpy(newContents.data(), contents.data(), buffer->byteLength());
> +    buffer->changeContents(cx, newContents, DoesntOwnData);

I don't think this really works, at least not for some flavors of ArrayBuffer.

If the buffer is mmap'd, it won't be |hasStealableContents|.  Then this function will copy its data into new contents and swap those contents into the buffer.  But that fundamentally breaks the mmap-fulness of the buffer.  If it's file-mapped data, writes into the buffer will no longer modify the file.

The problem's even worse for wasm mmap'd buffers: we've written pointers to the original buffer contents into stuff, and changing out from underneath is doubly bad.  But it's worse than that, because wasm'd buffers can't be detached at all!  It's fundamentally impossible to have this API for wasm mmap'd buffers.

But actually, the no-detachment restriction applies to wasm buffers even if they're *not* mmap'd, and even if they're malloc'd.

You need some restrictions on this function, or on the JSAPI function, that prohibit wasm buffers, and mmap'd buffers of any flavor, from ever appearing and being treated by any of this.

@@ +1302,5 @@
> +JS_ExternalizeArrayBufferContents(JSContext* cx, HandleObject objArg)
> +{
> +    JSObject* obj = CheckedUnwrap(objArg);
> +    if (!obj)
> +        return nullptr;

So actually, this is somewhat problematic.

In the presence of hueyfix, holding onto a wrapper isn't enough to keep the underlying object.  So if someone tries to perform this operation on a CCW, it might get back a pointer.  Then, according to the API contract, the user is supposed to JS_DetachArrayBuffer |objArg| before freeing the pointer -- but you can't detach |objArg|, because it's not an ArrayBuffer.  And if you can't detach |objArg|, there's no way to know when it's safe to free the returned pointer.

For consistency with JS_DetachArrayBuffer, then, this function must only handle unwrapped ArrayBufferObject.
Attachment #8779790 - Flags: review?(jwalden+bmo) → review-
Flags: needinfo?(jwalden+bmo)
This is similar to stealing the buffer, except that the ArrayBuffer
won't be detached.  The caller is still responsible for freeing the
buffer.
Attachment #8794405 - Flags: review?(jwalden+bmo)
Attachment #8779790 - Attachment is obsolete: true
Attached patch interdiffSplinter Review
Assignee: nobody → ehsan
Status: NEW → ASSIGNED
ping?
Comment on attachment 8794405 [details] [diff] [review]
Add an API to give up ownership of ArrayBuffer data

Review of attachment 8794405 [details] [diff] [review]:
-----------------------------------------------------------------

*gulp*

::: js/src/jsapi-tests/testArrayBuffer.cpp
@@ +197,5 @@
> +    view = nullptr;
> +    GC(cx);
> +
> +    CHECK(JS_DetachArrayBuffer(cx, buffer));
> +    JS_free(nullptr, contents);

Much prefer s/nullptr/cx/ here.

@@ +199,5 @@
> +
> +    CHECK(JS_DetachArrayBuffer(cx, buffer));
> +    JS_free(nullptr, contents);
> +    buffer = nullptr;
> +    GC(cx);

Out of paranoia, could you

  GC(cx);
  CHECK(*uint8Contents == randomByte);

after the detach-operation, then

  GC(cx);

after the free, then the final GC(cx); here?  Basically exercise every intermediate state's observable setup, as far as is possible.  Too many steps in a row before the GC right now, the way it's written, that need to be exercised harder.

::: js/src/vm/ArrayBufferObject.cpp
@@ +1100,5 @@
>  /* static */ ArrayBufferObject::BufferContents
> +ArrayBufferObject::externalizeContents(JSContext* cx, Handle<ArrayBufferObject*> buffer,
> +                                       bool hasStealableContents)
> +{
> +    MOZ_ASSERT(buffer->isPlain(), "Only support doing this on plain ABOs");

MOZ_ASSERT(!buffer->isDetached(), "must have contents to externalize");

@@ +1687,5 @@
> +        JS_ReportErrorNumber(cx, GetErrorMessage, nullptr, JSMSG_TYPED_ARRAY_BAD_ARGS);
> +        return nullptr;
> +    }
> +
> +    Rooted<ArrayBufferObject*> buffer(cx, &obj->as<ArrayBufferObject>());

I think you can do

  Handle<ArrayBufferObject*> buffer = obj.as<ArrayBufferObject>();

here.  Nothing wrong with the existing way, just this is very mildly more efficient.
Attachment #8794405 - Flags: review?(jwalden+bmo) → review+
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/87b7e5eac522
Add an API to give up ownership of ArrayBuffer data; r=Waldo
Fails to build:

js/src/vm/ArrayBufferObject.cpp:1693:9: error: use of undeclared identifier 'JS_ReportErrorNumber'; did you mean 'JS_ReportErrorNumberUC'?
     JS_ReportErrorNumber(cx, GetErrorMessage, nullptr, JSMSG_TYPED_ARRAY_BAD_ARGS);
     ^~~~~~~~~~~~~~~~~~~~
     JS_ReportErrorNumberUC
https://hg.mozilla.org/integration/mozilla-inbound/rev/7a51bf026e2eb3ecfd662d37a9facf7ba428261f
Bug 1287298 followup - Use JS_ReportErrorNumberASCII instead of JS_ReportErrorNumber. r=bustage
Thanks for the fix, Tooru!
https://hg.mozilla.org/mozilla-central/rev/87b7e5eac522
https://hg.mozilla.org/mozilla-central/rev/7a51bf026e2e
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
You need to log in before you can comment on or make changes to this bug.