Closed Bug 1259021 Opened 4 years ago Closed 4 years ago

Investigate using in-place storage in AutoStableStringChars to avoid allocation for short strings

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: jonco, Assigned: jonco)

References

Details

Attachments

(3 files, 2 obsolete files)

As suggested in 1258453.
Attached patch bug1259021-assc-inline (obsolete) — Splinter Review
Here's what I came up with for this.

I originally looked at using a Variant for the vector based on the char type, but that didn't work because there's no way to construct a Variant without passing an instance of the required type.  Also it was pretty complicated.

This patch uses a vector of uint8_t and casts the pointer to the required type.

I had to add a method to Vector to get maybeGiveOwnershipToCaller keep the same semantics.  I'll ask for Waldo's review for this as well.
Assignee: nobody → jcoppeard
Attachment #8734039 - Flags: review?(jdemooij)
Comment on attachment 8734039 [details] [diff] [review]
bug1259021-assc-inline

Review of attachment 8734039 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, thanks!

::: js/src/vm/String.cpp
@@ +933,5 @@
>      return base->isInline();
>  }
>  
> +template <typename T>
> +T* AutoStableStringChars::allocOwnChars(JSContext* cx, size_t count)

Nit: \n after T*
Attachment #8734039 - Flags: review?(jdemooij) → review+
Comment on attachment 8734039 [details] [diff] [review]
bug1259021-assc-inline

Requesting additional review for the Vector changes.

This adds Vector::extractRawBufferWithoutAlloc which returns the heap allocated contents if inline storage is not being used or nullptr otherwise.  

(Possibly this would be better named extractRawBuffer and the existing method renamed to extractRawBufferOrCopy.  Or possibly AutoStableStringChars::maybeGiveOwnershipToCaller should have different semantics such that we can just call the existing Vector::extractRawBuffer.  But this seemed the simplest and sanest thing to do.)
Attachment #8734039 - Flags: review?(jwalden+bmo)
Comment on attachment 8734039 [details] [diff] [review]
bug1259021-assc-inline

Review of attachment 8734039 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

This requires additions to mfbt/tests/TestVector.cpp, working with a class that counts calls to its ctor/dtor/move-ctors to verify expected behavior.  And then there are the handful of other changes noted in review comments to go.

::: js/src/jsfriendapi.h
@@ +1320,5 @@
>  class MOZ_STACK_CLASS AutoStableStringChars
>  {
> +    /*
> +     * When copying string char, use this many bytes of inline storage.  This is
> +     * chosen to allow the inline string types to be copied without allocating.

Note the location of the static_assert for this, please.

::: js/src/vm/String.cpp
@@ +938,5 @@
> +{
> +    static_assert(
> +        InlineCapacity >= sizeof(JS::Latin1Char) * JSFatInlineString::MAX_LENGTH_LATIN1 + 1 &&
> +        InlineCapacity >= sizeof(char16_t) * JSFatInlineString::MAX_LENGTH_TWO_BYTE + 1,
> +        "InlineCapacity to small to hold fat inline strings");

too small to

::: mfbt/Vector.h
@@ +654,5 @@
> +   /**
> +   * Like extractRawBuffer above but if the elements are stored in-place it
> +   * returns nullptr rather that attempting to allocate a new buffer.
> +   */
> +  MOZ_WARN_UNUSED_RESULT T* extractRawBufferWithoutAlloc();

> Possibly this would be better named extractRawBuffer and the existing
> method renamed to extractRawBufferOrCopy.

Yeah.

Let's (first patch) do a pure rename of extractRawBuffer to extractOrCopyRawBuffer.

Then (second patch) let's have add this new method, extractRawBuffer.  Add it above extractOrCopyRawBuffer, and give it this comment.  (I'd like to not have the two extraction method comments refer to each other -- or at least, if there's to be any reference, I'd like extractOrCopyRawBuffer to refer to extractRawBuffer, because the latter is a "smaller" behavior than the former.)

"""
If elements are stored in-place, return nullptr and leave the vector unmodified.  Otherwise return this vector's elements and clears the vector.
"""

We should probably modify extractOrCopyRawBuffer's comment to indicate that elements are copy-constructed -- and that if any of the vector's elements are uninitialized (as growByUninitialized permits), depending on T, behavior may be undefined.

@@ +1328,4 @@
>  Vector<T, N, AP>::extractRawBuffer()
>  {
> +  T* ret = extractRawBufferWithoutAlloc();
> +  if (ret) {

Mild preference for 

if (T* ret = ...)

and then rename the subsequent |ret| to |copy|, to make clearer that we're returning a copy.

@@ +1338,5 @@
> +    return nullptr;
> +  }
> +
> +  Impl::copyConstruct(ret, beginNoCheck(), endNoCheck());
> +  Impl::destroy(beginNoCheck(), endNoCheck());

Interesting -- this (preexistingly) makes Vector::extractOrCopyRawBuffer unusable with move-only data types.  Change these two lines to

  Impl::moveConstruct(copy, beginNoCheck(), endNoCheck())

and update the comment to refer to move construction.

@@ +1339,5 @@
> +  }
> +
> +  Impl::copyConstruct(ret, beginNoCheck(), endNoCheck());
> +  Impl::destroy(beginNoCheck(), endNoCheck());
> +  /* mBegin, mCapacity are unchanged. */

Forgot one:

#ifdef DEBUG
  mReserved = 0;
#endif
Attachment #8734039 - Flags: review?(jwalden+bmo) → feedback+
Patch to rename Vector::extractRawBuffer to extractOrCopyRawBuffer globally.
Attachment #8743909 - Flags: review?(jwalden+bmo)
Add new Vector::extractRawBuffer method that doesn't copy in-place data.
Attachment #8743910 - Flags: review?(jwalden+bmo)
Attached patch bug1259021-assc-inline v2 (obsolete) — Splinter Review
Updated main patch after splitting off Vector changes.
Attachment #8734039 - Attachment is obsolete: true
Attachment #8743911 - Attachment is obsolete: true
Attachment #8743916 - Flags: review?(jwalden+bmo)
Attachment #8743909 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8743910 [details] [diff] [review]
bug1259021-add-vector-extract-without-copy

Review of attachment 8743910 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Vector.h
@@ +645,5 @@
> +   * vector.
> +   *
> +   * If successful, this transfers ownership of the internal buffer used by this
> +   * vector to the caller.  (It's the caller's responsibility to properly
> +   * deallocate this buffer, in accordance with this vector's AllocPolicy.)

Let's wordsmith this a little harder, keeping the if-inline, if-not text in wholly separate paragraphs for readability, and being more precise about what "zero" means:

"""
If elements are stored in-place, return nullptr and leave this vector unmodified.

Otherwise return this vector's elements buffer, and clear this vector as if by clearAndFree(). The caller now owns the buffer and is responsible for deallocating it consistent with this vector's AllocPolicy.
"""

@@ +653,5 @@
> +  MOZ_WARN_UNUSED_RESULT T* extractRawBuffer();
> +
> +  /**
> +   * Return this vector's elements buffer and clear the vector, copying them to
> +   * a new buffer if they are stored in-place.

And, I ended up wordsmithing this harder, too, for consistency with the above (and to deal with the "clear" ambiguity):

"""
If elements are stored in-place, allocate a new buffer, move this vector's elements into it, and return that buffer.

Otherwise return this vector's elements buffer. The caller now owns the buffer and is responsible for deallocating it consistent with this vector's AllocPolicy.

This vector is cleared, as if by clearAndFree(), when this method succeeds.  This method fails and returns nullptr only if new elements buffer allocation fails.

N.B. Only the range [0, length()) of the returned buffer is constructed.  If any of these elements are uninitialized (as growByUninitialized enables), behavior is undefined.
"""

(Note the upgrade to "is undefined" from "may be undefined" is deliberate: it's UB even to *evaluate* an uninitialized object, and moveConstruct would definitely do that.  And I'd prefer to just forbid uninitialized elements at all, than allow them only if using out-of-line storage, because that's an easy API to misuse.)

@@ +1320,5 @@
>  
>  template<typename T, size_t N, class AP>
>  inline T*
> +Vector<T, N, AP>::extractRawBuffer()
> +{

I'm not confident of the exact logic for where to use/not use |MOZ_REENTRANCY_GUARD_ET_AL;|, but I think you can use it here, and in the OrCopy version.

@@ +1350,5 @@
> +  }
> +
> +  Impl::moveConstruct(copy, beginNoCheck(), endNoCheck());
> +  Impl::destroy(beginNoCheck(), endNoCheck());
> +  /* mBegin, mCapacity, mReserved are unchanged. */

Hmm.  I think it's probably better/more predictable for mReserved to be zeroed and mCapacity to be reset to kInlineCapacity here.  (This is also consistent with extractRawBuffer.)  Please do that.

::: mfbt/tests/TestVector.cpp
@@ +246,5 @@
> +
> +  Vector<S, 5> vec;
> +  MOZ_RELEASE_ASSERT(vec.reserve(5));
> +  for (size_t i = 0; i < 5; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));

Spaces around *, and elsewhere, so s/i\*i/i * i/g

Use infallibleEmplaceBack here.

@@ +248,5 @@
> +  MOZ_RELEASE_ASSERT(vec.reserve(5));
> +  for (size_t i = 0; i < 5; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));
> +  }
> +  MOZ_RELEASE_ASSERT(vec.length() == 5);

MOZ_ASSERT(vec.reserved() == 5);

@@ +257,5 @@
> +  S* buf = vec.extractRawBuffer();
> +  MOZ_RELEASE_ASSERT(!buf);
> +  MOZ_RELEASE_ASSERT(S::constructCount == 5);
> +  MOZ_RELEASE_ASSERT(S::moveCount == 0);
> +  MOZ_RELEASE_ASSERT(S::destructCount == 0);

MOZ_RELEASE_ASSERT(vec.length() == 5);
MOZ_ASSERT(vec.reserved() == 5);

@@ +261,5 @@
> +  MOZ_RELEASE_ASSERT(S::destructCount == 0);
> +
> +  MOZ_RELEASE_ASSERT(vec.reserve(10));
> +  for (size_t i = 5; i < 10; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));

infallibleEmplaceBack

@@ +263,5 @@
> +  MOZ_RELEASE_ASSERT(vec.reserve(10));
> +  for (size_t i = 5; i < 10; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));
> +  }
> +  MOZ_RELEASE_ASSERT(vec.length() == 10);

MOZ_ASSERT(vec.reserved() == 10);

@@ +272,5 @@
> +  buf = vec.extractRawBuffer();
> +  MOZ_RELEASE_ASSERT(buf);
> +  MOZ_RELEASE_ASSERT(S::constructCount == 10);
> +  MOZ_RELEASE_ASSERT(S::moveCount == 5);
> +  MOZ_RELEASE_ASSERT(S::destructCount == 5);

MOZ_RELEASE_ASSERT(vec.length() == 0);
MOZ_ASSERT(vec.reserved() == 0);

@@ +290,5 @@
> +
> +  Vector<S, 5> vec;
> +  MOZ_RELEASE_ASSERT(vec.reserve(5));
> +  for (size_t i = 0; i < 5; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));

infallibleEmplaceBack

@@ +292,5 @@
> +  MOZ_RELEASE_ASSERT(vec.reserve(5));
> +  for (size_t i = 0; i < 5; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));
> +  }
> +  MOZ_RELEASE_ASSERT(vec.length() == 5);

MOZ_ASSERT(vec.reserved() == 5);

@@ +301,5 @@
> +  S* buf = vec.extractOrCopyRawBuffer();
> +  MOZ_RELEASE_ASSERT(buf);
> +  MOZ_RELEASE_ASSERT(S::constructCount == 5);
> +  MOZ_RELEASE_ASSERT(S::moveCount == 5);
> +  MOZ_RELEASE_ASSERT(S::destructCount == 5);

MOZ_RELEASE_ASSERT(vec.length() == 0);
MOZ_ASSERT(vec.reserved() == 0);

@@ +314,5 @@
> +  MOZ_RELEASE_ASSERT(vec.reserve(10));
> +  for (size_t i = 0; i < 10; i++) {
> +    MOZ_RELEASE_ASSERT(vec.emplaceBack(i, i*i));
> +  }
> +  MOZ_RELEASE_ASSERT(vec.length() == 10);

MOZ_ASSERT(vec.reserved() == 10);

@@ +323,5 @@
> +  buf = vec.extractOrCopyRawBuffer();
> +  MOZ_RELEASE_ASSERT(buf);
> +  MOZ_RELEASE_ASSERT(S::constructCount == 10);
> +  MOZ_RELEASE_ASSERT(S::moveCount == 0);
> +  MOZ_RELEASE_ASSERT(S::destructCount == 0);

MOZ_RELEASE_ASSERT(vec.length() == 0);
MOZ_ASSERT(vec.reserved() == 0);
Attachment #8743910 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 8743916 [details] [diff] [review]
bug1259021-assc-inline v3

Review of attachment 8743916 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/String.cpp
@@ +938,5 @@
> +AutoStableStringChars::allocOwnChars(JSContext* cx, size_t count)
> +{
> +    static_assert(
> +        InlineCapacity >= sizeof(JS::Latin1Char) * JSFatInlineString::MAX_LENGTH_LATIN1 + 1 &&
> +        InlineCapacity >= sizeof(char16_t) * JSFatInlineString::MAX_LENGTH_TWO_BYTE + 1,

Er, do these want to be

  InlineCapacity >= sizeof(...) * (JSFatInlineString::* + 1)

?  I assume this is trying to account for a trailing '\0', but it's not quite going to in the 16-bit character case.

@@ +943,5 @@
> +        "InlineCapacity too small to hold fat inline strings");
> +
> +    size_t size;
> +    if (!CalculateAllocSize<T>(count, &size))
> +        return nullptr;

This needs to ReportAllocationOverflow(cx).  Please add a jsapi-test to verify that this failure mode properly sets a pending exception, and make sure it fails without this modification, passes with.

@@ +997,5 @@
>      if (!chars)
>          return false;
>  
>      PodCopy(chars, linearString->rawTwoByteChars(), length);
>      chars[length] = 0;

Requiring null termination here is a real downer.  :-(  We should get rid of this requirement at some point to bring a bunch of edge-case strings back into inlinable territory.
Attachment #8743916 - Flags: review?(jwalden+bmo) → review+
(In reply to Jeff Walden [:Waldo] (remove +bmo to email) from comment #10)
> Er, do these want to be
> 
>   InlineCapacity >= sizeof(...) * (JSFatInlineString::* + 1)
> 
> ?  I assume this is trying to account for a trailing '\0', but it's not
> quite going to in the 16-bit character case.

Yes, good catch.

> @@ +943,5 @@
> > +        "InlineCapacity too small to hold fat inline strings");
> > +
> > +    size_t size;
> > +    if (!CalculateAllocSize<T>(count, &size))
> > +        return nullptr;
> 
> This needs to ReportAllocationOverflow(cx).  Please add a jsapi-test to
> verify that this failure mode properly sets a pending exception, and make
> sure it fails without this modification, passes with.

I looked into this some more and this can't actually overflow since the length of strings are limited to 2^28 - 1.  I'm going to replace the check with a static assert instead.
Depends on: 1273432
You need to log in before you can comment on or make changes to this bug.