Closed Bug 1269451 Opened 8 years ago Closed 8 years ago

UncompressedSourceCache should use the runtime's SharedImmutableStringsCache to deduplicate strings

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: fitzgen, Assigned: fitzgen)

References

Details

Attachments

(7 files, 8 obsolete files)

15.03 KB, patch
fitzgen
: review+
fitzgen
: checkin-
Details | Diff | Splinter Review
7.59 KB, patch
fitzgen
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
2.80 KB, patch
jimb
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
1.97 KB, patch
jimb
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
18.41 KB, patch
jimb
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
5.59 KB, patch
jimb
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
6.86 KB, patch
fitzgen
: review+
fitzgen
: checkin+
Details | Diff | Splinter Review
Follow up to bug 1211723 -- we should deduplicate the strings in the UncompressedSourceCache as well.
This is going to nicely clean up all the AutoEntryHolder stuff, too -- nice!
Assignee: nobody → nfitzgerald
Status: NEW → ASSIGNED
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #1)
> This is going to nicely clean up all the AutoEntryHolder stuff, too -- nice!

For context: the holder is no longer needed because we let the refcounts of each SharedImmutableTwoByteString manage lifetime of the underlying chars, instead of sweeping them out every GC. We still purge the cache on GC sweep, but because it holds SharedImmutableTwoByteStrings instead of UniquePtrs, it just ends up decrementing refcounts in dtors and doing the Right Thing when that refcount reaches zero. No more special casing!
While the idea sounds reasonable, I'm familiar with the SharedImmutableStringsCache so perhaps the reviewer those patches should review this too?
(In reply to Luke Wagner [:luke] from comment #5)
> While the idea sounds reasonable, I'm familiar with the
> SharedImmutableStringsCache so perhaps the reviewer those patches should
> review this too?

Sure -- was just spreading around the review love and you were the only other person who has touched script sources recently AFAIK. No problem, though!
Attachment #8750434 - Flags: review?(luke) → review?(jimb)
Comment on attachment 8750434 [details] [diff] [review]
Make the UncompressedSourceCache use the shared, immutable strings infrastructure

Review of attachment 8750434 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good so far, but I want to give it one more look. Posting the one comment I have so far.

::: js/src/jsscript.cpp
@@ +1823,4 @@
>  }
>  
>  bool
> +UncompressedSourceCache::put(ScriptSource* ss, SharedImmutableTwoByteString str)

If I understand how this is meant to be used, it seems like this should be an rvalue reference. We only want to permit passing arguments of which we'll take ownership. As it stands, it's hard for me to see why this isn't going to try to invoke the (deleted) copy constructor.
Comment on attachment 8750434 [details] [diff] [review]
Make the UncompressedSourceCache use the shared, immutable strings infrastructure

Review of attachment 8750434 [details] [diff] [review]:
-----------------------------------------------------------------

Okay, looks great. Nice cleanup.
Attachment #8750434 - Flags: review?(jimb) → review+
(In reply to Jim Blandy :jimb from comment #7)
> ::: js/src/jsscript.cpp
> @@ +1823,4 @@
> >  }
> >  
> >  bool
> > +UncompressedSourceCache::put(ScriptSource* ss, SharedImmutableTwoByteString str)
> 
> If I understand how this is meant to be used, it seems like this should be
> an rvalue reference. We only want to permit passing arguments of which we'll
> take ownership. As it stands, it's hard for me to see why this isn't going
> to try to invoke the (deleted) copy constructor.

Yeah, good call. Fixing.
This also caused a pile of performance regressions on talos tp5o:

https://treeherder.mozilla.org/perf.html#/alerts?id=1142

https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5
https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5o_scroll

Could you please re-run the talos tests on try before relanding? It would be good to mitigate the performance hit from this if possible.
(In reply to William Lachance (:wlach) from comment #13)
> Could you please re-run the talos tests on try before relanding? It would be
> good to mitigate the performance hit from this if possible.

This change should have no performance impact at all. There's something unexpected going on.
(In reply to Jim Blandy :jimb from comment #14)
> (In reply to William Lachance (:wlach) from comment #13)
> > Could you please re-run the talos tests on try before relanding? It would be
> > good to mitigate the performance hit from this if possible.
> 
> This change should have no performance impact at all. There's something
> unexpected going on.

My suspicion is that we are doing hashing of large sources on main thread, where we didn't before. Its the only thing I can think of.
If we can confirm that with profiling, could we define the hash function for the UncompressedSourceCache to only hash the first N and last N bytes? Or to take samples spaced further and further apart in the string? ("If N is the size of something, log(N) is a constant.")
Poking around just now, I was pretty surprised to see that SharedImmutableString, which is returned by value by getOrCreate(), has a destructor that takes a lock and does a set lookup that does an O(n) memcmp.  It seems like we'd want to do at most one O(n) op ever on a string (the initial de-dup), and then, ideally, outside of the lock.
(In reply to Luke Wagner [:luke] from comment #17)
> Poking around just now, I was pretty surprised to see that
> SharedImmutableString, which is returned by value by getOrCreate(), has a
> destructor that takes a lock and does a set lookup that does an O(n) memcmp.
> It seems like we'd want to do at most one O(n) op ever on a string (the
> initial de-dup), and then, ideally, outside of the lock.

That's a good point. Hashing is constant time in the size of the table, but linear in the size of the key. I wish I'd considered this angle when reviewing.

At present, the hash table stores StringBoxes directly as its elements. Since the table can be resized, we can't hold pointers to StringBoxes live across insertions or deletions. But if the hash table stored pointers to StringBoxes, then SharedImmutableString could point to a StringBox, and we wouldn't need to hash to find the reference count.

However, when the reference count drops to zero, we must remove the entry from the hash table. If we want to do that in the destructor, there's no way to avoid hashing at that point. However, iterating over the entire table provides another way to remove entries without hashing. So perhaps we could free the text when the reference count drops to zero, and mark the StringBox as "dead" (perhaps simply by nulling out the pointer to the now-freed text); and then let major GCs iterate through the table and remove "dead" StringBoxes.

This would depend on the hash table being okay with having entries that don't match their hash values any more.
(In reply to Luke Wagner [:luke] from comment #17)
> Poking around just now, I was pretty surprised to see that
> SharedImmutableString, which is returned by value by getOrCreate(), has a
> destructor that takes a lock and does a set lookup that does an O(n) memcmp.
> It seems like we'd want to do at most one O(n) op ever on a string (the
> initial de-dup), and then, ideally, outside of the lock.

We already do the hashing without holding the lock, FWIW.

I don't see how we can avoid an O(n) memcmp in the `match` method of the hash policy.

(In reply to Jim Blandy :jimb from comment #16)
> If we can confirm that with profiling, could we define the hash function for
> the UncompressedSourceCache to only hash the first N and last N bytes? Or to
> take samples spaced further and further apart in the string? ("If N is the
> size of something, log(N) is a constant.")

We could certainly look into this, or something like it. Hash every sqrt(length-of-string)'th character of the string? I think this would lead to less hash collisions compared to first and last, but its also pretty much the worst thing we could do for the memory caches...

(In reply to Jim Blandy :jimb from comment #18)
> At present, the hash table stores StringBoxes directly as its elements.
> Since the table can be resized, we can't hold pointers to StringBoxes live
> across insertions or deletions. But if the hash table stored pointers to
> StringBoxes, then SharedImmutableString could point to a StringBox, and we
> wouldn't need to hash to find the reference count.
> 
> However, when the reference count drops to zero, we must remove the entry
> from the hash table. If we want to do that in the destructor, there's no way
> to avoid hashing at that point. However, iterating over the entire table
> provides another way to remove entries without hashing. So perhaps we could
> free the text when the reference count drops to zero, and mark the StringBox
> as "dead" (perhaps simply by nulling out the pointer to the now-freed text);
> and then let major GCs iterate through the table and remove "dead"
> StringBoxes.
> 
> This would depend on the hash table being okay with having entries that
> don't match their hash values any more.

Simpler than that would be not freeing the text eagerly on refcount == 0, and to let the string float in the cache until the next GC, at which point in time we can sweep the table. With that extra indirection around table entries you mention, this would allow us to avoid taking the lock in the destructor, and not having to worry about someone else racing to incref from a freash table lookup while we are freeing the chars. Something like this:

Thread 1                         Thread 2
--------------------------------------------------------------
take table's lock                ~SharedImmutableString()
find boxed table entry           decref boxed table entry to 0
incref boxed table entry to 1    free(entry's chars)
UAF

We have to guard against this situation if the destructor is not going to take the table's lock and wants to free the chars immediately upon refcount == 0. My suggestion is to relax a little and let refcount == 0 entries persist until the next gc sweep.
My suggestion would allow entries to revive from the dead, but if that happens it is probably better than re-inserting that string into the cache, which might potentially require a copy.
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #19)
> (In reply to Luke Wagner [:luke] from comment #17)
> > Poking around just now, I was pretty surprised to see that
> > SharedImmutableString, which is returned by value by getOrCreate(), has a
> > destructor that takes a lock and does a set lookup that does an O(n) memcmp.
> > It seems like we'd want to do at most one O(n) op ever on a string (the
> > initial de-dup), and then, ideally, outside of the lock.
> 
> We already do the hashing without holding the lock, FWIW.
> 
> I don't see how we can avoid an O(n) memcmp in the `match` method of the
> hash policy.

Rather than doing strange things to optimize the hash policy, I think it makes more sense to just avoid lookups. We should be able to get by with only one.

> (In reply to Jim Blandy :jimb from comment #18)
> > At present, the hash table stores StringBoxes directly as its elements.
> > Since the table can be resized, we can't hold pointers to StringBoxes live
> > across insertions or deletions. But if the hash table stored pointers to
> > StringBoxes, then SharedImmutableString could point to a StringBox, and we
> > wouldn't need to hash to find the reference count.
> > 
> > However, when the reference count drops to zero, we must remove the entry
> > from the hash table. If we want to do that in the destructor, there's no way
> > to avoid hashing at that point. However, iterating over the entire table
> > provides another way to remove entries without hashing. So perhaps we could
> > free the text when the reference count drops to zero, and mark the StringBox
> > as "dead" (perhaps simply by nulling out the pointer to the now-freed text);
> > and then let major GCs iterate through the table and remove "dead"
> > StringBoxes.
> 
> Simpler than that would be not freeing the text eagerly on refcount == 0,
> and to let the string float in the cache until the next GC, at which point
> in time we can sweep the table. With that extra indirection around table
> entries you mention, this would allow us to avoid taking the lock in the
> destructor, and not having to worry about someone else racing to incref from
> a fresh table lookup while we are freeing the chars.

I think we should keep the locking regime simple for as long as we can. With the approach I suggested, destroying a SharedImmutableString won't require any hashes; the table's lock will only be held across some reference-counting.

It seems to me freeing the string when the refcount hits zero should be better than waiting for a GC; after all, it's probably a GC that dropped the refcount in the first place, so we'll usually be keeping it around for a full major GC cycle. But that question is less important than keeping the locking simple.
This commit changes SharedImmutableString from holding its cache's entry's chars
and length directly, to holding a pointer to its cache's entry. This allows us
to avoid a table lookup and the full string hashing that entails in the
destructor.
Attachment #8752957 - Flags: review?(jimb)
Rebased on top of the other two new patches. Carrying r+.
Attachment #8752958 - Flags: review+
Attachment #8750434 - Attachment is obsolete: true
Try push:

remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=19fbe8c00a76
remote: 
remote: It looks like this try push has talos jobs. Compare performance against a baseline revision:
remote:   https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=19fbe8c00a76
Comment on attachment 8752955 [details] [diff] [review]
Part 0: Add an extra indirection around entries in the SharedImmutableStringsCache

Review of attachment 8752955 [details] [diff] [review]:
-----------------------------------------------------------------

Slow, inaccurate neural-net-based type checker says, "Righto, this looks a-okay!"

::: js/src/vm/SharedImmutableStringsCache.h
@@ +145,2 @@
>          for (auto r = locked->set.all(); !r.empty(); r.popFront())
> +            n += mallocSizeOf(r.front().get()) + mallocSizeOf(r.front()->chars());

The slightly more idiomatic thing would be to give StringBox its own sizeOfIncludingThis method, but this is fine.

https://www.getyarn.io/yarn-clip/8f97db6f-f731-4bd6-aef8-ea02014e0b2d

@@ +223,4 @@
>          StringBox(OwnedChars&& chars, size_t length)
> +            : chars_(mozilla::Move(chars))
> +            , length_(length)
> +            , refcount(0)

Did you mean to change the indentation here?
Attachment #8752955 - Flags: review?(jimb) → review+
Attachment #8752955 - Attachment is obsolete: true
Update to only purge shared immutable strings if the table has been initialized
and if it exists (there are JSAPI tests that create, but do not init the
JSRuntime, and then destroy it, which happened to trigger assertion failures
when we trie to grab the cache).
Attachment #8753083 - Flags: review?(jimb)
Attachment #8752957 - Attachment is obsolete: true
Attachment #8752957 - Flags: review?(jimb)
New try push:

remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=4240867fea1c
remote: 
remote: It looks like this try push has talos jobs. Compare performance against a baseline revision:
remote:   https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=4240867fea1c
Note/TODO for myself: ScriptSourceTask should take a handle on the shared immutable strings cache and do the lookup/de-dupe off thread, rather than always doing this on the main thread when we call setCompressedSource.
I'm starting to lose my sense of this code again, so I was looking through it and came across this:

SharedImmutableString
SharedImmutableString::clone() const
{
    auto clone = cache_.getOrCreate(chars(), length(), [&]() {
        MOZ_CRASH("Should not need to create an owned version, as this string is already in "
                  "the cache");
        return nullptr;
    });
    MOZ_ASSERT(clone.isSome());
    return SharedImmutableString(mozilla::Move(*clone));
}

What's the reason for not simply writing:

    return mozilla::Move(*clone);

This compiles (on my machine). The current code seems like a needless move construction.
Comment on attachment 8752957 [details] [diff] [review]
Part 1: SharedImmutableString should hold a pointer to the cache's entry

Review of attachment 8752957 [details] [diff] [review]:
-----------------------------------------------------------------

I still feel like I don't have a grasp on this, so I'll just post what I've got so far...

If SharedImmutableString is going to point directly to StringBoxes, then we need to do something with StringBox's move constructor. At the very least, it should now assert that the reference count is zero, as those are the only circumstances under which it's safe to move them: if there are any SharedImmutableStrings pointing to the source of the move, we're not fixing them up to point to the move's destination, so moving the reference count over is just wrong. But since you're boxing the StringBoxes now (as Juvenal once asked, "Quis boxiet ipsos StringBoxes?" Now we know) perhaps we don't need any move constructor for StringBoxes at all.

::: js/src/threading/ExclusiveData.h
@@ +166,5 @@
>  
>          operator T& () const { return get(); }
>          T* operator->() const { return &get(); }
>  
> +        const ExclusiveData<T>* data() const {

The name `data` is surprising to me: it could refer to either the guard's referent, or to the Exclusive*Data* from which that guard was taken. Would the name `parent` be less ambiguous?

::: js/src/vm/SharedImmutableStringsCache.h
@@ +187,5 @@
>       */
>      SharedImmutableStringsCache clone() {
>          MOZ_ASSERT(inner_);
>          auto locked = inner_->lock();
> +        return Clone(locked);

I'm probably missing something, but this method `SharedImmutableStringsCache::clone` seems unused. Could we just delete it?

@@ +414,5 @@
>       * resulting pointer while this `SharedImmutableString` exists.
>       */
>      const char* chars() const {
> +        MOZ_ASSERT(box_);
> +        MOZ_ASSERT(box_->refcount > 0);

Since the reference count is greater than zero, we should be able to assert that its chars pointer is non-null too, right?

@@ +423,5 @@
>       * Get the length of the underlying string.
>       */
>      size_t length() const {
> +        MOZ_ASSERT(box_);
> +        MOZ_ASSERT(box_->refcount > 0);

similarly here
Attachment #8752957 - Attachment is obsolete: false
(In reply to Jim Blandy :jimb from comment #32)

In retrospect, this would best be written `return clone.value();`.
(In reply to Jim Blandy :jimb from comment #33)
> perhaps we don't
> need any move constructor for StringBoxes at all.

Good call.

> ::: js/src/threading/ExclusiveData.h
> @@ +166,5 @@
> >  
> >          operator T& () const { return get(); }
> >          T* operator->() const { return &get(); }
> >  
> > +        const ExclusiveData<T>* data() const {
> 
> The name `data` is surprising to me: it could refer to either the guard's
> referent, or to the Exclusive*Data* from which that guard was taken. Would
> the name `parent` be less ambiguous?

Sure.

> ::: js/src/vm/SharedImmutableStringsCache.h
> @@ +187,5 @@
> >       */
> >      SharedImmutableStringsCache clone() {
> >          MOZ_ASSERT(inner_);
> >          auto locked = inner_->lock();
> > +        return Clone(locked);
> 
> I'm probably missing something, but this method
> `SharedImmutableStringsCache::clone` seems unused. Could we just delete it?

Huh, I thought it was in use but after some DXR'ing, it seems it is not. Let's kill it!

> @@ +414,5 @@
> >       * resulting pointer while this `SharedImmutableString` exists.
> >       */
> >      const char* chars() const {
> > +        MOZ_ASSERT(box_);
> > +        MOZ_ASSERT(box_->refcount > 0);
> 
> Since the reference count is greater than zero, we should be able to assert
> that its chars pointer is non-null too, right?

Good call.
Addressed review comments.
Attachment #8753401 - Flags: review?(jimb)
Attachment #8752957 - Attachment is obsolete: true
Try push:

remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=b297786dff0c
remote: 
remote: It looks like this try push has talos jobs. Compare performance against a baseline revision:
remote:   https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=b297786dff0c
Accidentally ended up with multiples of this version, unsure which is most up to
date, so here is the most up to date one.
Attachment #8753442 - Flags: review?(jimb)
Attachment #8753083 - Attachment is obsolete: true
Attachment #8753401 - Attachment is obsolete: true
Attachment #8753083 - Flags: review?(jimb)
Attachment #8753401 - Flags: review?(jimb)
Fixes an assertion that came up from the try push. It was asserting that an
ExclusiveContext had access to a compartment and zone when we did
`cx->zone()->runtimeFromAnyThread()->sharedImmutableStrings()`. But we don't
really want the zone or the runtime, we just needed a way to get the
SharedImmutableStringsCache from the ExclusiveContext so I just added a getter
next to the getters for the other thread-safe structures on the JSRuntime.
Attachment #8753492 - Flags: review?(jimb)
Attachment #8753438 - Attachment is obsolete: true
Attachment #8753438 - Flags: review?(jimb)
Small fix: only forget() the existing pointer when the realloc succeeded and we
are replacing it with the newly realloc'd one. Old patch would always forget the
existing pointer, so if realloc failed, we would end up with a nullptr.
Attachment #8753440 - Attachment is obsolete: true
Attachment #8753440 - Flags: review?(jimb)
YATP

remote: Follow the progress of your build on Treeherder:
remote:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc206c57001e
remote: 
remote: It looks like this try push has talos jobs. Compare performance against a baseline revision:
remote:   https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=cc206c57001e
Comment on attachment 8753442 [details] [diff] [review]
Part 1: SharedImmutableString should hold a pointer to the cache's entry

Review of attachment 8753442 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/SharedImmutableStringsCache.cpp
@@ +30,5 @@
>  #endif
>  {
>      MOZ_ASSERT(this != &rhs, "self move not allowed");
> +    MOZ_ASSERT(rhs.box_);
> +    MOZ_ASSERT(rhs.box_->refcount > 0);

Since this is a public move constructor, we don't have any reason to think the cache is locked at this point, so this assertion (and I think others like it), are unsynchronized references to shared data. The patch should take the lock, or omit the assertions.
Attachment #8753442 - Flags: review?(jimb) → review+
Comment on attachment 8753084 [details] [diff] [review]
Part 3: Stop doing DEBUG-only hashing to catch incorrect mutations to SharedImmutableString's chars

Review of attachment 8753084 [details] [diff] [review]:
-----------------------------------------------------------------

Sad to see it go. It was a great hack.
Attachment #8753084 - Flags: review?(jimb) → review+
Blocks: 1273693
Comment on attachment 8753492 [details] [diff] [review]
Part 4: Deduplicate the compressed string in the helper thread, not on the main thread

Review of attachment 8753492 [details] [diff] [review]:
-----------------------------------------------------------------

If I'm reading right, you've deleted all uses of the de-duping overload of ScriptSource::setCompressedSource, and that can be deleted.

It seems like the changes to SourceCompressionTask mean that SourceCompressionTask::compressed isn't really needed any more. Filed follow-up bug 1273693.

::: js/src/jsscript.cpp
@@ +1989,5 @@
>      return true;
>  }
>  
>  void
> +ScriptSource::setCompressedSource(SharedImmutableString&& raw, size_t sourceLength)

Both the compressed and uncompressed are "source", per se. If you're going to rename, why not "uncompressedLength"? It's a mouthful, but it's unambiguous.
Attachment #8753492 - Flags: review?(jimb) → review+
Attachment #8753441 - Flags: review?(jimb) → review+
Comment on attachment 8753494 [details] [diff] [review]
Part 5: SourceCompressionTask should use UniquePtr rather than raw pointers

Review of attachment 8753494 [details] [diff] [review]:
-----------------------------------------------------------------

I guess bug 1273693 becomes even easier after this patch...

::: js/src/jsscript.cpp
@@ +2101,4 @@
>  
>      // Shrink the buffer to the size of the compressed data.
> +    if (auto newCompressed = static_cast<char*>(js_realloc(compressed.get(), compressedBytes))) {
> +        mozilla::Unused << compressed.release();

I see why this is required, but it's pretty weird-looking. Could we just have a simple comment, like:

// If the realloc succeeded, compressed is now a freed pointer.

(I'm surprised UniquePtr doesn't have any better way to cope with realloc than this, but I didn't see one.)
Attachment #8753494 - Flags: review+
Attachment #8753492 - Attachment is obsolete: true
Comment on attachment 8752958 [details] [diff] [review]
Part 2: Make the UncompressedSourceCache use the shared, immutable strings infrastructure

Although this patch was a very nice simplification of the uncompressed source cache, it turned out to be too much of a performance regression, and isn't worth pursuing.
Attachment #8752958 - Flags: checkin-
All patches that I intend to land here, have landed.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: