Closed Bug 1358761 Opened 3 years ago Closed 3 years ago

Consider replacing nsPurpleBuffer with SegmentedVector

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: smaug, Assigned: smaug)

Details

Attachments

(2 files, 6 obsolete files)

Profiler seems to hint that maintaining the free list takes some time. I guess mostly because it causes memory locality issues.

I wonder if we could use SegmentedVector, and perhaps track how many free entries there are in middle and do compaction when needed (basically create a new SegmentedVector and copy all non-empty entries there).
This sounds like a good idea. Also, having less weird data structures would be nice.

In-place compaction could be done cheaply any time we're already scanning the purple buffer: if you find an empty entry while scanning forward, remove entries from the end until you find a non empty one, and put it in the spot.

Compaction to a new vector could be done incrementally.
Attached patch patch (obsolete) — Splinter Review
I had something like this in mind. 
May not improve performance much, but improves code readability, IMO. 

Let's see what tryserver says
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f43ec7490746c27a1233e65896bebf365c7a9bde
Assignee: nobody → bugs
Attached patch v2 (obsolete) — Splinter Review
Compact always, it is after all O(n) and done while iterating the vector anyhow.
One could say that SelectPointers doesn't need to compact, but this just keeps the code simpler.

Compacting always should spread the time spent in compacting to many RemoveSkippable calls.


https://treeherder.mozilla.org/#/jobs?repo=try&revision=2a716e04af308ea9a4f0961bb3b8f6f140fe24e1


Feedback-ish review request.
Overall I think this makes the code quite a bit easier to understand.
Attachment #8862182 - Attachment is obsolete: true
Attachment #8862417 - Flags: review?(continuation)
Attachment #8862417 - Attachment is obsolete: true
Attachment #8862417 - Flags: review?(continuation)
Looks like SegmentedVector has a buggy assert. Removing it...
Comment on attachment 8862417 [details] [diff] [review]
v2

Review of attachment 8862417 [details] [diff] [review]:
-----------------------------------------------------------------

Did you do any measurements to make sure this doesn't take a ton of time somehow?

::: mfbt/SegmentedVector.h
@@ +314,5 @@
>    };
>  
>    IterImpl Iter() { return IterImpl(this); }
>  
> +  class PrevIterImpl

Maybe RevIter instead of PrevIter? This is a reverse iterator, right? You should also get review from froydnj or somebody for this part of the patch.

@@ +323,5 @@
> +    size_t mIndex;
> +
> +    explicit PrevIterImpl(SegmentedVector* aVector)
> +      : mSegment(aVector->mSegments.getLast())
> +      , mIndex(mSegment ? mSegment->Length() - 1 : 0)

This will underflow if the last segment is empty. I don't know if it is okay to assume that the last segment will always be non-empty.

@@ +347,5 @@
> +      MOZ_ASSERT(!Done());
> +      if (mIndex == 0) {
> +        mSegment = mSegment->getPrevious();
> +        if (mSegment) {
> +          mIndex = mSegment->Length() - 1;

I guess we can assume that the length of a segment that isn't at the end is non-zero.

::: xpcom/base/nsCycleCollector.cpp
@@ +978,5 @@
>  struct nsPurpleBufferEntry
>  {
> +  nsPurpleBufferEntry(void* aObject, nsCycleCollectingAutoRefCnt* aRefCnt,
> +                      nsCycleCollectionParticipant* aParticipant)
> +  : mObject(aObject), mRefCnt(aRefCnt), mParticipant(aParticipant) {}

nit: {} on new line. I think the initializer list is supposed to be indented, too.

@@ +999,5 @@
> +  // - On 32-bit platforms sizeof(nsPurpleBufferEntry) is 12, so mEntries
> +  //   is 16,380 bytes.
> +  // - On 64-bit platforms sizeof(nsPurpleBufferEntry) is 24, so mEntries
> +  //   is 32,544 bytes.
> +  static const uint32_t kEntriesPerSegment = 1365;

You should have a static assert for SegmentedVector::SegmentImpl here, like the old code had. This would have caught that your size is wrong: a segment has a field mLength, so you'll overflow into the next bucket size, and waste a lot of space.

Maybe it would be sufficient to do mEntries(kEntriesPerSegment) in the ctor?

@@ +1006,5 @@
> +  SegmentedVector<nsPurpleBufferEntry, kSegmentSize, InfallibleAllocPolicy>
> +    mEntries;
> +
> +public:
> +  nsPurpleBuffer() : mCount(0)

nit: initializer list on new line

@@ -1018,5 @@
> -    // Put all the entries in the block on the free list.
> -    void InitNextPointers()
> -    {
> -      for (uint32_t i = 1; i < ArrayLength(mEntries); ++i) {
> -        mEntries[i - 1].mNextInFreeList =

It is great to get rid of all this weird code!

@@ +1016,4 @@
>    }
>  
>    template<class PurpleVisitor>
>    void VisitEntries(PurpleVisitor& aVisitor)

Please add a comment saying that this compacts while it visits.

@@ +1026,5 @@
> +      if (e.mObject) {
> +        ++newLength;
> +        aVisitor.Visit(*this, &e);
> +      } else {
> +        // Try to find non-empty entry from the end of the vector.

nit: "find a non-empty entry"

@@ +1034,5 @@
> +            break;
> +          }
> +          if (otherEntry.mObject) {
> +            ++newLength;
> +            std::swap(e.mObject, otherEntry.mObject);

I think you should define a copy constructor for nsPurpleBufferEntry, then you could do e = otherEntry. That would be one line, plus you'd save writing the empty entry to otherEntry. ;) Given that the dtor for entries doesn't do anything.

@@ +1036,5 @@
> +          if (otherEntry.mObject) {
> +            ++newLength;
> +            std::swap(e.mObject, otherEntry.mObject);
> +            std::swap(e.mRefCnt, otherEntry.mRefCnt);
> +            std::swap(e.mParticipant, otherEntry.mParticipant);

Maybe reorder these so that "++newLength;" and "aVisitor.Visit(*this, &e);" are last, so that it matches the (e.mObject) case above, to make it clearer that we're just doing the normal visit then.

@@ +1066,5 @@
>    {
>      void
>      Visit(nsPurpleBuffer& aBuffer, nsPurpleBufferEntry* aEntry)
>      {
>        if (aEntry->mRefCnt) {

One thing you should consider doing is defining a copy constructor like I mentioned above, then make a nsPurpleBufferEntry dtor that is
if (mRefCnt) {
  mRefCnt->RemoveFromPurpleBuffer();
}
You'd have to stick with a swap in VisitEntries above (so we don't do anything when we pop) but then you could get rid of UnmarkRemainingPurple entirely. The drawback would be a dtor with a well-predicted branch. I don't know what effect that would have on speed.

@@ +1075,5 @@
>        --aBuffer.mCount;
>      }
>    };
>  
> +  void UnmarkRemainingPurple()

You should inline this method now that there is only a single caller.

@@ +1099,4 @@
>    MOZ_ALWAYS_INLINE void Put(void* aObject, nsCycleCollectionParticipant* aCp,
>                               nsCycleCollectingAutoRefCnt* aRefCnt)
>    {
> +    Unused << mEntries.Append(nsPurpleBufferEntry(aObject, aRefCnt, aCp));

I do worry a little that we'll need to compact more often sometimes. I guess that can only happen if we're creating and destroying a lot of CCed objects, and in that case we're not actually destroying them until we run FreeSnowWhite, which will do compaction, so I guess the purple buffer size won't be the major problem for memory usage.

@@ +1106,5 @@
>    void Remove(nsPurpleBufferEntry* aEntry)
>    {
>      MOZ_ASSERT(mCount != 0, "must have entries");
>  
>      if (aEntry->mRefCnt) {

Maybe put all of this into a Clear() method on nsPurpleBufferEntry? You could also use it above.

@@ +1122,5 @@
>    }
>  
>    size_t SizeOfExcludingThis(MallocSizeOf aMallocSizeOf) const
>    {
> +    return mEntries.SizeOfIncludingThis(aMallocSizeOf);

This should be SizeOfExcludingThis, because |this| of mEntries is stored within |this| of the purple buffer. I think.

@@ +1161,5 @@
>    VisitEntries(visitor);
>  
>    NS_ASSERTION(mCount == 0, "AddPurpleRoot failed");
>    if (mCount == 0) {
>      FreeBlocks();

I do wonder if it would be a good idea to allocate a segment right away. We just freed a bunch of them, so jemalloc can recycle an existing one, and we know we're probably going to need another one right away anyways.
Attachment #8862417 - Attachment is obsolete: false
Attachment #8862417 - Flags: feedback+
Comment on attachment 8862459 [details] [diff] [review]
v3

I guess I'll wait for v4. ;)
Attachment #8862459 - Flags: review?(continuation)
(In reply to Andrew McCreight [:mccr8] from comment #7)
> Comment on attachment 8862417 [details] [diff] [review]
> v2
> 
> Review of attachment 8862417 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Did you do any measurements to make sure this doesn't take a ton of time
> somehow?
I did profile this. And also I don't see anything here which could take lots of time.


> > +  class PrevIterImpl
> 
> Maybe RevIter instead of PrevIter? This is a reverse iterator, right? You
> should also get review from froydnj or somebody for this part of the patch.
Yeah, I was wondering a good name. 


> > +    size_t mIndex;
> > +
> > +    explicit PrevIterImpl(SegmentedVector* aVector)
> > +      : mSegment(aVector->mSegments.getLast())
> > +      , mIndex(mSegment ? mSegment->Length() - 1 : 0)
> 
> This will underflow if the last segment is empty.
it can't be empty

> I guess we can assume that the length of a segment that isn't at the end is
> non-zero.
yes. All of segmented vector code relies on that


> @@ +999,5 @@
> > +  // - On 32-bit platforms sizeof(nsPurpleBufferEntry) is 12, so mEntries
> > +  //   is 16,380 bytes.
> > +  // - On 64-bit platforms sizeof(nsPurpleBufferEntry) is 24, so mEntries
> > +  //   is 32,544 bytes.
> > +  static const uint32_t kEntriesPerSegment = 1365;
> 
> You should have a static assert for SegmentedVector::SegmentImpl here, like
> the old code had. This would have caught that your size is wrong: a segment
> has a field mLength, so you'll overflow into the next bucket size, and waste
> a lot of space.
SegmentedVector itself tries to round to a good size

> 
> Maybe it would be sufficient to do mEntries(kEntriesPerSegment) in the ctor?
I don't know what you mean with this.
Or perhaps you didn't mean kEntriesPerSegment here, but the actual segment size


> @@ +1034,5 @@
> > +            break;
> > +          }
> > +          if (otherEntry.mObject) {
> > +            ++newLength;
> > +            std::swap(e.mObject, otherEntry.mObject);
> 
> I think you should define a copy constructor for nsPurpleBufferEntry, then
> you could do e = otherEntry. That would be one line, plus you'd save writing
> the empty entry to otherEntry. ;) Given that the dtor for entries doesn't do
> anything.
I had copy-ctor, but oddly enough it actually showed up in profiles. Not here, but when appending entry mEntries.
Profiler might just give wrong information, or implicit copy-ctor is just faster.


> One thing you should consider doing is defining a copy constructor like I
> mentioned above, then make a nsPurpleBufferEntry dtor that is
> if (mRefCnt) {
>   mRefCnt->RemoveFromPurpleBuffer();
> }
> You'd have to stick with a swap in VisitEntries above (so we don't do
> anything when we pop) but then you could get rid of UnmarkRemainingPurple
> entirely. The drawback would be a dtor with a well-predicted branch. I don't
> know what effect that would have on speed.
In general the code which needs to be particularly fast is adding anything to purple buffer.

But I guess I like the idea putting RemoveFromPurpleBuffer call to dtor.


> @@ +1099,4 @@
> >    MOZ_ALWAYS_INLINE void Put(void* aObject, nsCycleCollectionParticipant* aCp,
> >                               nsCycleCollectingAutoRefCnt* aRefCnt)
> >    {
> > +    Unused << mEntries.Append(nsPurpleBufferEntry(aObject, aRefCnt, aCp));
> 
> I do worry a little that we'll need to compact more often sometimes. I guess
> that can only happen if we're creating and destroying a lot of CCed objects,
> and in that case we're not actually destroying them until we run
> FreeSnowWhite, which will do compaction, so I guess the purple buffer size
> won't be the major problem for memory usage.
I tried some very memory heavy cases where we already crash easily because of OOM.
And currently we can easily keep too many PurpleBlocks in memory, because they can be fragmented.
Can't see this stuff in profiles.


> @@ +1122,5 @@
> >    }
> >  
> >    size_t SizeOfExcludingThis(MallocSizeOf aMallocSizeOf) const
> >    {
> > +    return mEntries.SizeOfIncludingThis(aMallocSizeOf);
> 
> This should be SizeOfExcludingThis, because |this| of mEntries is stored
> within |this| of the purple buffer. I think.
Hmm, I guess so

> I do wonder if it would be a good idea to allocate a segment right away. We
> just freed a bunch of them, so jemalloc can recycle an existing one, and we
> know we're probably going to need another one right away anyways.
so we'd push an empty entry to purple buffer or something. Not sure it is worth.

In general a thing which shows up in profiles is that the bucket sizes are too small.
They used to be larger, but then for some reason were shrink a bit.
(In reply to Andrew McCreight [:mccr8] from comment #7)
 
> You should have a static assert for SegmentedVector::SegmentImpl here, like
> the old code had. This would have caught that your size is wrong: a segment
> has a field mLength, so you'll overflow into the next bucket size, and waste
> a lot of space.
No I don't. SegmentedVector takes care of that. Usually you pass something like 4096 there, and it ensure that
the segment size is sane.
> @@ +1106,5 @@
> >    void Remove(nsPurpleBufferEntry* aEntry)
> >    {
> >      MOZ_ASSERT(mCount != 0, "must have entries");
> >  
> >      if (aEntry->mRefCnt) {
> 
> Maybe put all of this into a Clear() method on nsPurpleBufferEntry? You
> could also use it above.
Not sure what you mean with this. I do want to keep the --mCount
(In reply to Olli Pettay [:smaug] from comment #9)
> I had copy-ctor, but oddly enough it actually showed up in profiles. Not
> here, but when appending entry mEntries.
> Profiler might just give wrong information, or implicit copy-ctor is just
> faster.

That seems weird. Anyways, my main point was really just that you should use a copy ctor here rather than manually copying every field.

> But I guess I like the idea putting RemoveFromPurpleBuffer call to dtor.

It would be cleaner. I just hope it doesn't show up in profiles.

> I tried some very memory heavy cases where we already crash easily because
> of OOM.
> And currently we can easily keep too many PurpleBlocks in memory, because
> they can be fragmented.

You can still end up with a lot of PurpleBlocks in memory, but the number of them will never exceed the amount needed to hold the peak number of purple buffer entries, while with your patch it could go even higher. But if you didn't see this when you were stress testing it, then that's fine.
Attached patch v5 (obsolete) — Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d3e1741d0afc59f8f10b84efc7b13a62a46f5ed
Attachment #8862417 - Attachment is obsolete: true
Attachment #8862459 - Attachment is obsolete: true
Attachment #8862595 - Flags: review?(continuation)
Comment on attachment 8862595 [details] [diff] [review]
v5

Review of attachment 8862595 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! r=me for the CC changes.

::: xpcom/base/nsCycleCollector.cpp
@@ +1007,5 @@
> +      mRefCnt->RemoveFromPurpleBuffer();
> +    }
> +  }
> +
> +  void* mObject;

Maybe get rid of the spaces between the field declarations? That feels a little weird to me.

@@ +1050,3 @@
>    }
>  
> +  // This methods compacts mEntries.

nit: "This method".

@@ +1118,5 @@
>    void Remove(nsPurpleBufferEntry* aEntry)
>    {
>      MOZ_ASSERT(mCount != 0, "must have entries");
>  
>      if (aEntry->mRefCnt) {

Could you make this into a Clear() method? It is a little ugly to have aEntry-> all over the place.
Comment on attachment 8862595 [details] [diff] [review]
v5

Nathan, could you look at the change to SegmentedVector.h.
(1) I need a reverse iterator, (2) typedef is made public so that other code can assert reasonable size, (3) that one assert is just bogus
Attachment #8862595 - Flags: review?(nfroyd)
Comment on attachment 8862595 [details] [diff] [review]
v5

Oops, I meant to set r+.
Attachment #8862595 - Flags: review?(continuation) → review+
Attached patch v6 (obsolete) — Splinter Review
Attachment #8862595 - Attachment is obsolete: true
Attachment #8862595 - Flags: review?(nfroyd)
Attachment #8862665 - Flags: review?(nfroyd)
Looks like I can't use the static assert easily, the sizeof varies quite a bit between platforms.
Or I need to add some other valid values too.
Comment on attachment 8862665 [details] [diff] [review]
v6

Review of attachment 8862665 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the question below answered.

::: mfbt/SegmentedVector.h
@@ -259,5 @@
>      // Handle the case where the last segment contains more elements
>      // than we want to pop.
>      MOZ_ASSERT(last);
>      MOZ_ASSERT(last == mSegments.getLast());
> -    MOZ_ASSERT(aNumElements != 0);

Why are you removing this?  If aNumElements == 0, that case should have been handled inside the loop before falling through to here.

Are you calling this method with aNumElements == 0?

@@ +323,5 @@
> +
> +    explicit RevIterImpl(SegmentedVector* aVector)
> +      : mSegment(aVector->mSegments.getLast())
> +      , mIndex(mSegment ? mSegment->Length() - 1 : 0)
> +    {}

Perhaps:

  MOZ_ASSERT_IF(mSegment, mSegment->Length() > 0);

in the body would be a good idea.
Attachment #8862665 - Flags: review?(nfroyd) → review+
(In reply to Nathan Froyd [:froydnj] from comment #19)

> >      MOZ_ASSERT(last == mSegments.getLast());
> > -    MOZ_ASSERT(aNumElements != 0);
> 
> Why are you removing this?  If aNumElements == 0, that case should have been
> handled inside the loop before falling through to here.
How so. If the vector isn't empty but aNumElements == 0 is passed as param, we enter here.

> 
> Are you calling this method with aNumElements == 0?
Yes. I'm just passing (oldLength - newLength), which can be 0.


>   MOZ_ASSERT_IF(mSegment, mSegment->Length() > 0);
> 
> in the body would be a good idea.
ok
Crap, I hadn't taken account that CanSkip may add new entries to the purple buffer.
The patch needs some tweaking.  New patch coming.
Attached patch v7 (obsolete) — Splinter Review
This needs new reviews.
SegmentedVector has now just one iterator which can be iterated back or forward up to beginning or end, and there is just different method to access it in a way which starts from the current last entry.
I need to be able to go forward form the last entry in case new entries are added.


In nsCycleCollector the changes are in VisitEntries. It now deals with the cases when Visit clears the entry, or if Visit adds new entries to mEntries.
I changed NS_ASSERTION to MOZ_ASSERT in nsPurpleBuffer::SelectPointers to catch issues easier.
Attachment #8862665 - Attachment is obsolete: true
Attachment #8863162 - Flags: review?(nfroyd)
Attachment #8863162 - Flags: review?(continuation)
Comment on attachment 8863162 [details] [diff] [review]
v7

Review of attachment 8863162 [details] [diff] [review]:
-----------------------------------------------------------------

Please add some tests for this new iterator functionality.
Attachment #8863162 - Flags: review?(nfroyd) → review-
This is getting pretty complex. How bad would be it be performance-wise to have VisitEntries simply create a new purple buffer from scratch every time? VisitEntries would do a single forward iteration, and copy everything that remains non-empty after a Visit into the new purple buffer. When it was done with the iteration, it would swap out the old one. Then you wouldn't need to iterate backwards at the same time as you went forward, and you wouldn't need to worry about entries created while iterating. I guess it comes down to how dense the purple buffer typically is.
Flags: needinfo?(bugs)
An issue is swapping. How to do that. Easy way would be to have a pointer to SegmentedVector and just swap pointers, but that would mean extra indirect access.
Other option is to change SegmentedVector to somehow support swapping.

Swapping also requires more memory usage, in worst cases it would double memory usage
(and malloc is slow).
Given that the less-memory-usage implementation is now there, I would stick with that.

"typically" is hard to say. Varies a lot currently. With the patch purple buffer becomes rather dense.
Flags: needinfo?(bugs)
Attached patch +testsSplinter Review
Attachment #8863870 - Flags: review?(nfroyd)
Attachment #8863870 - Flags: review?(continuation)
Attachment #8863162 - Attachment is obsolete: true
Attachment #8863162 - Flags: review?(continuation)
Comment on attachment 8863870 [details] [diff] [review]
+tests

Review of attachment 8863870 [details] [diff] [review]:
-----------------------------------------------------------------

r=me for the CC changes.

::: xpcom/base/nsCycleCollector.cpp
@@ +1002,5 @@
> +  }
> +
> +  void Clear()
> +  {
> +    mRefCnt->RemoveFromPurpleBuffer();

Is it really okay to get rid of the null check on mRefCnt?

@@ +1114,5 @@
> +
> +      // While visiting entries, some new ones were possibly added. This can
> +      // happen during CanSkip. Move all such new entries to be after other
> +      // entries. Note, we don't call Visit on newly added entries!
> +      if (&iterFromLastEntry.Get() != &mEntries.GetLast()) {

Should this just be !iterFromLastEntry.Done()?
Attachment #8863870 - Flags: review?(continuation) → review+
Version: 50 Branch → Trunk
(In reply to Andrew McCreight [:mccr8] from comment #27)
> > +  }
> > +
> > +  void Clear()
> > +  {
> > +    mRefCnt->RemoveFromPurpleBuffer();
> 
> Is it really okay to get rid of the null check on mRefCnt?
Yes

> 
> @@ +1114,5 @@
> > +
> > +      // While visiting entries, some new ones were possibly added. This can
> > +      // happen during CanSkip. Move all such new entries to be after other
> > +      // entries. Note, we don't call Visit on newly added entries!
> > +      if (&iterFromLastEntry.Get() != &mEntries.GetLast()) {
> 
> Should this just be !iterFromLastEntry.Done()?
No. iterFromLastEntry isn't initially "Done()". It points to the initial last entry.
Comment on attachment 8863870 [details] [diff] [review]
+tests

Review of attachment 8863870 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the tests.

::: mfbt/SegmentedVector.h
@@ +274,5 @@
>    //    f(elem);
>    //  }
>    //
> +  // Note, adding new entries to the SegmentedVector while using iterators
> +  // is supported, but removing is not!

This statement seems like it requires a little qualification, as e.g. an iterator that is Done() won't magically become !Done() if you insert an element at the end.

::: mfbt/tests/TestSegmentedVector.cpp
@@ +276,5 @@
> +
> +  auto iter = v.Iter();
> +  auto iterFromLast = v.IterFromLast();
> +  MOZ_RELEASE_ASSERT(iter.Done());
> +  MOZ_RELEASE_ASSERT(iterFromLast.Done());

Should test somewhere that iterators over IterFromLast will give you the correct values in order, maybe in TestBasics().
Attachment #8863870 - Flags: review?(nfroyd) → review+
I'll add some more tests to the same testing function.
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f2217556ec39
replace PurpleBlock with SegmentedVector to reduce indirect memory accesses when calling suspect, r=mccr8,nfroyd
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f95b12c5994
dummy handling for return values in test,  r=bustage
https://hg.mozilla.org/mozilla-central/rev/f2217556ec39
https://hg.mozilla.org/mozilla-central/rev/0f95b12c5994
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.