Closed Bug 1331718 Opened 7 years ago Closed 7 years ago

Make DisplayItemData more readily available

Categories

(Core :: Web Painting, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Performance Impact high
Tracking Status
firefox55 --- fixed

People

(Reporter: bas.schouten, Assigned: bas.schouten)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

Currently we access DisplayItemData through the FramePropertyTable. We access this a lot and spend a considerable amount of our time doing hashtable lookups and iterating through arrays. Presumably also polluting the cache a lot in the process. We could possibly to better at the expense of some memory usage.
This patch makes DisplayItemData be stored in a pointer on nsIFrame. This does grow nsIFrame somewhat, but it leads to a 10-20% improvement in Layer Tree building times. I'm curious to know what people think.
Attachment #8827653 - Flags: feedback?(matt.woodrow)
David, let me know if there's any objections to growing nsIFrame, I realize this is a very real memory cost we're talking about. (I think 28 bytes per frame)
Flags: needinfo?(dbaron)
It seems like we could reduce the memory increase to 8 bytes per frame by instead storing a pointer to an array of DisplayItemData (stored in an arena). This would not add an additional indirection because AutoTArray already has the indirection to access it's inline storage.
I'm not at all concerned about what the increase in size **of nsIFrame** is -- weren't these objects just stored elsewhere before?  What's the net increase in size?  Are we using more or less memory than before?

(That said, if we're storing this data for every frame, it does feel like a substantial amount of storage.)
Flags: needinfo?(dbaron) → needinfo?(bas)
(In reply to David Baron :dbaron: ⌚️UTC-8 from comment #4)
> I'm not at all concerned about what the increase in size **of nsIFrame** is
> -- weren't these objects just stored elsewhere before?  What's the net
> increase in size?  Are we using more or less memory than before?
> 
> (That said, if we're storing this data for every frame, it does feel like a
> substantial amount of storage.)

Before we were storing them in the FramePropertyTable, so we weren't adding the pointers for frames that had never been displayed yet (or that simply don't get a display item).

(I'm using non-e10s numbers here as e10s numbers seem to be far noisier and harder to reason about)

This patch improves tp5 by about 2-5% (noisy data, hard to say for sure), regresses tp5_privatebytes by about 1MB, improves tp5_responsiveness by about 5%.

Most other talos suites seem somewhat unaffected, take all these numbers with a grain of salt obviously, Talos isn't the most reliable way of judging performance changes.

Locally I see an increase of 5-20% in layer tree building times depending on the exact workload.
Flags: needinfo?(bas)
(In reply to Jeff Muizelaar [:jrmuizel] from comment #3)
> It seems like we could reduce the memory increase to 8 bytes per frame by
> instead storing a pointer to an array of DisplayItemData (stored in an
> arena). This would not add an additional indirection because AutoTArray
> already has the indirection to access it's inline storage.

Wouldn't we be accessing a different part of the arena just to get to the pointer to DisplayItemData? I don't object terribly I'm just not superconvinced of the usefulness. The current patch is a little simpler. It would also require logic to do the arena allocation for the array on the first request of the DisplayItemData and it would require some logic to remove it as well, further fragmenting the Arena as well. (Although there's some stuff in the arena that should make fragmentation not be too bad)
(In reply to Bas Schouten (:bas.schouten) from comment #5)
> Before we were storing them in the FramePropertyTable, so we weren't adding
> the pointers for frames that had never been displayed yet (or that simply
> don't get a display item).

Aren't there a significant number of frames that don't have any display items, such as most frames without backgrounds, borders, etc., that aren't text/image/video/etc. frames?  (Although we'd give them a Background display item when building for event targeting, I don't think we would for painting, no?)  Would that be the primary source of the memory usage increase?

What is the tp5_privatebytes regression as a percentage?
I'm also curious if you have any numbers on what % of frames have display items.  (I'd guess 25% but varying substantially from site-to-site, but I'm really not sure.)

In general, we want to use frame properties for things that most frames don't have (and protect them with a state bit if they're hot), and member variables for things that most frames do have.  I'm not quite sure what the threshold for "most" is here ("most" is probably the wrong word since I'd put the threshold under 50%), but my gut feeling is this may well be over the threshold.  That said, if we're looking at something like 10%-25%, it might be worth putting in a little more effort to save space for the frames that don't have display items, while still avoiding the hashtable lookup, and seeing how that does on the space/size tradeoff.
Flags: needinfo?(bas)
(In reply to David Baron :dbaron: ⌚️UTC-8 from comment #7)
> (In reply to Bas Schouten (:bas.schouten) from comment #5)
> > Before we were storing them in the FramePropertyTable, so we weren't adding
> > the pointers for frames that had never been displayed yet (or that simply
> > don't get a display item).
> 
> Aren't there a significant number of frames that don't have any display
> items, such as most frames without backgrounds, borders, etc., that aren't
> text/image/video/etc. frames?  (Although we'd give them a Background display
> item when building for event targeting, I don't think we would for painting,
> no?)  Would that be the primary source of the memory usage increase?
> 
> What is the tp5_privatebytes regression as a percentage?

%-wise the privatebytes regression is roughly 1-2% as far as I can tell, but the data is quite noisy so it may be up to 3%.

And % wise, I think you're right, 25% probably isn't a horrible guess, then again, if the user loads a really long page and never scrolls (say 20 pages long with each page having very different content), it would never be more than 5%, so it's very content dependent.

I can get some hard numbers for you.
Flags: needinfo?(bas)
Comment on attachment 8827653 [details] [diff] [review]
Make DisplayItemData a pointer on nsIFrame

Review of attachment 8827653 [details] [diff] [review]:
-----------------------------------------------------------------

Code-wise this seems totally reasonable, modulo deciding if we can actually afford to use the extra memory.

Bug 1333918 might be interesting, as it looks like we're running out of memory for layout objects even without this extra increase.
Attachment #8827653 - Flags: feedback?(matt.woodrow)
I've investigated this, depending on the content the amount of frames that get DisplayItemData is between 10-25%. I'm not sure exactly what the cost of the DID when inside the FramePropertyTable is, but it stands to reason it's not 4-10x that of the direct attachment. So my best guess would be that there's some albeit small, but real net increase in memory here.
Blocks: QuantumFlow
One other option would be to put the FrameProperties on nsIFrame* instead of just DisplayItemData.

The memory regression would be the same, the win for DisplayItemData might be slightly less (as it adds another indirection), but we could get wins for other FrameProperty types (like overflow rects).
(In reply to Matt Woodrow (:mattwoodrow) from comment #12)
> One other option would be to put the FrameProperties on nsIFrame* instead of
> just DisplayItemData.

This is my preference.
Bas, this patch doesn't apply cleanly, and it looks like it expects DisplayItemData to be defined outside of FrameLayerBuilder, as mozilla::DisplayItemData. Did you have another patch before this that moves the definition of DisplayItemData? Could you attach it to this bug, please, if you still have it?
Flags: needinfo?(bas)
See bug 1330570.

(In reply to Matt Woodrow (:mattwoodrow) from comment #12)
> One other option would be to put the FrameProperties on nsIFrame* instead of
> just DisplayItemData.
> 
> The memory regression would be the same, the win for DisplayItemData might
> be slightly less (as it adds another indirection), but we could get wins for
> other FrameProperty types (like overflow rects).

It should be noted by far the main cost we're trying to avoid here is just the hash table lookup.
Flags: needinfo?(bas) → needinfo?(mstange)
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p1]
Fwiw, I've done some renewed investigation into this and updated the patches to current central, at least for Matt's perf test case, this, in combination with bug 1330570, makes FLB about 15-20% faster. I think this is worth the cost. Markus, are you okay with putting these up for review and landing this stuff?
(In reply to Bas Schouten (:bas.schouten) from comment #15)
> See bug 1330570.
> 
> (In reply to Matt Woodrow (:mattwoodrow) from comment #12)
> > One other option would be to put the FrameProperties on nsIFrame* instead of
> > just DisplayItemData.
> > 
> > The memory regression would be the same, the win for DisplayItemData might
> > be slightly less (as it adds another indirection), but we could get wins for
> > other FrameProperty types (like overflow rects).
> 
> It should be noted by far the main cost we're trying to avoid here is just
> the hash table lookup.

The hash table lookup is from Frame* to FramePropertyTable. Putting a pointer to the FramePropertyTable avoids that cost. I think it's worth us comparing your current patch to adding a pointer to the frame property table as that will make all property lookups faster.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #17)
> (In reply to Bas Schouten (:bas.schouten) from comment #15)
> > See bug 1330570.
> > 
> > (In reply to Matt Woodrow (:mattwoodrow) from comment #12)
> > > One other option would be to put the FrameProperties on nsIFrame* instead of
> > > just DisplayItemData.
> > > 
> > > The memory regression would be the same, the win for DisplayItemData might
> > > be slightly less (as it adds another indirection), but we could get wins for
> > > other FrameProperty types (like overflow rects).
> > 
> > It should be noted by far the main cost we're trying to avoid here is just
> > the hash table lookup.
> 
> The hash table lookup is from Frame* to FramePropertyTable. Putting a
> pointer to the FramePropertyTable avoids that cost. I think it's worth us
> comparing your current patch to adding a pointer to the frame property table
> as that will make all property lookups faster.

I will test this.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #17)
> (In reply to Bas Schouten (:bas.schouten) from comment #15)
> > See bug 1330570.
> > 
> > (In reply to Matt Woodrow (:mattwoodrow) from comment #12)
> > > One other option would be to put the FrameProperties on nsIFrame* instead of
> > > just DisplayItemData.
> > > 
> > > The memory regression would be the same, the win for DisplayItemData might
> > > be slightly less (as it adds another indirection), but we could get wins for
> > > other FrameProperty types (like overflow rects).
> > 
> > It should be noted by far the main cost we're trying to avoid here is just
> > the hash table lookup.
> 
> The hash table lookup is from Frame* to FramePropertyTable. Putting a
> pointer to the FramePropertyTable avoids that cost. I think it's worth us
> comparing your current patch to adding a pointer to the frame property table
> as that will make all property lookups faster.

So the result here for my machine (Giving FLB time on Matt Woodrow's DisplayItemData patch):

Averages from several runs, take a as a rough indicator:

Base: 30.75ms
FP on Frame: 29.25ms
DID on Frame: 27.75ms

So the benefit of putting the DisplayItemData directly on the frame is roughly twice that of putting the FramePointer directly on the frame. It should be noted the FP patch only made frames 1x sizeof(ptrtype) bigger whereas the DID patch made it 4x sizeof(ptrtype) bigger (because AutoTArray doesn't try very hard to be small), I'm going to make a DID patch that makes it 2x sizeof(ptrtype) bigger to reduce memory increase.
Olli, for doing this with minimal overhead I was looking at storing -usually- a single pointer or nothing more memory efficiently (i.e. best I can come up with is 2x(sizeof(T*)) which avoids allocations when elemCount <= 2), allowing array-type behavior when things grow larger. I heard in #developers you needed something similar? Is this true, and if so do you want me to add a generic class along these lines that you can re-use?
Flags: needinfo?(bugs)
I needed (or still need) similar behavior what SmallVoidArray had (it was removed at some point [1]): memmoveable array with inline storage for one entry. Memmoveability being the key part here.


[1] https://bug819090.bmoattachments.org/attachment.cgi?id=8558931
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #21)
> I needed (or still need) similar behavior what SmallVoidArray had (it was
> removed at some point [1]): memmoveable array with inline storage for one
> entry. Memmoveability being the key part here.
> 
> 
> [1] https://bug819090.bmoattachments.org/attachment.cgi?id=8558931

So my plan would be this:

SmallPointerArray<T>,

Inline storage for 2 pointers. But does -not- support nullptr.

Essentially contains two pointers:

When 1 is non-null the array is size 1. When 1 and 2 are non null the array is size 2. When 1 is null and 2 is non-null, ptr 2 points to an nsTArray<T> and the array has an arbitrary size.

I assume that would suit your usecase?
Flags: needinfo?(bugs)
SmallVoidArray setup would be nicer for my case, since by far the most common case the length of the array is 1. That setup isn't too complicated and does support nullptr.
Flags: needinfo?(bugs)
We have nsStyleAutoArray to allow memmoves of the style structs from servo, which seems similar to what you're describing, perhaps it could be reused?
(In reply to Olli Pettay [:smaug] from comment #23)
> SmallVoidArray setup would be nicer for my case, since by far the most
> common case the length of the array is 1. That setup isn't too complicated
> and does support nullptr.

Ah, sadly I need something more size-efficient and more performant so I guess out goals don't overlap sufficiently there. Oh well! I'll probably add something anyway and people can see if they use it, it's almost done anyway.
Attachment #8827653 - Attachment is obsolete: true
Comment on attachment 8863252 [details]
Bug 1331718 - Part 3: Store pointers to DisplayItemData directly on nsIFrame.

https://reviewboard.mozilla.org/r/135032/#review137974

::: layout/painting/FrameLayerBuilder.cpp:1961
(Diff revision 1)
> -  delete aArray;
>    sDestroyedFrame = nullptr;
>  }
>  
>  void
> +FrameLayerBuilder::DestroyRemainingDID(LayerManager* aLM)

What is this for? Seems like nothing, and it's not called.
Comment on attachment 8863252 [details]
Bug 1331718 - Part 3: Store pointers to DisplayItemData directly on nsIFrame.

https://reviewboard.mozilla.org/r/135032/#review137976

Looks good for FLB, but I've requested additional review from dbaron wrt the size change for nsIFrame.
Attachment #8863252 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8863250 [details]
Bug 1331718 - Part 1: Add small pointer array.

Nathan's a more appropriate reviewer than me.
Attachment #8863250 - Flags: review?(jmuizelaar) → review?(nfroyd)
Attachment #8863251 - Flags: review?(jmuizelaar) → review?(nfroyd)
Comment on attachment 8863252 [details]
Bug 1331718 - Part 3: Store pointers to DisplayItemData directly on nsIFrame.

Looks like the dbaron review request never actually happened.
Attachment #8863252 - Flags: review?(dbaron)
Comment on attachment 8863250 [details]
Bug 1331718 - Part 1: Add small pointer array.

https://reviewboard.mozilla.org/r/135034/#review138058

::: mfbt/SmallPointerArray.h:17
(Diff revision 1)
> +#include <vector>
> +
> +namespace mozilla {
> +
> +// Array class specialized to hold a small number (1 or 2 elements) of pointers
> +// in a minimal amount of inline space. Can be grown larger if required.

This could use a better description of how it works and how big it is.
Attachment #8863250 - Flags: review?(jmuizelaar)
Attachment #8863250 - Flags: review?(jmuizelaar)
How does the support for ranged-for handle mutations to the array while iterating?
We had several security critical bugs with nsTArray when it was using pointers as iterator values.
Bug 1299489 changed it to use indexes.

Is the underlying std::vector safe with ranged-for? Or would it be better to use nsTArray?
(In reply to Olli Pettay [:smaug] from comment #34)
> How does the support for ranged-for handle mutations to the array while
> iterating?
> We had several security critical bugs with nsTArray when it was using
> pointers as iterator values.
> Bug 1299489 changed it to use indexes.
> 
> Is the underlying std::vector safe with ranged-for? Or would it be better to
> use nsTArray?

The underlying std::vector is almost certainly not safe with respect to ranged-for and mutation.
It looks like this patch is taking the approach of storing the DisplayItemData object (28 bytes, per comment 2) in nsIFrame.  Did you make any measurements of the alternative approach of storing a DisplayItemData* pointer directly in nsIFrame (probably allocating the object in the pres shell arena)?  It seems like that would get most (maybe nearly all) of the performance benefit with only a fraction of the memory overhead (on the assumption of 5%-25% of frames having display items -- no idea if those guesses above are actually accurate), assuming 8 bytes overhead for all frames plus an additional 28 bytes for those with display items.  (I'm assuming it would be null for frames without display items -- is that correct?)
Flags: needinfo?(jmuizelaar)
(Or is the stuff in the other patches shrinking the size of DisplayItemData?  Clearer commit messages saying what's being changed would be helpful here...)
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #36)
> It looks like this patch is taking the approach of storing the
> DisplayItemData object (28 bytes, per comment 2) in nsIFrame.  Did you make
> any measurements of the alternative approach of storing a DisplayItemData*
> pointer directly in nsIFrame (probably allocating the object in the pres
> shell arena)?  It seems like that would get most (maybe nearly all) of the
> performance benefit with only a fraction of the memory overhead (on the
> assumption of 5%-25% of frames having display items -- no idea if those
> guesses above are actually accurate), assuming 8 bytes overhead for all
> frames plus an additional 28 bytes for those with display items.  (I'm
> assuming it would be null for frames without display items -- is that
> correct?)

So, you'll notice the comment is from quite a long time ago. I've created a specialized class that only takes 2*sizeof(void*) for the new patch. This causes the increase in the size of nsIFrame to be only 8 bytes. Traditionally when we used FrameProperties we took about 24-32 bytes to store DisplayItemData depending on the platform. Since about 1/3rd of Frames in my measurements get DisplayItemData (on common websites), there should be no -overall- increase in size due to this change.
(In reply to Olli Pettay [:smaug] from comment #34)
> How does the support for ranged-for handle mutations to the array while
> iterating?
> We had several security critical bugs with nsTArray when it was using
> pointers as iterator values.
> Bug 1299489 changed it to use indexes.
> 
> Is the underlying std::vector safe with ranged-for? Or would it be better to
> use nsTArray?

Even if std::vector iterators -were- safe. The way I implement the iterators for this class they are -not- safe for mutation of the array. This is a conscious decision but possibly I should be more explicit about it.

Btw, sorry about that David, Jeff just forwarded review for me, I did this patch.
Flags: needinfo?(jmuizelaar) → needinfo?(dbaron)
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #36)
> It looks like this patch is taking the approach of storing the
> DisplayItemData object (28 bytes, per comment 2) in nsIFrame.  Did you make
> any measurements of the alternative approach of storing a DisplayItemData*
> pointer directly in nsIFrame (probably allocating the object in the pres
> shell arena)?  It seems like that would get most (maybe nearly all) of the
> performance benefit with only a fraction of the memory overhead (on the
> assumption of 5%-25% of frames having display items -- no idea if those
> guesses above are actually accurate), assuming 8 bytes overhead for all
> frames plus an additional 28 bytes for those with display items.  (I'm
> assuming it would be null for frames without display items -- is that
> correct?)

Let me answer this in a little more detail:

It should be noted the patches in this bug already depend on patches that move DisplayItemData objects into the presshell arena. But just to be clear, this patch does -not- store DisplayItemData directly on the nsIFrame. It just allows an array of pointers onto the nsIFrame. I measured this, most frames only get 0 or 1 DisplayItemData. About 1% of frames get three or more. In the three or more case my patch would increase memory size a little bit, but that is rare. In the 0 case, my patch increases memory usage of the 'frame' by 8-16 (depending on 32 vs 64-bit platforms) bytes. In the 1 case we reduce the size of the FrameProperty table by about 24-32 bytes, and of course we still increase the size of the frame by 8-16. The exact memory impact is dependent on the page and such things, but Tp5 shows no visible memory regression with these patches.

I will add a better and more complete commit message.
(In reply to Bas Schouten (:bas.schouten) from comment #39)
> Even if std::vector iterators -were- safe. The way I implement the iterators
> for this class they are -not- safe for mutation of the array. This is a
> conscious decision but possibly I should be more explicit about it.

Hopefully "be more explicit about it" means "add fatal assertions in DEBUG builds"?
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #41)
> (In reply to Bas Schouten (:bas.schouten) from comment #39)
> > Even if std::vector iterators -were- safe. The way I implement the iterators
> > for this class they are -not- safe for mutation of the array. This is a
> > conscious decision but possibly I should be more explicit about it.
> 
> Hopefully "be more explicit about it" means "add fatal assertions in DEBUG
> builds"?

That's... difficult.. as an iterator I use pointers (which is a typical way of implementing these things), and somehow 'asserting' the data structure can't be manipulated while there's iterators are 'outstanding' is difficult (although not impossible). The reality is that this is how you often iterate over arrays. C++ is pretty serious about how you cannot just manipulate a data structure and assume its iterators remain valid (there's only a couple of STL data structures for which they do remain valid AFAIK), it specifies this for most of the STL data structures. Assuming you -can- just do that seems like a pretty basic lack of understanding how iterators or even programming works.

Using an index based iterator object is certainly possible, but does have an (albeit small) performance impact. I'm old fashioned in the sense that I assume the person using a class bothers to read the documentation for said class, but that might be naive :-).
(In reply to Bas Schouten (:bas.schouten) from comment #42)
> That's... difficult.. as an iterator I use pointers (which is a typical way
> of implementing these things), and somehow 'asserting' the data structure
> can't be manipulated while there's iterators are 'outstanding' is difficult
> (although not impossible). The reality is that this is how you often iterate
> over arrays. C++ is pretty serious about how you cannot just manipulate a
> data structure and assume its iterators remain valid (there's only a couple
> of STL data structures for which they do remain valid AFAIK), it specifies
> this for most of the STL data structures. Assuming you -can- just do that
> seems like a pretty basic lack of understanding how iterators or even
> programming works.

Please note that it's not merely understanding how iterators or programming works.  If you do:

  for (auto& v : mVector) {
    ...
  }

it's a matter of knowing whether the body of the for-loop, including all the functions and/or behaviors called at runtime (e.g. JavaScript) are going to potentially modify mVector.  And that's a pretty tall order: it's not necessarily obvious when writing the code, and it's definitely not obvious when reviewing/reading the code.
As an example, our most recent 0-day (bug 1321066) was iterator invalidation, via a stack one person familiar with the code didn't think could happen.
Comment on attachment 8863250 [details]
Bug 1331718 - Part 1: Add small pointer array.

https://reviewboard.mozilla.org/r/135034/#review138876

r-'ing because I'd like to see the documentation that Jeff mentioned and discuss the below first.

::: mfbt/SmallPointerArray.h:69
(Diff revision 1)
> +      mInlineElements[1] = aElement;
> +      return;
> +    }
> +
> +    T* tmp = mInlineElements[1];
> +    mArray = new std::vector<T*>(2);

Seems like you would want this to be at least `3`, given that you're putting three elements into the vector...or maybe just `mArray = new std::vector<T*>({ mInlineElements[0], mInlineElements[1], aElement });` or something like that?

::: mfbt/SmallPointerArray.h:88
(Diff revision 1)
> +      // Expectected case.
> +      mInlineElements[0] = mInlineElements[1];
> +      mInlineElements[1] = nullptr;
> +      return;
> +    }
> +    

Nit: trailing whitespace.

::: mfbt/SmallPointerArray.h:141
(Diff revision 1)
> +    if (mInlineElements[0] || !mArray) {
> +      return &mInlineElements[0];
> +    }
> +
> +    if (!mArray->size()) {
> +      return nullptr;
> +    }
> +
> +    return &(*mArray)[0];

Given the lack of documentation, I'm having a hard time deciding whether the lack of symmetry between `begin()` and `end()` is a bug or just an artifact of the implementation.

Maybe we should have a single (private) method that returns the equivalent of `begin()` and `Length()`, and then have `begin`, `end`, and `Length` call that?  That would cut down on code duplication, and because it's private, we could even make it `const`, but returning `T*`, so we can reuse it for both the `const` and non-`const` iterator implementations.  WDYT?

::: mfbt/SmallPointerArray.h:179
(Diff revision 1)
> +
> +    if (!mArray->size()) {
> +      return nullptr;
> +    }
> +
> +    return &((*mArray)[mArray->size() - 1]) + 1;

Maybe just `&(*mArray)[0] + mArray->size()`.
Attachment #8863250 - Flags: review?(nfroyd) → review-
Comment on attachment 8863251 [details]
Bug 1331718 - Part 2: Add unit tests for SmallPointerArray.

https://reviewboard.mozilla.org/r/135036/#review138882

These look basically good, would like to see again to confirm.

::: mfbt/tests/TestSmallPointerArray.cpp:59
(Diff revision 1)
> +  testArray.RemoveElement(PTR1);
> +
> +  MOZ_RELEASE_ASSERT(testArray.Length() == 2);
> +  MOZ_RELEASE_ASSERT(testArray[0] == PTR2);
> +  MOZ_RELEASE_ASSERT(testArray.ElementAt(0) == PTR2);
> +  MOZ_RELEASE_ASSERT(testArray[1] == PTR3);
> +  MOZ_RELEASE_ASSERT(testArray.ElementAt(1) == PTR3);

You should also check that removing all elements from a `std::vector`'d array correctly gives you a length 0 array, optionally checking intermediate states (length, elements remaining in the array are what you expect, etc.) on the way.

::: mfbt/tests/TestSmallPointerArray.cpp:85
(Diff revision 1)
> +  testArray.RemoveElement(PTR2);
> +
> +  MOZ_RELEASE_ASSERT(testArray.Length() == 1);
> +  MOZ_RELEASE_ASSERT(testArray[0] == PTR1);
> +  MOZ_RELEASE_ASSERT(testArray.ElementAt(0) == PTR1);

Should check that removing a non-existent element from the array works as expected.  You may want to consider changing `RemoveElement`'s return type to `bool` to reflect whether we removed an element or not.

::: mfbt/tests/TestSmallPointerArray.cpp:110
(Diff revision 1)
> +  testArray.AppendElement(PTR1);
> +
> +  for (void* test : testArray) {
> +    verification[entries++] = test;
> +  }
> +  

Nit: trailing whitespace.

::: mfbt/tests/TestSmallPointerArray.cpp:122
(Diff revision 1)
> +  MOZ_RELEASE_ASSERT(entries == 2);
> +  MOZ_RELEASE_ASSERT(verification[0] == PTR1);
> +  MOZ_RELEASE_ASSERT(verification[1] == PTR2);

Should probably add tests here that test removal of elements from a 2-element array and ensuring the range-based for loops give the correct answers.
Attachment #8863251 - Flags: review?(nfroyd)
(In reply to Andrew McCreight [:mccr8] from comment #44)
> As an example, our most recent 0-day (bug 1321066) was iterator
> invalidation, via a stack one person familiar with the code didn't think
> could happen.

Right, -personally- I am of the opinion in any class where modification of the data structure leads to iterator invalidation (vector, deque, etc.) you shouldn't be going off into distant parts of the code from the body of your loop. If the 'work per loop iteration' is going to be that high, you probably also don't need to performance benefits of fast iterators.

But yes, I concede there's situations in which it is easy to make mistakes when using iterators. Especially when they're now so convenient to use :).
Comment on attachment 8863251 [details]
Bug 1331718 - Part 2: Add unit tests for SmallPointerArray.

https://reviewboard.mozilla.org/r/135036/#review139278

::: mfbt/tests/TestSmallPointerArray.cpp:13
(Diff revisions 1 - 2)
> +// We explicitly test sizes up to 3 here, as that is when SmallPointerArray<>
> +// switches to the storage method used for larger arrays.

Thanks for this comment.
Attachment #8863251 - Flags: review?(nfroyd) → review+
Comment on attachment 8863250 [details]
Bug 1331718 - Part 1: Add small pointer array.

https://reviewboard.mozilla.org/r/135034/#review139256

r=me with all the changes below.  Feel free to push back if you disagree.

::: mfbt/SmallPointerArray.h:7
(Diff revisions 1 - 2)
>  /* vim: set ts=8 sts=2 et sw=2 tw=80: */
>  /* This Source Code Form is subject to the terms of the Mozilla Public
>  * License, v. 2.0. If a copy of the MPL was not distributed with this
>  * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
>  
>  #ifndef mozilla_SmallPointerArray_h

Please add a single-line comment prior to the `#ifndef` like:

`/* A vector of pointers space-optimized for small numbers of elements. */`

This makes directory displays like DXR/searchfox more pleasant.

::: mfbt/SmallPointerArray.h:16
(Diff revisions 1 - 2)
>  // Array class specialized to hold a small number (1 or 2 elements) of pointers
>  // in a minimal amount of inline space. Can be grown larger if required.

How about:

Array class for situations where a small number of elements (<= 2) is expected, a large number of elements must be accommodated if necessary, and the size of the class must be minimal.  Typical vector implementations will fulfull the first two requirements by simply adding inline storage alongside the rest of their member variables.  While this strategy works, it brings unnecessary storage overhead for vectors with an expected small number of elements.  This class is intended to deal with that problem.

::: mfbt/SmallPointerArray.h:19
(Diff revisions 1 - 2)
>  namespace mozilla {
>  
>  // Array class specialized to hold a small number (1 or 2 elements) of pointers
>  // in a minimal amount of inline space. Can be grown larger if required.
> +//
> +// This class is similar is performance to a vector class. Accessing its

I think you intended "This class is similar *in* performance...", correct?

::: mfbt/SmallPointerArray.h:21
(Diff revisions 1 - 2)
>  // Array class specialized to hold a small number (1 or 2 elements) of pointers
>  // in a minimal amount of inline space. Can be grown larger if required.
> +//
> +// This class is similar is performance to a vector class. Accessing its
> +// elements when it has not grown over a size of 2 does not require an extra
> +// level of indirection and will therefor be faster.

Nit: "therefore"

::: mfbt/SmallPointerArray.h:25
(Diff revisions 1 - 2)
> +// elements when it has not grown over a size of 2 does not require an extra
> +// level of indirection and will therefor be faster.
> +//
> +// The minimum (inline) size is 2 * sizeof(void*).
> +//
> +// Any manipulation of the array invalidates the iterators.

How about "Any modification of the array invalidates any outstanding iterators."

::: mfbt/SmallPointerArray.h:51
(Diff revisions 1 - 2)
>      }
>      mInlineElements[0] = mInlineElements[1] = nullptr;
>    }
>  
>    void AppendElement(T* aElement) {
>      MOZ_ASSERT(aElement != nullptr);

Maybe we should have a comment prior to this, such as: "Storing nullptr as an element is not permitted, but we do check for it below to avoid corruption issues in non-debug builds." ?

We need some way of explaining why we have the assert and the `!aElement` checks as well.

::: mfbt/SmallPointerArray.h:172
(Diff revisions 1 - 2)
>  
> -    return &((*mArray)[mArray->size() - 1]) + 1;
> +    return &(*mArray)[0];
>    }
> -  const_iterator cend() const { return end(); }
>  
> -private:
> +  // mArray and mInlineElements[1] share the same area in memory.

We should have some `static_assert`s in this class to verify the memory layout is what we think it is.

::: mfbt/SmallPointerArray.h:174
(Diff revisions 1 - 2)
>    }
> -  const_iterator cend() const { return end(); }
>  
> -private:
> +  // mArray and mInlineElements[1] share the same area in memory.
> +  //
> +  // When !mInlineElemnts[0] && !mInlineElements[1] the array is empty.

Nit: `!mInlineElements[0]`.
Attachment #8863250 - Flags: review?(nfroyd) → review+
(In reply to Bas Schouten (:bas.schouten) from comment #42)
> (In reply to David Baron :dbaron: ⌚️UTC-7 from comment #41)
> > (In reply to Bas Schouten (:bas.schouten) from comment #39)
> > > Even if std::vector iterators -were- safe. The way I implement the iterators
> > > for this class they are -not- safe for mutation of the array. This is a
> > > conscious decision but possibly I should be more explicit about it.
> > 
> > Hopefully "be more explicit about it" means "add fatal assertions in DEBUG
> > builds"?
> 
> That's... difficult.. as an iterator I use pointers (which is a typical way
> of implementing these things), and somehow 'asserting' the data structure
> can't be manipulated while there's iterators are 'outstanding' is difficult
> (although not impossible). The reality is that this is how you often iterate
> over arrays. C++ is pretty serious about how you cannot just manipulate a
> data structure and assume its iterators remain valid (there's only a couple
> of STL data structures for which they do remain valid AFAIK), it specifies
> this for most of the STL data structures. Assuming you -can- just do that
> seems like a pretty basic lack of understanding how iterators or even
> programming works.
> 
> Using an index based iterator object is certainly possible, but does have an
> (albeit small) performance impact. I'm old fashioned in the sense that I
> assume the person using a class bothers to read the documentation for said
> class, but that might be naive :-).

So I think it's still pretty doable to have assertions -- can't you have iteration objects just contain a pointer in non-DEBUG builds, but a drop more in DEBUG builds, and then the iterator objects "reference count" the object they're iterating (something like mIteratorsCount, not mRefCnt), and then assert on mutations that mIteratorsCount is 0?
Flags: needinfo?(dbaron)
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #54)
> (In reply to Bas Schouten (:bas.schouten) from comment #42)
> > (In reply to David Baron :dbaron: ⌚️UTC-7 from comment #41)
> > > (In reply to Bas Schouten (:bas.schouten) from comment #39)
> > > > Even if std::vector iterators -were- safe. The way I implement the iterators
> > > > for this class they are -not- safe for mutation of the array. This is a
> > > > conscious decision but possibly I should be more explicit about it.
> > > 
> > > Hopefully "be more explicit about it" means "add fatal assertions in DEBUG
> > > builds"?
> > 
> > That's... difficult.. as an iterator I use pointers (which is a typical way
> > of implementing these things), and somehow 'asserting' the data structure
> > can't be manipulated while there's iterators are 'outstanding' is difficult
> > (although not impossible). The reality is that this is how you often iterate
> > over arrays. C++ is pretty serious about how you cannot just manipulate a
> > data structure and assume its iterators remain valid (there's only a couple
> > of STL data structures for which they do remain valid AFAIK), it specifies
> > this for most of the STL data structures. Assuming you -can- just do that
> > seems like a pretty basic lack of understanding how iterators or even
> > programming works.
> > 
> > Using an index based iterator object is certainly possible, but does have an
> > (albeit small) performance impact. I'm old fashioned in the sense that I
> > assume the person using a class bothers to read the documentation for said
> > class, but that might be naive :-).
> 
> So I think it's still pretty doable to have assertions -- can't you have
> iteration objects just contain a pointer in non-DEBUG builds, but a drop
> more in DEBUG builds, and then the iterator objects "reference count" the
> object they're iterating (something like mIteratorsCount, not mRefCnt), and
> then assert on mutations that mIteratorsCount is 0?

Fwiw, I think this is true. Since Nathan has r+'ed this patch I'll open a separate bug where we can decide if this is worth the code complication of a bunch of #ifdef DEBUG's. If this is the prevailing opinion I will do this work!

Do I have your r+ on the size increase of nsIFrame itself? :)
Flags: needinfo?(bas) → needinfo?(dbaron)
Comment on attachment 8863252 [details]
Bug 1331718 - Part 3: Store pointers to DisplayItemData directly on nsIFrame.

https://reviewboard.mozilla.org/r/135032/#review139436

ok, r=dbaron on the approach here; I'm assuming Matt already reviewed all the details.
Attachment #8863252 - Flags: review?(dbaron) → review+
Pushed by bschouten@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1b9ecb9b9fe8
Part 1: Add small pointer array. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/4527bd35cc4e
Part 2: Add unit tests for SmallPointerArray. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/ff6861518bc3
Part 3: Store pointers to DisplayItemData directly on nsIFrame. r=mattwoodrow r=dbaron
Flags: needinfo?(bas)
Pushed by bschouten@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c4aec03e9740
Part 1: Add small pointer array. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/e18a94a1ffda
Part 2: Add unit tests for SmallPointerArray. r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2832968581c
Part 3: Store pointers to DisplayItemData directly on nsIFrame. r=mattwoodrow r=dbaron
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa38c52a84a5
Part 4: Mark some tests on android as passing. r=bustage
Depends on: 1414508
Flags: needinfo?(mstange)
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.