Implement optionally infallible nsTArray

RESOLVED FIXED

Status

()

enhancement
RESOLVED FIXED
10 years ago
8 years ago

People

(Reporter: cjones, Assigned: cjones)

Tracking

(Blocks 1 bug)

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking2.0 final+)

Details

(Whiteboard: [sg:want P2], )

Attachments

(3 attachments, 3 obsolete attachments)

After bug 441324 landed, the question arose (bug 541496 comment 26) about an infallible variant of nsTArray.  That's a very desirable API IMHO.

I'd like to implement it and advance the agenda of STL-izing of mozilla code all in one fell swoop by creating infallible-nsTArray as mozilla::Vector.  Vector can start as an iterator-free subset of the STL std::vector API.  Hopefully Vector can be implemented as a wrapper around a parameterized nsTArray_base that uses moz_xmalloc/xrealloc instead of NS_Alloc/Realloc.
Blocks: 550593
Aw, why no iterators? ;-)

Also, how about bending the rules a little and making it be mozilla::vector, lowercase v?  Then conversion to the *real* STL vector doesn't have to touch every single call site (assuming we manage to clone the API correctly), just change the #include and the using-directive.

We could preserve a fallible/infallible distinction across that hypothetical transition by encoding it as an allocator type:

mozilla::vector<T>  // default is infallible
mozilla::vector<T, mozilla::fallible_allocator<T> >
Hm, I take that back; STL allocators are required to succeed or throw, and the library implementation (at least for gcc) does not cope with an allocator that returns 0.

Let's worry about that if and when we find a vector case that needs fallibility.
(In reply to comment #1)
> Aw, why no iterators? ;-)
> 

I want to whip out a quick, basic Vector impl that wraps nsTArray, so people can start using it immediately and we can keep momentum going.  I'm all for implementing iterators later.

> Also, how about bending the rules a little and making it be mozilla::vector,
> lowercase v?  Then conversion to the *real* STL vector doesn't have to touch
> every single call site (assuming we manage to clone the API correctly), just
> change the #include and the using-directive.
> 

I think we have an important decision here.  The methods that std::vector and mozilla::[vV]ector share, I've named as |standard_method()|.  I've added some non-standard methods (Insert/Erase) to work around the lack of iterators, named as |NonstandardMethod()|.  With those non-standard methods, I prefer mozilla::Vector.  However, we could also just drop the non-standard methods and wait for iterators.  In that case, I agree with mozilla::vector.

So, if we commit to adhere religiously to an STL API, I like vector.  Otherwise, Vector.  IMHO the latter seems more likely.
I think if we're going to do this exercise, we should really commit to it, and make it match the STL API, and use mozilla::vector.
I've learned that the non-standard Insert()/Erase() approach isn't going to fly.  We'll need iterators from the get go.

I think the only two nsTArray methods std::vector doesn't support equivalently are the shortcuts |T* InsertAt(i)| and |T* Append()|.  Since we'll have to keep around a non-standard fallible vector API regardless, I'm cool with a strict-STL mozilla::vector, and keeping those extensions in mozilla::fallible_vector (or whatever) for code that really needs them.
Quick progress update:

  - the majority of mozilla::vector and mozilla::vector::*iterator is working and covered by the new TestVector.cpp.  Many of the checks in TestVector.cpp are A/B tests that ensure |std::vector|s and |mozilla::vector|s are equivalent after being mutated by <algorithm> functions, which is really nice!
  - assigning from "forward iterator" start/end ranges to mozilla::vector needs new nsTArray-impl-level code in mozilla::vector [easy]
  - for maximum efficiency, writing N copies of the same value v to slots (i, j] in mozilla::vector needs new nsTArray-impl-level code in mozilla::vector [easy]
  - nsTArray needs to be taught moz_xmalloc [easy]

A TODO that I'll leave for a followup bug is implementing a useful suite of DEBUG-only checks, f.e. mutation-checking iterators [easy].  (TODO because the most important checks are already done by nsTArray.)

About 1/2 day's work remains, in my estimation.
This is a frustratingly ugly patch.  I wanted to use a traits class to provide the fallible/infallible allocators, but because nsTArray relies on non-templated nsTArray_base for all allocation, this wasn't feasible.  The new mFallible member variable in nsTArray_base is going to increase memory usage and might possibly incur a small runtime penalty for the extra branch before calling malloc/realloc/free.

The only option I can see other than this patch is forking the entire nsTArray_base impl into nsTArrayFallible_base and nsTArrayInfallible_base and adding a template argument to nsTArray controlling from which base to inherit.  This would be duplicating a lot of code though.
Assignee: nobody → jones.chris.g
Attachment #431298 - Flags: review?(benjamin)
Not sure who all should/wants to review this.
Attachment #431299 - Flags: review?(zweinberg)
Attachment #431299 - Flags: review?(benjamin)
Summary: Implement infallible mozilla::Vector → Implement infallible mozilla::vector
Comment on attachment 431299 [details] [diff] [review]
Part 2: Implement (infallible) mozilla::vector offering the STL std::vector API, as a wrapper around nsTArray

Oops, meant to r? luke.
Attachment #431299 - Flags: review?(lw)
Comment on attachment 431299 [details] [diff] [review]
Part 2: Implement (infallible) mozilla::vector offering the STL std::vector API, as a wrapper around nsTArray

Oops, forgot the sr flag.
Attachment #431299 - Flags: review?(benjamin) → superreview?(benjamin)
Attachment #431298 - Attachment is obsolete: true
Attachment #431416 - Flags: superreview?(benjamin)
Attachment #431298 - Flags: review?(benjamin)
Now #error's if mozilla::vector is attempted to be used without infallible ::operator new() (previously would have been a cryptic compile error).
Attachment #431299 - Attachment is obsolete: true
Attachment #431417 - Flags: superreview?(benjamin)
Attachment #431417 - Flags: review?(zweinberg)
Attachment #431417 - Flags: review?(lw)
Attachment #431299 - Flags: superreview?(benjamin)
Attachment #431299 - Flags: review?(zweinberg)
Attachment #431299 - Flags: review?(lw)
Infallible nsTArray is desirable apart from STL-ization, but (having seen how easy it is for map and friends) mozilla::vector should wrap std::vector, IMO, or else we should just use std::vector directly (as I hope we can, per bug 551254).

How much of nsTArray_base would have to move into headers to enable use of a traits class for the fallibility toggle?

Could we get away with just making all use of nsTArray infallible?
(In reply to comment #13)
> Infallible nsTArray is desirable apart from STL-ization, but (having seen how
> easy it is for map and friends) mozilla::vector should wrap std::vector, IMO,
> or else we should just use std::vector directly (as I hope we can, per bug
> 551254).
> 

My only concern is moving from an impl for which the performance characteristics are well-known to one for which they're not (wrt to Gecko), and might be platform dependent.  But TBH, I think the odds of nsTArray being faster than std::vector are slim to none.

> How much of nsTArray_base would have to move into headers to enable use of a
> traits class for the fallibility toggle?
> 

Depends.  There are two conflicting goals.  One is keeping sizeof(nsTArray) small, which the current code goes to great lengths to achieve.  I don't know if it's effective, or we really care anymore.  Another is keeping the amount of generated machine code small.  There are several solutions with various tradeoffs.

To use a "traditional" traits approach, we'd have to move the entire definition of nsTArray_base into the header.  Hopefully we'd only get two instantiations of nsTArray_base (fallible/infallible) and only emit one copy of each would be emitted in libxul.so, but that would be out of our control.  This wouldn't add new members to nsTArray_base.

We could fork the implementation of nsTArray_base for fallible/infallible variants as I described in comment 7.  This is probably optimal in terms of generated code, but is ugly and harder to maintain.

The only other solutions I can think of require adding new members to nsTArray_base.  The one I implemented was adding a |PRPackedBool fallible| flag, in vain hope of keeping memory usage down (probably won't subject to alignment) trading off slower calls to malloc et al.  We could add a member that's a pointer to a traits object, which is cleaner but might potentially increase nsTArray_base size.

> Could we get away with just making all use of nsTArray infallible?

I don't think so, given that we almost certainly have nsTArrays whose lengths are content controlled.  Boris, do you know if that's true?
I don't know offhand, but it wouldn't surprise me at all.  Certainly cellmaps are such, but we cap those sizes already...  Not sure about textruns, image data, and the like.
Comment on attachment 431417 [details] [diff] [review]
Part 2: Implement (infallible) mozilla::vector offering the STL std::vector API, as a wrapper around nsTArray, v2

Nice work; STL users rejoice!

The only substantial problem I see is that it seems nsTArray_base::ShiftData does not do what one would hope it would, i.e., call copy constructors, assignment operators, and destructors appropriately.  Instead, it just memmoves.  So I think this requires replacing all the uses of ShiftData in mozilla::vector with something else.

Other than that, some light comments:

>+  // (NB: non-standard SGI ctor)
>+  explicit vector(size_type size) : INIT_MEMBERS
>+  { }

It seems the constructor is missing the calls to create |size| elements.  Also, this constructor could be made standards-compliant, per 23.2.4.1 (modulo the allocator argument), by merging this constructor with the immediately following one and adding the standard-required default argument:

explicit vector(size_type n, const T& value = T());

>+  assign(size_type numCopies, const T& value) {
>+    size_type oldSize = size();
>+
>+    if (numCopies < oldSize)
>+      TruncateLength(numCopies);
>+
>+    DestructRange(0, oldSize);
>+
>+    if (numCopies > oldSize) {
>+      // explicitly not initializing this memory to avoid calling
>+      // default ctors
>+      EnsureCapacity(numCopies, sizeof(value_type));
>+      IncrementLength(numCopies - oldSize);
>+    }
>+
>+    Fill(begin(), numCopies, value);

For PODs, the destructor loop should be boiled away, but in general, I think it would be faster to use operator= up to min(oldSize, newCopies) before truncating/filling the end.

>+  // Insert |elt| into |pos|
>+  iterator
>+  insert(iterator pos, const T& elt) {
>+    size_type index = pos - begin();
>+    InsertElementAt(index, elt);
>+    Mutate();
>+    return begin() + index;
>+  }

Perhaps, for this and other member functions taking an iterator, you could assert pos.mVector == this?

>+  insert(iterator pos, size_type numCopies, const T& elt) {
>+    index_type index = pos - begin();
>+    size_type oldSize = size();
>+    size_type newSize = oldSize + numCopies;
>+
>+    EnsureCapacity(newSize, sizeof(value_type));
>+    ShiftData(index, 0, newSize - oldSize, sizeof(value_type));

s/newSize - oldSize/numCopies/

+  assign(InputIterator first, InputIterator last) {
+    Splice(begin(), end(), first, last);

It seems the full generality of Splice makes this operation a little slower than it need be.  Ideally, there would be a single, linear pass which calls operator= on the range of constructed objects, and destroys/inserts the remainder.  The other two uses of Splice also seem to be special cases, so, if implementing Splice without ShiftData is annoying, you could just leave it off and implement insert/range-construction directly.

>+    // Table 74—Forward iterator requirements

Some weird characters ^^

Lastly, a general note concerning EnsureCapacity:  It looks like this function is in a non-template, non-inline member function.  While PGO/LTO may inline whatever it chooses, you might be incurring a function call for each check.  Perhaps you could make an inline helper that only calls EnsureCapacity if the requested capacity is greater than the current capacity?  (Or I guess change nsTArray.h to cleave EnsureCapacity into two.  On a separate note, it might also be nice to use a helper like SetCapacity to avoid the manual sizeof(value_type) clutter.
>>+    // Table 74—Forward iterator requirements
>Some weird characters ^^

I believe that is an em-dash in UTF-8.
Comment on attachment 431417 [details] [diff] [review]
Part 2: Implement (infallible) mozilla::vector offering the STL std::vector API, as a wrapper around nsTArray, v2

I'm removing myself from review here because I don't think I understand either nsTArray or the precise details of the specification of std::vector well enough to comment on the code.  What I had to say about high-level design I've pretty much said.

I will observe that nsAutoTArray<T, 12> seems like it should translate to vector<T, mozilla::AutoAllocator<T, 12> > so maybe leaving out allocators is not the right long-term move.

Also, are we confident that <iterator> and <algorithm> are usable as-is from Mozilla code?  How about <utility>?  (std::pair would be quite handy ;-)
Attachment #431417 - Flags: review?(zweinberg)
std::vector might be the better way to go.

> Also, are we confident that <iterator> and <algorithm> are usable as-is from
> Mozilla code?  How about <utility>?  (std::pair would be quite handy ;-)

Nope, but we use at least <algorithm> already.  If we're going to interpose our own exception-to-abort handlers or use existing ones, I think we should mostly be fine; just need to make sure they don't call into libstdc++/crt.dll.
(In reply to comment #18)
> Also, are we confident that <iterator> and <algorithm> are usable as-is from
> Mozilla code?  How about <utility>?  (std::pair would be quite handy ;-)
We are already using std::pair in the tree because it was deemed safe.
Comment on attachment 431417 [details] [diff] [review]
Part 2: Implement (infallible) mozilla::vector offering the STL std::vector API, as a wrapper around nsTArray, v2

Luke, thanks much for the very helpful review!  I think with how bug 551254 is progressing, we can probably get away with using std::vector directly.  I'm going to keep this patch in reserve in case we hit performance problems with std::vector.
Attachment #431417 - Flags: superreview?(benjamin)
Attachment #431417 - Flags: review?(lw)
Summary: Implement infallible mozilla::vector → Implement optionally infallible nsTArray
Comment on attachment 431416 [details] [diff] [review]
Part 1: Make nsTArray optionally infallible, v2

I really want traits here. I think you should be able to trait/specialize nsTArray_base for infallible/fallible allocators.
Attachment #431416 - Flags: superreview?(benjamin) → superreview-
Added allocator traits love.  The patch was mostly mechanical except for nsMaybeWeakPtrArray (!), but those changes were pretty straightforward as well.

For opt/linux/x86-64 builds, here's the before/after libxul

$ ll *libxul*
-rwxr-xr-x 1 cjones cjones 22879944 2010-04-15 21:58 before-libxul-stripped.so
-rwxr-xr-x 1 cjones cjones 22998728 2010-04-15 21:59 after-libxul-stripped.so

If I did my math right, that's a 0.5% increase in codesighs.  I can live with that.  There's more opportunity for eliminating templated code at the cost of some icky hacks in nsTArray.h.
Attachment #431416 - Attachment is obsolete: true
Attachment #439440 - Flags: superreview?(benjamin)
There's a small bug in the latest patch that I found while testing something else.  I fixed it locally, and it's a one-line change so I don't see much point in reposting.
Comment on attachment 439440 [details] [diff] [review]
Make nsTArray optionally infallible, v3

Nit, throughout this patch please place the opening brace of classes and structs on the next line (column 0 for toplevel declarations). I know existing code in the file didn't follow that convention, but since you're touching the declarations let's fix them to match the style guide and common XPCOM practice.

nsTArrayDefaultAllocator makes me a little sad. You had to make it a struct instead of a typedef to forward-declare it somewhere?

>diff --git a/xpcom/glue/nsTArray.h b/xpcom/glue/nsTArray.h

>-    PRBool EnsureCapacity(size_type capacity, size_type elementSize);
>+    PRBool EnsureCapacity(size_type capacity, size_type elemSize) {

Here and elsewhere, you've inlined a lot of the code directly in the declaration. That makes it a lot harder to read the class. Please put these methods at the end of nsTArray.h, or in a separate nsTArray-inl.h file, so that the class declaration is easier to skim. You might be able to `hg cp nsTArray.cpp nsTArray-inl.h` to make it evident where the bodies came from, and easier to see what changes come from this patch.

>+    Header* Hdr() { 
>+      return mHdr;
>+    }
>+
>+    Header* Hdr() const { 
>+      return mHdr;
>+    }

Why do you need both the const and non-const versions? Isn't a single const version of the method sufficient?

sr=me with those changes
Attachment #439440 - Flags: superreview?(benjamin) → superreview+
I wouldn't mind reviewing patches here as I've done some amount of work on nsTArray.

I would really like to make the default nsTArray be infallible. I would imagine that most of our arrays are ultimately content controlled, so if we'll leave all content controlled arrays using fallible allocators we won't make that much of an improvement over what we have now.

However I suppose this is better brought up in the newsgroups.
Blocks: 578914
The HTML5 parser uses nsTArrays all over the place assuming that they'll become infallible by the time we ship a release with the HTML5 parser. (See e.g. bug 578914.) Therefore, it would be *really* nice to actually get infallibility by Firefox 4.0, so I'm nominating this as a blocker.
blocking2.0: --- → ?
blocking2.0: ? → final+
Blocks: 585943
Whiteboard: [sg:want P2]
Is this still going to happen. It concerns me that it's marked as final+ and that there hasn't been any action here for over 6 months. It seems like we want more testing than just RCs for the dependent bugs, such as the HTML5 ones.

Renominating so that we can get an update on if this truly will remain a blocker.
blocking2.0: final+ → ?
I pushed dusted-off patches to try last night.  One test needs to be sorted out.
(In reply to comment #28)
> Is this still going to happen. It concerns me that it's marked as final+ and
> that there hasn't been any action here for over 6 months. It seems like we want
> more testing than just RCs for the dependent bugs, such as the HTML5 ones.
Only one of the bugs that this blocks is marked as blocking, and it's only marked as blocking final.  Since we've already been using the infallible allocator for new for some time now, I don't think we need to worry about that having a lack of testing.  I think using the infallible allocator with this shouldn't be any significant risk.  Since the patches in the bug aren't really massive blocks of code, I think having just RCs on this will be fine.  With that said, if it gets in earlier, I'm not going to complain.
blocking2.0: ? → final+
Mechanical dusting-off of infallible-tarray.
Attachment #487499 - Flags: superreview?(benjamin)
Attachment #487499 - Flags: superreview?(benjamin) → superreview+
http://hg.mozilla.org/mozilla-central/rev/f915a22def59
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
  17.453 +// The Alloc template parameter can be used to choose between
  17.454 +// "fallible" and "infallible" nsTArray (if available), defaulting to
  17.455 +// fallible.  If the *fallible* allocator is used, the return value of
  17.456 +// methods that might allocate doesn't need to be checked; Append() is
  17.457 +// one such method.  These return values don't need to be checked if
  17.458 +// the *in*fallible allocator is chosen.  When in doubt, choose the
  17.459 +// infallible allocator.

Doesn't this comment claim that the return value should never be checked?
Obviously there's a typo in the first part.
Blocks: 610823
No longer blocks: 610823
Blocks: 610823
You need to log in before you can comment on or make changes to this bug.