GC: Bump Allocation Nursery

RESOLVED FIXED in mozilla21

Status

()

defect
RESOLVED FIXED
8 years ago
5 years ago

People

(Reporter: terrence, Assigned: terrence)

Tracking

Trunk
mozilla21
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [leave-open])

Attachments

(8 attachments, 9 obsolete attachments)

30.11 KB, patch
billm
: review+
terrence
: checkin+
Details | Diff | Splinter Review
3.42 KB, patch
billm
: review+
terrence
: checkin+
Details | Diff | Splinter Review
55.63 KB, patch
billm
: review+
terrence
: checkin+
Details | Diff | Splinter Review
132.82 KB, patch
bhackett
: review+
terrence
: checkin+
Details | Diff | Splinter Review
4.61 KB, patch
Details | Diff | Splinter Review
v2
108.02 KB, patch
billm
: review+
Details | Diff | Splinter Review
44.48 KB, patch
terrence
: feedback+
Details | Diff | Splinter Review
v0
3.73 KB, patch
billm
: review+
Details | Diff | Splinter Review
This patch creates a separated, fixed-size heap, allocates out of it, and does not crash and burn when we GC.  It makes heavy use of existing GC constructs in order to be as minimal as possible, and does not require cross heap pointer tracking (Phase 2) or movable objects, and it does not do bump allocation (Phase 4) or, indeed, any collection of garbage whatsoever (Phase 3).

The purpose of this patch is to allow us to work on Phase 2: we need to generate cross generational pointers to build and test our tracking of cross-generation pointers.  To do this we need at least 2 generations of separated heap.

I'm posting this here now to get early feedback on big-picture concepts and to make sure I didn't miss anything important.  Keep in mind that basically all of this specific code will be going away at some point.  We won't need ArenaLists to manage two heaps when we are doing bump allocation.  The modifications to MarkStackWord will go away when we track cross-generation pointers correctly.  The JM allocator will need to be rewritten to understand our new Nursery.  The Nursery itself will need to get rewritten (in much sleeker form) as a bump allocator.
This looks good to me. I think you are using copyFreeListsToArenas incorrectly. The rules are:
1. Once ArenaLists::purge is called, it's safe to iterate over all the cells in an arena. Nothing else needs to be done.
2. If you're not planning to call purge (because it's not a GC), then you should call copyFreeListsToArenas before you begin iterating and clearFreeListsInArenas when you're done iterating.

I think everything you have is okay, except that the call to you debugging code should probably happen after purge. Or else you need to call copyFreeListsToArenas before it. If you do this, you need to remove an assert in purge(). You don't need to clearFreeListsInArenas, since purge will take care of that for you later.

Also, a few general comment about style:

1. This style:
void function() {
}
is only used for functions defined in the body of a class. Otherwise the open brace goes on the next line.

2. Multi-line C-style comments look like this:
/*
 * Text goes here, with no next on the first and last line of the comment.
 */

3. It's not really correct to start an identifier with an underscore. Those names are reserved for the compiler. Sometimes we do use an underscore at the end for field names.
Underscores are actually a little trickier than that.  _<capital-letter>... is reserved.  _<non-capital>... is reserved only in global scope.  The latter's a little unfortunate, given underscore-prefix is pretty common for JS naming (and somewhat Python), but oh well.
(In reply to Jeff Walden (remove +bmo to email) from comment #2)
> Underscores are actually a little trickier than that.  _<capital-letter>...
> is reserved.  _<non-capital>... is reserved only in global scope.  The
> latter's a little unfortunate, given underscore-prefix is pretty common for
> JS naming (and somewhat Python), but oh well.

I did not know that. I just did a search, and we have one underscore-prefixed name in jstypedarray.cpp. Somehow this doesn't surprise me.
Posted patch v2: Nits scratched (obsolete) — Splinter Review
Attachment #578306 - Attachment is obsolete: true
Attachment #581032 - Attachment is obsolete: true
The current nursery heap is a monstrosity which is able to mirror the required bits of the long-lived heap to support Cell::getAllocKind and Cell::compartment.  Once the Pools work is done, we should be able to rewrite the Nursery heap as a contiguous area for bump allocation of all types.
Summary: GC: Separated Nursery Heap → GC: Bump Allocation Nursery
Assignee: terrence → general
Over the long weekend I took some time to look at how hard it would be to go straight to a bump-allocated nursery and skip needing to mimic the arena layout altogether. It turns out it's not too bad. As a first step, I'm only focusing on JSObject, not JSString -- we'll need to run finalizers for JSString: I think I know how to deal with this, but I'd like to write some code first.

I've split the preparatory work into two patches: this one cleans up some of our rooting API so that we can stop ditching good type information that we just have to re-create anyway. This lets us kill off several calls to getAllocKind() and paves the way for the second patch, which will move |compartment()| off of Cell completely.
Assignee: general → terrence
Status: NEW → ASSIGNED
Attachment #704989 - Flags: review?(wmccloskey)
Posted patch v0: compartment() Diaspora (obsolete) — Splinter Review
This is actually pretty straightforward, except for one thing: there are a bunch of places (mostly in jsgc.cpp) where I've replaced calls to Cell::compartment() with Cell::arenaHeader()->compartment. There are two assumption that I'm making to ensure the safety of these calls.

1) We will do a Nursery GC before every incremental slice.
2) In DEBUG builds the Nursery will add extra code to inject a poisoned NULL pointer where Cell::arenaHeader() would land and not allocate objects there. Cell::arenaHeader() will then assert !IsPoisonedPtr(hdr->compartment).
Attachment #705006 - Flags: review?(wmccloskey)
Comment on attachment 705006 [details] [diff] [review]
v0: compartment() Diaspora

Unflagging review so that I can update JSObject::compartment() to do something that will also work when sweeping.
Attachment #705006 - Flags: review?(wmccloskey)
Posted patch v1: compartment() diaspora (obsolete) — Splinter Review
This patch is basically the same, but preserves the existing behavior so that we can be sure to avoid regressions. We can easily carry the potentially-breaking changes with the nursery patch itself.
Attachment #705006 - Attachment is obsolete: true
Attachment #705453 - Flags: review?(wmccloskey)
Posted patch v1: compartment() diaspora (obsolete) — Splinter Review
This time remembering to qref.
Attachment #705453 - Attachment is obsolete: true
Attachment #705453 - Flags: review?(wmccloskey)
Attachment #705454 - Flags: review?(wmccloskey)
Comment on attachment 705454 [details] [diff] [review]
v1: compartment() diaspora

And then I re-export the wrong patch. Maybe I'd better just give this a Try run first.
Attachment #705454 - Flags: review?(wmccloskey)
Comment on attachment 704989 [details] [diff] [review]
v0: Rooting API cleanups

Review of attachment 704989 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsfriendapi.cpp
@@ +902,5 @@
>  
>      gc::Cell *cell = static_cast<gc::Cell *>(ptr);
> +    JSCompartment *c = (kind == JSTRACE_OBJECT)
> +                       ? static_cast<JSObject *>(cell)->compartment()
> +                       : cell->arenaHeader()->compartment;

I'd rather you just use ->compartment() for everything here. Also, please call the compartment |comp| instead of |c|.

::: js/src/jsgc.h
@@ +507,5 @@
> +extern JSBool
> +js_AddObjectRoot(JSContext *cx, JSObject **rp, const char *name);
> +
> +extern JSBool
> +js_AddScriptRoot(JSContext *cx, JSScript **rp, const char *name);

As long as you're here, could you move these to js:: ?
Attachment #704989 - Flags: review?(wmccloskey) → review+
This is another small, atomic change that I'd like to make ahead of time. I think the extra word only decreases our extra Chunk padding for every thing kind, but I didn't verify manually: easier to use AWSY for this.
Attachment #706670 - Flags: review?(wmccloskey)
Can you just printf ArenasPerChunk before and after the change? That's a more direct way to check than relying on AWSY.
You are right, I don't know why I didn't think to do it the simple way. I've just manually confirmed that the arena count is 252 in all builds with and without the runtime pointer and in both 32 and 64 bit.
Comment on attachment 706670 [details] [diff] [review]
v0: runtime on the chunk

Thanks for checking.
Attachment #706670 - Flags: review?(wmccloskey) → review+
Whiteboard: [leave-open]
Since we want a shared nursery for the runtime now, it makes sense to move the store buffer and verifier there as well.
Attachment #707391 - Flags: review?(wmccloskey)
Attachment #706670 - Flags: checkin+
Attachment #704989 - Flags: checkin+
Rebased and I think it's ready to go for real this time. At least the try run is green: https://tbpl.mozilla.org/?tree=Try&rev=a9baed0f1989
Attachment #705454 - Attachment is obsolete: true
Attachment #707921 - Flags: review?(wmccloskey)
This allows the GGC prototype to skip tracing of the type-info graph, saving us time and a bunch of pain. This was a bit complex, so I let it bake for a few days while ironing more bugs out of the minor collection. I haven't hit any more busted singleton pointers or any of the asserts I added, so I think this is ready for review.
Attachment #709296 - Flags: review?(bhackett1024)
Comment on attachment 709296 [details] [diff] [review]
v0: Don't store singletons in the Nursery

Review of attachment 709296 [details] [diff] [review]:
-----------------------------------------------------------------

Oof, thanks for doing this.

::: js/src/gc/Heap.h
@@ +35,5 @@
>  struct Chunk;
>  
> +/*
> + * This flag allows an allocation site to request a specific heap based upon the
> + * estimated lifetime of objects allocated from that site.

This should be 'estimated lifetime or lifetime requirements' I think.

::: js/src/jsinterp.cpp
@@ +2621,5 @@
>  BEGIN_CASE(JSOP_REST)
>  {
>      RootedObject &rest = rootObject0;
> +    NewObjectKind newKind = UseNewTypeForInitializer(cx, script, regs.pc, &ArrayClass);
> +    rest = regs.fp()->createRestParameter(cx, newKind);

Rest parameters can always be generic objects.

::: js/src/jsobj.h
@@ +1178,5 @@
>  
> +enum NewObjectKind {
> +    GenericObject,
> +    SingletonObject,
> +    MaybeSingletonObject

Can you comment these?  The purpose of MaybeSingletonObject isn't obvious.
Attachment #709296 - Flags: review?(bhackett1024) → review+
(In reply to Brian Hackett (:bhackett) from comment #24)
> Oof, thanks for doing this.

And thank you for the fast review! I can't imagine it was fun to reading.
 
> ::: js/src/jsinterp.cpp
> @@ +2621,5 @@
> >  BEGIN_CASE(JSOP_REST)
> >  {
> >      RootedObject &rest = rootObject0;
> > +    NewObjectKind newKind = UseNewTypeForInitializer(cx, script, regs.pc, &ArrayClass);
> > +    rest = regs.fp()->createRestParameter(cx, newKind);
> 
> Rest parameters can always be generic objects.

Yup, I thought it was bit absurd too. Will fix. :-)
Comment on attachment 707391 [details] [diff] [review]
v0: Move store buffer and verifier nursery to the Runtime

Review of attachment 707391 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this worked out very well.

::: js/src/gc/Barrier-inl.h
@@ +162,5 @@
>  inline void
>  HeapValue::writeBarrierPost(const Value &value, Value *addr)
>  {
>  #ifdef JSGC_GENERATIONAL
>      if (value.isMarkable()) {

No more braces needed.

@@ +328,4 @@
>  }
>  
>  inline void
> +HeapSlot::setCrossCompartment(JSObject *obj, Kind kind, uint32_t slot, const Value &v, JSRuntime *rt)

I think you can remove the |rt| parameter entirely, since we can get the runtime from |obj|. Also, I don't think a special call for cross-compartment slots is needed anymore. Can you remove setCrossCompartmentSlot entirely?
Attachment #707391 - Flags: review?(wmccloskey) → review+
Comment on attachment 707921 [details] [diff] [review]
v2: compartment() diaspora (now including zones as well)

Review of attachment 707921 [details] [diff] [review]:
-----------------------------------------------------------------

I don't understand what this patch accomplishes. Can you explain more? I can see that calling zone() isn't going to work on nursery-allocated objects and strings. But the patch doesn't seem to fix that, as far as I can tell.
Gladly. In fact, as this patch is indeed quite gross, let me fully outline the problem and the possible solutions we've considered so that posterity will know why we ended up here.

The problem in question is that calling zone() on any subclass of Cell unconditionally looks at the first word of the aligned 4K region: i.e. arenaHeader()->zone. When GCThings are allocated in the Nursery this is not necessarily going to be a Zone* without doing extra work.

There are two ways we can solve this: the first is by manually inserting a Zone* into the Nursery every 4K so that Cell::zone() works unmodified. The cost of this will be mocking up the arena layout in the Nursery. This isn't just an extra branch to detect when we overflow the 4K alignment write the Zone* out: we also need to track which Zone* is where so we can allocate things in the right zone. My previous implementation re-used ArenaLists for this with all the unfortunate complexity that entailed.

A second way we can deal with this is by modifying Cell::zone() to conditionally look somewhere else to get the Zone* when the GCThing is in the arena. The mechanism we would want to use to do this (and if we need to at all) is determined by the type of the GCThing: therefore this patch, which will allow us to specialize the zone() access based on the type.

Naturally, which of these methods we want to end up using is going to depend on who uses zone(), how frequently, and in what contexts. At present, the main consumer of zone() is the check for needsBarrier() in the pre barriers. For IonMonkey, we bake this value directly into the code and completely avoid the need to do the check unless we are actually inside an incremental GC. Thus, my intuition is that the cost of zone() probably doesn't matter as much as the allocation cost for hot code. Secondarily, if we do find hot accesses to this method we are able to simply pass the zone deeper manually to solve the performance issues; with the first approach, our allocator just has unavoidable extra overhead that we cannot spot-fix.

(In reply to Bill McCloskey (:billm) from comment #27)
> I don't understand what this patch accomplishes. Can you explain more? I can
> see that calling zone() isn't going to work on nursery-allocated objects and
> strings. But the patch doesn't seem to fix that, as far as I can tell.

Now as to your actual question: the reason that JSObject::zone() is not currently specialized is that if we just naively implement it as |return shape_->zone();| then we crash during finalization. The reason for this is that we end up in the pre barrier during the finalizer somehow and the shape is in an already released arena. I think, however, that these pre barriers are entirely spurious -- we're inside the GC so needsBarrier is going to be false anyway. We should be able to just rewrite any finalizers that attempt this to simply not trigger the barrier and add an assertion to ensure that it doesn't regress.

For the GGC prototype I just hacked around this in the easiest way possible. You are probably right, however, that it would make more sense to attempt a real solution /before/ adding the ugly patch that does nothing useful.
Attachment #709296 - Flags: checkin+
Try run is at:
https://tbpl.mozilla.org/?tree=Try&rev=84dcbe8dc359
Attachment #707921 - Attachment is obsolete: true
Attachment #707921 - Flags: review?(wmccloskey)
Attachment #711119 - Flags: review?(wmccloskey)
Attachment #707391 - Flags: checkin+
Comment on attachment 711119 [details] [diff] [review]
v3: Actually make the shape_->zone() call.

Reworking based on our IRL conversation.
Attachment #711119 - Flags: review?(wmccloskey)
Depends on: 839673
Target Milestone: --- → mozilla21
Depends on: 839215
Depends on: 841059
Depends on: 841065
I'm a little confused by comments 32 and 34.  Is there a patch up for review?
(In reply to Paul Wright from comment #35)
> I'm a little confused by comments 32 and 34.  Is there a patch up for review?

It moved to the blocking bug 841059.
Depends on: 842424
Depends on: 843907
Depends on: 844932
Depends on: 845573
Depends on: 847093
Depends on: 847579
Depends on: 847698
Depends on: 848199
Depends on: 848595
Depends on: 848599
Depends on: 848601
Depends on: 848608
Depends on: 848612
Depends on: 849068
Depends on: 849453
Attachment #711119 - Attachment is obsolete: true
Posted patch v0 (obsolete) — Splinter Review
This is green on jsapi-tests, jsreftests, and jit-tests alone and with JS_GC_ZEAL=11 and on the first 1000 tests in jit-tests --valgrind-all.

To use it, configure with "--enable-gcgenerational --enable-exact-rooting" and run the shell with "--ggc --no-jm --no-ion". To see basic statistics about Nursery and StoreBuffer use, set the environment variable JS_GC_STATS to contain one or both of the strings "nursery" and "storebuffer".
Attachment #723023 - Flags: review?(wmccloskey)
Depends on: 849536
Comment on attachment 723023 [details] [diff] [review]
v0

Review of attachment 723023 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Nursery.cpp
@@ +87,5 @@
> +    if (position() + sizeof(NurseryTag) + thingSize > nextArenaStart) {
> +        ArenaHeader *nextArena = reinterpret_cast<ArenaHeader *>(nextArenaStart);
> +        nextArena->setAsNotAllocated();
> +        nextArena->init(reinterpret_cast<Zone *>(Arena::PoisonedZone), AllocKind(254));
> +        position_ = reinterpret_cast<void *>(nextArenaStart + sizeof(ArenaHeader));

To reiterate discussion from IRC, these fake arena headers and PoisonedZone should be removed.  (a) Ion can't do this, and (b) structuring data differently in debug vs non-debug builds is problematic and can easily lead to crashes in one but not the other.  Since Cell::runtime() works on nursery things, Cell::arenaHeader() can just assert that it is not in the nursery rather than look for a poisoned zone pointer.

@@ +96,5 @@
> +
> +    /* Insert a tag to make our nursery walkable and checkable. */
> +    NurseryTag *tag = reinterpret_cast<NurseryTag *>(position_);
> +    *tag = NurseryTag(thingSize, thingKind);
> +    position_ = reinterpret_cast<void *>(tag + 1);

Ditto (b) above.  NurseryTag should be completely removed.  The assertions in Nursery::moveToTenured can use information from the object's shape instead.
Terrence, can you incorporate Brian's changes and post a new patch? Thanks.
Posted patch few fixesSplinter Review
A few more fixes.  fixupAfterMove had a couple bugs where it would try to reuse the fixed elements/slots of the destination inappropriately.  obj->slots should never point to obj->fixedSlots(), it is either NULL or a dynamically allocated pointer (malloc'ed or, in the nursery, a nursery allocated buffer).  obj->elements can point to obj->fixedElements() but only when there is enough space (this fix is pessimistic in that it will never use fixedElements() for objects moved from the nursery; this could be improved).  Also, there was a bug in the store buffer capacity which caused the SlotEdge buffer to overflow in non-JS_GC_ZEAL (i.e. non-DEBUG) builds.  It would be very good if the store buffer size did not depend on JS_GC_ZEAL, for the same reasons as (b) above.  What prevents this?
(In reply to Bill McCloskey (:billm) from comment #39)
> Terrence, can you incorporate Brian's changes and post a new patch? Thanks.

Yup, I'm folding these in with the rebase over all the changes requested in the prerequisite patches and splitting out a few more independent bits for review.

(In reply to Brian Hackett (:bhackett) from comment #40)
> obj->slots should never point to obj->fixedSlots(), it is either NULL or a
> dynamically allocated pointer (malloc'ed or, in the nursery, a nursery
> allocated buffer).  obj->elements can point to obj->fixedElements() but only
> when there is enough space (this fix is pessimistic in that it will never
> use fixedElements() for objects moved from the nursery; this could be
> improved).

Ah, thanks, that's exactly the feedback I wanted. You get bonus points for attaching a patch with it :-).

>  Also, there was a bug in the store buffer capacity which caused
> the SlotEdge buffer to overflow in non-JS_GC_ZEAL (i.e. non-DEBUG) builds.
> It would be very good if the store buffer size did not depend on JS_GC_ZEAL,
> for the same reasons as (b) above.  What prevents this?

Yeah, I kept tweaking these values and must have missed that in one of my edits. The reason for the size dependence on zeal is that the verifier does not handle store buffer overflow elegantly -- it just disables the verification for that cycle. Upping the store buffer size allowed more tests to fail on verification. The repeated tweaking was because the relocatable buffers have O(n^2) compaction: I had to lower their size several times -- as well as moving several RelocatablePtrs into the generic buffer -- to keep too many tests from timing out.
Depends on: 850749
Depends on: 850841
Depends on: 850842
Depends on: 850849
Depends on: 850922
Depends on: 850954
Depends on: 851057
Posted patch v1 (obsolete) — Splinter Review
I've applied Brian's changes and split off a bunch of bits into separate patches that should be easier to review alone.

Also, I had to change --ggc to --no-ggc because we don't have control of the shell args we run with on tinderbox. I figure if you are building with --enable-gcgenerational you probably want ggc anyway.
Attachment #723023 - Attachment is obsolete: true
Attachment #723023 - Flags: review?(wmccloskey)
Attachment #724984 - Flags: review?(wmccloskey)
Depends on: 852802
Comment on attachment 724984 [details] [diff] [review]
v1

Review of attachment 724984 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty good. The big changes I'd like: (1) the nursery collection code is pretty spread out, and I'd like it to be in Nursery.cpp, (2) the fallback marking code needs to be fixed (see below).

::: js/public/GCAPI.h
@@ +23,5 @@
>      D(DEBUG_GC)                                 \
>      D(DEBUG_MODE_GC)                            \
>      D(TRANSPLANT)                               \
>      D(RESET)                                    \
> +    D(OUT_OF_NURSERY)                           \

This kinda screws up our existing telemetry numbers, but I guess it's inevitable. Could you add a bunch of dummies here so that we have room to grow?

::: js/src/builtin/ParallelArray.cpp
@@ +1178,5 @@
>  
>      // ParallelArray objects are frozen, so mark it as non-extensible here.
>      Shape *empty = EmptyShape::getInitialShape(cx, &class_,
>                                                 result->getProto(), result->getParent(),
> +                                               gc::GetGCObjectKind(&class_),

This is no longer needed.

::: js/src/gc/Marking.cpp
@@ +106,5 @@
>  #endif
>  
> +/* This is true when marking inside of a minor collection. */
> +static JS_ALWAYS_INLINE bool
> +IsGCMinorTracer(JSTracer *jstrc)

I think IsMinorGCTracer sounds better.

@@ +173,5 @@
>  
> +#ifdef JSGC_GENERATIONAL
> +    Nursery &nursery = thing->runtime()->gcNursery;
> +    if (IsGCMinorTracer(trc)) {
> +        if (nursery.isInside(*thingp) && nursery.getForwardedPointer(thing, thingp))

Can you find a way to do this inside the minor tracer instead? It seems cleaner to me. You can skip over CheckMarkedThing if the thing is in the nursery.

@@ +182,5 @@
> +         * MinorCollectionTracer because of a pre-barrier. The pre-barrier is
> +         * not needed in this case because we perform a minor collection before
> +         * each incremental slice.
> +         */
> +        if (!trc->callback && nursery.isInside(thing))

Fold this check into the trc->callback condition below.

::: js/src/gc/Marking.h
@@ +281,5 @@
>      MarkIonCode(trc, code, name);
>  }
>  
> +inline void
> +Mark(JSTracer *trc, JSObject **o, const char *name)

What's this for? Also, please use obj instead of o.

@@ +287,5 @@
> +    MarkObjectUnbarriered(trc, o, name);
> +}
> +
> +inline void
> +Mark(JSTracer *trc, ScopeObject **o, const char *name)

...and this one?

::: js/src/gc/Nursery-inl.h
@@ +22,5 @@
> +class RelocationOverlay
> +{
> +    friend struct MinorCollectionTracer;
> +
> +    const static uintptr_t RELOCATED = uintptr_t(0x4E10CA7E4E10CA7E);

Could you make this some recognizable value that's odd? Something like 0xbadbeef. Also, the name needn't be in uppercase.

@@ +48,5 @@
> +        JS_ASSERT(isForwarded());
> +        return newLocation_;
> +    }
> +
> +    void forward(void *addr) {

Why not have this take a Cell?

::: js/src/gc/Nursery.cpp
@@ +39,5 @@
> +    JSRuntime **rtp = static_cast<JSRuntime **>(end_);
> +    *rtp = rt;
> +#ifdef DEBUG
> +    memset(start_, FreshNursery, NurserySize - sizeof(JSRuntime*));
> +#endif

See the comment below about Nursery::Layout. Once we have that, you can say
  start_ = position_ = layout->data;
  end_ = &layout->data + 1;
  layout->runtime = rt;
  JS_POISON(layout->data, FreshNursery, sizeof(layout->data));

@@ +63,5 @@
> +    disable();
> +}
> +
> +void *
> +js::gc::Nursery::allocate(AllocKind thingKind, size_t thingSize)

I think there should be a low-level and a high-level version of this function. The low-level one wouldn't take thingKind and wouldn't increment stats.allocs. The high-level one would. allocateSlots should use the low-level one. Otherwise we'll have a buffer overflow in the allocs array, since it's not big enough to hold NurserySlotsKind. Also, you already have slotAllocs, so putting that in allocs seems redundant. Then you should remove NurserySlotsKind completely.

@@ +79,5 @@
> +#endif
> +
> +    void *thing = position_;
> +
> +    position_ = reinterpret_cast<void *>(position() + thingSize);

No need for the cast.

@@ +85,5 @@
> +    return thing;
> +}
> +
> +HeapSlot *
> +js::gc::Nursery::allocateSlots(Allocator *alloc, JSObject *obj, size_t nSlots)

We normally say nslots instead of nSlots.

@@ +107,5 @@
> +    return allocateHugeSlots(alloc, nSlots);
> +}
> +
> +HeapSlot *
> +js::gc::Nursery::reallocateSlots(Allocator *alloc, JSObject *obj, HeapSlot *oldSlots,

This should have a comment that it's used for both slots and elements.

@@ +114,5 @@
> +    if (!isInside(obj)) {
> +        HeapSlot *newSlots = oldSlots;
> +        if (newCount > oldCount) {
> +            newSlots = (HeapSlot *)alloc->realloc_(oldSlots, oldCount * sizeof(HeapSlot),
> +                                                  newCount * sizeof(HeapSlot));

Ident is wrong.

@@ +116,5 @@
> +        if (newCount > oldCount) {
> +            newSlots = (HeapSlot *)alloc->realloc_(oldSlots, oldCount * sizeof(HeapSlot),
> +                                                  newCount * sizeof(HeapSlot));
> +        }
> +        return newSlots;

I think this will break shrinkSlots. I suggest you change MallocProvider<T>::realloc_(p, old, new) so that new < old is allowed. In that case, we should just do what realloc_(p, new) does (i.e., update malloc counter only if p is null). If you do that, then you can unconditionally call realloc_ here and everything should work.

@@ +126,5 @@
> +         * allocated memory, however, we still want to return the unused slots
> +         * to the system.
> +         */
> +        size_t oldSize = oldCount * sizeof(HeapSlot);
> +        size_t newSize = newCount * sizeof(HeapSlot);

It would be simpler to move these two definitions to the top of the function. They're used in almost all the cases.

@@ +129,5 @@
> +        size_t oldSize = oldCount * sizeof(HeapSlot);
> +        size_t newSize = newCount * sizeof(HeapSlot);
> +        void *newBuffer = newCount < oldCount
> +                          ? js_realloc(oldSlots, newSize)
> +                          : alloc->realloc_(oldSlots, oldSize, newSize);

If you make the change above, you don't need this disjunction.

@@ +136,5 @@
> +            JS_ASSERT(hugeSlots.has(oldSlots));
> +            hugeSlots.remove(oldSlots);
> +            hugeSlots.put(newSlots);
> +        }
> +        JS_ASSERT(hugeSlots.has(newSlots));

If hugeSlots.put OOMs, this assert will fail. So don't bother with the assert.

@@ +150,5 @@
> +    return newSlots;
> +}
> +
> +HeapSlot *
> +js::gc::Nursery::allocateHugeSlots(Allocator *alloc, size_t nSlots)

nslots

@@ +154,5 @@
> +js::gc::Nursery::allocateHugeSlots(Allocator *alloc, size_t nSlots)
> +{
> +    METER(++stats.hugeAllocs);
> +    HeapSlot *slots = alloc->pod_malloc<HeapSlot>(nSlots);
> +    hugeSlots.put(slots);

Maybe put a comment here that we ignore failure because it will only cause a leak.

@@ +167,5 @@
> +}
> +
> +void
> +js::gc::Nursery::moveToTenured(MinorCollectionTracer *trc, JSObject *dst, JSObject *src,
> +                                    AllocKind thingKind, size_t thingSize)

Indent.

@@ +190,5 @@
> +void
> +js::gc::Nursery::fixupNewlyTenured(MinorCollectionTracer *trc)
> +{
> +    RelocationOverlay *p = trc->head;
> +    while (p) {

Use a for loop here.

::: js/src/gc/Nursery.h
@@ +16,5 @@
> +
> +struct JSCompartment;
> +
> +namespace js {
> +namespace gc {

I don't think it makes sense for this stuff to go in namespace gc. In the future, I'd like to reserve that for stuff that's internal to the gc and not used by the rest of the engine. Maybe MinorCollectionTracer could go there, but not Nursery.

@@ +52,5 @@
> +{
> +    /* Pointer to the start of the Nursery memory. */
> +    void *start_;
> +    void *position_;
> +    void *end_;

It seems more natural to me for these to have type char*. That way it's clear that all pointer arithmetic is over bytes.

I think it would be nice to declare a struct here sort of like the following:

const static size_t NurseryUsableSize = ChunkSize - sizeof(JSRuntime *);

struct Layout {
    char data[NurseryUsableSize];
    JSRuntime *runtime;
};

and then do a static assert somewhere that sizeof(Layout) == NurserySize.

@@ +67,5 @@
> +    const static uint8_t SweptNursery = 0x2b;
> +    const static uint8_t AllocatedThing = 0x2c;
> +
> +#ifdef DEBUG
> +    mutable struct Stats

I don't really like metering, but if this helps you, we can keep it in for a little while. After that we should fold the useful parts into gc/Statistics and delete the rest.

@@ +157,5 @@
> +
> +    /*
> +     * Frees all non-live nursery-allocated things at the end of a minor
> +     * collection. This operation takes time proportional to the number of
> +     * live things.

The last sentence doesn't seem true. Freeing huge slots takes time proportional to the number of dead things. Maybe just remove this sentence.

@@ +167,5 @@
> +#endif
> +};
> +
> +} /* namespace js */
> +} /* namespace gc */

These are flipped.

::: js/src/gc/StoreBuffer.h
@@ +103,5 @@
> +    void mark(JSTracer *trc) {
> +        Key prior = key;
> +        if (!map->has(key))
> +            return;
> +        ValueType value = map->lookup(key)->value;

It would be quicker to call lookup without calling has and then then check the result.

::: js/src/jscntxt.cpp
@@ +416,5 @@
>          /* Unpin all common names before final GC. */
>          FinishCommonNames(rt);
>  
>          /* Clear debugging state to remove GC roots. */
> +        GCMinor(rt, gcreason::LAST_CONTEXT);

What's this for?

::: js/src/jscntxt.h
@@ +827,5 @@
>      /* Whether the currently running GC can finish in multiple slices. */
>      int                 gcIsIncremental;
>  
> +    /* Whether the currently running GC is a minor collection. */
> +    int                 gcIsMinor;

Why is this an int and not a bool? I don't know why gcIsIncremental is this way. Try changing that one too.

::: js/src/jscompartment.h
@@ +145,5 @@
>      inline void *onOutOfMemory(void *p, size_t nbytes);
>      inline void updateMallocCounter(size_t nbytes);
>      inline void reportAllocationOverflow();
> +
> +    inline js::HeapSlot *allocateSlots(JSObject *obj, size_t nSlots);

nslots

::: js/src/jscompartmentinlines.h
@@ +51,5 @@
>      return arenas.parallelAllocate(zone, thingKind, thingSize);
>  }
>  
> +inline js::HeapSlot *
> +js::Allocator::allocateSlots(JSObject *obj, size_t nSlots)

nslots

@@ +54,5 @@
> +inline js::HeapSlot *
> +js::Allocator::allocateSlots(JSObject *obj, size_t nSlots)
> +{
> +#ifdef JSGC_GENERATIONAL
> +    return zone->rt->gcNursery.allocateSlots(this, obj, nSlots);

nslots

::: js/src/jsgc.cpp
@@ +780,5 @@
>      JSRuntime *rt = zone->rt;
> +#ifdef JSGC_GENERATIONAL
> +    JS_ASSERT_IF(rt->gcMaxBytes != ~size_t(0), rt->gcMaxBytes < ~size_t(0) - Nursery::NurserySize - ArenaSize);
> +#endif
> +    if (JS_UNLIKELY(rt->gcMaxBytes != ~size_t(0) && rt->gcBytes >= rt->gcMaxBytes))

As we discussed, I'd like this to be as before except that we shouldn't return NULL if we're doing a nursery collection.

@@ +1869,5 @@
>      JS_ASSERT(started);
> +    Cell *cell = static_cast<Cell *>(p);
> +    if (!cell->isTenured())
> +        return;
> +    JS_ASSERT(cell->tenuredZone()->isCollecting());

Use JS_ASSERT_IF please.

@@ +4685,5 @@
> +        size_t gcBytes = rt->gcBytes;
> +        rt->gcBytes = 0;
> +        t = ArenaLists::refillFreeListDuringGC(zone, thingKind);
> +        JS_ASSERT(t);
> +        rt->gcBytes += gcBytes;

With the change I suggested above, you can just MOZ_CRASH if this fails (since we won't return null because of gcMaxBytes). I don't think we have any acceptable way to handle an actual OOM at this point. Is that true?

@@ +4700,5 @@
> +{
> +    /*
> +     * This cell iteration may or may not skip any new cells that we allocate
> +     * while iterating, depending on whether the allocator grabs a new arena
> +     * or-reuses a partially empty arena. This is fine because we will be

No dash here.

@@ +4710,5 @@
> +        if (IsAtomsCompartment(comp))
> +            continue;
> +
> +        if (comp->watchpointMap)
> +            comp->watchpointMap->markAll(trc);

What's this for? Won't this already happen in MarkRuntime?

@@ +4716,5 @@
> +        for (size_t i = 0; i < FINALIZE_LIMIT; ++i) {
> +            AllocKind kind = AllocKind(i);
> +
> +            for (CellIterUnderGC cells(comp, kind); !cells.done(); cells.next())
> +                JS_TraceChildren(trc, cells.getCell(), MapAllocToTraceKind(kind));

Unfortunately I don't think this will work. CellIter uses the freelist to iterate over the cells. The freelist can get overwritten by objects if we allocate, and that would totally screw up the iteration.

However, I had an idea for an alternate way to do this fallback marking. We can store a bitmap that has 1 bit for every 8 bytes in the nursery. I think we're guaranteed that nursery objects will always start at 8-byte boundaries. The fallback marking would iterate over the major heap and set the bit for every object it finds. Once that's done, we would iterate over the bitmap and evacuate those objects from the nursery. I think this would be a lot simpler than having to deal with simultaneously allocating and iterating over the heap. For an 8MB nursery the bitmap would be 128KB. We would preallocate it along with the nursery.

@@ +4740,5 @@
> +         * We do not know the number of inline elements that the array was
> +         * initially allocated with. Instead we copy into a new, more optimal size
> +         * class.
> +         */
> +        thingSize = sizeof(ObjectImpl);

If the array has elements stored in the fixed slots, they won't get copied over properly. This isn't a problem right now because of a bug in ObjectImpl::fixupAfterMove, but once that is fixed you'll need to pass in the right size here.

@@ +4742,5 @@
> +         * class.
> +         */
> +        thingSize = sizeof(ObjectImpl);
> +    } else {
> +        thingKind = src->getAllocKind();

Now that this is the only call the getAllocKind, that code can be moved to Nursery.cpp and integrated with the array handling above.

@@ +4764,5 @@
> +#ifdef JSGC_GENERATIONAL
> +    MinorCollectionTracer *trc = static_cast<MinorCollectionTracer *>(jstrc);
> +    Cell *cell = static_cast<Cell *>(*thingp);
> +
> +    if (!cell || trc->nursery->isInside(thingp) || !trc->nursery->isInside(cell))

I don't think we ever trace NULL.

@@ +4775,5 @@
> +#endif
> +}
> +
> +void
> +js::GCMinor(JSRuntime *rt, gcreason::Reason reason)

Is it possible to put all the new minor collection code in gc/Nursery.cpp? If you need any additional definitions from jsgc.cpp, try adding them to GCInternals.h.

@@ +4805,5 @@
> +            priorPoke(rt->gcPoke)
> +        {
> +            runtime->gcIsMinor = true;
> +            rt->gcNumber++;
> +            trc = js_new<MinorCollectionTracer>();

Why does this need to be allocated on the heap?

@@ +4810,5 @@
> +            trc->nursery = &rt->gcNursery;
> +            trc->head = NULL;
> +            trc->tail = &trc->head;
> +            trc->doingFallbackMarking = false;
> +            trc->eagerlyTraceWeakMaps = TraceWeakMapKeysValues;

This stuff should all be part of the constructor for the MinorCollectionTracer.

@@ +4819,5 @@
> +            js_delete(trc);
> +            runtime->gcIsFull = priorIsFull;
> +            runtime->gcIsIncremental = priorIsIncremental;
> +            runtime->gcShouldCleanUpEverything = priorShouldCleanUpEverything;
> +            runtime->gcPoke = priorPoke;

Most of these flags aren't getting set, so why are you restoring them?

@@ +4825,5 @@
> +        }
> +
> +        MinorCollectionTracer *tracer() const { return trc; }
> +    };
> +    AutoMinorGCSession session(rt);

The auto class here seems like overkill since there are no early exits from this function.

@@ +4836,5 @@
> +        trc->doingFallbackMarking = false;
> +    } else {
> +        trc->runtime->gcStoreBuffer.mark(trc);
> +    }
> +    rt->gcNursery.fixupNewlyTenured(trc);

As Brian mentioned, this should have a name that signifies that most of the work takes place here.

@@ +4846,5 @@
> +        c->sweepCrossCompartmentWrappers();
> +    }
> +
> +    for (CompartmentsIter c(rt); !c.done(); c.next())
> +        ArrayBufferObject::sweep(c);

Can you explain why you don't need to sweep other stuff like watchpoints and the debugger?

@@ +4852,5 @@
> +    rt->gcNursery.sweep(rt->defaultFreeOp());
> +    rt->gcStoreBuffer.clear();
> +
> +    /* See comment in AllocateDuringMinorGC. */
> +    if (JS_UNLIKELY(rt->gcMaxBytes != ~size_t(0) && rt->gcBytes >= rt->gcMaxBytes)) {

No need for this ~size_t(0) thing.

::: js/src/jsgc.h
@@ +635,5 @@
>  extern void
>  PrepareForDebugGC(JSRuntime *rt);
>  
> +extern void
> +GCMinor(JSRuntime *rt, gcreason::Reason reason);

I'd prefer the name MinorGC.

::: js/src/jsgcinlines.h
@@ +535,5 @@
> +                JS_ASSERT(t);
> +                return t;
> +            }
> +        }
> +    }

Could you put everything inside the ShouldAllocate test into a separate function, maybe TryNewNurseryGCThing?

::: js/src/jsobj.cpp
@@ +2387,5 @@
>          }
>      }
>  
>      if (!oldCount) {
> +        obj->slots = cx->zone()->allocator.allocateSlots(obj, newCount);

The zone's Allocator won't report OOM while the JSContext will. The only place where these calls don't have a JSContext is JSObject::growElements(js::Allocator*, ...), and that method is only used by the RiverTrail code. Since there's no plan to use generational GC with RiverTrail in the near future, I suggest that you assert in that method that the object is not in the nursery and keep its existing call. For all the others, just pass a JSContext rather than using this allocator business.

@@ +2712,5 @@
>      if (hasDynamicElements()) {
>          uint32_t oldAllocated = oldcap + ObjectElements::VALUES_PER_HEADER;
>          newheader = (ObjectElements *)
> +            alloc->reallocateSlots(this, (HeapSlot *)getElementsHeader(), oldAllocated,
> +                                   newAllocated);

This one...

@@ +2717,4 @@
>          if (!newheader)
>              return false;  /* Leave elements as its old size. */
>      } else {
> +        newheader = (ObjectElements *) alloc->allocateSlots(this, newAllocated);

...and this one being the RiverTrail ones that I mentioned above.

::: js/src/jstypedarrayinlines.h
@@ +192,5 @@
> +    void mark(JSTracer *trc) {
> +        /* Update obj's private to point to the moved buffer's array data. */
> +        MarkObjectUnbarriered(trc, &buffer, "ArrayBufferObject");
> +        MarkObjectUnbarriered(trc, &obj, "TypedArray");
> +        obj->initPrivate(buffer->dataPointer() + byteOffset);

I don't think this is right. Since it was added to the store buffer, the byteOffset could have changed. You need to read it out before calling the mark functions (by subtracting the private pointer from buffer->dataPointer()) and then write it back afterward.

::: js/src/vm/GlobalObject.cpp
@@ +383,5 @@
>      RootedObject intrinsicsHolder(cx);
>      if (cx->runtime->isSelfHostingGlobal(self)) {
>          intrinsicsHolder = self;
>      } else {
> +        intrinsicsHolder = NewObjectWithClassProto(cx, &ObjectClass, NULL, self, MaybeSingletonObject);

This should probably go in a separate patch.

::: js/src/vm/ObjectImpl-inl.h
@@ +405,5 @@
>  
> +#ifdef JSGC_GENERATIONAL
> +/* static */ inline void
> +js::ObjectImpl::fixupAfterMove(JSRuntime *rt, ObjectImpl *src, HeapSlot **ownedSlotsp,
> +                               HeapSlot **ownedElementsp)

I think this should be a function in Nursery.cpp. If you need access to private stuff, add some accessors or use friend or something.

@@ +408,5 @@
> +js::ObjectImpl::fixupAfterMove(JSRuntime *rt, ObjectImpl *src, HeapSlot **ownedSlotsp,
> +                               HeapSlot **ownedElementsp)
> +{
> +    /* Update, allocate, or reclaim from the nursery our elements pointer. */
> +    if (rt->gcNursery.isInside(elements)) {

I might be wrong, but if you have an array that stores its elements in fixed slots, won't this be true? (When an array uses its fixed slots for elements, |array->elements == array->fixedElements()|.) And in that case, it seems like you're needlessly mallocing slots.

@@ +445,5 @@
> +        *ownedSlotsp = slots;
> +    }
> +
> +    /* The shape's list head may point into the old object. */
> +    if (&src->shape_ == shape_->listp) {

No braces.

::: js/src/vm/ScopeObject.cpp
@@ +1672,5 @@
>          js_ReportOutOfMemory(cx);
>          return false;
>      }
>  
> +    HashTableWriteBarrierPost(cx->runtime, scopes->unbarrieredProxyScopes(), &scope);

You moved this to a different bug.
Attachment #724984 - Flags: review?(wmccloskey)
Also, although threading a linked list through the evacuated objects is pretty clever, I'm worried that it's bad for performance. It would be nice if we could iterate over the newly tenured objects sequentially. To do that, we'd need to allocate them in fresh arenas, the way we do during background finalization. This is less space-efficient, but it's probably warranted.

I think it's fine to keep things the way they are now, but we probably should experiment with this after the patch lands.
Depends on: 853939
Depends on: 854029
Depends on: 854051
Depends on: 857345
Depends on: 857706
Depends on: 859512
Posted patch v2Splinter Review
(In reply to Bill McCloskey (:billm) from comment #43)
> Comment on attachment 724984 [details] [diff] [review]
> v1
> 
> Review of attachment 724984 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks pretty good. The big changes I'd like: (1) the nursery collection
> code is pretty spread out, and I'd like it to be in Nursery.cpp, (2) the
> fallback marking code needs to be fixed (see below).

> ::: js/src/gc/Marking.h
> @@ +281,5 @@
> >      MarkIonCode(trc, code, name);
> >  }
> >  
> > +inline void
> > +Mark(JSTracer *trc, JSObject **o, const char *name)
> 
> What's this for? Also, please use obj instead of o.

HashKeyRef (used by HashTablePostBarrier) is templated on the Key, so can't call
one of the specific Markers. This variant is for WeakMap's HashKeyRef
instantiation. I've added a comment to the method.

> @@ +287,5 @@
> > +    MarkObjectUnbarriered(trc, o, name);
> > +}
> > +
> > +inline void
> > +Mark(JSTracer *trc, ScopeObject **o, const char *name)
> 
> ...and this one?

Like above, but for proxiedScopes.

> @@ +85,5 @@
> > +    return thing;
> > +}
> > +
> > +HeapSlot *
> > +js::gc::Nursery::allocateSlots(Allocator *alloc, JSObject *obj, size_t nSlots)
> 
> We normally say nslots instead of nSlots.

My fingers apparently think in CamelCase.

> @@ +52,5 @@
> > +{
> > +    /* Pointer to the start of the Nursery memory. */
> > +    void *start_;
> > +    void *position_;
> > +    void *end_;
> 
> It seems more natural to me for these to have type char*. That way it's
> clear that all pointer arithmetic is over bytes.

I made these void* to ensure that the compiler's alias analysis doesn't bite us.
Brian switched these to uintptr_t; if we're not going to use void*, this is
probably the better type as it gives the same semantics as char* and saves
several casts at the same type.

> @@ +79,5 @@
> > +#endif
> > +
> > +    void *thing = position_;
> > +
> > +    position_ = reinterpret_cast<void *>(position() + thingSize);
> 
> No need for the cast.

position_ was void* and position() is uintptr_t. I've changed these all to
actually be uintptr_t per above.

> @@ +4685,5 @@
> > +        size_t gcBytes = rt->gcBytes;
> > +        rt->gcBytes = 0;
> > +        t = ArenaLists::refillFreeListDuringGC(zone, thingKind);
> > +        JS_ASSERT(t);
> > +        rt->gcBytes += gcBytes;
> 
> With the change I suggested above, you can just MOZ_CRASH if this fails
> (since we won't return null because of gcMaxBytes). I don't think we have
> any acceptable way to handle an actual OOM at this point. Is that true?

No, we do not. I see a number of ways we could mitigate this to some degree, but
none of them fully solve the problem and all of them are quite troublesome to
implement.

> @@ +67,5 @@
> > +    const static uint8_t SweptNursery = 0x2b;
> > +    const static uint8_t AllocatedThing = 0x2c;
> > +
> > +#ifdef DEBUG
> > +    mutable struct Stats
> 
> I don't really like metering, but if this helps you, we can keep it in for a
> little while. After that we should fold the useful parts into gc/Statistics
> and delete the rest.

It was mostly for sanity checking that everything was actually working in
practice like I imagined it all in my head. I think I've verified that to my
satisfaction, so I'll pull it out now.

> ::: js/src/jscntxt.cpp
> @@ +416,5 @@
> >          /* Unpin all common names before final GC. */
> >          FinishCommonNames(rt);
> >  
> >          /* Clear debugging state to remove GC roots. */
> > +        GCMinor(rt, gcreason::LAST_CONTEXT);
> 
> What's this for?

The following loop calls JSCompartment::clearTraps for each compartment.
clearTraps works by manually iterating each compartment's GCThings using a
CellIter, so we need to evict the nursery first. I moved the GC call up above
the loop head to avoid calling MinorGC in a loop. I think a better solution is
probably to specialize MinorGC on an empty nursery to return immediately.

> @@ +157,5 @@
> > +
> > +    /*
> > +     * Frees all non-live nursery-allocated things at the end of a minor
> > +     * collection. This operation takes time proportional to the number of
> > +     * live things.
> 
> The last sentence doesn't seem true. Freeing huge slots takes time
> proportional to the number of dead things. Maybe just remove this sentence.

Doh! I swear I was thinking "dead things" when I wrote that. I wanted to point
it out specially because it's a different complexity constraint than the rest of
GGC.

> ::: js/src/jscntxt.h
> @@ +827,5 @@
> >      /* Whether the currently running GC can finish in multiple slices. */
> >      int                 gcIsIncremental;
> >  
> > +    /* Whether the currently running GC is a minor collection. */
> > +    int                 gcIsMinor;
> 
> Why is this an int and not a bool? I don't know why gcIsIncremental is this
> way. Try changing that one too.

You've accurately divined my reasoning: gcIsIncremental was suspicious looking
enough that it made me intensly uneasy. Since it seems I'm not actually missing
some vitally important reason behind the type, I'll change them both to bool.

> ::: js/src/gc/StoreBuffer.h
> @@ +103,5 @@
> > +    void mark(JSTracer *trc) {
> > +        Key prior = key;
> > +        if (!map->has(key))
> > +            return;
> > +        ValueType value = map->lookup(key)->value;
> 
> It would be quicker to call lookup without calling has and then then check
> the result.

Good catch. I initially had that, but changed it when trying to adapt the same
code to work with MapObject, which doesn't have Ptr. Also, I think at the time
DebuggerWeakMap, which inherits WeakMap privately, wasn't re-exporting Ptr.

> @@ +4710,5 @@
> > +        if (IsAtomsCompartment(comp))
> > +            continue;
> > +
> > +        if (comp->watchpointMap)
> > +            comp->watchpointMap->markAll(trc);
> 
> What's this for? Won't this already happen in MarkRuntime?

Huh, I copied that verbatim from the marking we do for the post-barrier
verifier. I'm not sure what the verifier is doing: it doesn't call MarkRuntime
even. In any case, I don't think manually including watchpoint marking is ever
going to be the right solution, so I'll just remove it from both.

> @@ +4805,5 @@
> > +            priorPoke(rt->gcPoke)
> > +        {
> > +            runtime->gcIsMinor = true;
> > +            rt->gcNumber++;
> > +            trc = js_new<MinorCollectionTracer>();
> 
> Why does this need to be allocated on the heap?

Good point. I used to have this logic in separate prepare/destroy functions, but
didn't update after moving to RAII.

> @@ +4810,5 @@
> > +            trc->nursery = &rt->gcNursery;
> > +            trc->head = NULL;
> > +            trc->tail = &trc->head;
> > +            trc->doingFallbackMarking = false;
> > +            trc->eagerlyTraceWeakMaps = TraceWeakMapKeysValues;
> 
> This stuff should all be part of the constructor for the
> MinorCollectionTracer.

Right, that's a nice simplification.

> @@ +4819,5 @@
> > +            js_delete(trc);
> > +            runtime->gcIsFull = priorIsFull;
> > +            runtime->gcIsIncremental = priorIsIncremental;
> > +            runtime->gcShouldCleanUpEverything = priorShouldCleanUpEverything;
> > +            runtime->gcPoke = priorPoke;
> 
> Most of these flags aren't getting set, so why are you restoring them?

I was thinking we would need to call BeginMarkPhase, etc, but that was totally
spurious.

> @@ +4836,5 @@
> > +        trc->doingFallbackMarking = false;
> > +    } else {
> > +        trc->runtime->gcStoreBuffer.mark(trc);
> > +    }
> > +    rt->gcNursery.fixupNewlyTenured(trc);
> 
> As Brian mentioned, this should have a name that signifies that most of the
> work takes place here.

After rewriting it as a for loop it's only 4 lines; I've inlined it, as that
seems clearer.

> @@ +4716,5 @@
> > +        for (size_t i = 0; i < FINALIZE_LIMIT; ++i) {
> > +            AllocKind kind = AllocKind(i);
> > +
> > +            for (CellIterUnderGC cells(comp, kind); !cells.done(); cells.next())
> > +                JS_TraceChildren(trc, cells.getCell(), MapAllocToTraceKind(kind));
> 
> Unfortunately I don't think this will work. CellIter uses the freelist to
> iterate over the cells. The freelist can get overwritten by objects if we
> allocate, and that would totally screw up the iteration.

Uhg, you are quite right. I was thinking that CellIter worked of the in-header
copy of the free list, but this is not so. I thought I had tested by forcing the
fallback on every MinorGC, but this is failing now, so apparently I did not.

> However, I had an idea for an alternate way to do this fallback marking. We
> can store a bitmap that has 1 bit for every 8 bytes in the nursery. I think
> we're guaranteed that nursery objects will always start at 8-byte
> boundaries. The fallback marking would iterate over the major heap and set
> the bit for every object it finds. Once that's done, we would iterate over
> the bitmap and evacuate those objects from the nursery. I think this would
> be a lot simpler than having to deal with simultaneously allocating and
> iterating over the heap. For an 8MB nursery the bitmap would be 128KB. We
> would preallocate it along with the nursery.

That is a very clever idea; wasn't too hard to implement either. There were a
few trace functions still touching the arena header, but this was
straightforward to fix. A full jit-test run passes now with
StoreBuffer::hasOverflowed returning true unconditionally, where it did not before.

> @@ +107,5 @@
> > +    return allocateHugeSlots(alloc, nSlots);
> > +}
> > +
> > +HeapSlot *
> > +js::gc::Nursery::reallocateSlots(Allocator *alloc, JSObject *obj, HeapSlot *oldSlots,
> 
> This should have a comment that it's used for both slots and elements.

Ah, yes, thanks for noticing that. I removed the comment when I added a
reallocateElements wrapper for it; however, I killed that project because I
didn't want to deal with the parallel stuff. Now that I'm doing this anyway (see
below), I've re-added the type-specialied elements versions.

> @@ +116,5 @@
> > +        if (newCount > oldCount) {
> > +            newSlots = (HeapSlot *)alloc->realloc_(oldSlots, oldCount * sizeof(HeapSlot),
> > +                                                  newCount * sizeof(HeapSlot));
> > +        }
> > +        return newSlots;
> 
> I think this will break shrinkSlots.

I did break shrinkSlots repeatedly and in many different ways when implementing
this, so I wouldn't be surprised to see I missed some case. I don't see what
you're refering to, however. Could you elaborate?

> I suggest you change
> MallocProvider<T>::realloc_(p, old, new) so that new < old is allowed. In
> that case, we should just do what realloc_(p, new) does (i.e., update malloc
> counter only if p is null). If you do that, then you can unconditionally
> call realloc_ here and everything should work.

That is a very nice simplification. However, I'm worried that changing the
malloc accounting so dramatically will have a deleterious effect on our GC
timing. How about we leave it as is and just skip updating the counter for
shrinking realloc?

> @@ +154,5 @@
> > +js::gc::Nursery::allocateHugeSlots(Allocator *alloc, size_t nSlots)
> > +{
> > +    METER(++stats.hugeAllocs);
> > +    HeapSlot *slots = alloc->pod_malloc<HeapSlot>(nSlots);
> > +    hugeSlots.put(slots);
> 
> Maybe put a comment here that we ignore failure because it will only cause a
> leak.

And above too where I removed the assert.

> ::: js/src/jsobj.cpp
> @@ +2387,5 @@
> >          }
> >      }
> >  
> >      if (!oldCount) {
> > +        obj->slots = cx->zone()->allocator.allocateSlots(obj, newCount);
> 
> The zone's Allocator won't report OOM while the JSContext will. The only
> place where these calls don't have a JSContext is
> JSObject::growElements(js::Allocator*, ...), and that method is only used by
> the RiverTrail code. Since there's no plan to use generational GC with
> RiverTrail in the near future, I suggest that you assert in that method that
> the object is not in the nursery and keep its existing call. For all the
> others, just pass a JSContext rather than using this allocator business.

Actually, ensureDenseElements and growElement(JSContext) call
growElements(Allocator), so it is called by non-rivertrail. This is why I
skipped out on doing this before. I've now templatized it and added new methods
for (re)allocateElements to hide all the nasty casting.

> @@ +4846,5 @@
> > +        c->sweepCrossCompartmentWrappers();
> > +    }
> > +
> > +    for (CompartmentsIter c(rt); !c.done(); c.next())
> > +        ArrayBufferObject::sweep(c);
> 
> Can you explain why you don't need to sweep other stuff like watchpoints and
> the debugger?

I was doing this to ensure that all of the array buffer's weak view lists got
visited. I've fixed this by making the view list head a HeapPtr.

The sweepCrossCompartmentWrappers is another "interesting" case. For it, sweep
happens to do what we need, but I think it's clearer to implement a new
markAllCrossCompartmentWrappers which does Mark instead of IsAboutToBeFinalized
to make clearer what we expect to happen.

> ::: js/src/jstypedarrayinlines.h
> @@ +192,5 @@
> > +    void mark(JSTracer *trc) {
> > +        /* Update obj's private to point to the moved buffer's array data. */
> > +        MarkObjectUnbarriered(trc, &buffer, "ArrayBufferObject");
> > +        MarkObjectUnbarriered(trc, &obj, "TypedArray");
> > +        obj->initPrivate(buffer->dataPointer() + byteOffset);
> 
> I don't think this is right. Since it was added to the store buffer, the
> byteOffset could have changed. You need to read it out before calling the
> mark functions (by subtracting the private pointer from
> buffer->dataPointer()) and then write it back afterward.

Yes, looks like I missed that fact. I'm now requerying the correct offset from
the object's reserved slots immediately before updating the value.

> @@ +4740,5 @@
> > +         * We do not know the number of inline elements that the array was
> > +         * initially allocated with. Instead we copy into a new, more optimal size
> > +         * class.
> > +         */
> > +        thingSize = sizeof(ObjectImpl);
> 
> If the array has elements stored in the fixed slots, they won't get copied
> over properly. This isn't a problem right now because of a bug in
> ObjectImpl::fixupAfterMove, but once that is fixed you'll need to pass in
> the right size here.

Right, so the problem here is that there is simply no way at all for an array
object to get its size class without the arena header. I verified with Brian
that this data is not in the shape hierarchy or anywhere else we can get to,
either convieniently or inconveniently. Since we don't know the size of src,
there is no way for us to safely copy out the existing fixed elements. On the
other hand, because this data isn't in the shape there is nothing to keep us
from using a more optimal size class when we tenure.

> @@ +408,5 @@
> > +js::ObjectImpl::fixupAfterMove(JSRuntime *rt, ObjectImpl *src, HeapSlot **ownedSlotsp,
> > +                               HeapSlot **ownedElementsp)
> > +{
> > +    /* Update, allocate, or reclaim from the nursery our elements pointer. */
> > +    if (rt->gcNursery.isInside(elements)) {
> 
> I might be wrong, but if you have an array that stores its elements in fixed
> slots, won't this be true? (When an array uses its fixed slots for elements,
> |array->elements == array->fixedElements()|.) And in that case, it seems
> like you're needlessly mallocing slots.

Yup, looks like I totally botched this when I added Brian's changes. This used
to check for fixed slots explicitly and copy what was possible. Moving this into
Nursery let me break this function up into much clearer sub-units.

> ::: js/src/jsgc.cpp
> @@ +4852,5 @@
> > +    rt->gcNursery.sweep(rt->defaultFreeOp());
> > +    rt->gcStoreBuffer.clear();
> > +
> > +    /* See comment in AllocateDuringMinorGC. */
> > +    if (JS_UNLIKELY(rt->gcMaxBytes != ~size_t(0) && rt->gcBytes >= rt->gcMaxBytes)) {
> 
> No need for this ~size_t(0) thing.

Since gcBytes is size_t now, we actually don't need to do any of the existing
fiddly "Max - Current < ArenaSize" weirdness either: a simple "Current >= Max"
works fine.

(In reply to Bill McCloskey (:billm) from comment #44)
> Also, although threading a linked list through the evacuated objects is
> pretty clever, I'm worried that it's bad for performance. It would be nice
> if we could iterate over the newly tenured objects sequentially. To do that,
> we'd need to allocate them in fresh arenas, the way we do during background
> finalization. This is less space-efficient, but it's probably warranted.

> I think it's fine to keep things the way they are now, but we probably
> should experiment with this after the patch lands.

Indeed, no promises on optimality for version 1.

My thinking is that the list length is strictly less than the number of things
we are tenuring and the tenuring process itself is heavy enough to make the list
overhead tiny in relation. I haven't actually tested that theory though, so we
should definitely investigate once we run out of other low-hanging fruit.
Attachment #724984 - Attachment is obsolete: true
Attachment #737016 - Flags: review?(wmccloskey)
Comment on attachment 737016 [details] [diff] [review]
v2

Review of attachment 737016 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/Debugger.h
@@ +387,5 @@
>       * them and returns true. If not, it returns false.
>       */
>      static void markCrossCompartmentDebuggerObjectReferents(JSTracer *tracer);
>      static bool markAllIteratively(GCMarker *trc);
> +    static void markAll(JSTracer *trc);

I'm not sure how this hunk crept up my patch queue. I merged it into the Debugger patch, where it belonged. Please ignore.
Comment on attachment 737016 [details] [diff] [review]
v2

Review of attachment 737016 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great. I decided it would be easier to do my review in the form of a patch. I'll attach the changes I'd like to see. Provided you're okay with them, this is good to go.
Attachment #737016 - Flags: review?(wmccloskey) → review+
Posted patch review for v2Splinter Review
This is mostly cosmetic stuff:

- I didn't see any reason for getForwardedPointer to take two params, so I just passed the reference.

- I replaced rt->gcIsMinor with a new heapState that distinguishes major and minor collections.

- I got rid of IsMinorCollector, since we can just use the heapState variable instead. That allows the tracing functions to be private to Nursery.

- When an array is tenured, its new capacity is now set to its initializedLength. I think this makes sense, since we can expect the array size to roughly have stabilized by the time we do a collection. I didn't really understand the Min/Max stuff that was there before.

- I didn't really like the templatization of growSlots. There aren't any good options here, but I changed it to take a class that can hold either a JSContext or an Allocator. This at least reduces the code bloat from templates, and it also forces people to pass either a JSContext or an Allocator.

Let me know if you have any questions or problems with these changes. If not, please fold this in with the v2 patch and feel free to land.
Attachment #738751 - Flags: feedback?(terrence)
Comment on attachment 738751 [details] [diff] [review]
review for v2

Review of attachment 738751 [details] [diff] [review]:
-----------------------------------------------------------------

(In reply to Bill McCloskey (:billm) from comment #48)
> Created attachment 738751 [details] [diff] [review]
> review for v2
> 
> This is mostly cosmetic stuff:
> 
> - When an array is tenured, its new capacity is now set to its
> initializedLength. I think this makes sense, since we can expect the array
> size to roughly have stabilized by the time we do a collection. I didn't
> really understand the Min/Max stuff that was there before.

My thinking here was that if the resize logic has decided that the array should have |capacity| slots available, we should honor that request when tenuring. Considering how simple that logic is, however, you're right that it's probably better to just use initializedLength.

> - I didn't really like the templatization of growSlots. There aren't any
> good options here, but I changed it to take a class that can hold either a
> JSContext or an Allocator. This at least reduces the code bloat from
> templates, and it also forces people to pass either a JSContext or an
> Allocator.

Yeah, this was pretty appalling, all around. What you have seems reasonable.
 
> Let me know if you have any questions or problems with these changes. If
> not, please fold this in with the v2 patch and feel free to land.
Attachment #738751 - Flags: feedback?(terrence) → feedback+
Posted patch v0Splinter Review
Sometime since last Friday, nursery objects started showing up in the tagged proto in initialShapes. I don't see any reason why this would be fundamentally wrong, so it seems reasonable to just mark them. Does the attached seem reasonable?
Attachment #738828 - Flags: review?(wmccloskey)
Comment on attachment 738828 [details] [diff] [review]
v0

Review of attachment 738828 [details] [diff] [review]:
-----------------------------------------------------------------

I'm a little concerned that this is going to be expensive, but I guess we can wait and see.
Attachment #738828 - Flags: review?(wmccloskey) → review+
(In reply to Bill McCloskey (:billm) from comment #51)
> I'm a little concerned that this is going to be expensive, but I guess we
> can wait and see.

Good point. We can always add a HashKeyRef to the generic buffer if this is too expensive.

For posterity, I tracked down the introduction of this failure to removal of --enable-gczeal from my build. With --enable-gczeal, the following tests succeeded before this patch where they do not afterwards.

>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/bug793385.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/auto-regress/bug702915.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/auto-regress/bug770713.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/baseline/bug847425.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/baseline/bug852801.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/basic/bug710947.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/basic/bug720675.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/basic/testBug756919.js
>    /home/terrence/moz/branch/ggc_implement_and_tune/mozilla/js/src/jit-test/tests/basic/testBug840012.js
One failure in jsapi-tests on gcc-4.5 with opt-debug. Looks like a real failure.

When we steal an array buffer's contents we set it to have fixed slots, then NULL the view list. If the buffer did not have fixed slots before this will be 0xda in opt-dbg and NULL in opt-only. Since the patch makes the view pointer a HeapPtr, this write in opt-debug attempts to access the runtime of 0xdadadadadadadada and crashes, correctly, whereas opt-only unluckily succeeds. The solution here is to use init, since this is an initializing write.

https://tbpl.mozilla.org/?tree=Try&rev=85c09292dcf1
Depends on: 864033
I'd like to get the ggc build on tbpl unhidden on try and inbound first.
Depends on: 867784
Depends on: 871036
Blocks: 874151
Depends on: 875435
No longer depends on: 875435
No longer depends on: 847579
Depends on: 831507
Depends on: 878869
Depends on: 879874
How are we looking with regard to comment 57?
Depends on: 883498
No longer blocks: GenerationalGC
And SM(ggc) this has been unhidden for a week without major incident. Finally time to close.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Depends on: 1128108
You need to log in before you can comment on or make changes to this bug.