Closed Bug 641027 Opened 13 years ago Closed 13 years ago

GC: Snapshot-at-the-beginning write barrier for incremental GC

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: billm, Assigned: billm)

References

Details

Attachments

(12 files, 1 obsolete file)

239.20 KB, patch
luke
: review+
Details | Diff | Splinter Review
64.94 KB, patch
luke
: review+
Details | Diff | Splinter Review
7.08 KB, patch
luke
: review+
Details | Diff | Splinter Review
3.00 KB, patch
luke
: review+
Details | Diff | Splinter Review
10.11 KB, patch
luke
: review+
Details | Diff | Splinter Review
10.09 KB, patch
luke
: review+
Details | Diff | Splinter Review
12.86 KB, patch
luke
: review+
Details | Diff | Splinter Review
3.30 KB, patch
luke
: review+
Details | Diff | Splinter Review
26.05 KB, patch
luke
: review+
Details | Diff | Splinter Review
17.84 KB, patch
luke
: review+
Details | Diff | Splinter Review
513.62 KB, patch
Details | Diff | Splinter Review
87.20 KB, patch
bhackett1024
: review+
Details | Diff | Splinter Review
We need a write barrier, both for generational GC and incremental GC. The barrier needs to trigger for every write to every field that the GC traces through during marking.
Assignee: general → wmccloskey
Have you looked into any sort of write barrier removal analysis?  (For instance, "Write Barrier Removal by Static Analysis" by Zee and Rinard.)  Even something simple, like eliding write barriers for the initialization of objects in a generational collector (because an object that has just been allocated must be in the young generation) can remove a lot of them, though it is probably less awesome in JS than in ML, where mutation is rare.

Are you using a snapshot-at-the-beginning or incremental-update style of barrier?

Just curious.
Yes, I definitely think we should do the optimization you describe for initializing objects. I'm not sure about anything fancier. It's possible that we could do some optimizations in IonMonkey.

The barrier is a snapshot-at-the-beginning barrier, although it would be pretty easy to switch.
I'm going to make this be the bug for incremental write barriers. Generational write barriers will have a separate bug, although most of the work will have been done in this patch.
No longer blocks: GenerationalGC
Summary: GC: Write barrier → GC: Snapshot-at-the-beginning write barrier for incremental GC
The write barrier patch has been relatively stable for a while, in the sense that the only changes I've made have been for rebasing. Now seems like a good time to start the review process.

This patch includes:
  * The write barrier code itself, packaged as a few C++ classes that you wrap things in so that they're automatically barriered. These are HeapPtr<T>, HeapValue, and HeapId (for jsid).
  * Changes to a bunch of fields so that they use HeapPtr<T> instead of T*.
  * A --enable-gcincremental option that turns on or off the write barrier code.
  * Lots of miscellaneous changes throughout the code to deal with the changes above.

The miscellaneous changes fall into a few categories:

* In some cases, |x->field = y| is changed to |x.init(y)|. The init method differs from assignment in that the variable is assumed to have no previous value (either uninitialized or NULL/void). Using init is sometimes necessary because it would be incorrect to execute a barrier on an uninitialized value. It's also faster because it avoids a barrier.

* In some cases, the automatic conversion from HeapPtr<T> to T* won't work. This may be caused by use of the ternary operator, or sometimes casts cause problems. In these cases, I use the get() method to explicitly get a pointer out.

* Using the assignment method on HeapValue is slow. Normally, given a pointer x, we can test if we need a barrier by checking if x->compartment()->gcMarkingTracer != NULL. But for HeapValues, it takes a lot more work to get the compartment (test what kind of value it is, test if it's a static string, etc.). Most of the time, we already know the compartment when we're doing the assignment. So I converted many instances of |x->field = value| into |x.set(compartment(), value|.

* Unfortunately, our byzantine include file structure forced me to move a lot of inline functions into -inline.h files. I don't think there's a way around this.

In addition to this, I had to rewrite a fair amount of jsxml.cpp to be more templatey. This is really unfortunate. If you run this patch through diffstat, jsxml.cpp has more changes than any file except jsgcbarrier.h, which is the new barrier code. If we could remove E4X from Firefox, then embedders could still use it (but without incremental or generational GC, which they probably don't want anyway) and we wouldn't have to spend so much time maintaining code that we don't even use.

Luke, it's probably easiest to start the review with jsgcbarrier.h. It has the big comment. After that, it's just a file-by-file slog. I've split out as many of the cross-file changes as I could think of into separate patches, which will follow. If there's anything else I can do to make this easier, please let me know.
Attachment #549863 - Flags: review?(luke)
This patch makes some changes to how GC marking is done.

First, it stores the JSRuntime directly in every JSTracer--we used to store the context and then fetch the runtime via trc->context->runtime. This is no longer safe, because the JSTracer for an incremental GC may outlast the JSContext that started the GC. I've considered removing the context member entirely, but that's an API change and maybe not necessary.

Second, I changed most of the marking functions to require that you pass them HeapPtr<T>s instead of T*. This hopefully will make it less likely that someone will add a new heap field and forget to barrier it.

Third, I changed a lot of the root marking code to use MarkRoot instead of MarkObject/String/Whatever. In incremental and generational GC, roots don't need barriers, so you're allowed to pass bare JSObject*s and such to these functions.

I also changed js::gc::FINALIZE_OBJECT_LAST to js::gc::FINALIZE_OBJECT_LIMIT, since I ran into some problems with the former. It's just kind of error-prone and weird. The usage in JSObject::getEmptyShape was especially confusing. When you read it, note the subtle difference between FINALIZE_OBJECT_LAST and FINALIZE_FUNCTION_AND_OBJECT_LAST, which is easy to miss.
Attachment #549869 - Flags: review?(luke)
We had been storing upvars as a PrivateData slot pointer to an array of Values. I made a struct for these, and changed them to HeapValues. This isn't really necessary, but it makes the casting a little less verbose.
Attachment #549870 - Flags: review?(luke)
Attached patch args changes, v1Splinter Review
This one changes the way the ArgumentsData is initialized a bit, and converts the fields to HeapValues.
Attachment #549871 - Flags: review?(luke)
This patch changes the API to encapsulate JSIdArray better. This way we don't have to expose barriers to API users (in this case, at least). Only one place in the browser had to be changed.
Attachment #549874 - Flags: review?(luke)
We've been storing the object and nextToGC fields of JSScript in a union. I need to make |object| be a HeapPtr, but C++ doesn't like that. So I did something I'm not proud of. It's pretty isolated, though, so not too gross. Maybe you can think of a better way.
Attachment #549875 - Flags: review?(luke)
This adds some read barrier code for weak pointers. The comment in jsgcbarrier.h should explain why this is necessary.

Note that the reason for all the .get() changes in JSCompartment::sweep is that IsAboutToBeFinalized takes a void* argument, so C++ won't do the conversion automatically. I tried changing the argument to be a gc::Cell, but that caused a lot of trouble. Maybe in the future we could do this, but it should be a separate bug. (There's also the issue that IsAboutToBeFinalized works on static strings, which are not technically Cells.)
Attachment #549877 - Flags: review?(luke)
These tests were copying JSObjects by value, which normally is a no-no. With barriers, it's particularly bad, because the copy invokes the HeapPtr copy constructor, which is intentionally private. I changed the copies to use memcpy instead, which I think makes it more clear what's actually happening.
Attachment #549878 - Flags: review?(luke)
This patch adds a write barrier API for embedders to use. They're not actually required to do this unless they enable incremental or generational GC (once we get those). But XPConnect has to change, and the changes are included.
Attachment #549879 - Flags: review?(luke)
Last patch in the series. This one tries to check if we've missed any barriers. There are a few comments that describe how it works. I've tested jit-tests, jstests, and xpcshell-tests and they're clean. Starting a browser also works. The verifier is quite slow, so it's hard to do much more than this.
Attachment #549880 - Flags: review?(luke)
Comment on attachment 549863 [details] [diff] [review]
base write barrier patch, v1 (based on 98303320cbbe)

Review of attachment 549863 [details] [diff] [review]:
-----------------------------------------------------------------

True to the bug, I'm going to review this patch incrementally, tapping out when MEGO.

Your use of types and helpers has so far really made this a pretty easy read.  The jsgcbarrier.h comment is fantastic.

::: js/src/jsapi.cpp
@@ -2202,5 @@
> -JS_CallTracer(JSTracer *trc, void *thing, uint32 kind)
> -{
> -    JS_ASSERT(thing);
> -    MarkKind(trc, thing, kind);
> -}

From IRL: this was moved to jsgc, but perhaps instead you can make a js_CallTracer called here so that way all JS_* defines are in jsapi.cpp.

::: js/src/jsarray.cpp
@@ +992,5 @@
>  
>      /* Create a native scope. */
>      JSObject *arrayProto = getProto();
>      js::gc::FinalizeKind kind = js::gc::FinalizeKind(arenaHeader()->getThingKind());
> +    if (!InitScopeForObject<false>(cx, this, &js_SlowArrayClass, arrayProto, kind))

A two-valued enum would aid readability here.  Alternatively, there are only two callers (NewObject and makeDenseArraySlow); perhaps you can pull the setMap out of InitScopeForObject (which actually makes sense, given the name) and put the initMap call in NewObject and the setMap call in makeDenseArraySlow.

::: js/src/jscompartment.h
@@ +394,5 @@
>  
>      js::gc::ArenaList            arenas[js::gc::FINALIZE_LIMIT];
>      js::gc::FreeLists            freeLists;
>  
> +    js::GCMarker                 *gcMarkingTracer;

Could you put "incremental" somewhere in there and drop the "marking" and "tracer" words (as these words are hopelessly ambiguated in our codebase)?

Seeing this on the compartment makes me wonder how incremental works in per-compartment vs. per-runtime collection.  Perhaps a comment here or in jsgcbarrier.h would help?

::: js/src/jsemit.cpp
@@ +105,5 @@
>      bindings.trace(trc);
>  }
>  
> +void
> +JSTreeContext::writeBarrierPre(JSTreeContext *tc)

From IRL: all the parser/compiler stuff is traced stack-y like from jsgc and there is usually nothing to trace, so we might as well not do it incrementally to avoid having to think about it.

::: js/src/jsgcbarrier.h
@@ +16,5 @@
> + * The Original Code is Mozilla Communicator client code, released
> + * March 31, 1998.
> + *
> + * The Initial Developer of the Original Code is
> + * Netscape Communications Corporation.

I think you're suppose to use newer license boilerplate for newer files (viz., not mentioning Netscape).

@@ +115,5 @@
> + * GC. Since this is rare, the cost of the write barrier is usually just an
> + * extra branch.
> + *
> + * In practice, we implement the pre-barrier differently based on the type of
> + * value0. To see an example of one, take a look at JSObject::writeBarrierPre,

Perhaps, less colloquially, s/Too see an example of one, take a look at/E.g., see/ ?

@@ +130,5 @@
> + * call, this file contains a bunch of C++ classes and templates that use
> + * operator overloading to take care of barriers automatically. In many cases,
> + * all that's necessary to make some field be barriered is the following:
> + *     Type *field;
> + * ==> HeapPtr<Type> field;

Instead of ==>, perhaps:

... make some field be barriered is to replace
  Type *field;
with
  HeapPtr<Type> field;

@@ +131,5 @@
> + * operator overloading to take care of barriers automatically. In many cases,
> + * all that's necessary to make some field be barriered is the following:
> + *     Type *field;
> + * ==> HeapPtr<Type> field;
> + * There are also special classes HeapValue and HeapId.

Perhaps append "which barrier js::Value and jsid, respectively"

@@ +133,5 @@
> + *     Type *field;
> + * ==> HeapPtr<Type> field;
> + * There are also special classes HeapValue and HeapId.
> + *
> + * One additional note: not all object write need to be barriered. Writes to

s/write/writes/

@@ +166,5 @@
> +{
> +    T *value;
> +
> +  public:
> +    HeapPtr() : value(NULL) {}

(from irl, we talked about having HeapPtr() leave HeapPtr undefined (perhaps with debug-only poison value)

@@ +179,5 @@
> +        post();
> +    }
> +
> +    /* Use when a pointer "disappears", but the destructor won't be called. */
> +    void destroy() { pre(); }

This name bugged me a bit; makes me thing the pointee would be destroyed.  Your comment makes me think that perhaps we could delete this method and just expect code that would call destroy() manually call the destructor.  Really, since HeapPtr isn't a POD, that code should be calling the destructor anyway.

@@ +224,5 @@
> +    template<class U>
> +    operator MarkablePtr<U>() const { return MarkablePtr<U>(value); }
> +
> +    void pre() { T::writeBarrierPre(value); }
> +    void post() { T::writeBarrierPost(value, (void *)&value); }

Perhaps these should be private so external code isn't tempted to call them.

@@ +254,5 @@
> +
> +typedef HeapPtr<JSObject> HeapPtrObject;
> +typedef HeapPtr<JSString> HeapPtrString;
> +typedef HeapPtr<js::Shape> HeapPtrShape;
> +typedef HeapPtr<const js::Shape> HeapPtrConstShape;

Personally, I'd drop the "Const" in the typedef name.

@@ +262,5 @@
> +{
> +    Value value;
> +
> +  public:
> +    explicit HeapValue() : value(UndefinedValue()) {}

Similarly, I'd leave uninitialized (or perhaps Debug_SetValueRangeToCrashOnTouch?).

@@ +333,5 @@
> +
> +static inline const Value *
> +Unbarrierify(const HeapValue *array)
> +{
> +    return (const Value *)array;

You could add a "Valueify" overload since, technically, that's what its doing.  Second, I'd either add a JS_STATIC_ASSERT(sizeof(HeapValue) == sizeof(Value)) directly before this.

::: js/src/jsgcbarrierinlines.h
@@ +75,5 @@
> +HeapValue::writeBarrierPre(const Value &value)
> +{
> +#ifdef JSGC_INCREMENTAL
> +    if (value.isMarkable()) {
> +        js::gc::Cell *cell = (js::gc::Cell *)value.toGCThing();

Perhaps you could change the signature of Value::toGCThing to return a gc::Cell?  I know this might play poorly with inlines, but I like the type-level implications.

@@ +77,5 @@
> +#ifdef JSGC_INCREMENTAL
> +    if (value.isMarkable()) {
> +        js::gc::Cell *cell = (js::gc::Cell *)value.toGCThing();
> +        unsigned kind = value.gcKind();
> +        if (kind == JSTRACE_STRING && ((JSString *)cell)->isStaticAtom())

Perhaps JS_ASSERT(value.isString()).  Perhaps also make a patch that makes static strings markable ;-)

@@ +145,5 @@
> +}
> +
> +inline void
> +HeapValue::set(JSCompartment *comp, const Value &v)
> +{

Can you assert that 'comp' is the right compartment for 'v'?
Comment on attachment 549863 [details] [diff] [review]
base write barrier patch, v1 (based on 98303320cbbe)

Review of attachment 549863 [details] [diff] [review]:
-----------------------------------------------------------------

Another batch; done through jso*.

::: js/src/jsfun.cpp
@@ +1173,5 @@
>          JS_ASSERT(obj->numSlots() >= first + count);
> +
> +        /* Don't do this during a GC, since it hits a write barrier. */
> +        if (!IS_GC_MARKING_TRACER(trc))
> +            obj->setSlotsUndefined(first, count);

From IRL, this can be removed.  Leave the sadness be until bug 596295 fixes it.

@@ +1661,5 @@
>  
>      if (fun != obj) {
> +        /*
> +         * obj is a cloned function object, trace the clone-parent, fun.
> +         * This is safe to leave Unbarriered for incremental GC, but we'll need

Could you add a "because...".  (And, is it "because fun is immutable"?  If so, perhaps you could add a MarkObject overload that takes const pointers and make 'fun' a const pointer?  Just a suggestion, if its easy.)

::: js/src/jsgcbarrier.h
@@ +1,1 @@
> +/* -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-

js/src/gc/Barrier{.h,-inl.h,.cpp} ?

@@ +136,5 @@
> + *
> + * One additional note: not all object write need to be barriered. Writes to
> + * newly allocated objects do not need a barrier as long as the GC is not
> + * allowed to run in between the allocation and the write. In these cases, we
> + * use the "obj->field.init(value)" method instead of "obj->field = value".

Could you maybe point out more emphatically (or, using yesterday's word of the day, horatatively) that 'init' is a naming idiom?  Its a good idiom.

::: js/src/jshashtable.h
@@ +1034,5 @@
> +            return false;
> +        const_cast<Key &>(pentry->key) = k;
> +        pentry->value = v;
> +        return true;
> +    }

How about just changing the existing 'add' to be templated for k and v?  It does weaken type checking, but the errors will still be fairly direct.  That avoids the problem that the name 'addInPlace' is kinda confusing and I don't have any great suggestions... addWithLazyTypes, addTemplated, ...

::: js/src/jsiter.cpp
@@ +326,5 @@
>          return false;
>  
>      ida->length = static_cast<jsint>(len);
> +    for (size_t i = 0; i < len; i++)
> +        ida->vector[i].init(props[i]);

Perhaps a reusable InitIdRange() (in jsvalue.h) for here and below (x2)?

@@ +1073,5 @@
> +    /*
> +     * MarkGenerator should only be called when regs is based on the floating frame.
> +     * See calls to RebaseRegsFromTo.
> +     */
> +    JS_ASSERT(gen->regs.sp - fp->slots() <= ptrdiff_t(fp->numSlots()));

It's a stronger assert if you drop the ptrdiff_t on the right and cast to size_t on the left.

@@ +1235,5 @@
> +     * Write barrier is needed since the generator stack can be updated,
> +     * and it's not barriered in any other way. We need to do it before
> +     * gen->state changes, which can cause us to trace the generator
> +     * differently.
> +     */

From IRL: the generator only needs to be marked once per incremental GC.  Right now generators are probably so slow this doesn't matter, but could you put in a note mentioning this as a possible future optimization?

::: js/src/jsobj.cpp
@@ +3488,5 @@
> +#ifdef JSGC_INCREMENTAL
> +    JSCompartment *comp = a->compartment();
> +    if (comp->gcMarkingTracer) {
> +        MarkChildren(comp->gcMarkingTracer, a);
> +        MarkChildren(comp->gcMarkingTracer, b);

Perhaps "if (JSTracer *trc = comp->gcMarkingTracer)" and reuse trc.  Second a comment would be nice explaining why this was being done.  IIUC, its because if one has been marked and the other hasn't, then the unmarked one will be missed.  Is that right?

::: js/src/jsobj.h
@@ +701,5 @@
>      inline const js::Value &nativeGetSlot(uintN slot) const;
>  
> +    inline void setSlot(uintN slot, const js::Value &value);
> +    inline void initSlot(uintN slot, const js::Value &value);
> +    inline void initSlotUnsafe(uintN slot, const js::Value &value);

I see your motivation here for an attention-attracting word, but Unsafe isn't my favorite.  initSlotUnchecked?  initSlotKnowingly ;-)

@@ +744,5 @@
>          return privateData;
>      }
>  
> +    inline void setPrivate(void *data);
> +    inline void setPrivateNoBarrier(void *data);

What about initPrivate() instead of NoBarrier?

::: js/src/jsobjinlines.h
@@ +107,5 @@
> +DestroyValueRange(HeapValue *vec, uintN len)
> +{
> +    for (uintN i = 0; i < len; i++)
> +        vec[i].destroy();
> +}

Can all these inline functions be in jsgcbarrier.h?  It's kinda weird to find 'em here.

@@ +420,5 @@
>  inline const js::Value *
>  JSObject::getDenseArrayElements()
>  {
>      JS_ASSERT(isDenseArray());
> +    return Unbarrierify(getSlots());

Seems like you would want to propagate the HeapValue-ness out through getDenseArrayElements and Unbarrierify closer to the use.

@@ +472,5 @@
> +        markStart = dstStart;
> +        markEnd = js::Min(dstStart + count, srcStart);
> +    }
> +    for (uintN i = markStart; i < markEnd; i++)
> +        js::HeapValue::writeBarrierPre(getSlot(i));

Maybe the compiler's got your back, but it seems like, if you really want perf, you'd want to hoist the "if (gcMarkingTracer)" test in writeBarrierPre up so as to predicate the if/else and loop.  Lastly, I've seen on several several occaisions GCC fail to turn indexing into pointer arithmetic (and seen this be the source of loops going 25% slower), so I'd try to do that here too.

@@ +1505,5 @@
> +inline void
> +JSObject::setSlotsUndefined(uint32 start, uint32 count)
> +{
> +    for (uint32 i = start; i < start + count; i++)
> +        setSlot(i, js::UndefinedValue());

Same comment as above: if this is hot at all, I'd turn it into pointer arithmetic.

@@ +1595,5 @@
> +    JS_ASSERT(getClass()->flags & JSCLASS_HAS_PRIVATE);
> +
> +    privateWriteBarrierPre(&privateData);
> +    privateData = data;
> +    privateWriteBarrierPost(&privateData);

I think there are a lot of places where we setPrivate() right after init.  I see you caught a few, but it seems like, since this is kinda expensive, we'd want to make the barrieried version have the noisy name; or at least a different one to indicate its not "init".  Perhaps updatePrivate and (as suggested above) initPrivate?

@@ +1611,5 @@
> +{
> +#ifdef JSGC_INCREMENTAL
> +    JSCompartment *comp = compartment();
> +    if (comp->gcMarkingTracer)
> +        MarkChildren(comp->gcMarkingTracer, this);

From IRL: it seems like this can just call the trace hook.
Comment on attachment 549863 [details] [diff] [review]
base write barrier patch, v1 (based on 98303320cbbe)

Review of attachment 549863 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed (or counterarguments raised).  Again, great work.

::: js/src/jsscan.cpp
@@ +73,5 @@
>  #include "jsstaticcheck.h"
>  #include "jsvector.h"
>  
>  #include "jsscriptinlines.h"
> +#include "jsobjinlines.h"

I'm guessing you are doing this to avoid those annoying "referenced but not defined" situations.  Is there any way you can instead fix the headers?  In theory, you should only have to add a *inlines.h file to a .cpp if you use an definition in that file...

::: js/src/jsweakmap.cpp
@@ +244,5 @@
> +            p->value.set(cx->compartment, value);
> +            return true;
> +        }
> +        if (!map->addInPlace(p, key, value))
> +            goto out_of_memory;

If you templated 'put' like 'add' (suggested in prev comment), you could reuse 'put'.

::: js/src/vm/String.cpp
@@ +190,5 @@
>          JSExtensibleString &left = this->leftChild()->asExtensible();
>          size_t capacity = left.capacity();
>          if (capacity >= wholeLength) {
> +            JSString::writeBarrierPre(d.u1.left);
> +            JSString::writeBarrierPre(d.s.u2.right);

It's kindof sad to do all this branching in flatten()... ideally we'd do a single branch at the head of flatten that would send us to statically did or did not do the write barriers (via a template bool parameter).

Another option is to say that, when we mark a string that is a rope, we mark the whole rope.  This isn't purely incremental, but I would think that string marking is so fast, it would matter.  That would remove all branching from flatten (which is a fairly hot path).
Attachment #549863 - Flags: review?(luke) → review+
Attachment #549869 - Flags: review?(luke) → review+
Attachment #549870 - Flags: review?(luke) → review+
Comment on attachment 549871 [details] [diff] [review]
args changes, v1

Review of attachment 549871 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsfun.cpp
@@ +209,5 @@
>          return NULL;
> +
> +    data->callee.init(ObjectValue(callee));
> +    for (HeapValue *v = data->slots; v != data->slots + argc; ++v)
> +        v->init(UndefinedValue());

I think I put this in a prev comment, but some InitValueRangeToUndefined would be nice.
Attachment #549871 - Flags: review?(luke) → review+
Comment on attachment 549874 [details] [diff] [review]
id array changes, v1

Review of attachment 549874 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.cpp
@@ +4022,5 @@
>          /* Native case: start with the last property in obj. */
> +        PropIterData *data = cx->new_<PropIterData>(obj->lastProperty());
> +        if (!data)
> +            return NULL;
> +        pdata = data;

Is it really necessary to malloc a new PropIterData just to hold a single pointer-sized datum?  What was wrong with the way it was before?
Attachment #549875 - Flags: review?(luke) → review+
Attachment #549877 - Flags: review?(luke) → review+
Attachment #549878 - Flags: review?(luke) → review+
Comment on attachment 549879 [details] [diff] [review]
JSAPI and XPConnect changes, v1

Review of attachment 549879 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsapi.h
@@ +1786,5 @@
>  
>  #endif
>  
>  /*
> + * Write barrier API.

From IRL: let's do the minimal API addition possible to get just incremental GC working.  It sounds like for generational that if we want to use types to good effect in the way you've done with incremental, we'll want radical JSAPI changes.

Also, I'd slim down the comment to just mention incremental perhaps just point to your beefy jsgcbarrier comment for general concepts and just explain here the JSAPI equivalent operations are.
Attachment #549879 - Flags: review?(luke) → review+
Comment on attachment 549880 [details] [diff] [review]
barrier verifier, v1

Review of attachment 549880 [details] [diff] [review]:
-----------------------------------------------------------------

This verifier is way friggin cool.  Perhaps in the jsgcbarrier.h comment, right after you say "The name comes from the idea that we take a
theoretical "snapshot" of all reachable objects in step 2; ..." you say "in fact, the incremental gc verifier actually does take a snapshot...".

::: js/src/jsgc.cpp
@@ +2695,5 @@
> +            if (inVerify) EndVerifyBarriers(cx);
> +        }
> +        ~AutoVerifyBarriers() { if (inVerify) StartVerifyBarriers(cx); }
> +    };
> +    AutoVerifyBarriers av(cx);

You can even do: struct AutoVerifyBarriers { ... } av(cx);

@@ +2915,5 @@
>      }
>  #endif
>  }
>  
> +struct EdgeValue

Do you think perhaps you could hoist this into some gc/Verifier.{h,cpp} ?

@@ +2935,5 @@
> +
> +/*
> + * The verifier data structures are simple. The entire graph is stored in a
> + * single block of memory. At the beginning is a VerifyNode for the root
> + * node. It is followed by a sequence of VerifyEdges--the exact number is given

s/VerifyEdges/EdgeValues/
Attachment #549880 - Flags: review?(luke) → review+
Comment on attachment 549874 [details] [diff] [review]
id array changes, v1

Review of attachment 549874 [details] [diff] [review]:
-----------------------------------------------------------------

Reasonable justification about cramming HeapValue in void* and its cold code.

Done!  Nice work
Attachment #549874 - Flags: review?(luke) → review+
Attached patch write barrier changes for TI (obsolete) — Splinter Review
These are the changes needed for TI.
Attachment #572704 - Flags: review?(bhackett1024)
Brian, in case it's helpful, here's the full patch.
Comment on attachment 572704 [details] [diff] [review]
write barrier changes for TI

Arghh. I made a mistake assuming that getSlot works for dense arrays. I'll post an updated patch tomorrow.
Attachment #572704 - Attachment is obsolete: true
Attachment #572704 - Flags: review?(bhackett1024)
Attached patch TI patch, v2Splinter Review
Here's the corrected TI patch.
Attachment #572925 - Flags: review?(bhackett1024)
Comment on attachment 572925 [details] [diff] [review]
TI patch, v2

Review of attachment 572925 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsarray.cpp
@@ +3026,5 @@
>  
> +    /*
> +     * Set the initializedLength first to invalidate the correct number of slots
> +     * for the write barrier.
> +     */

result will be initially empty, you should be able to assert that rather than add the comment.

::: js/src/jsinfer.cpp
@@ +1952,5 @@
> +TypeSet::needsBarrier()
> +{
> +    return unknownObject()
> +        || getObjectCount() > 0
> +        || hasAnyFlag(TYPE_FLAG_NULL | TYPE_FLAG_STRING);

I think this should take a cx and add the freeze constraint itself on paths when it returns false.  This keeps callers from needing to addFreeze() themselves.  Also, why the TYPE_FLAG_NULL?

::: js/src/jsinfer.h
@@ +629,5 @@
>  /* Type information about a property. */
>  struct Property
>  {
>      /* Identifier for this property, JSID_VOID for the aggregate integer index property. */
> +    HeapId id;

Property::id can be a weak reference, which would allow this to be a bare jsid.  This would not be hard to fix, but maybe better done as a followup.

@@ +732,3 @@
>  
>      /* Lazily filled array of empty shapes for each size of objects with this type. */
> +    js::HeapPtr<js::EmptyShape> *emptyShapes;

no js:: necessary here (the old js::EmptyShape was a lazy copy-paste job).

::: js/src/jsinferinlines.h
@@ +1272,5 @@
> +        return;
> +
> +    JSCompartment *comp = type->compartment();
> +    if (comp->needsBarrier)
> +        MarkTypeObjectUnbarriered(comp->barrierTracer(), type, "write barrier");

General question about write barriers: I'd thought that v->compartment() was a relatively expensive operation when done over and over again.  Do you know how much would be saved if the barriers took a cx and used cx->compartment (and did not overload operator=)?  I suppose this doesn't matter as much as it does for stack autorooters, which will be more active than VM heap modifications (and for which I really don't want to access v->compartment() anywhere).

::: js/src/jsobj.cpp
@@ +4464,5 @@
>      }
>  }
>  
>  void
> +JSObject::copySlotRange(size_t start, const Value *vector, size_t length, bool valid)

Ditto for valid parameter to copyDenseArrayElements.

::: js/src/jsobjinlines.h
@@ +613,5 @@
>  {
>      JS_ASSERT(isDenseArray());
>      JS_ASSERT(dstStart + count <= capacity);
> +    if (valid)
> +        invalidateSlotRange(dstStart, dstStart + count);

Can you rename invalidateSlotRange to something like markSlotRangeOverwrite?  It's not obvious what this call is doing, and the ObjShrink stuff adds a DEBUG JSObject::invalidateSlotRange which overwrites the slots with invalid (crash on touch) values.

Also, can you remove the 'valid' parameter and split this into a copyDenseArrayElements and an initDenseArrayElements?

::: js/src/methodjit/Compiler.cpp
@@ +5182,5 @@
> +                                   reg, Registers::ArgReg1);
> +                OOL_STUBCALL(stubs::GCThingWriteBarrier, REJOIN_NONE);
> +                stubcc.rejoin(Changes(0));
> +            } else {
> +                propertyTypes->addFreeze(cx);

This addFreeze will add constraints even if !cx->compartment->needsBarrier.  By putting the addFreeze into the 'return false' path of types->needsBarrier() as suggested above, this call is unnecessary.

@@ +5255,5 @@
>      }
>  
> +#ifdef JSGC_INCREMENTAL_MJ
> +    /* Write barrier. */
> +    bool needBarrier = !types || types->propertyNeedsBarrier(cx, id);

types->propertyNeedsBarrier() and types->needsBarrier() should only be called if cx->compartment->needsBarrier.  Otherwise the calls may add freeze constraints which trigger later recompilation even if the incremental GC is not active.

@@ +5257,5 @@
> +#ifdef JSGC_INCREMENTAL_MJ
> +    /* Write barrier. */
> +    bool needBarrier = !types || types->propertyNeedsBarrier(cx, id);
> +    if (cx->compartment->needsBarrier && needBarrier)
> +        stubcc.linkExit(masm.jump(), Uses(2));

Rather than generate an IC but always jump to the slow path, it would be better to always take a stub, e.g. just call jsop_setprop_slow and return.  I think the way things are structured, IC stubs (possibly many duplicate ones?) will still get generated but never used.

@@ +6246,5 @@
> +                stubcc.masm.move(ImmPtr(value), Registers::ArgReg1);
> +                OOL_STUBCALL(stubs::WriteBarrier, REJOIN_NONE);
> +                stubcc.rejoin(Changes(0));
> +            } else {
> +                types->addFreeze(cx);

Ditto unnecessary addFreeze.

@@ +6275,5 @@
> +
> +#ifdef JSGC_INCREMENTAL_MJ
> +    /* Write barrier. */
> +    if (cx->compartment->needsBarrier)
> +        stubcc.linkExit(masm.jump(), Uses(2));

Ditto use jsop_setgname_slow.

::: js/src/methodjit/FastBuiltins.cpp
@@ +506,5 @@
>  
> +#ifdef JSGC_INCREMENTAL_MJ
> +    /* Write barrier. */
> +    if (cx->compartment->needsBarrier)
> +        stubcc.linkExit(masm.jump(), Uses(2));

return Compile_InlineAbort

::: js/src/methodjit/FastOps.cpp
@@ +1162,1 @@
>          // Rejoin with the inline path.

Remove this comment, it's wrong now.

@@ +1173,5 @@
> +     * because in that case the slot we're overwriting was previously
> +     * undefined.
> +     */
> +    types::TypeSet *types = frame.extra(obj).types;
> +    bool needBarrier = !types || types->propertyNeedsBarrier(cx, JSID_VOID);

Ditto propertyNeedsBarrier() call ordering.

@@ +1507,5 @@
>  
> +#ifdef JSGC_INCREMENTAL_MJ
> +    // Write barrier.
> +    if (cx->compartment->needsBarrier)
> +        stubcc.linkExit(masm.jump(), Uses(3));

Ditto use jsop_setelem_slow.

::: js/src/methodjit/PolyIC.cpp
@@ +3236,5 @@
> +    if (cx->compartment->needsBarrier) {
> +        spew(cx, "ignored", "write barrier");
> +        return false;
> +    }
> +#endif

By using setelem_slow and avoiding generating of SetElement (and other ICs) when the compartment needs barriers, you should be able to assert !cx->compartment->needsBarrier here.
Attachment #572925 - Flags: review?(bhackett1024) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/d4bd0f9bece8
https://hg.mozilla.org/integration/mozilla-inbound/rev/65dcb557b3f6

Note to sheriffs: This is expected to cause a regression of 1-2% on JS-heavy benchmarks. Please don't back out solely for that reason.
Target Milestone: --- → mozilla11
Blocks: 701448
Comment on attachment 549879 [details] [diff] [review]
JSAPI and XPConnect changes, v1

Review of attachment 549879 [details] [diff] [review]:
-----------------------------------------------------------------

This seems to be missing write barriers for a bunch of things: nsWrapperCache::mWrapperPtrBits, nsScriptObjectHolder::mObject, various nsXULPrototype* members, ... Are these just planned in followup patches?

::: js/src/xpconnect/src/xpcwrappednative.cpp
@@ +1227,5 @@
> +    // able to properly cleanup. So, if it fails we null out mFlatJSObject
> +    // to indicate the invalid state of this object and return false.
> +    if(!JS_SetPrivate(ccx, existingJSObject, this))
> +    {
> +        mFlatJSObject = nsnull;

I don't understand this change. How can mFlatJSObject be non-null here?

@@ +1233,5 @@
> +    }
> +
> +    // Morph the existing object.
> +    if(!JS_SetReservedSlot(ccx, existingJSObject, 0, JSVAL_VOID))
> +        return JS_FALSE;

Also, it seems wrong to not null out the reserved slot for existingJSObject if JS_SetPrivate fails.
Peter, you're right about nulling mFlatJSObject. I made a stupid copy-paste mistake. I don't think it does any harm to have there, but I can take it out.

Regarding nulling out the reserved slot for existingJSObject, the order is now different. In the code now, if JS_SetReservedSlot fails, should we null out the private pointer of existingJSObject?

For nsWrapperCache::mWrapperPtrBits and the rest, I don't think we need write barriers. We only need them for pointers that the GC traces through (via JS_CALL_TRACER and the like). And even then only if the pointers are mutable. I did a big mxr search for all these a while ago and it seemed like only xpconnect needed changing. But I might have missed something. Do you know if any of the pointers you mentioned get traced?

As an aside, we *will* have to deal with these sorts of pointers when we get moving GC. But that's a little further down the road (bug 650161).
No longer blocks: 701448
Depends on: 703474
Depends on: 704258
Blocks: 709750
Sorry for not responding earlier, I've been away for a while.

(In reply to Bill McCloskey (:billm) from comment #31)
> Peter, you're right about nulling mFlatJSObject. I made a stupid copy-paste
> mistake. I don't think it does any harm to have there, but I can take it out.

Yeah, let's take it out.

> Regarding nulling out the reserved slot for existingJSObject, the order is
> now different. In the code now, if JS_SetReservedSlot fails, should we null
> out the private pointer of existingJSObject?

Sure, but we should still null out the reserved slot if JS_SetPrivate fails too.

> For nsWrapperCache::mWrapperPtrBits and the rest, I don't think we need
> write barriers. We only need them for pointers that the GC traces through
> (via JS_CALL_TRACER and the like). And even then only if the pointers are
> mutable. I did a big mxr search for all these a while ago and it seemed like
> only xpconnect needed changing. But I might have missed something. Do you
> know if any of the pointers you mentioned get traced?

These pointers are often traced by the GC (mostly through TraceJSObject in XPCJSRuntime.cpp). It looks like they're not mutable, at least they shouldn't change from one non-null value to another non-null value but I'm not entirely sure. Hopefully that's good enough.
(In reply to Peter Van der Beken [:peterv] from comment #32)
> These pointers are often traced by the GC (mostly through TraceJSObject in
> XPCJSRuntime.cpp). It looks like they're not mutable, at least they
> shouldn't change from one non-null value to another non-null value but I'm
> not entirely sure. Hopefully that's good enough.

There's an additional rule I forgot. Anything traced as a root (i.e., anything traced in the first slice of an incremental GC) doesn't need a barrier. This includes anything traced from TraceXPConnectRoots, so I think we're in the clear on these, whether they're mutable or not.
Depends on: 744727
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: