Closed Bug 780309 Opened 7 years ago Closed 7 years ago

GC: Make analysis aware of internal thing pointers

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: terrence, Assigned: sfink)

References

Details

(Whiteboard: [leave open][js:t])

Attachments

(2 files, 4 obsolete files)

When we move GC things we will need a mechanism to update all pointers on the stack that point into gcthing memory that is getting moved, not just pointers to the object heads (Roots).  Additionally, our analysis needs a way to avoid poisoning pointers to gcthing internals (wherever they end up [for now]).

For this purpose, we are introducing InternalRoot.  InternalRoot will be constructed from an owning Cell*, a byte offset into the thing, and the location of the pointer to protect.  We may also need an end offset for the analysis, I'll see how it goes.
Attached patch v0 (obsolete) — Splinter Review
This patch removes the SkipRoot that was used to protect the internal pointer 'chars' in AtomizeInline with something more elegant and robust.  This required fairly extensive changes to jsatom.h's interface, which I have used as an excuse to clean up jsatom.h in general.

Bill, could you review the new InternalPointer type and associated changes to jspubtd.h and CheckStackRoots?  Luke, could you review the atoms changes?
Attachment #649903 - Flags: review?(wmccloskey)
Attachment #649903 - Flags: review?(luke)
Just a few preliminary questions. Could you split the jsatom namespace changes into a separate patch? That would make this easier to get through. Also, why does the InternalPointer type need a length? And why is the length being checked during the root analysis?
(In reply to Bill McCloskey (:billm) from comment #2)
> Just a few preliminary questions. Could you split the jsatom namespace
> changes into a separate patch? That would make this easier to get
> through.

I am touching all of the users anyway, so not folding them together is going to double my pain.  I can do this if you really want me to, but imho the namespace change is trivial enough not to warrant it.

> Also, why does the InternalPointer type need a length?

I initially had it as #ifdef DEBUG to assert that the accessors are in range.  I made it always-on because it is such a nice cleanup to all of the string interfaces.  In addition, I was considering adding a check that *w in CheckStackRoot is not inside any of our inside-an-object range if w does not match.  This was somewhat awkward (from a C++ perspective) to implement, so I wanted to run it by you first to see if you think this is even a good idea.

> And why is the length
> being checked during the root analysis?

If the length looks like a GCThing pointer, then the conservative GC test will pass and poison the length if we don't stop it by checking if it is the length pointer here.
(In reply to Terrence Cole [:terrence] from comment #3)
> I am touching all of the users anyway, so not folding them together is going
> to double my pain.  I can do this if you really want me to, but imho the
> namespace change is trivial enough not to warrant it.

Seven of the first eight hunks in this patch just remove a js_ prefix. Please split it up.

> > And why is the length
> > being checked during the root analysis?
> 
> If the length looks like a GCThing pointer, then the conservative GC test
> will pass and poison the length if we don't stop it by checking if it is the
> length pointer here.

Did that happen? It seems like the length would have to be truly enormous to look like a pointer.
(In reply to Bill McCloskey (:billm) from comment #4)
> Seven of the first eight hunks in this patch just remove a js_ prefix.
> Please split it up.

Right, forgot about those.  Will do.
 
> > > And why is the length
> > > being checked during the root analysis?
> > 
> > If the length looks like a GCThing pointer, then the conservative GC test
> > will pass and poison the length if we don't stop it by checking if it is the
> > length pointer here.
> 
> Did that happen? It seems like the length would have to be truly enormous to
> look like a pointer.

I did not observe it, but I know several of our string tests create strings on the order of the size of our address space through recursive concatenation.  I expected that some of these would be problematic.
Attachment #649903 - Flags: review?(wmccloskey)
Attachment #649903 - Flags: review?(luke)
This is the Atom namespace changes split off from the InternalPointer stuff.  Still compiles in the shell.  I didn't see any browser files with the symbols I changed on DXR, but trying anyway to make sure I didn't break anything vital.

https://tbpl.mozilla.org/?tree=Try&rev=b50e55a0cc37
Attachment #649903 - Attachment is obsolete: true
Attachment #650155 - Flags: review?(luke)
I made the length #ifdef DEBUG in the base class and added it to the CharsPointer directly.  This gives us the best of both worlds: bounds checking in debug and no extra overhead for non-CharsPointers in opt builds.
Attachment #650175 - Flags: review?(wmccloskey)
Comment on attachment 650155 [details] [diff] [review]
0: cleanup jsatom.h v0

Review of attachment 650155 [details] [diff] [review]:
-----------------------------------------------------------------

un-js_ all the things!

::: js/src/jsatom.cpp
@@ +432,5 @@
>      *idp = JSID_FROM_BITS((size_t)atom);
>      return true;
>  }
>  
> +} /* namespace js */

Instead of this could you prefix IndexToIdSlow with js:: and remove the namespace js { ?
Attachment #650155 - Flags: review?(luke) → review+
AFAICT, that should work, but g++ was not having it.  The thing it has difficulty with is the forward-declaration of IndexToIdSlow inside of IndexToId.  If I implement IndexToIdSlow as js::IndexToIdSlow then g++ gives me the very unsubtle error:

error: ‘bool js::IndexToIdSlow(JSContext*, uint32_t, jsid*)’ should have been declared inside ‘js’
Attachment #650155 - Flags: checkin+
Comment on attachment 650175 [details] [diff] [review]
1: The InternalPointer, CharsPointer, and CharsHandle classes

Review of attachment 650175 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Root.h
@@ +431,5 @@
> +    static InternalPointerKind kind() { return INTERNAL_POINTER_STRING; }
> +};
> +
> +/*
> + * Local variable of type ElementType* which points into a Handle<T>.

Instead of "points into a Handle<T>", maybe say "points into a GC thing of type T."

@@ +438,5 @@
> + * specialization, define a InternalPointerBase<ElementType> specialization
> + * containing them.
> + */
> +template <typename T, typename ElementType>
> +class InternalPointer : public InternalPointerBase<ElementType>

Let's not do the InternalPointerBase<ElementType> thing until we actually need it.

@@ +455,5 @@
> +#endif
> +    }
> +
> +  public:
> +    InternalPointer(JSContext *cx, const Handle<T> &owner, ElementType *ptr, size_t length_)

You need a destructor to remove this thing from the cx->internalGCPointers.

@@ +482,5 @@
> +        JS_ASSERT(offset < length_);
> +        return ptr[offset];
> +    }
> +
> +    ElementType *operator+(size_t offset) {

Could you move the length stuff and the operator[] and operator+ methods to CharsPointer? It seems to me like they belong there instead of here.

@@ +487,5 @@
> +        JS_ASSERT(offset < length_);
> +        return ptr + offset;
> +    }
> +
> +    ElementType *pointer() const { return ptr; }

This should be called get().

@@ +489,5 @@
> +    }
> +
> +    ElementType *pointer() const { return ptr; }
> +
> +    void poison() { PoisonPtr(&ptr); }

Where would this be used?

::: js/src/jsgc.cpp
@@ +4568,5 @@
> +                InternalPointer<void *, void *> *internalptr = cx->internalGCPointers[i];
> +                while (internalptr) {
> +                    if (internalptr->addressOfPointer() == static_cast<void*>(w))
> +                        matched = true;
> +                    if (internalptr->addressOfLength() == static_cast<void*>(w))

I still would like you to remove this. I don't see any reason to make a special case for this length field when there are lots of others that will be on the stack. If there's a problem, this isn't going to fix it.

::: js/src/jspubtd.h
@@ +242,5 @@
>      THING_ROOT_TYPE,
>      THING_ROOT_LIMIT
>  };
>  
> +enum InternalPointerKind

It seems like we should be able to re-use ThingRootKind for these.

@@ +253,5 @@
>  template <typename T>
>  struct RootKind;
>  
> +template <typename T>
> +struct PointerKind;

And then we don't need this.

@@ +272,5 @@
> +    static InternalPointerKind pointerKind() { return INTERNAL_POINTER_OBJECT; };
> +};
> +template <> struct PointerKind<JSString *> {
> +    static InternalPointerKind pointerKind() { return INTERNAL_POINTER_STRING; };
> +};

Or these.

@@ +297,5 @@
>      Rooted<void*> *thingGCRooters[THING_ROOT_LIMIT];
> +
> +    /*
> +     * Stack allocated pointers to inside of a GC root which may need to be
> +     * updated if thier owning object is moved during a GC.

Grammar nits:
to inside -> to the inside
which -> that
thier -> their
Attachment #650175 - Flags: review?(wmccloskey)
Depends on: 784808
Depends on: 785551
Steve and I sat down and mulled over a few possible designs and ended up with something fairly nice:

template <class S>
class InternalHandle {
    void **thing;
    size_t offset;

  public:
    InternalHandle(void *&thing, void *pointer)
      : thing(&thing), offset(pointer - thing) {}

    S &operator -> () { return (S)(uintptr_t(*thing) + offset); }

    static InternalHandle<S> fromMarkedLocation(S &*thing) {
        return InternalHandle<S>::InternalHandle(&thing, &thing);
    }
};

Usage example: access to JSScript::bindings during the call to initWithTemporaryStorage, which can GC -- not that we can move scripts in the near future, but because the existing SkipRoot was broken and causing the analysis to fail.

// Typedef for easy usage.
typedef InternalHandle<Bindings *> BindingsHandle;

// An accessor for bindings which automatically creates a bindings handle.
BindingsHandle
JSScript::bindingsHandle() {
    return BindingsHandle(this, &bindings);
}

// Convert initWithTemporyStorage to be static so that we do 
// not attempt to use a Bindings* stored on the stack.
Bindings::initWithTemporaryStorage(script->bindingsHandle());

// Usage example for FunctionBox's bindings, which is not inside of a GC thing.
BindingsHandle bindings = BindingsHandle::fromMarkedLocation(&funbox->bindings);
Bindings::initWithTemporaryStorage(bindings);

//////////////
Advantages to this approach:

1) The BindingsHandle contains no pointers to the GC Heap.  The analysis will not need to know about InternalHandle<T> or any specific internal handle.  More importantly, real moving GC's will also not need to worry about it.

2) We do not encode JSScript in the definition of the Handle, so we only need one interface to initWithTemporaryStorage.  We could store a Handle<T> instead of a void**, but this would not give us more type safety because of funbox->bindings. 

3) We only need to access through a handle in regions which might GC with a pointer on the *stack*.  Other locations, specifically all other access to funbox->bindings or script->bindings, do not need to take any performance hit.
Showing the current WIP. Seems to work. It still needs some cleanup. Posting in case anyone has feedback.
Assignee: terrence → sphink
Attached patch patch (obsolete) — Splinter Review
Brushed it up a little bit, mostly just reshuffling internals slightly. Time for a second opinion.
Attachment #658983 - Flags: review?(terrence)
Attachment #658939 - Attachment is obsolete: true
Comment on attachment 658983 [details] [diff] [review]
patch

Review of attachment 658983 [details] [diff] [review]:
-----------------------------------------------------------------

This is mostly needing cosmetic cleanup, but I do have one serious question: why in the world do you need a static void* NULL that is unused, in a base class?

::: js/src/frontend/Parser.cpp
@@ +295,5 @@
>      }
>  }
>  
>  bool
> +ParseContext::generateFunctionBindings(JSContext *cx, InternalHandle<Bindings*> bindings) const

Wrap this to 100 columns.

@@ +309,5 @@
>  
>      AppendPackedBindings(this, args_, packedBindings);
>      AppendPackedBindings(this, vars_, packedBindings + args_.length());
>  
> +    if (!Bindings::initWithTemporaryStorage(cx, bindings, args_.length(), vars_.length(), packedBindings))

100?

::: js/src/gc/Root.h
@@ +241,5 @@
> + * InternalHandle when you have a pointer to a direct field of a gcthing, or
> + * when you need a parameter type for something that *may* be a pointer to a
> + * direct field of a gcthing.
> + */
> +class InternalHandleBase {

{ on newline.

@@ +243,5 @@
> + * direct field of a gcthing.
> + */
> +class InternalHandleBase {
> +  protected:
> +    static void * const zeroPointer;

1) Where are you using this?
2) Why do we need this at all?

@@ +250,5 @@
> +template <typename T>
> +class InternalHandle { };
> +
> +template <typename T>
> +class InternalHandle<T*> : public InternalHandleBase {

{ on newline.

@@ +261,5 @@
> +     * field in question, and a pointer to the field.
> +     */
> +    template<typename H>
> +    InternalHandle(const Handle<H> &handle, T *field)
> +        : holder((void**)handle.address()), offset(uintptr_t(field) - uintptr_t(handle.get()))

2 space indent.

@@ +269,5 @@
> +    /*
> +     * Create an InternalHandle to a field within a Rooted<>.
> +     */
> +    template<typename R>
> +    InternalHandle(const Rooted<R> &root, T *field)

Will a Rooted not automatically convert to a Handle and use the above constructor?

@@ +270,5 @@
> +     * Create an InternalHandle to a field within a Rooted<>.
> +     */
> +    template<typename R>
> +    InternalHandle(const Rooted<R> &root, T *field)
> +        : holder((void**)root.address()), offset(uintptr_t(field) - uintptr_t(root.get()))

2 spaces.

::: js/src/jsgc.cpp
@@ +102,5 @@
>  using namespace mozilla;
>  using namespace js;
>  using namespace js::gc;
>  
> +

Remove the extra space here.

@@ -2414,5 @@
> -      case BINDINGS: {
> -        Bindings::AutoRooter *rooter = static_cast<Bindings::AutoRooter *>(this);
> -        rooter->trace(trc);
> -        return;
> -      }

\o/

::: js/src/jsscript.cpp
@@ +59,5 @@
>      return bi.frameIndex();
>  }
>  
>  bool
> +Bindings::initWithTemporaryStorage(JSContext *cx, InternalHandle<Bindings*> self, unsigned numArgs, unsigned numVars, Binding *bindingArray)

Wrap to 100 columns.

@@ +143,5 @@
>      return reinterpret_cast<uint8_t *>(newBindingArray + count());
>  }
>  
>  bool
> +Bindings::clone(JSContext *cx, InternalHandle<Bindings*> self, uint8_t *dstScriptData, HandleScript srcScript)

Wrap to 100 columns.

@@ +156,5 @@
>      /*
>       * Since atoms are shareable throughout the runtime, we can simply copy
>       * the source's bindingArray directly.
>       */
> +    if (!initWithTemporaryStorage(cx, self, src.numArgs(), src.numVars(), src.bindingArray()))

Is this <100 columns?

@@ +210,5 @@
>              bindingArray[i] = Binding(name, kind, aliased);
>          }
>  
> +        InternalHandle<Bindings*> bindings(script, &script->bindings);
> +        if (!Bindings::initWithTemporaryStorage(cx, bindings, numArgs, numVars, bindingArray))

100?

@@ +2063,5 @@
>  
>      /* Bindings */
>  
>      Bindings bindings;
> +    InternalHandle<Bindings*> bindingsHandle = InternalHandle<Bindings*>::fromMarkedLocation(&bindings);

100?  Just a guess.  Splinter doesn't actually give us that info.

::: js/src/jsscript.h
@@ +159,5 @@
>       * bindingArray must have length numArgs+numVars. Before the temporary
>       * storage is release, switchToScriptStorage must be called, providing a
>       * pointer into the Binding array stored in script->data.
>       */
> +    static bool initWithTemporaryStorage(JSContext *cx, InternalHandle<Bindings*> self, unsigned numArgs, unsigned numVars, Binding *bindingArray);

This is way more than 100 columns.

@@ +166,5 @@
>      /*
>       * Clone srcScript's bindings (as part of js::CloneScript). dstScriptData
>       * is the pointer to what will eventually be dstScript->data.
>       */
> +    static bool clone(JSContext *cx, InternalHandle<Bindings*> self, uint8_t *dstScriptData, HandleScript srcScript);

Make sure this is 100 columns.

@@ -209,1 @@
>  };

\o/ \o/
Attachment #658983 - Flags: review?(terrence)
Attachment #650175 - Attachment is obsolete: true
(In reply to Terrence Cole [:terrence] from comment #16)
> Comment on attachment 658983 [details] [diff] [review]
> patch
> 
> Review of attachment 658983 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is mostly needing cosmetic cleanup, but I do have one serious question:
> why in the world do you need a static void* NULL that is unused, in a base
> class?

It's used in the private constructor used for fromMarkedLocation(). I need a pointer to a NULL so that I can dereference it, add the offset to it, and get an actual pointer. I stuck it in a base class just so I could only have one of them. If I put it in the templatized class, I'd end up with a different pointer per instantiation, which is fine but seemed a waste. I could just use a global variable, but it seemed like I should somehow associate it with the name "InternalHandle". I suppose I could make an InternalHandle namespace to shove it in, but having a namespace and a template name the same seems not good.

I'm open to suggestions, though! I just need something to use for |holder| to make this work without any conditionals:

  reinterpret_cast<T*>(uintptr_t(*holder) + offset)

> ::: js/src/frontend/Parser.cpp
> @@ +295,5 @@
> >      }
> >  }
> >  
> >  bool
> > +ParseContext::generateFunctionBindings(JSContext *cx, InternalHandle<Bindings*> bindings) const
> 
> Wrap this to 100 columns.

Wrapping nazi.

> @@ +243,5 @@
> > + * direct field of a gcthing.
> > + */
> > +class InternalHandleBase {
> > +  protected:
> > +    static void * const zeroPointer;
> 
> 1) Where are you using this?
> 2) Why do we need this at all?

see above

> @@ +269,5 @@
> > +    /*
> > +     * Create an InternalHandle to a field within a Rooted<>.
> > +     */
> > +    template<typename R>
> > +    InternalHandle(const Rooted<R> &root, T *field)
> 
> Will a Rooted not automatically convert to a Handle and use the above
> constructor?

I doubt it. This is template specialization, not overloading. But let me try... nope, doesn't work. You'd be asking it to figure out that it needs a Handle<T> specialization based on seeing a Rooted<T> usage.
Attached patch patchSplinter Review
Btw, we also discussed making owner be a union of size_t when it's an offset and T*. I didn't do that just because using it properly would mean changing get() to do something like

  if (owner)
    return owner + u.offset
  else
    return u.pointer;

which theoretically could be optimized to the current

  return owner + offset

but I wasn't sure the gain was worth it.

I could rename 'offset' to 'offsetOrMarkedPointer'?
Attachment #659322 - Flags: review?(terrence)
Attachment #658983 - Attachment is obsolete: true
Comment on attachment 659322 [details] [diff] [review]
patch

Review of attachment 659322 [details] [diff] [review]:
-----------------------------------------------------------------

Okay, zeroPointer is annoying, but I can't think of a better solution.
Attachment #659322 - Flags: review?(terrence) → review+
Whiteboard: [leave open] → [leave open][js:t]
It looks like bindings is the only important instance of this.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.