Closed Bug 1234862 Opened 8 years ago Closed 8 years ago

Attempt to unify the GC's various policy mechanisms

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox46 --- affected
firefox47 --- fixed

People

(Reporter: terrence, Assigned: terrence)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

In various places we use GCMethods, InternalGCMethods, and GCPolicy to determine what actions to take on a GC reference. It would be nice to have a single way of doing things that we used everywhere. There are a few cases to consider.

Regarding GC Pointers (e.g. JSObject*)
----
Whether stored in C++ stack or heap memory or in the GC heap directly, when stored in Rooted or Heap we need a GCMethods definition; When stored in HeapPtr (et.al.), we need an InternalGCMethods definition; When stored in a GCHashTable or TraceableVector, we need a GCPolicy.

The difference between InternalGCMethods and GCMethods is inlining. GCMethods for all pointers that are /not/ JSObject or JSFunction only define a non-empty body for |initial|, so are trivial. If we want to merge with InternalGCMethods, I believe that post barriers already need to make a call, so I think moving the existing JSObject/JSFunction code out of line will not hurt much, if at all. Also, unexposed classes will keep the existing post method, so the ool aspect is probably not a problem.

For GCHash/Vector, on the other hand, we need a GCPolicy that tells us how to trace and sweep the references. We do not need this for rooted because of the additional RootKind template magic that puts us in a layout-based list. We'll need to keep something like this, however, there is the opportunity to use the Type<->TraceKind mapping to remove this extra template and unify with PersistentRooted. We do not need this for Heap{Ptr} because tracing is done by the owner. With a GC policy on all references, the owner could do all tracing by doing |GCPolicy<decltype(memberName_)>::trace(trc);|; however, this doesn't really improve over our existing argument-based type dispatch ala JS::TraceEdge. Which leads us to:

Aggregate C++ Types (JSPropertyDescriptor, etc.)
----
Recently, we have added the ability to store aggregate C++ types in Rooted and GCHash/Vector. Rooted stores T::trace locally for tracing non-virtually. GCHash/Vector take the virtual overhead already, so have access to a correct T even though we can store these in a Rooted. Since these datastructures can also store pointers too, they use GCPolicy<T>::trace instead of T::trace directly.

We do not currently use GCPolicy<T>::trace to trace Rooted<T>, because we drop the exact T when putting them in the Rooted lists. We can recover the base T from the list, but at that point, we might as well just call TraceRoot instead of GCPolicy<T>::trace, which would call TraceEdge and lose information.

Regarding Tagged Pointer Types (Value, jsid, TaggedProto, etc)
----
Tagged pointers muddy the waters. Traditionally, these have worked exactly like direct pointers, with the trace functions knowing to manually dispatch on entry. In theory, we could use our new tools for marking aggregate types to mark these tagged pointers too and simplify the GC a bit. On the other hand, this would add overhead and the only tools that are needed currently are DispatchTyped and RewrapTaggedPointer. If you drop in these two functions, tracing "just works".

The design space here is huge and the potential for failure enormous. I think we can do better than what we have right now, however.
This is not quite right. Or at least not clear enough.

At a high level, we have a clear correspondence: containers that contain a variable number of GC references, structs that contain a specific larger-than-one number of GC references, tagged pointers (basically a struct with one GC reference), and direct pointers -- a container of a single GC reference. All of these "containers" support |trace| and |needsSweep|.

This breaks down at the level of barriers. The direct pointers want to be barriered directly and the real containers contain barriered pointers instead of direct pointers. This would be fine except that Value and jsid *also* get their own barriers. So our barrier wrappers support containers, but only certain sorts of containers that obey certain properties.

So, this would suggest a question: why can we not just make Value and jsid into "normal" structs for these purposes. HeapSlot is already handled by a custom barrier so we would not lose significant performance here.

The answer is that the packed internal representation of Value and jsid make this impossible. We /could/ use generic barriers to wrap something like Vector<Value> with appropriate barriers externally, but we'd need something specific to do that that looks very much like HeapValue. Compare this to a Value class implemented with |using Value = Variant<HeapObject, HeapString, HeapScript, int32_t, double, etc>|. In this case, we get construction and destruction of the barriered pointers and those pointers have a complete and correct representation in memory, so we'd get the "Value" barriers for free. With a packed representation this isn't possible.

Similarly, because of the packing, we need to be able to TraceEdge, NeedsSweep, etc on Value and jsid directly because there is no unpacked representation we could wrap. Thus, we have the concept of DispatchTyped. We could move this all into a Value::trace, but then we'd need Value::needsSweep and probably a ton of others, judging by how frequently we use DispatchTyped. Instead, we should formalize the concept of DispatchTyped by moving it into the barrier implementation struct.

So, concretely, what do we want to do here? I think the optimal solution actually looks very much like what we have now, with a couple important changes:

* Rename GCMethods to BarrierMethods and DefaultGCPolicy to GCPolicy.
* Rename TraceableVector to GCVector, as we did for GCHashTable.
* Don't take a GCPolicy in GCHashSet or GCVector -- the underlying type should know what to do in those cases.
* Rename GCHashMap::GCPolicy to GCHashMap::sweepPolicy (note: default will still be GCPolicy<Key> || GCPolicy<Value>).
* Move ::initial from BarrierMethods to GCPolicy. This is not specific to barrierable pointers and is the main reason that non-barriered things get GCMethods right now.
* Move DispatchTyped and RewrapTagged into BarrierMethods.
* Try to merge InternalBarrierMethods and BarrierMethods. This split allows for better inlining, but I'm not sure how important that actually is, and we should be able to move the important checks external anyway, if we haven't already.
Keywords: leave-open
Rename GCMethods to BarrierMethods and move ::initial to the new GCPolicy.
Attachment #8702351 - Flags: review?(sphink)
And move TraceableVector to GCVector.
Attachment #8702352 - Flags: review?(sphink)
Comment on attachment 8702351 [details] [diff] [review]
1_rename_GCMethods_to_BarrierMethods-v0.diff

Review of attachment 8702351 [details] [diff] [review]:
-----------------------------------------------------------------

I'm still chewing on the overall picture, but this change in particular seems totally good.
Attachment #8702351 - Flags: review?(sphink) → review+
Attachment #8702352 - Flags: review?(sphink) → review+
Depends on: 1238711
Comment on attachment 8702352 [details] [diff] [review]
2_move_TraceableVector_to_GCVector-v0.diff

Moved to bug 1238711.
Attachment #8702352 - Attachment is obsolete: true
Now that we've converted everything over to generic sweeping, I feel pretty confident saying that we don't really need (or want) our datastructures taking a custom GCPolicy for individual T's. The only exception is for the table's sweeping policy, which is not a GC policy so much as it is a MapSweepingPolicy. This patch and the next two remove the template arg and uses DefaultGCPolicy directly.
Attachment #8712229 - Flags: review?(sphink)
This renames DefaultGCPolicy to GCPolicy, merging it with the old GCPolicy and moving everything to js/public/GCPolicyAPI.h and js/src/gc/Policy.h. All of the various types that need to trace or sweep themselves get trace or needsSweep methods as appropriate, etc. It's mostly code motion, but there's quite a lot of it.

Now that we have a GCPolicy that we can expect on all GC aware types, we should be able to make Rooted<Traceable> look to GCPolicy<T>::trace instead of T::trace and drop the Traceable trait completely. I'll do that in a different bug though.
Attachment #8712234 - Flags: review?(sphink)
(In reply to Terrence Cole [:terrence] from comment #14)
> Created attachment 8712234 [details] [diff] [review]
> 5_rename_to_GCPolicy-v0.diff
> 
> This renames DefaultGCPolicy to GCPolicy, merging it with the old GCPolicy

\o/
Comment on attachment 8712234 [details] [diff] [review]
5_rename_to_GCPolicy-v0.diff

Review of attachment 8712234 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! This is really coming together now.

::: js/public/GCPolicyAPI.h
@@ +17,5 @@
> +//       - Tells the GC how to construct an empty T.
> +//
> +//   static void trace(JSTracer, T* tp, const char* name)
> +//       - Tells the GC how to traverse the edge itself or its children in the
> +//         case of an aggregate.

Hm. Would it be accurate to say

 // Tells the GC how to traverse the edge. In the case of an aggregate,
 // describe how to trace the children.

or something? The wording about sounds like it's talking about the children of the edge. Or should it be "...how to trace the edge to a GC pointer, or in the case of an aggregate, how to trace through the children"?

@@ +25,5 @@
> +//         aggregates, the meaning is less clear. When the aggregate is used in
> +//         a weak container, the return value will determine the semantcs of
> +//         that container's weakness. This is the primary reason that GC-
> +//         supporting containers can override the policy on a per-container
> +//         basis.

This should mention that it isn't just a pure boolean determination, it may also update moved GC pointers.

*semantics

I'm not sure I understand the use of "aggregates" here. For one thing, "aggregate" and "container" sound like they may refer to the same thing. I think you mean when the aggregate is used as a key into or member of a weak container? (Actually, "weak container" is a bit of an odd expression, since it's just a container, one that holds things weakly. But sure, "weak container" could be shorthand for that.) Then "GC-supporting containers". It sounds like you're distinguishing them from weak containers, when you're really just giving some extra clarification. I think that should be "GC-supporting weak containers", maybe? I mean, you could override the policy on a non-weak container, but you'd never sweep it so needsSweep is irrelevant. Then there's trace() -- I guess you're trying to point out here that it's not useful to override trace() on a per-container basis?

Sorry, I guess I should be suggesting an alternative wording.

Tells the GC how to determine if an edge is about to be finalized, and potentially updates the edge for moving GC if not. For aggregates, it determines the weakness semantics of storing the aggregate inside a weak container of some sort. For example, you might specialize the a weak table's key type GCPolicy to describe when an entry should be kept during sweeping. This is the primary reason that GC-supporting weak containers can override the [sweep?] policy on a per-container basis.

@@ +65,5 @@
> +
> +// The default GC policy attempts to defer to methods on the underlying type.
> +// Most C++ structures that contain a default constructor, a trace function and
> +// a sweep function will work out of the box with Rooted, GCHash{Set,Map}, and
> +// GCVector.

and TraceableFifo, which perhaps should be GCFifo now? (I'd never heard of it before.)

Might want to explicitly list Handle too, since I know that Luke at least was (pleasantly) surprised that Handle just worked. Probably that'll be enough to suggest MutableHandle.

@@ +94,5 @@
> +template <> struct GCPolicy<JSAtom*> : public GCPointerPolicy<JSAtom*> {};
> +template <> struct GCPolicy<JSFunction*> : public GCPointerPolicy<JSFunction*> {};
> +template <> struct GCPolicy<JSObject*> : public GCPointerPolicy<JSObject*> {};
> +template <> struct GCPolicy<JSScript*> : public GCPointerPolicy<JSScript*> {};
> +template <> struct GCPolicy<JSString*> : public GCPointerPolicy<JSString*> {};

I don't suppose there's a way to do something like

template <typename T> struct GCPolicy<T*> {
  static_assert(false, "No GCPolicy specialization defined for this type.");
};

is there? (To get a meaningful message into a compile error if you attempt to use an unknown type)

::: js/src/ds/TraceableFifo.h
@@ +25,2 @@
>  // js/public/TracingAPI.h. Generic helpers can be found in
>  // js/public/TracingAPI.h.

pre-existing rebase gunk? The last sentence is duplicated.

::: js/src/gc/Policy.h
@@ +14,5 @@
> +
> +// Forward declare the types we're defining policies for. This file is
> +// included in all places that define GC things, so the real definitions
> +// will be available when we do template expansion, allowing for use of
> +// static members in the underlying types. We cannot, however, user

s/user/use/
Attachment #8712234 - Flags: review?(sphink) → review+
Comment on attachment 8712231 [details] [diff] [review]
4_force_DefaultGCPolicy_for_TraceableFifo-v0.diff

Review of attachment 8712231 [details] [diff] [review]:
-----------------------------------------------------------------

You don't want to make this GCFifo at the same time?
Attachment #8712231 - Flags: review?(sphink) → review+
Attachment #8712230 - Flags: review?(sphink) → review+
Attachment #8712229 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink, :s:] from comment #17)
> Comment on attachment 8712231 [details] [diff] [review]
> 4_force_DefaultGCPolicy_for_TraceableFifo-v0.diff
> 
> Review of attachment 8712231 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> You don't want to make this GCFifo at the same time?

I was going to ask Nick for that review.
The SM(r) and ASAN failures above are on tip and do not reproduce locally.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: