Closed Bug 1211022 Opened 9 years ago Closed 9 years ago

Add a weak C++ GC thing reference that is automagically swept

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox44 --- fixed

People

(Reporter: terrence, Assigned: terrence)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

We have a surprising amount of code dedicated to nulling out random, untracked references. We can automate most of this fairly easily. This patch adds WeakRef as an alias for ReadBarriered and allows WeakRef to be traced with TraceWeakEdge. This new trace routine records the pointer and nulls it out immediately before sweeping begins.

The try run seems to be green, or at least as green as they seem to come recently:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ae8af080f6ac
Attachment #8669197 - Flags: review?(sphink)
Comment on attachment 8669197 [details] [diff] [review]
add_weakref_type-v0.diff

Review of attachment 8669197 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, this came out to be very clean and nice. I like it.

::: js/src/gc/Barrier.h
@@ +595,5 @@
>      }
>  };
>  
> +// A WeakRef pointer does not hold its target live and is automatically nulled
> +// out when the GC discovers that it is not reachable from any other path.

which still begs the question of why it needs to be read barriered, but I suppose for an API comment that's fine.

::: js/src/gc/Marking.cpp
@@ +740,5 @@
> +template <typename T>
> +void
> +NoteWeakEdge(GCMarker* gcmarker, T* thingp)
> +{
> +    MOZ_CRASH("the gc does not support tagged pointers as weak edges");

Hm. That's a good check, but I bet most people who end up seeing this message will see it because they neglected to take the address of the edge, and end up wondering why it's complaining about their JSObject* being a tagged pointer. Maybe "NoteWeakEdge must be given the address of a GC pointer. Tagged pointers are not supoprted."?

@@ +753,5 @@
> +    MOZ_ASSERT((*edge)->isTenured());
> +
> +    // Note: we really want the *source* Zone here. The edge may start in a
> +    // non-gc heap location, however, so we use the fact that cross-zone weak
> +    // references are not allowed and use the *target's* zone as a proxy.

The comment is scary. I think removing the "as a proxy" might be better. It had me thinking, "wait, so if the target and source zones are different, then somehow it's safe to use the target's zone because, uh, maybe they'll end up in the same zone group or something?" It was the word "proxy" that threw me. If not for that, I would have noticed the "cross-zone weak references are not allowed."

@@ +757,5 @@
> +    // references are not allowed and use the *target's* zone as a proxy.
> +    JS::Zone::WeakEdges &weakRefs = (*edge)->asTenured().zone()->gcWeakRefs;
> +    AutoEnterOOMUnsafeRegion oomUnsafe;
> +    if (!weakRefs.append(reinterpret_cast<TenuredCell**>(edge)))
> +        oomUnsafe.crash("Failed to record a weak edge for sweeping.");

Ugh. Something of a worrisome place to put an unhandleable oom, but the options to avoid it seem pretty gross. Re-do all of marking? Bloat out weakrefs into a pair of words so you can thread a linked list through them, possibly only used on OOM?

Bleh. Another one of those "let's wait to see if it's a problem." I wonder if we could have automatic telemetry bucketing for different unhandleable OOM reasons -- except I guess we already get that with crashstats.

@@ +2318,5 @@
>  
> +bool
> +js::gc::IsAboutToBeFinalizedDuringSweep(TenuredCell& tenured)
> +{
> +    MOZ_ASSERT(!IsInsideNursery(&tenured));

Seems a little redundant, but I guess it's not too unlikely that someone might reinterpret_cast<>.

::: js/src/gc/Tracer.h
@@ +77,5 @@
>  TraceManuallyBarrieredEdge(JSTracer* trc, T* thingp, const char* name);
>  
> +// Visits a WeakRef, but does not trace it. If *thingp is not marked at the
> +// end of marking, it is replaced by nullptr. This method records thingp, so
> +// the edge location must not change after this function is called.

Ugh.

A: "I have a function that visits a WeakRef, but does not trace it."
B: "Cool! What do you call it?"
A: "TraceWeakEdge"
B: ...

Maybe "Visits a WeakRef, but does not trace its children."? Or "Visits a WeakRef, but does not trace its referent."? The latter is probably more accurate.

::: js/src/jsgc.cpp
@@ +5004,5 @@
> +            /* Edges may be present multiple times, so may already be nulled. */
> +            if (*edge && IsAboutToBeFinalizedDuringSweep(**edge))
> +                *edge = nullptr;
> +        }
> +        zone->gcWeakRefs.clear();

Oh! Seems like clear() (and popBack() etc.) don't bother to shrink the reserved storage space, so perhaps the lack of OOM handling won't be a problem.
Attachment #8669197 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink, :s:] from comment #1)
> Comment on attachment 8669197 [details] [diff] [review]
> add_weakref_type-v0.diff
> 
> Review of attachment 8669197 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Wow, this came out to be very clean and nice. I like it.
> 
> ::: js/src/gc/Barrier.h
> @@ +595,5 @@
> >      }
> >  };
> >  
> > +// A WeakRef pointer does not hold its target live and is automatically nulled
> > +// out when the GC discovers that it is not reachable from any other path.
> 
> which still begs the question of why it needs to be read barriered, but I
> suppose for an API comment that's fine.
> 
> ::: js/src/gc/Marking.cpp
> @@ +740,5 @@
> > +template <typename T>
> > +void
> > +NoteWeakEdge(GCMarker* gcmarker, T* thingp)
> > +{
> > +    MOZ_CRASH("the gc does not support tagged pointers as weak edges");
> 
> Hm. That's a good check, but I bet most people who end up seeing this
> message will see it because they neglected to take the address of the edge,
> and end up wondering why it's complaining about their JSObject* being a
> tagged pointer. Maybe "NoteWeakEdge must be given the address of a GC
> pointer. Tagged pointers are not supoprted."?

Eh? This is not exposed in a .h, so the only way to call it is via TraceWeakEdge, which takes a WeakRef<T>*, so I don't think this is actually true. If you mean by internal callers, then absolutely: that's just the risk you take when working near heavy machinery with the safety covers off.

> @@ +753,5 @@
> > +    MOZ_ASSERT((*edge)->isTenured());
> > +
> > +    // Note: we really want the *source* Zone here. The edge may start in a
> > +    // non-gc heap location, however, so we use the fact that cross-zone weak
> > +    // references are not allowed and use the *target's* zone as a proxy.
> 
> The comment is scary. I think removing the "as a proxy" might be better. It
> had me thinking, "wait, so if the target and source zones are different,
> then somehow it's safe to use the target's zone because, uh, maybe they'll
> end up in the same zone group or something?" It was the word "proxy" that
> threw me. If not for that, I would have noticed the "cross-zone weak
> references are not allowed."

Done.

> @@ +757,5 @@
> > +    // references are not allowed and use the *target's* zone as a proxy.
> > +    JS::Zone::WeakEdges &weakRefs = (*edge)->asTenured().zone()->gcWeakRefs;
> > +    AutoEnterOOMUnsafeRegion oomUnsafe;
> > +    if (!weakRefs.append(reinterpret_cast<TenuredCell**>(edge)))
> > +        oomUnsafe.crash("Failed to record a weak edge for sweeping.");
> 
> Ugh. Something of a worrisome place to put an unhandleable oom, but the
> options to avoid it seem pretty gross. Re-do all of marking? Bloat out
> weakrefs into a pair of words so you can thread a linked list through them,
> possibly only used on OOM?

Yes, exactly! It's a catch 22. The way it works currently is to encode information about these edge into sweeping via manual IsAboutToBeFinalized in ::finalize or via a custom one-off callback in beginSweepingZoneGroup. Note that in the first case, given incremental sweeping, you have to deal with sweeping both during access in the mutator and deal with reallocation in the finalizer, AND this only works if you know there is a single target. The second mechanism has to be very precisely organized to avoid stepping all over each other, given the intermingled data dependencies.

Given that choice, I think we should just take the OOM risk.

> Bleh. Another one of those "let's wait to see if it's a problem." I wonder
> if we could have automatic telemetry bucketing for different unhandleable
> OOM reasons -- except I guess we already get that with crashstats.
> 
> @@ +2318,5 @@
> >  
> > +bool
> > +js::gc::IsAboutToBeFinalizedDuringSweep(TenuredCell& tenured)
> > +{
> > +    MOZ_ASSERT(!IsInsideNursery(&tenured));
> 
> Seems a little redundant, but I guess it's not too unlikely that someone
> might reinterpret_cast<>.

Actually, this is making sure that the *source* of the TenuredCell pointer is not in the nursery, so it's less redundant than it looks.

> ::: js/src/gc/Tracer.h
> @@ +77,5 @@
> >  TraceManuallyBarrieredEdge(JSTracer* trc, T* thingp, const char* name);
> >  
> > +// Visits a WeakRef, but does not trace it. If *thingp is not marked at the
> > +// end of marking, it is replaced by nullptr. This method records thingp, so
> > +// the edge location must not change after this function is called.
> 
> Ugh.

I guess we could add a barrier to track movement of these, but double-ugh. For the most part these pointers are going to be in gc things or singletonish C++ allocations. If the reference is in a container, then we're going to need something more complicated anyway.

> 
> A: "I have a function that visits a WeakRef, but does not trace it."
> B: "Cool! What do you call it?"
> A: "TraceWeakEdge"
> B: ...

Heh.

> Maybe "Visits a WeakRef, but does not trace its children."? Or "Visits a
> WeakRef, but does not trace its referent."? The latter is probably more
> accurate.

Done.

> ::: js/src/jsgc.cpp
> @@ +5004,5 @@
> > +            /* Edges may be present multiple times, so may already be nulled. */
> > +            if (*edge && IsAboutToBeFinalizedDuringSweep(**edge))
> > +                *edge = nullptr;
> > +        }
> > +        zone->gcWeakRefs.clear();
> 
> Oh! Seems like clear() (and popBack() etc.) don't bother to shrink the
> reserved storage space, so perhaps the lack of OOM handling won't be a
> problem.

Oh, yeah. I had thought that clear did remove the allocation, but if not maybe that's for the best. We'll have to see what this does in the wild. We may even need a HashSet here if we end up visiting the same edge many times in too many cases.
> > @@ +2318,5 @@
> > >  
> > > +bool
> > > +js::gc::IsAboutToBeFinalizedDuringSweep(TenuredCell& tenured)
> > > +{
> > > +    MOZ_ASSERT(!IsInsideNursery(&tenured));
> > 
> > Seems a little redundant, but I guess it's not too unlikely that someone
> > might reinterpret_cast<>.
> 
> Actually, this is making sure that the *source* of the TenuredCell pointer
> is not in the nursery, so it's less redundant than it looks.

Oh! I totally missed that. Indeed, that makes sense.

> > ::: js/src/gc/Tracer.h
> > @@ +77,5 @@
> > >  TraceManuallyBarrieredEdge(JSTracer* trc, T* thingp, const char* name);
> > >  
> > > +// Visits a WeakRef, but does not trace it. If *thingp is not marked at the
> > > +// end of marking, it is replaced by nullptr. This method records thingp, so
> > > +// the edge location must not change after this function is called.
> > 
> > Ugh.
> 
> I guess we could add a barrier to track movement of these, but double-ugh.
> For the most part these pointers are going to be in gc things or
> singletonish C++ allocations. If the reference is in a container, then we're
> going to need something more complicated anyway.

Oh, sorry. The "Ugh" was only about the name. I like the functionality exactly as it is.
https://hg.mozilla.org/mozilla-central/rev/7740085aa5e7
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
You need to log in before you can comment on or make changes to this bug.