Closed Bug 658676 Opened 10 years ago Closed 8 years ago

Static analysis for identifying moving GC hazards

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 831409

People

(Reporter: bhackett1024, Unassigned)

References

Details

Attachments

(1 file, 1 obsolete file)

Experimental idea, it would be neat to have a static analysis to identify hazards with a moving GC --- GC thing pointers which are live across a call that can possibly trigger a GC.  This is an easier problem than with a non-moving GC, as we don't need to worry about whether a thing pointer is rooted.  We mostly just need an analysis to identify functions that can trigger GC, and a path-based analysis that can track liveness interprocedurally.  Fortunately, sixgill is a great fit for both of these requirements.

Interested in trying this out partly because I don't have a good sense for how hard it would be to introduce smart pointers for GC things and migrate off the conservative collector as we move towards a generational GC.
This sounds great.
Very rough first cut.

http://sixgill.org/gcsafe

This knows which functions can transitively call GC and which types to watch out for.  Only dereferences are checked (copies also need to be checked), and it doesn't yet have any idea how to actually determine an access is safe, so there are tons of reports.

It's easy enough to train the analysis to identify heap fields which are traversed by GC, and are always safe to access (as long as no copy is live across a GC).  Trickier is tracking locals across calls.

The existing AutoObjectRooter only really works for a non-moving GC (unless one wants to call root.object() everywhere), but I'd like to avoid the whole hog V8 Handle<Thing> route for now.  Doing this would require a ton of code changes, and I'd like to minimize the delta.  I think that AutoObjectRooter could be reworked to support this, by rooting a particular stack location rather than a particular object.

class AutoObjectRooter
{
    JSObject *&object;
    AutoObjectRooter(JSContext *cx, JSObject *&object);
};


Then:

void foo()
{
    JSObject *obj = ...;
    AutoObjectRooter root(cx, obj);

    MaybeGC();

    ... = obj->...;
}

If MaybeGC() relocates the original value of obj, the stack value of the local gets updated in the process.  Making a root should be cheap, the object does not need to be accessed indirectly (inhibits compiler optimizations related to the locals, though. oh well), the roots can be #ifdef'ed into nops for checking perf vs. the conservative collector, and little code needs to be changed.

This should work for arguments and locals but not 'this', whose address cannot be taken.  I don't know a way to trick the compiler into giving up &this, and doing black magic to compute the address from the frame pointer won't work with the varying ABIs that C++ code can use (not to mention inlining).  It's probably best to turn GC thing member functions which can trigger GC into wrappers for helper methods.
Attached patch root patch (obsolete) — Splinter Review
Make template classes for GC thing rooters, update existing AutoObjectRooters etc. to use them and add STATIC_GCSAFE annotation.  Class AutoRooter roots the address of a local or argument:

void foo(JSContext *cx, JSObject *obj)
{
    AutoRooter<JSObject> root(cx, &obj);

    JSObject *obj2 = obj;   // safe
    maybeGC();
    ... = obj2->f;          // not safe

    obj = ...;              // ok to reassign, stays rooted
    obj->f = ...;           // safe
}

Class AutoRooterVar acts like a new always-rooted local variable.

void foo(JSContext *cx, JSObject *obj)
{
    AutoRooterVar<JSObject> obj2(cx, obj);

    maybeGC();
    ... = obj->f;           // safe
    ... = obj2->f;          // not safe

    obj2 = ...;             // ok to reassign, stays rooted
    obj2->f = ...;          // safe

    JSObject *obj3 = obj2;  // coerces to JSObject*
}

AutoRooter and AutoRooterVar constructors are similar but have different signatures, for some typo protection.  Neat thing: if you take a pointer by reference ('void foo(const JSObject *&obj)' and all the callers pass rooted locals or heap locations, the callee doesn't have to reroot the object.

The STATIC_GCSAFE annotation indicates heap fields that are traversed by GC.  These are assumed correct, so the analysis will treat accesses to them as always being safe.

struct JSObject {
    STATIC_GCSAFE(lastProp)
    const Shape *lastProp;
    ...
};

void foo()
{
    const Shape *shape = obj->lastProp;   // safe
    ...
    shape->flags = x;    // safe if no possible GC in the '...'
}

The annotation needs to be passed the field name (sixgill annotations apply to the whole type, not just a field), but can be tucked in front of any field or at the start of the structure to avoid cluttering up the rest of the definition.

The analysis needs some more work. A) it knows about rooted fields, but doesn't pick them up from annotations yet. B) it knows about rooted vars, but its checking has some gaps.  Will fix these, and then I'd like to start adding annotations and AutoRooter/AutoRooterVar as necessary to drive down the warnings.  Would be good to get them into the tree relatively easily and avoid merge hell, but to avoid adding pointless overhead in the presence of the conservative collector we may want to nop AutoRooter for now.

In places where we depend on the rooter even with the conservative collector ---GC thing is no longer live, but derived data like slots()/chars() still is --- just fix the code to reload that data after possible GC triggers.
Attachment #535579 - Flags: feedback?(wmccloskey)
(In reply to comment #3)
> void foo(JSContext *cx, JSObject *obj)
> {
>     AutoRooterVar<JSObject> obj2(cx, obj);
> 
>     maybeGC();
>     ... = obj->f;           // safe
>     ... = obj2->f;          // not safe

Doh, reverse the notes here (it's late, also we need a checker for a reason).  obj is not safe because even though it is the same object as obj2, that object may have been relocated by maybeGC(). obj2 is safe because if such a relocation occurred, the stack address of the variable was updated.
Updated reports, at a new URL.

https://sixgill.mozilla.org/

These reflect GC safety annotations for some of the core fields traversed by GC for JSObject and Shape (a couple others too, like JSObjectBox::object).  Along with some other improvements, this cut the number of reports from 2500 to 1300.  These should be complete (modulo analysis bugs) within js/src for JSObject, JSString and Shape.  The exception is functions which we failed to analyze, either because of an analysis timeout (29 functions) or because they used a language feature the tool doesn't model (77 functions).  The latter is almost all uses of pointer-to-member in some hashtable template instantiations (each of which is a different function), but there are also some uses of computed goto, particularly js::Interpret.  These functions need to be examined by hand.  In the case of Interpret, it would be nice to factor out its guts into helper methods which could be analyzed and which the JIT stubs could also use (the duplicated and near-duplicated code has been an unfortunately persistent source of bugs in TI).

The page also includes reports listing all functions in js/src that can GC along with a stack trace along which they could GC.  This is generated using the same machinery as normal sixgill reports so will also exclude the above functions which analysis failed on.
Attached patch patchSplinter Review
Add more roots, reducing the number of warnings in js/src to 800.  This changes the format of the rooters so that they don't take a JSContext, but can determine the compartment from the initial thing and entrain the root info there.  (The roots still have a no-op implementation, though).  This allows rooted variables to be initialized with '=' during construction like the raw pointer variables they replace, and functions can have rooted variables as arguments --- coercion will happen automatically at call sites.  A second constructor for variables still takes a context, to handle cases where the variable starts out NULL but is then assigned a value.
Attachment #535579 - Attachment is obsolete: true
Attachment #535579 - Flags: feedback?(wmccloskey)
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 831409
You need to log in before you can comment on or make changes to this bug.