Last Comment Bug 660039 - SpiderMonkey needs a WeakMap usable from C++
: SpiderMonkey needs a WeakMap usable from C++
Status: RESOLVED FIXED
fixed-in-tracemonkey
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: ---
Assigned To: Jim Blandy :jimb
:
Mentors:
Depends on:
Blocks: 636907
  Show dependency treegraph
 
Reported: 2011-05-26 11:41 PDT by Jim Blandy :jimb
Modified: 2011-06-20 17:08 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Rename WeakMap to JSWeakMap; plain WeakMap will be the general-purpose C++ class template. (9.78 KB, patch)
2011-05-26 23:39 PDT, Jim Blandy :jimb
no flags Details | Diff | Review
Minor cleanups to WeakMap. (3.89 KB, patch)
2011-05-26 23:40 PDT, Jim Blandy :jimb
jorendorff: review+
Details | Diff | Review
Provide a WeakMap usable from C++. (12.95 KB, patch)
2011-05-26 23:41 PDT, Jim Blandy :jimb
no flags Details | Diff | Review
Provide a WeakMap usable from C++. (13.69 KB, patch)
2011-05-27 20:08 PDT, Jim Blandy :jimb
jorendorff: review-
Details | Diff | Review
Provide a WeakMap usable from C++. (22.59 KB, patch)
2011-06-07 10:16 PDT, Jim Blandy :jimb
jorendorff: review+
Details | Diff | Review

Description Jim Blandy :jimb 2011-05-26 11:41:36 PDT
Various places in SpiderMonkey would like to have tables that behave like JavaScript WeakMaps --- that have collectable keys and hold their values strongly only when their keys are strongly held --- but can hold arbitrary C++ values. In particular, we need these in the Debug object, which can't use a WeakMap directly because the keys are in a different compartment.

I'd like to add a class template to jsweakmap.h that can be used not only by the WeakMap implementation itself, but by everyone else as well.
Comment 1 Jim Blandy :jimb 2011-05-26 23:39:44 PDT
Created attachment 535563 [details] [diff] [review]
Rename WeakMap to JSWeakMap; plain WeakMap will be the general-purpose C++ class template.
Comment 2 Jim Blandy :jimb 2011-05-26 23:40:24 PDT
Created attachment 535564 [details] [diff] [review]
Minor cleanups to WeakMap.
Comment 3 Jim Blandy :jimb 2011-05-26 23:41:23 PDT
Created attachment 535565 [details] [diff] [review]
Provide a WeakMap usable from C++.

Builds; not tested.

Not ready for review yet, but comments on general approach very welcome.
Comment 4 Jim Blandy :jimb 2011-05-27 20:08:33 PDT
Created attachment 535809 [details] [diff] [review]
Provide a WeakMap usable from C++.

This one's reviewable.
Comment 5 Jason Orendorff [:jorendorff] 2011-06-02 12:16:27 PDT
Comment on attachment 535809 [details] [diff] [review]
Provide a WeakMap usable from C++.

Review of attachment 535809 [details] [diff] [review]:
-----------------------------------------------------------------

r- due to a bug that probably needs a new test. I'd like to re-review in order to look at the test. Otherwise this is exactly what we need.

::: js/src/jsapi.cpp
@@ +643,5 @@
>  #endif
>  
>  JSRuntime::JSRuntime()
> +    : gcWeakMapList(NULL),
> +      gcChunkAllocator(&defaultGCChunkAllocator)

Two spaces before the colon (as it was before your change) is more usual, according to naive grepping.

::: js/src/jsweakmap.cpp
@@ +65,5 @@
> +        if (m->markIteratively(tracer))
> +            markedAny = true;
> +    }
> +    return markedAny;
> +}

This costs a virtual method call per WeakMap during GC (and another one during sweep).

We could do this without any virtual method calls. JSRuntime would have several gcWeakMapLists:

  struct JSRuntime {
      ...
      ObjectWeakMap *gcObjectWeakMapList;
      FrameWeakMap  *gcFrameWeakMapList;
      ...

      // support for the WeakMap template
      void getWeakMapList(ObjectWeakMap ***listpp) { *listpp = &gcObjectWeakMapList; }
      void getWeakMapList(FrameWeakMap ***listpp) { *listpp = &gcFrameWeakMapList; }
      ...
  }

GC would run faster, I think, and WeakMapBase would go away. The downside is, the set of allowed WeakMap template instantiations would basically be hardcoded in several places; each time a C++ user starts using a new instantiation of js::WeakMap, that would require more work than just declaring the type and using it. The hardest parts would still be templatized.

r=me on the prettier slower approach, unless it draws a veto from Andreas or Gregor.

::: js/src/jsweakmap.h
@@ +50,5 @@
>  
> +// A C++ weak table, implemented as a subclass of js::HashMap. Keys and values can be
> +// garbage-collected values, and when a key is collected, the table entry disappears. More
> +// precisely, each entry in this table holds its value strongly only when its key is
> +// strongly held.

Phew. There really is no good way to describe this thing, is there.

Some random thoughts, take them or leave them; I have no idea if they are in any way better or worse than what you've written here:

1. Your audience probably mainly wants an explanation of what this tool does, which is your "Keys and values can..." sentence. You could instead say, which amounts to the same thing:

// This is just like a traditional weak-key map, except GC doesn't cause WeakMap entries
// to vanish randomly, just because they haven't been used for a while. Instead, they
// vanish when -- and only when -- the key is GC'd.

2. What was most helpful to me, I think, was to understand how you'd specify the mark-phase behavior (what your "More precisely" sentence does). It might be even better to visually set this off as something formal, like a definition in a math book.

// More precisely, the rule that explains how GC affects WeakMaps is:
//
//     A WeakMap entry is collected if and only if either the WeakMap or the entry's key
//     is collected. If an entry is not collected, it remains in the WeakMap and it has a
//     strong reference to the value.

@@ +95,5 @@
> +// Common base class for all WeakMap specializations. The collector uses this to call
> +// their markIteratively and sweep methods.
> +class WeakMapBase {
> +  public:
> +    WeakMapBase() : next() { }

That next() seems like unnecessary use of a fairly obscure C++ feature. Splurge and say next(NULL)?

@@ +182,5 @@
> +        for (Enum e(*this); !e.empty(); e.popFront()) {
> +            JS_ASSERT(!t.keyMarked(e.front().key) ==
> +                      !t.valueMarked(e.front().value));
> +            if (!t.valueMarked(e.front().value))
> +                e.removeFront();

The value can be marked even if the key is dead. The existing WeakMap::sweep code checks to see if the key was marked, and it does not contain the assertion. That is correct. The new code could fail to sweep an entry when the key is dead.

(js::Debug::sweepAll currently has this assertion, but that's because in that case we additionally know that the value has a strong reference to the key. That does not hold more generally.)

If this passed tests, I think a new test is necessary!

@@ -52,4 @@
>  
>  class JSWeakMap {
>      ObjectValueMap map;
> -    JSObject *next;

Please go a little further and delete class JSWeakMap. I think only an `extern Class WeakMapClass;` needs to be in the header. All the rest can be file-scope in jsweakmap.cpp.
Comment 6 Jason Orendorff [:jorendorff] 2011-06-02 12:17:36 PDT
Comment on attachment 535563 [details] [diff] [review]
Rename WeakMap to JSWeakMap; plain WeakMap will be the general-purpose C++ class template.

Clearing review flag here; the change is unnecessary if we delete this class. :)
Comment 7 Jason Orendorff [:jorendorff] 2011-06-02 12:18:44 PDT
Comment on attachment 535564 [details] [diff] [review]
Minor cleanups to WeakMap.

OK.
Comment 8 Jim Blandy :jimb 2011-06-02 13:17:16 PDT
Thanks for the careful review!
Comment 9 Jim Blandy :jimb 2011-06-02 18:58:13 PDT
(In reply to comment #5)
> Two spaces before the colon (as it was before your change) is more usual,
> according to naive grepping.

Fixed.

> r=me on the prettier slower approach, unless it draws a veto from Andreas or
> Gregor.

It's a virtual call, but probably a pretty predictable one.
> 1. Your audience probably mainly wants an explanation of what this tool
> does, which is your "Keys and values can..." sentence. You could instead
> say, which amounts to the same thing:
> 
> // This is just like a traditional weak-key map, except GC doesn't cause
> WeakMap entries
> // to vanish randomly, just because they haven't been used for a while.
> Instead, they
> // vanish when -- and only when -- the key is GC'd.

I have never heard of a traditional weak-key map in which entries vanish randomly, just because they haven't been used in a while. This description is really confusing.

> 2. What was most helpful to me, I think, was to understand how you'd specify
> the mark-phase behavior (what your "More precisely" sentence does). It might
> be even better to visually set this off as something formal, like a
> definition in a math book.
> 
> // More precisely, the rule that explains how GC affects WeakMaps is:
> //
> //     A WeakMap entry is collected if and only if either the WeakMap or the
> entry's key
> //     is collected. If an entry is not collected, it remains in the WeakMap
> and it has a
> //     strong reference to the value.

I like this, however. I've changed the comment.

> @@ +95,5 @@
> > +// Common base class for all WeakMap specializations. The collector uses this to call
> > +// their markIteratively and sweep methods.
> > +class WeakMapBase {
> > +  public:
> > +    WeakMapBase() : next() { }
> 
> That next() seems like unnecessary use of a fairly obscure C++ feature.
> Splurge and say next(NULL)?

Obscure to you! :D (I'll add the NULL).

> @@ +182,5 @@
> > +        for (Enum e(*this); !e.empty(); e.popFront()) {
> > +            JS_ASSERT(!t.keyMarked(e.front().key) ==
> > +                      !t.valueMarked(e.front().value));
> > +            if (!t.valueMarked(e.front().value))
> > +                e.removeFront();
> 
> The value can be marked even if the key is dead. The existing WeakMap::sweep
> code checks to see if the key was marked, and it does not contain the
> assertion. That is correct. The new code could fail to sweep an entry when
> the key is dead.

This is a flat-out typo, 'value' for 'key'. Thanks very much for catching it. A test should have caught this.

I carelessly lifted the assertion from the Debug code, where it does hold. A test should have caught this, too.

The code now reads:

    void sweep(JSTracer *tracer) {
        MarkPolicy t(tracer);
        for (Enum e(*this); !e.empty(); e.popFront()) {
            if (!t.keyMarked(e.front().value))
                e.removeFront();
        }
    }

> Please go a little further and delete class JSWeakMap. I think only an
> `extern Class WeakMapClass;` needs to be in the header. All the rest can be
> file-scope in jsweakmap.cpp.

Done.

I'll work on the tests.
Comment 10 Jim Blandy :jimb 2011-06-02 19:00:59 PDT
Err, the code now reads:

    void sweep(JSTracer *tracer) {
        MarkPolicy t(tracer);
        for (Enum e(*this); !e.empty(); e.popFront()) {
            if (!t.keyMarked(e.front().key))
                e.removeFront();
        }
    }
Comment 11 Jim Blandy :jimb 2011-06-07 10:16:15 PDT
Created attachment 537820 [details] [diff] [review]
Provide a WeakMap usable from C++.

[This patch should address all review comments, and is rebased to depend on the patches for Bug 661567, which needs some of the same cleanups and is possibly more urgent.]

Remove WeakMap class; implement the JavaScript object using functions static to jsweakmap.cpp.

Define a new WeakMap class template, parameterized by Key and Value types,
and accepting a MarkPolicy argument saying how to mark them.

Add assertions to check that we check and set the right mark bits, and
tests that trip them in the presence of mistakes in earlier revisions of
this patch.
Comment 12 Jim Blandy :jimb 2011-06-10 11:44:18 PDT
Hi, Andreas. Review ping? This is pretty simple, and I need it on the Debugger branch.
Comment 13 Jason Orendorff [:jorendorff] 2011-06-14 17:58:03 PDT
Comment on attachment 537820 [details] [diff] [review]
Provide a WeakMap usable from C++.

Review of attachment 537820 [details] [diff] [review]:
-----------------------------------------------------------------

r=me.

::: js/src/jsapi.cpp
@@ +643,5 @@
>  #endif
>  
>  JSRuntime::JSRuntime()
> +  : gcWeakMapList(NULL),
> +    gcChunkAllocator(&defaultGCChunkAllocator)

Don't bother. JSRuntime has many fields which must be initialized to NULL. We just memset it somewhere.

I know, disgusting, but that's how it's done.

::: js/src/jsweakmap.cpp
@@ +88,5 @@
>  
> +typedef WeakMap<JSObject *, Value, DefaultHasher<JSObject *>, RuntimeAllocPolicy> ObjectValueMap;
> +
> +static ObjectValueMap *
> +GetObjectMap(JSObject *obj)

These shouldn't be in namespace js, right? Just file-scope seems better. (This is what nameless namespaces are for, but we don't use them...)

@@ +105,5 @@
>      return &vp->toObject();
>  }
>  
> +static JSBool
> +WeakMapHas(JSContext *cx, uintN argc, Value *vp)

I think the prevailing style is to use an underscore in such names: WeakMap_has.

@@ +258,4 @@
>  {
> +    ObjectValueMap *map = GetObjectMap(obj);
> +    if (map)
> +        cx->delete_(map);

Pre-existing nit: JSContext::delete_ allows a non-null pointer, so drop the if.

::: js/src/jsweakmap.h
@@ +134,4 @@
>  
>    protected:
> +    // Instance member functions called by the above. Specializations of WeakMap override
> +    // these with definitions appropriate for their Key and Value types.

Instantiations of WeakMap, not specializations.

@@ +136,5 @@
> +    // Instance member functions called by the above. Specializations of WeakMap override
> +    // these with definitions appropriate for their Key and Value types.
> +    virtual void nonMarkingTrace(JSTracer *tracer) = 0;
> +    virtual bool markIteratively(JSTracer *tracer) = 0;
> +    virtual void sweep(JSTracer *tracer) = 0;

I expressed some concern about GC perf, but I really *really* don't think it's an issue. If we find that it is, we know how to fix it; bring on the follow-up bug.

@@ +148,5 @@
> +template <class Key, class Value,
> +          class HashPolicy = DefaultHasher<Key>,
> +          class AllocPolicy = ContextAllocPolicy,
> +          class MarkPolicy = DefaultMarkPolicy<Key, Value> >
> +class WeakMap : public HashMap<Key, Value, HashPolicy, AllocPolicy>, public WeakMapBase {

Consider getting rid of the AllocPolicy parameter and always using SystemAllocPolicy as the parameter to HashMap. (If you do this, remember to change the comment that says that MarkPolicy is the "fifth template parameter".)
Comment 14 Jim Blandy :jimb 2011-06-14 19:24:38 PDT
http://hg.mozilla.org/tracemonkey/rev/4e6f886b7f2b
Comment 15 Andreas Gal :gal 2011-06-14 19:32:18 PDT
Sorry for the late comment but removing the WeakMap:: class namespace seems a bit gratuitous. We use it elsewhere too, so this seems a step back.
Comment 16 Jason Orendorff [:jorendorff] 2011-06-15 10:27:35 PDT
Let's chat about it on IRC. Jim's first patch in this bug retained class WeakMap, renaming it to JSWeakMap (comment 1), but there were no longer any WeakMap objects being created anywhere, and it was almost empty, plus now we had WeakMapBase, so I asked him to remove it. Jim's on vacation, but let me know what you need and we'll fix it up.
Comment 17 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-06-15 11:17:31 PDT
Let me know what you decide to do, as I'm currently working on getting a WeakMap leak fix tested and landed in Bug 653248.  Thanks.  (I don't have any  opinion about the namespace.)
Comment 18 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-06-15 11:17:47 PDT
(In this bug is okay...)
Comment 19 Chris Leary [:cdleary] (not checking bugmail) 2011-06-20 17:08:18 PDT
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/4e6f886b7f2b

Note You need to log in before you can comment on or make changes to this bug.