Open Bug 1056992 Opened 7 years ago Updated 1 year ago

JSClass needs a hook to construct ubi::Nodes specialized for its instances

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

REOPENED

People

(Reporter: jimb, Unassigned)

References

(Blocks 4 open bugs)

Details

Attachments

(1 file, 7 obsolete files)

JS::ubi::Node is our memory tools' generic thing-in-memory pointer type for reflection; it's supposed to be able to point to embedding-defined types as well as SpiderMonkey types. However, we'll never reach a node for an embedding-defined type from a SpiderMonkey node type if only SpiderMonkey code is allowed to produce outgoing edges.

The JS::ubi::Node constructor, given a T *, actually delegates the construction of the node to a method of JS::ubi::Concrete<T>, so we can customize how we build a Node from a JSObject *. In particular, we can add a new hook to JSClass for constructing a Node specialized for that particular kind of JavaScript object.

For example, here is how we would use such a hook to make DOM structure visible to our memory tools. The JS shadow objects representing DOM nodes could have a JSClass hook that builds a JS::ubi::Node that has an extra outgoing edge to the underlying C++ nsINode object that it represents. Then, we would add a JS::ubi::Node specialization for nsINode that exposes its edges.
Blocks: 1057057
Blocks: 1059139
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INACTIVE
Status: RESOLVED → REOPENED
Resolution: INACTIVE → ---
We have a person interested in working on this now!
Kris, can you take a look at this?
Assignee: nobody → kwright
Here's some background for contributors who are solid with C++ and familiar with ubi::Node (read js/public/UbiNode.h) but otherwise new to SpiderMonkey and web development:

The SpiderMonkey JavaScript engine is only concerned with running JavaScript code correctly, safely, and quickly; it doesn't have any browser-specific code in it at all. But SpiderMonkey does provide hooks for customizing the behavior of JavaScript objects so that they act as interfaces to some underlying C++ world. This is how the Web platform is built: everything from a web page's text, to images, to network requests in flight gets represented as JavaScript objects that are "fronting" for Firefox C++ objects, and whose core behaviors are implemented in Firefox's C++ code.

SpiderMonkey's two features that make this possible are:

- You can have JavaScript functions that are implemented in C++. A call to such a JS function results in a call to a given C++ function. These are called "native functions".

- You can have JavaScript objects that hold pointers to C++ objects. These pointers are invisible to the JavaScript world, but C++ code, like the native functions described above, can get them and use them.

So, for example, a JavaScript RegExp object represents a regular expression, with methods for searching and matching strings and so on. A JavaScript RegExp object is represented in C++ as a JSObject that holds a pointer to a C++ js::RegExpShared object. This RegExpShared object actually implements searching, matching, and so on. The JavaScript RegExp methods are implemented by C++ functions that retrieve the js::RegExpShared* from the JSObject, and use that to do their thing. The JSObject's js::RegExpShared pointer is invisible to JavaScript; it's only visible to C++ code looking at the JSObject.

In this example, a JSObject is pointing to a js::RegExpShared object, so all the code involved is within SpiderMonkey. But these customization mechanisms are available to the application in which SpiderMonkey is embedded (referred to as "the embedder"). So in Firefox, you'll have plenty of JSObjects pointing to C++ types in Firefox's own web-specific code.

This bug's concern is that, although JS::ubi::Node can enumerate edges between things that SpiderMonkey knows about --- JavaScript objects, strings, etc. --- at present it cannot see those edges from JSObjects to the embedder's C++ objects. This creates a huge blind spot for tools based on JS::ubi::Node, since that world of the embedder's C++ objects is responsible for plenty of Firefox's memory use.

Naturally, if we expose such an edge, there's got to be another ubi::Node at the far end, so we'd need a JS::ubi::Concrete specialization for that C++ object. And since that C++ object is one of the embedder's types, only the embedder can provide an appropriate Concrete specialization. That can be done, but there are hundreds such types (at least!), and this bug is not about implementing them.

Rather, this bug is about giving the embedder the means to supply its own JS::ubi::Base implementations for its customized JSObjects. A ubi::Node referring to a generic, uncustomized JSObject can continue to use the plain JS::ubi::Concrete<JSObject> subclass of JS::ubi::Base, but a ubi::Node referring to an embedder-customized JSObject should use a correspondingly customized JS::ubi::Base subclass.

Now, at the C++ level, these objects are all simply JSObjects when we encounter them, so making these distinctions will require a trick. We are expecting the JS::ubi::Node::Node(JSObject*) constructor to build different sorts of JS::ubi::Nodes, depending on dynamic information found on the JSObject itself.

Into the details:

We're going to walk a path from "least typed" to "most typed", starting with the ubi::Node constructor:

    template<typename T>
    MOZ_IMPLICIT Node(T* ptr) {
        construct(ptr);
    }

This accepts literally any pointer type, and delegates it to the private Node::construct method:

    template<typename T>
    void construct(T* ptr) {
        ...
        Concrete<T>::construct(base(), ptr);
    }

Here's where the first restriction on T arises: this only works when Concrete<T>::construct exists. The unspecialized Concrete struct has no construct member function, so T must be a type that Concrete has been specialized for.

In our case, we care about calls to ubi::Node::Node(JSObject*), so this will call Concrete<JSObject>::construct, which does exist:

https://searchfox.org/mozilla-central/rev/bf4def01bf8f6ff0d18f02f2d7e9efc73e12c63f/js/public/UbiNode.h#1075-1098

At present, the path of type refinement ends here: the definition simply constructs an instance of Concrete<JSObject> in the Node's storage, and that's the ubi::Base implementation for generic JSObjects. But we would like to extend the path with one more step: we want to look at what sort of JSObject this is, and if it is a customized JSObject, we want to construct a Base instance that knows about those customizations.

Fortunately, JSObject already provides a means for different sorts of objects to customize their behavior. Each JSObject points to a js::Class (available via JSObject::getClass):

https://searchfox.org/mozilla-central/rev/bf4def01bf8f6ff0d18f02f2d7e9efc73e12c63f/js/public/Class.h#867

js::Class holds various metadata, including an indication of how much space the JSObjects should reserve for internal use --- for example, pointers to C++ objects --- but also a bunch of pointers to functions that customize the behavior of the JSObject. These are stored in various structures pointed to by js::Class's cOps, spec, ext, and oOps members. js::Class and all its pieces are usually statically allocated variables with initializers.

For this bug, we want to add a new function pointer that, if present, JS::ubi::Concrete<JSObject>::construct can call to populate a ubi::Node's storage with a JS::ubi::Base subclass appropriate for the JSObjects using that js::Class.

For our purposes, I think the best place to add the function pointer for ubi::Node construction would be js::ObjectOps:

https://searchfox.org/mozilla-central/rev/bf4def01bf8f6ff0d18f02f2d7e9efc73e12c63f/js/public/Class.h#872

A reasonable name might be 'constructUbiNode', and it might as well take the same signature as the Concrete<T>::construct methods, so:

    using ConstructUbiNodeFn = void (*)(void* storage, JSObject*);
    ...
    ConstructUbiNodeFn constructUbiNode;

Once we've added this to ObjectOps (and updated the existing initialized instances), then we can expand Concrete<JSObject>::construct to delegate initializing the Node's storage to the JSObject's class's constructUbiNode function if it is present, but preserve the old behavior if it is nullptr.

So, for example, suppose we wanted to customize ubi::Node's behavior for RegExp JavaScript objects, so that each RegExp object has an edge pointing to its RegExpShared structure.

First, since there needs to be a ubi::Node at the other end of the edge, we need to teach ubi::Nodes to point at RegExpShared structures. We write a JS::ubi::Concrete<RegExpShared> specialization, implementing JS::ubi::Base's methods as appropriate to describe a RegExpShared. This can be as simple or as complicated as we want, depending on how much detail we want to provide about RegExpShared objects.

Next, we need to let RegExp objects report their special edge to their RegExpShared objects. We need a JS::ubi::Base subclass whose edges method returns an EdgeRange that includes everything the generic Concrete<JSObject>'s version does, but also includes a new edge to the RegExpShared. Since JavaScript RegExp objects are represented in C++ by a subclass of JSObject, JS::RegExpObject, we can write a JS::ubi::Concrete<RegExpObject> specialization to serve as our Base.

At this point, we've added the ability to call ubi::Node::Node on pointers to RegExpShared and RegExpObject objects, and get back ubi::Nodes that understand their referents. All that's missing is the ability to call ubi::Node::Node on a JSObject* whose referent happens to be a RegExpObject, and have the constructor build us the more specialized Concrete<RegExpObject> instead of the generic Concrete<JSObject>.

The static member RegExpObject::class_ is the js::Class used for RegExp objects:

https://searchfox.org/mozilla-central/rev/ce86c6c0472d5021ef693cf99abaaa0644c89e55/js/src/vm/RegExpObject.cpp#190-197

This doesn't mention an ObjectOps pointer, so we'll need to add an initializer pointing to a new statically initialized ObjectOps that points to our own constructor function.

Then, when we apply JS::ubi::Node::Node to a JSObject* referring to a RegExp object:

- JS::ubi::Node::Node calls Node::construct;

- Node::construct calls JS::ubi::Concrete<JSObject>::construct;

- Concrete<JSObject>::construct consults the class's constructUbiNode function, sees that it's non-null, and thus calls it;

- This RegExpObject-specific constructUbiNode function builds a Concrete<RegExpObject> instance in the ubi::Node's storage;

and we've got a ubi::Node that reports the additional edge.
WIP - Create UbiNodeUtils.h - contains UbiNode utility files
WIP -  Created a hook in ObjectOps to be implemented by other class instances. As per the example, tested it in RegExpObject and confirmed that it works.
Attachment #8983146 - Flags: feedback?(jimb)
Attachment #8983147 - Flags: feedback?(jimb)
Maybe it's worth considering putting this in ClassOps instead of ObjectOps - there's not a clear distinction, but ObjectOps is more for overriding how basic operations like getProperty work and almost all classes (like RegExpObject here) use the default ObjectOps.
(In reply to Jan de Mooij [:jandem] from comment #7)
> Maybe it's worth considering putting this in ClassOps instead of ObjectOps -
> there's not a clear distinction, but ObjectOps is more for overriding how
> basic operations like getProperty work and almost all classes (like
> RegExpObject here) use the default ObjectOps.

That suits me fine. I wasn't sure which of those structures made the most sense. (Sorry for the bum steer, Kristen.)
Comment on attachment 8983146 [details] [diff] [review]
JSClass needs a hook to construct ubi::Nodes specialized for its instances

Review of attachment 8983146 [details] [diff] [review]:
-----------------------------------------------------------------

This looks fine for a WIP. I'm only marking it feedback- because, after our lunch conversation today, I think we can take a different approach that is simpler and still usable.

::: js/public/UbiNodeUtils.h
@@ +101,5 @@
> +  };
> +
> +// An EdgeRange concrete class that simply holds a vector of Edges,
> +// populated by the init method.
> +class SimpleEdgeRange : public EdgeRange {

We talked about this a bit over lunch today. Conclusions:

The name `SimpleEdgeRange` doesn't reflect the GC-trace-driven behavior of this type, and we'd like to be able to use the vector/index/iteration behavior as a base class for other types that the GC doesn't know how to trace.

I think we can keep this type, but rename `init` to `addTracerEdges`. Both that and `addEdge` should call `settle`; this is a little wasteful, but it ensures the object is always ready to use. Then, a ubi::Node representing something the GC knows about can call `addTracerEdges`, and possibly add other edges it's aware of; and a ubi::Node representing a non-GC'd type can simply call `addEdge` to populate the vector itself.

@@ +127,5 @@
> +
> +    void popFront() override { i++; settle(); }
> +};
> +
> +class CombineEdges : public SimpleEdgeRange {

I think we can delete this class, and just have ubi::Base::edges implementations construct a SimpleEdgeRange, perhaps call `addTracerEdges`, perhaps call `addEdge`, and then return that directly.
Attachment #8983146 - Flags: feedback?(jimb) → feedback-
Comment on attachment 8983147 [details]
JSClass needs a hook to construct ubi::Nodes specialized for its instances

Review of attachment 8983147 [details]:
-----------------------------------------------------------------

::: js/public/Class.h
@@ +715,4 @@
>  
>  struct MOZ_STATIC_CLASS ObjectOps
>  {
> +    using ConstructUbiNodefn = void (*) (void* storage, JSObject* ptr);

As Jan suggested, this probably belongs in ClassOps, rather than ObjectOps. It should be named `...Op`, to match the other members, and the type definition should be placed alongside the others earlier in the file. (I think `using` is preferred to `typedef` these days, so it's fine to deviate from the other `...Op`s in that regard.

::: js/src/vm/RegExpObject.cpp
@@ +1512,5 @@
> +JS::ubi::Concrete<RegExpObject>::size(mozilla::MallocSizeOf mallocSizeOf) const
> +{
> +  // I am unsure what to report here
> +  // Because ptr points to a JSObject I am measuring RegExpObjects right now the
> +  // way they were being measured in JSObject's concrete class

I think copying Concrete<JSObject>::size is the best you can do. Would it help to make Concrete<RegExpObject> inherit from Concrete<JSObject>, instead of TracerConcrete<RegExpObject>? Then you could just inherit the size method.

::: js/src/vm/RegExpObject.h
@@ +26,5 @@
>  
> +namespace JS{
> +namespace ubi{
> +template<>
> +class Concrete <js::RegExpObject> : TracerConcrete <js::RegExpObject> {

I think this should inherit from at least TracerConcreteWithCompartment, since RegExpObjects are JSObjects, and JSObjects have compartments. But we could be even more specific here, and inherit from Concrete<JSObject>; see my later comments.
Attachment #8983147 - Flags: feedback?(jimb)
Created a hook in ClassOps to be implemented by other class instances. Adjusted JSObject specialization construction to delegate construction to the hook, should it exist
WIP - Create UbiNodeUtils.h - contains UbiNode utility files
Attachment #8983146 - Attachment is obsolete: true
Attachment #8983147 - Attachment is obsolete: true
Attachment #8983147 - Attachment is patch: false
Created a hook in ClassOps to be implemented by other class instances. Adjusted JSObject specialization construction to delegate construction to the hook, should it exist
Attachment #8983608 - Attachment is obsolete: true
WIP - Create UbiNodeUtils.h - contains UbiNode utility files
Attachment #8983609 - Attachment is obsolete: true
Depends on: 1466979
Attachment #8983612 - Flags: review?(jimb)
Created a hook in ClassOps to be implemented by other class instances. Adjusted JSObject specialization construction to delegate construction to the hook, should it exist
Attachment #8983612 - Attachment is obsolete: true
Attachment #8983612 - Flags: review?(jimb)
Attachment #8983620 - Attachment is obsolete: true
Created a hook in ClassOps to be implemented by other class instances. Adjusted JSObject specialization construction to delegate construction to the hook, should it exist
Attachment #8986241 - Attachment is obsolete: true
Attachment #8986572 - Flags: review?(jimb)
Comment on attachment 8986572 [details] [diff] [review]
JSClass needs a hook to construct ubi::Nodes specialized for its instances

Please don't review this until bug 1474383 is landed.
Comment on attachment 8986572 [details] [diff] [review]
JSClass needs a hook to construct ubi::Nodes specialized for its instances

Review of attachment 8986572 [details] [diff] [review]:
-----------------------------------------------------------------

You asked me not to review this... is it okay if I just comment?? :)

::: js/public/Class.h
@@ +503,4 @@
>  typedef size_t
>  (* JSObjectMovedOp)(JSObject* obj, JSObject* old);
>  
> +using ConstructUbiNodefn = void (*) (void* storage, JSObject* ptr);

Let's match the ambient style. This should be named JSConstructUbiNodeOp, and let's use a typedef, like the ones above it. And  put it with the other ClassOps typedefs, perhaps following JSHasInstanceOp.

@@ +599,5 @@
> +    JSMayResolveOp     getMayResolve()      const { return cOps ? cOps->mayResolve       : nullptr; } \
> +    JSNative           getCall()            const { return cOps ? cOps->call             : nullptr; } \
> +    JSHasInstanceOp    getHasInstance()     const { return cOps ? cOps->hasInstance      : nullptr; } \
> +    JSNative           getConstruct()       const { return cOps ? cOps->construct        : nullptr; } \
> +    ConstructUbiNodefn getConstructUbiNode()const { return cOps ? cOps->constructUbiNode : nullptr; } \

Renaming the hook may change this, but: space after the parens in "()const"

::: js/src/moz.build
@@ +167,4 @@
>      '../public/UbiNodeDominatorTree.h',
>      '../public/UbiNodePostOrder.h',
>      '../public/UbiNodeShortestPaths.h',
> +    '../public/UbiNodeUtils.h',

It looks like this belongs in your other patch.

::: js/src/vm/UbiNode.cpp
@@ +441,5 @@
> +        if (!src)
> +            new (storage) Concrete(ptr);
> +        else
> +            ptr->getClass()->cOps->constructUbiNode(storage, ptr);
> +    }

`src` is not a suggestive name for this value; be kind to your readers.

I would arrange this `if` nest to check positive conditions, and then let the simplest case be the final fallback:

if (ptr) {
    if (auto construct = ptr->getClass()->getConstructUbiNode()) {
        construct(storage, ptr);
        return;
    }
}

new (storage) Concrete(ptr);
Assignee: kwright → nobody
You need to log in before you can comment on or make changes to this bug.