Support TI tracking of js::Class
Categories
(Core :: JavaScript Engine: JIT, defect, P2)
Tracking
()
People
(Reporter: tcampbell, Assigned: djvj)
References
(Blocks 1 open bug)
Details
Attachments
(1 file, 1 obsolete file)
32.91 KB,
patch
|
Details | Diff | Splinter Review |
Once ObjectKey is cleaned up, we should add support for tracking js::Class in TI. This will let us track JSFunction/ArrayObject/various DOMClass without running out of number of tracked objects.
We'd need Baseline Type IC updates, but they seem minor.
This should ideally let us remove the DOMClass object limit difference in TI.
Comment 1•6 years ago
|
||
(In reply to Ted Campbell [:tcampbell] from comment #0)
This should ideally let us remove the DOMClass object limit difference in TI.
I think for that we need to support AnyDOMClass, because for DOM objects I expect Class and ObjectGroup to be 1:1 mostly (unless same-Class objects have different protos but that seems unlikely).
Assignee | ||
Comment 2•6 years ago
|
||
So I'd agree about the AnyDOMClass thing.
I have a question about this. In TemporaryTypeSet::isDomClass(), here:
https://searchfox.org/mozilla-central/source/js/src/vm/TypeInference.cpp#2480
Part of the constraint checks on the typeSet ensure that all the recorded types have a constraint set on their hasStableClassAndProto
property (marked with the OBJECT_FLAG_UNKNOWN_PROPERTIES flag). If this flag changes dynamically (as is possible from inspection of the code), then it will trigger the constraint and invalidate.
So part of the implicit contract of the isDomClass()
query + constraints is that we're guaranteed the definition is an object with property HasStableClassAndProto.
If we are to replace the TypeSet entry with something like AnyDOMClass, then it seems we'd need to do one of two things for any existing use of the isDOMClass()
query:
- Verify that the use of the constrained definition does not depend on the HasStableClassAndProto property.
- If the use does depend on HasStableClassAndProto, then add a runtime guard against UNKNOWN_PROPERTIES after the corresponding
isDOMClass()
call.
This seems brittle and error-prone, however - as the user has to remember to make sure to do this wherever they call isDOMClass()
. However, there are currently only 3 places in Ion that do this, and unlikely that there will be a ton more, and we can probably wrap that up as a helper method in IonBuilder.
The other thing is that making the UNKNOWN_PROPERTIES into a runtime check isn't a good bargain. That flag almost never gets set (and I have not seen a single invalidation with that as the cause). The constraints related to those are very stable and never fire. It's one of the examples of the right places to take a constraint instead of a runtime guard.
Thinking about it - we should be able to avoid the runtime guards by attaching constraints at type-set modification time (rather than only at compile-time). Consider:
- We have a TypeSet T with contents AnyDOMClassWithStableClassAndProto.
- We add some DOM object D with class C to T, confirming that C is a DOM class, and that D's group has StableClassAndProto
- We attach a constraint on D's group, on the UNKNOWN_PROPERTIES flag, which when triggered generalizes typeset T to AnyObject.
This latter approach has the downside of some extra constraints being created, but a lot of upsides. One big upside is that we don't have to audit every use of isDOMClass() to ensure that guards are added in the appropriate places where StableClassAndProto is also needed - the same IonBuilder construction will suffice.
Reporter | ||
Comment 3•6 years ago
|
||
The TISnapshot design in Bug 1544429 would encapsulate the unknown_properties stuff inside TISnapshot::isDOMClass to avoid this brittleness. The hope is to solve address the 'audit every use' problem because this is far from the only scary case we'd like to improve.
Assignee | ||
Comment 4•6 years ago
|
||
I've started sketching this out. Draft (non-working) patch incoming of some of the bits once I make sure it builds.
For now, I'm starting by adding some extra options to TYPE_FLAGS: DOM_NATIVE, DOM_PROXY, and DOM_UNKNOWN. These are also given representations in TypeSet::Type.
On the IonBuilder end, we need a new MGuardDOMObject, or potentially MGuardStableClassAndProto. At the high-level, the MGuardDOMObject specifies intent better, and MGuardStableClassAndProto specifies the actual operation. This guard will check for the UNKNOWN_PROPERTIES flag on the group and if set, branch into a VMCall to update the relevant type-set. That should be enough to cause the invalidation of the calling script. This flag never gets set as far as I can tell, so this should be very rare.
Additionally, we need to address the following: https://searchfox.org/mozilla-central/source/js/src/jit/IonBuilder.cpp#6039
That code validates that all the classes associated with all the types in the type-set have a particular common structure. We can't handle this directly the check here can vary heavily depending on the JitInfo struct being used. Currently it relies on new types being added always triggering an invalidation as a way to ensure the validity of the compile-time checks across all the classes of the types. We don't get this anymore in any way unless we're prepared to track multiple classes.
So we will likely want to make this check into a runtime thing too. It should be fast since it's simply pulling out the underlying DOM object and checking an interface-descriptor array for a matching entry in a particular index.
The goal right now is just to eliminate the object-limit specific to DOM objects and provide a better generalization path for the type-set when it overflows full of DOM things (now it generalizes to anyobject, after changes it'll generalize to one of DomNative, DomProxy, or DomUnknown).
Once we have that, we can experiment with reducing the object-list lengths down to 1 by plotting a more aggressive generalization strategy. If reducing object-list lengths down to 1 doesn't cause any performance problems, we can then easily reform the TypeSet datatype from (uintptr_t, ObjectKey**)
with up to 64 bytes of heap-mem, to a single uintptr_t
that multiplexes a few pointer kinds.
Assignee | ||
Comment 5•6 years ago
|
||
This builds, and has some initial work in it. Not complete yet.
Assignee | ||
Comment 6•6 years ago
|
||
bz: Question for you. I need to jit a call to this at runtime:
https://searchfox.org/mozilla-central/source/dom/bindings/BindingUtils.cpp#1100
The protoId
will be accessible to the MIR instruction (MGuardDOMClassHasProtoAtDepth
for example). I need to somehow compile this check from the macro-assembler, but I'm not sure what the best way would be to get at the specific offset information, as well as sizeof(prototype::ID) (which I assume is uint32_t, but I can't actually find the definition of in searchfox).
Ideally, I'd want to call into DOMCallbacks
, passing in some abstract code-gen object, and say "jit yourself", here are the appropriate inputs and targets. I don't think that's feasible because it's too much cross-linking between the codebases.
So I was thinking about using a small public JitDOMClassCheckInfo
structure that specified the parameters A
and B
of *(*(ptr + A) + j*B)
, where A
is offsetof(DOMJSClass, mInterfaceChain)
, and B
is sizeof(prototypes::ID)
.
A JitInfo
struct would have a JitDOMClassCheckInfo domClassCheckInfo()
method, which would return this on the fly. We'd then use this to perform the guard in the jit.
Does that seem like a reasonable approach?
![]() |
||
Comment 7•6 years ago
|
||
Sorry for the lag; I was trying to understand the context of the question, and in particular which current-compile-time guards are going to become runtime. Doing the typecheck at runtime is, as you say, not that expensive, but only as long as there are no cache misses during the pointer-chase. If there's a cache miss, or worse yet several, it becomes pretty expensive....
Ideally, I'd want to call into DOMCallbacks, passing in some abstract code-gen object, and say "jit yourself", here are the appropriate inputs and targets.
It would be interesting to have a setup for this in general, by the way; we've talked about doing that for some DOM getters where we could do better than doing a VMCall by reading things directly in the JIT. But I don't know that we should block on it here.
So I was thinking about using a small public JitDOMClassCheckInfo structure that specified the parameters A and B
That seems fine. If we wanted to, we could just stick this into the DOMCallbacks struct, right? The (C++) runtime lookup of what's fundamentally (C++) compile-time-known information is a little annoying...
A JitInfo struct would have a JitDOMClassCheckInfo domClassCheckInfo() method
This part I'm not quite following. Fundamentally, DOM code is where we know the values of A and B (but we know those at Gecko-compile-time). Where would the implementation of domClassCheckInfo()
live and how would it decide what A and B are?
One thing we could do that might be simple is to hardcode the two numbers as constexpr JitInfo class statics, add static asserts in bindings that the numbers are correct, and have the jit use the constexpr things, at which point we avoid the runtime lookup for these numbers... It's a little icky, but guaranteed to be fast.
![]() |
||
Comment 8•6 years ago
|
||
Where would the implementation of domClassCheckInfo() live and how would it decide what A and B are?
That is, DOM code knows what A and B are, but JitInfo is declared in SpiderMonkey, so presumably that's where domClassCheckInfo()
would live....
![]() |
||
Comment 9•6 years ago
|
||
Oh, and prototype::ID
is uint16_t.
Assignee | ||
Comment 10•6 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #7)
This part I'm not quite following. Fundamentally, DOM code is where we know the values of A and B (but we know those at Gecko-compile-time). Where would the implementation of
domClassCheckInfo()
live and how would it decide what A and B are?One thing we could do that might be simple is to hardcode the two numbers as constexpr JitInfo class statics, add static asserts in bindings that the numbers are correct, and have the jit use the constexpr things, at which point we avoid the runtime lookup for these numbers... It's a little icky, but guaranteed to be fast.
I started drafting this code and realized the first point you were making. I'll post the patch shortly, but DOMCallbacks ended up being a better fit for that API. I'd prefer not to hardcode A and B. It feels enough of a poor man's abstraction to leak the "the guard is a double indirection into an array" bit, putting the constants into the jit feels even more brittle and a bit of a landmine for future unsuspecting DOM code contributors.
This is what it looks like now:
struct JSDOMGuardProtoParams {
size_t protoArrayPtrOffset;
size_t sizeOfElement;
bool shouldGuard() {
return sizeOfElement > 0;
}
};
typedef struct JSDOMGuardProtoParams DOMGuardProtoParams;
typedef void (*DOMGetGuardProtoParams)(uint32_t protoID, uint32_t depth,
DOMGuardProtoParams* params);
struct JSDOMCallbacks {
DOMInstanceClassHasProtoAtDepth instanceClassMatchesProto;
DOMGetGuardProtoParams getGuardProtoParams;
};
typedef struct JSDOMCallbacks DOMCallbacks;
in MIR, the MGuardDOMObject instruction is modified to embed these params, and then emit the guard against them during code-gen.
Doing the typecheck at runtime is, as you say, not that expensive, but only as long as there are no cache misses during the pointer-chase. If there's a cache miss, or worse yet several, it becomes pretty expensive....
I hadn't considered this point before, at least more than in passing.
Thinking about it.. it's exactly two loads off of the JSClass structure. I'm optimistic about the performance hit here on the following reasoning: By in large there are a reasonably small and finite set of DOM JSClass variants. Of these, I'd expect that some reasonably small subset are the most accessed ones (Event types, various Node types, etc.).
So the given sizeof(prototype::ID) == 2
, these proto arrays should be able to fit 32 ids in a cache line - large enough that a single array would likely fit in a single cache line the vast majority of the time.
Thus I'd expect 2 cache lines per grou to be needed for the check, on average (one to get the protoArrayPtr
, one to read the prototype::ID
). For the "hot classes", these should be in cache with high confidence during most guards. If that sounds a bit handwavey to you, that's probably because it is :)
The upshot of this is that it moves us towards the path of eliminating type-invalidations on Ioncode due to trivial changes like adding a new Group with a new type of Event class to a typeset. It also basically decouples the relationships between type-set size and optimizability of DOM operations. It really doesn't matter if 20 different DOM types show up at a location now - as long as the types all support the op the ion-code will stay valid.
It would be interesting to have a setup for this in general, by the way; we've talked about doing that for some DOM getters where we could do better than doing a VMCall by reading things directly in the JIT. But I don't know that we should block on it here.
No plans to block on that at all, just a nod to a "in a perfect world" scenario. Practically, something like this is blocked on some underlying shared high-level IR infrastructure, which is not on any near-term roadmap.
![]() |
||
Comment 11•6 years ago
|
||
putting the constants into the jit feels even more brittle and a bit of a landmine for future unsuspecting DOM code contributors
Even if we have static_asserts (in the DOM code) ensuring that the constants are correct?
> typedef void (*DOMGetGuardProtoParams)(uint32_t protoID, uint32_t depth,
> DOMGuardProtoParams* params);
We could do this, I guess, but this is assuming (or at least acting like it's assuming) that the params might depend on the protoID and depth, whereas they don't: they're just C++-compile-time constants that only depend on DOMJSClass struct layout and the definition of dom::prototypes::ID
. That is, we could just put these as data members in the JSDOMCallbacks struct and modify https://searchfox.org/mozilla-central/rev/b9da45f63cb567244933c77b2c7e827a057d3f9b/xpcom/base/CycleCollectedJSRuntime.cpp#525-526 accordingly instead of adding another callback function to the callbacks... It's "just" JIT-compile time extra work to do it with the callback, so maybe not too terrible, but I don't really see the benefit of the extra flexibility this allows.
I'd expect that some reasonably small subset are the most accessed ones
If "several dozen" counts as "small" for our purposes here, then yes. Also, there's the pointer-chase to get to the JSClass, right? Looks like it reads a pointer out of the JSObject itself, then out of the group. The JSObject itself needs to get loaded into the cache no matter what (because we're going to try to read its reserved slots to get the C++ DOM thing). Not sure whether the group was going to get loaded anyway.
large enough that a single array would likely fit in a single cache line
Sure. The maximum array length at the moment is 8; I don't expect it to ever increase in practice.
The typechecks we end up doing in the DOM code when we can't take the ion fast-path definitely show up if I measure, but I don't know how much of that is getting to the jitinfo, the extra function call overhead, etc.
I agree that probably this is going to be OK, and the more benchmarky the code the more OK it's likely to be...
Assignee | ||
Comment 12•6 years ago
|
||
Even if we have static_asserts (in the DOM code) ensuring that the constants are correct?
It still bothers me, but I'm not married to the idea of forcing the highest level abstraction possible here.
We could do this, I guess, but this is assuming (or at least acting like it's assuming) that the params might depend on the protoID and depth, whereas they don't: they're just C++-compile-time constants that only depend on DOMJSClass struct layout and the definition of dom::prototypes::ID.
Does it feel like premature abstraction to you? You're right of course that it implies an unnecessary dependency between the guard and the protoId/depth being guarded against.
I guess I'm just trying to constrain things as much as possible for the client of the "how to guard DOM things" API. Assume that it may depend on any of the match parameters, get back some object that encapsulates the guard parameters which can be passed in to masm to generate the macroassembler when needed.
The typechecks we end up doing in the DOM code when we can't take the ion fast-path definitely show up if I measure, but I don't know how much of that is getting to the jitinfo, the extra function call overhead, etc.
The call overheads are likely a huge contributor here. I haven't seen exactly how this gets handled without TI, but I'm guessing we have a similar "jagged cliff" situation here as in other places. The optimized TI paths are fast and happy if we land on them, and everything else gets the slow C++ shunt with a full callVM
(since unoptimized case tells us nothing about whether the DOM target can or cannot execute arbitrary JS).
We would also lose all the nice things we get for DOM calls through the Jit info - narrow aliasing, etc.
Anyway, we won't know for sure until we actually test this approach I guess, so I'll just try to get there ASAP.
Appreciate the detailed feedback.
![]() |
||
Comment 13•6 years ago
|
||
Does it feel like premature abstraction to you?
It mostly feels like taking a (probably smallish, and jit-compile-time-only) performance hit that there's no real reason to take. I might be a bit scarred by how much performance improvement in Gecko has come from eliminating many such small performance hits over the years...
I guess I'm just trying to constrain things as much as possible for the client of the "how to guard DOM things" API.
That makes sense, but where we draw the line between hardcoding that the protoid and depth are the only things that influence the guard vs assuming anything in the jitinfo might do that vs assuming that the structure of the guard is not dependent on the protoid+depth is an interesting question.
My feeling is that if we're willing to assume that two integers (protoid and depth) are enough to determine the guard, we might as well assume that they are just indexing into the JSClass at offsets that don't depend on the exact integers. If that ever changes, then we'd probably need to change from using the two-integer setup anyway.
Now that I've talked through all this, an obvious question: why don't we just change the meaning of the "depth" member of JsJitInfo to be "byte offset from the JSClass"? It's not as conceptually clean, possibly, but it would be quite simple to have bindings generate that instead of the current thing, and then on the JS engine side the semantics would be quite simple....
but I'm guessing we have a similar "jagged cliff" situation here as in other places
Yeah, we do.
Assignee | ||
Comment 14•6 years ago
|
||
So all that seems reasonable enough to me. I'll go ahead and pull that extra ornamentation out of the API then. Including the "normalize depth to byte-offset" bit.
Assignee | ||
Comment 15•6 years ago
|
||
More work. This includes the plumbing for TypeSets, Types, and the integration with DOM, as well as the required MIR instructions.
Just for future reference, things I need to nail down still:
- Aliasing and GVN behaviour for the GuardDOMObject IR instruction.
- Confirm Spectre mitigations are being handled properly.
- Clean up the DOM proto-check API as discussed.
Comment 16•4 years ago
|
||
TI was removed.
Description
•