Closed Bug 638680 Opened 10 years ago Closed 8 years ago

JM: Clean up Call ICs


(Core :: JavaScript Engine, defect)

Not set





(Reporter: dvander, Assigned: dvander)


(Blocks 1 open bug)



(3 files, 3 obsolete files)

Call ICs are way too confusing and bloated. Most of the out-of-line code is not necessary and can be delayed in an IC.
This patch removes Call ICs and leaves uncached calls in, so tests pass without changing JIT functionality.

 7 files changed, 13 insertions(+), 1269 deletions(-)
Attached patch new call ICs (obsolete) — Splinter Review
New call ICs. Keeps funcall/apply speculation and arg splatting inline, and moves everything else out of line. Only one IC gets generated (instead of some confusing twisted mess) and it captures, basically, exactly all of the old paths; nothing more or less. I'm still fixing bugs and should have performance numbers soon to make sure nothing regressed, but this is what the final patch will look like.

Still 500 lines in the black...
 23 files changed, 1060 insertions(+), 313 deletions(-)
Note, this patch also gets rid of "uncached calls", so the call path looks the same whether or not we're in debug mode. Instead, the IC checks for debug mode before attaching a stub.
Attached patch part 2: new call ICs (obsolete) — Splinter Review
passes trace-tests, jstests. unfortunately I had to re-introduce the inline path to get performance back.
Attachment #519507 - Attachment is obsolete: true
Perf is at parity now.
Attachment #519585 - Attachment is obsolete: true
Attachment #519591 - Flags: review?(luke)
Comment on attachment 519504 [details] [diff] [review]
part 1: remove Call ICs

r-, this would be a huge SS/V8 regression !!!!1
Attachment #519504 - Flags: review?(luke) → review+
Comment on attachment 519591 [details] [diff] [review]
part 2: new call ICs

Unfortunately, must leave for weekend and did not finish reviewing.  Code is much nicer though.  Comments so far:

After having to read my own dogfood, speculativeArgc is a confusing name.  IRL we agreed on replacing it with numArgsPushedBeforeCall.

>+    // If speculating funApply, splat the apply args in the OOL path, to
>+    // simplify the OOL and generated paths.
>+    if (ic.frameSize.isDynamic()) {

In this then-branch, could lines 2483 through 2506 (using bugzilla diff line numbers) be hoisted up into the else-branch of the preceding:

>+            if (ic.op == JSOP_FUNCALL)
>+                ic.frameSize.initStatic(frame.localSlots(), speculatedArgc - 1);
>+            else
>+                ic.frameSize.initDynamic();

That way, when ic.frameSize.initDynamic() is set, it reflects the correct state at the corresponding point in the jit code.  Furthermore, line 2507-2508 and 2515-2516 can be shared/hoisted above the if-then.

As discussed, perhaps file a followup bug (with njn cc'd) to determine how much the always-generated stub call path costs in terms of space?

>+    // The OOL path, if no JIT address is returned, jumps back to here
>+    // manually. Every other path returns here naturally via fp->ncode.
>+    stubcc.crossJump(stubcc.masm.jump(), ic.doneRejoinOffset);

From IRL: this can be deleted; the stubcc.masm.jump() will never be taken.
Comment on attachment 519591 [details] [diff] [review]
part 2: new call ICs

>+    DataLabelPtr emitStaticFrame(bool isNew, uint32 frameDepth, void *returnAddress) {

Since this is encoding initCallFrameCallerHalf, perhaps this could be reflected in the name and a comment referencing JSStackFrame::initCallFrameCallerHalf?

Second, it seems strange to put this in BaseAssembler.h.  Perhaps it could go in MonoIC, next to and contrasting with emitDynamicFrame?  Same thought on emitCallTail.  Your choice, of course.

I was going to ask if we could group all this code in a single place so that the call/return story was in a single file, but I think that would not be fun times for bhackett.  Maybe in IonMonkey :)

>+        if (from.hasInlinePath) {
>+            to.hasExtendedInlinePath = true;

Perhaps rename CallICInfo::hasInlinePath to hasExtendedInlinePath to match (or is there a difference I'm missing?).

>+    if (ic.frameSize.isStatic())
>+        ic.slowPathCall = OOL_STUBCALL(funptr);
>+    else
>+        ic.slowPathCall = OOL_STUBCALL_LOCAL_SLOTS(funptr, -1);

Pre-existing, but is it worth adding a OOL_STUBCALL_WITH_SP_ALREADY_SET() ?  -1 is (my fault and) a rather cryptic.

>+        uint32 initialFrameDepth;

I think this is unused

>+        DataLabelPtr param2;


>+        Label doneRejoinOffset;


>-    void emitUncachedCall(uint32 argc, bool callingNew);
>+    void emitUncachedCall(RegisterID reg, FrameSize frameSize, Address rval);

I think this is unused.

>+    if (! || (IsNew && !>isInterpreted())) {
>+        if (IsNew)
>+            return InvokeConstructor(, InvokeArgsAlreadyOnTheStack(vp, call.argc));
>+        return Invoke(, InvokeArgsAlreadyOnTheStack(vp, call.argc), 0);
>+    }
>+    if (>isNative())
>+        return CallJSNative(,>u.n.native, call.argc, vp);

From IRL: natives-called-as-constructors need special handling so, for simplicity, take out the latter 'if' and add '>isNative()' as a disjunct to the former 'if'.

>+    Jump emitDeeperCalleeGuard(Jump guard, Label reenter)
>+    {

This function doesn't appear to be called.  I think its handling the case where we have multiple function instances for the same JSFunction* which, right now appears to fall into the generic scripted path.  Did you intend to remove this optimization?  (Does it not measurably matter?)  If you do remove it, I think you can remove the "if (!claspGuard.isSet())" test in assemble().

>+CallIC::disable(VMFrame &f, Repatcher &repatcher)
>+    CallICFun fun = GetCallICFun<false>(op(), frameSize);
>+    repatcher.relink(slowCall(), JSC::FunctionPtr(JS_FUNC_TO_DATA_PTR(void *, fun)));

The 'f' arg doesn't seem necessary here.  Similarly, looking at the callers, CallIC::update() and CallAssembler don't need a VMFrame, but rather just a JSContext* and a JITScript.  It would be nice to just pass these since it makes clear that the current VMFrame isn't being arbitrarily mutated by ic compilation.

>+    // Always generate a stub.
>+    if (!pool) {
>+        CallAssembler ca(f, *this, call);
>+        LookupStatus status = ca.assemble();
>+        if (status != Lookup_Cacheable)
>+            return status != Lookup_Error;
>+        return != Lookup_Error;
>+    }

ca.assemble() always returns Lookup_Cacheable, so perhaps change to return 'void' and remove check?

Second, can return Lookup_Uncacheable when '!linker.verifyRange(f.jit())', which hasn't set 'pool' yet.  If this happens, will the above 'disable(f, repatcher)' call ensure that we don't try to compile an IC every time?  I think so.  In that case, then it seems like could just return a bool error/no-error value.

>+    return Lookup_Cacheable;

CallIC::update() returns a bool, so this should be 'return true'.

>+struct CallComposition {

The name confused me at first; CallDescription ?

>+    // inlineCallee is the funobj being guarded on the inline path,
>+    // if hasExtendedInlinePath is true. Otherwise, it is unused.
>+    mutable JSObject *inlineCallee;

I don't know whether its worth it, but it would save a callic word to compute this from calleePtr()...

>+// Clean up a frame and return.
>+static inline void
>+InlineReturn(VMFrame &f)

Since we are touching this, I noticed we could share this code with js::Interpret's JSOP_RETURN path.  Even better, we can change popInlineFrame to propagate the fp->rval to the down-frame's stack value so that there is no InlineReturn magic utility.  This would also let us kill RemovePartialFrame which, IIUC, would be fine with this modification of popInlineFrame.

r+ unless you add some more code for emitDeeperCalleeGuard() issue and think it needs review.
Attachment #519591 - Flags: review?(luke) → review+
Thanks, your comments were extremely helpful. A few notes:

> Second, it seems strange to put this in BaseAssembler.h.

emitStaticFrame() gets used in both the main compiler and ICs. BaseAssembler is poorly named, but it encapsulates generic VM interactions too.

> Pre-existing, but is it worth adding a OOL_STUBCALL_WITH_SP_ALREADY_SET() ?

Probably not.

> This function doesn't appear to be called.  I think its handling the case
> where we have multiple function instances for the same JSFunction* which,
> right now appears to fall into the generic scripted path.  Did you intend to
> remove this optimization?  (Does it not measurably matter?)  If you do remove
> it, I think you can remove the "if (!claspGuard.isSet())" test in assemble().

Yeah, you're totally right. I kept this in and forgot to remove it in the final patch. It doesn't matter for perf at all so it was just extra complexity.
Blocks: 646267
Fixes a bug where a GC could cause debug-mode scripts to enable ICs, and pushing this to try.
Attachment #521354 - Attachment is obsolete: true
I finally found why this is failing on try... there's a valid bug, that amazingly doesn't seem to actually matter but makes a debugMode assert throw. The IC updates after the frame has been pushed but it uses cx->fp()->jit() as if it were its own. This should be easy to fix, I'll post a delta patch tomorrow.
This seemed green on try, so:
Whiteboard: fixed-in-tracemonkey
This appears to have broken a big swath of mochitest-a11y tests, only on Linux64 opt. Hard to see at first, since this push got a broken build slave for that build, twice, and then the next push broke everything, but I retriggered mochitest-other on the push before this, and it was green again, and retriggered the build on this push, which I'm pretty sure will be orange. Tree's closed for the combination of this and bug 652127, so you can race to see who's going to be the last one keeping it closed.
Retriggered build and test finished, it was this.
Thanks, philor, backed out as
Whiteboard: fixed-in-tracemonkey
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.