Last Comment Bug 687901 - IonMonkey: simple inlining
: IonMonkey: simple inlining
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: ---
Assigned To: Chris Leary [:cdleary] (not checking bugmail)
:
Mentors:
Depends on: 699736
Blocks: 703085 686595
  Show dependency treegraph
 
Reported: 2011-09-20 10:11 PDT by Chris Leary [:cdleary] (not checking bugmail)
Modified: 2011-11-21 14:41 PST (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP: inlining one level deep. (29.23 KB, patch)
2011-09-20 10:11 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Review
WIP: inlining one level deep. (32.38 KB, patch)
2011-10-20 14:56 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Review
MIR construction with inlining. (202.34 KB, image/png)
2011-11-04 00:04 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details
MIR post-GVN. (78.46 KB, image/png)
2011-11-04 00:04 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details
Microbenchmark. (768 bytes, application/javascript)
2011-11-04 00:07 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details
WIP: inlining one level deep. (76.57 KB, patch)
2011-11-04 00:09 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Review
WIP: inlining one level deep. (77.08 KB, patch)
2011-11-08 17:03 PST, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Review
Inlining one level deep. (79.48 KB, patch)
2011-11-09 14:44 PST, Chris Leary [:cdleary] (not checking bugmail)
dvander: review+
Details | Diff | Review

Description Chris Leary [:cdleary] (not checking bugmail) 2011-09-20 10:11:13 PDT
Created attachment 561220 [details] [diff] [review]
WIP: inlining one level deep.

Restricted to one level of nesting for now. Uses the type oracle to determine the inlining target.
Comment 1 Chris Leary [:cdleary] (not checking bugmail) 2011-09-20 10:20:08 PDT
What's left to do:

- Create CheckInlineScript to look at JSScript and make an "are we capable of inlining this" decision -- a bunch of this is in the |TypeOracle::getInliningTarget| right now, which may not be the best place.
- Create phis for inline graphs with multiple exit blocks.
- Extend snapshotting so that it understands nested frame SSA values.
- Test like mad (bailouts for inline frames should be extensively tested).

Questions:

- Do we want the backend to optimize straight-line blocks (with MGotos linking them) into a single basic block? Would make it easier if we don't need to think about merging blocks together opportunistically in the MIR generation.
Comment 2 Chris Leary [:cdleary] (not checking bugmail) 2011-09-20 10:39:21 PDT
Oh yeah, another TODO: the TypeOracle also uses freezeTypes, and I didn't check to see if that works for deoptimization in IonMonkey yet.
Comment 3 David Anderson [:dvander] 2011-09-20 11:06:41 PDT
> - Do we want the backend to optimize straight-line blocks (with MGotos
> linking them) into a single basic block? Would make it easier if we don't
> need to think about merging blocks together opportunistically in the MIR
> generation.

Right now backends will optimize away MGotos that go to the next block.
Comment 4 David Anderson [:dvander] 2011-09-22 14:40:26 PDT
Comment on attachment 561220 [details] [diff] [review]
WIP: inlining one level deep.

Review of attachment 561220 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/TypeOracle.cpp
@@ +143,5 @@
> +        break;
> +      default:
> +        JS_ASSERT(false && "unhandled opcode");
> +        return NULL;
> +    }

Drive-by feedback on this API, which we talked a bit about yesterday. This is the first Oracle function to run into the problem where it's hard to abstract between TI and a theoretical non-TI, IC-based profiling oracle. The former tells us outputs (the inputs are already annotated) and the latter tells us inputs.

As written, this function is designed to be abstracted, but in the non-existent, v8-style oracle case, would not differentiate whether a callee guard is necessary.

types::TypeSet seems like a good abstraction to bridge the gap. Ideally we'd just use the TypeSet attached to individual MIR nodes, but that won't work because of phis, which TI has seen beforehand but we haven't. So instead I'm proposing a signature like:

TypeSet *
TypeOracle::getCallTarget(JSScript *caller, jsbytecode *pc)

And IonBuilder would use your logic below to decide whether to inline. Then, all an Oracle is responsible for is giving us TypeSets.
Comment 5 David Anderson [:dvander] 2011-09-22 23:11:52 PDT
Note, bug 687125 just went by, we should make sure that fix goes into inlining decisions with TI.
Comment 6 Chris Leary [:cdleary] (not checking bugmail) 2011-10-20 14:56:13 PDT
Created attachment 568525 [details] [diff] [review]
WIP: inlining one level deep.

Rebased and made to work on the existing MIRGraph. Fixes an issue where the MControlInstruction return values weren't being unlinked via their operands.
Comment 7 Chris Leary [:cdleary] (not checking bugmail) 2011-11-04 00:04:27 PDT
Created attachment 571892 [details]
MIR construction with inlining.
Comment 8 Chris Leary [:cdleary] (not checking bugmail) 2011-11-04 00:04:59 PDT
Created attachment 571893 [details]
MIR post-GVN.
Comment 9 Chris Leary [:cdleary] (not checking bugmail) 2011-11-04 00:07:38 PDT
Created attachment 571894 [details]
Microbenchmark.

Behold, the power of inlining dumb stuff!

On my i7 box this reduces the runtime for this microbenchmark from 242ms to 17ms.
Comment 10 Chris Leary [:cdleary] (not checking bugmail) 2011-11-04 00:09:06 PDT
Created attachment 571895 [details] [diff] [review]
WIP: inlining one level deep.
Comment 11 Chris Leary [:cdleary] (not checking bugmail) 2011-11-08 17:03:47 PST
Created attachment 573048 [details] [diff] [review]
WIP: inlining one level deep.

Reworked to look backwards from the JSOP_CALL site. Will see how this does on tests.
Comment 12 Chris Leary [:cdleary] (not checking bugmail) 2011-11-09 13:43:12 PST
Passes all shell tests, will post for review after a little bit of cleanup.
Comment 13 Chris Leary [:cdleary] (not checking bugmail) 2011-11-09 14:44:34 PST
Created attachment 573334 [details] [diff] [review]
Inlining one level deep.

Passes shell tests. Follow up bug will address making it multi-level.
Comment 14 David Anderson [:dvander] 2011-11-12 11:21:17 PST
Comment on attachment 573334 [details] [diff] [review]
Inlining one level deep.

Review of attachment 573334 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome, I'm really glad the whole resume point -> snapshot -> bailout thing played out so simply, as did the recursive graph builder for the most part. I have some comments and nits, so r=me with those addressed.

::: js/src/ion/Bailouts.cpp
@@ +191,5 @@
> +    regs.pc = fp->script()->code + pcOff;
> +
> +    IonSpew(IonSpew_Bailouts, " new PC is offset %u within script %p",
> +            pcOff, (void *) fp->script());
> +    JS_ASSERT(exprStackSlots == js_ReconstructStackDepth(cx, fp->script(), regs.pc));

Haha, this is definitely the right thing to do but I have nightmares of this assert failing all the time in LeaveTree(). I'll cross my fingers :)

::: js/src/ion/IonBuilder.cpp
@@ +149,5 @@
> +    }
> +
> +    JSScript *inlineScript = fun->script();
> +
> +    if (!inlineScript->hasAnalysis() || !inlineScript->analysis()->ranInference()) {

Any way to hide this behind the Oracle? Actually, the Oracle::enterInlinedScript suggested below could just be fallible, and that might take care of this.

@@ +1855,5 @@
> +
> +    // |top| jumps into the callee subgraph -- save it for later use.
> +    MBasicBlock *top = current;
> +
> +    MResumePoint *inlineResumePoint = MResumePoint::NewUnwrapArgs(top, argc, pc, callerResumePoint_);

callerResumePoint_ is never set. What is it?

@@ +1868,5 @@
> +        return false;
> +    for (uintN i = 0; i < argc; ++i) {
> +        MPassArg *passArg = top->pop()->toPassArg();
> +        JS_ASSERT(passArg->useCount() == 0);
> +        passArg->block()->remove(passArg);

"remove" will leave passArg's input operand in-tact, making it still part of the actual argument's use chain. You want to use the more destructive variant "nuke" which NULLs out the operands.

(It might have gotten a different name - check with sstangl.)

@@ +1875,5 @@
> +    }
> +
> +    MPassArg *thisArg = top->pop()->toPassArg();
> +    MDefinition *thisDefn = thisArg->getArgument();
> +    thisArg->block()->remove(thisArg);

Iterate to argc + 1? |thisArg| is never used.

@@ +1885,5 @@
> +    MIRGraphExits &exits = inlineBuilder.graph().getExitAccumulator();
> +
> +    // Create a |bottom| block for all the callee subgraph exits to jump to.
> +    JS_ASSERT(*pc == JSOP_CALL);
> +    jsbytecode *postCall = pc + js_CodeSpec[JSOP_CALL].length;

I think we have a GetNextPc() helper in IonBuilder for this :)

@@ +1897,5 @@
> +
> +        if (!retvalDefns.append(exitBlock->lastIns()->toReturn()->getOperand(0)))
> +            return false;
> +
> +        exitBlock->removeLastIns();

s/remove/whatever-the-nuke-function-is/

@@ +1916,5 @@
> +    if (retvalDefns.length() > 1) {
> +        // Need to create a phi to merge the returns together.
> +        MPhi *phi = MPhi::New(bottom->stackDepth());
> +        if (!phi)
> +            return false;

Can remove the error check here, this falls under the ballast.

@@ +1961,5 @@
> +    if (!getInliningTarget(argc, pc, &inlineFunc))
> +        return false;
> +
> +    if (!inlineFunc)
> +        return true;

"shouldInline" is a little confusing here because it's actually returning whether an error occurred. Maybe "makeInliningDecision" or something.

@@ +1974,5 @@
>  {
> +    if (inliningEnabled()) {
> +        InliningData data;
> +        if (!shouldInlineCurrentCall(argc, &data))
> +            return false;

nit: jsop_call is probably going to become a very long function in the future - could this |if|, or everything under it, be split out?

@@ +1997,5 @@
> +            } else {
> +                DummyOracle oracle;
> +                IonBuilder inlineBuilder(cx, temp(), graph(), &oracle, *info, inliningDepth + 1);
> +                return jsop_call_inline(argc, inlineBuilder, &data);
> +            }

Ideally IonBuilder shouldn't have to sniff out the oracle like this. What if instead the TypeOracle was extended with a new API, like:

    oracle->enterInlinedScript(script);
    oracle->leaveInlinedScript([outer]);

I believe TI has to be informed of inlining, so that might work out nicely. Then you could also get rid of the AutoExitAccumulator.

::: js/src/ion/IonBuilder.h
@@ +299,5 @@
>      Vector<CFGState, 8, IonAllocPolicy> cfgStack_;
>      Vector<ControlFlowInfo, 4, IonAllocPolicy> loops_;
>      Vector<ControlFlowInfo, 0, IonAllocPolicy> switches_;
>      TypeOracle *oracle;
> +    uintN inliningDepth;

nit: size_t

::: js/src/ion/MIR.h
@@ +1684,5 @@
>      MDefinition **operands_;
>      uint32 stackDepth_;
>      jsbytecode *pc_;
> +    MResumePoint *parent_;
> +    jsbytecode *parentPC_;

"parent" is fairly overloaded in JS - would "outer" or "prev" be better?

@@ +1738,5 @@
> + * function inlining). Operands are ordered from oldest frame to newest.
> + */
> +class FlattenedMResumePointIter
> +{
> +    Vector<MResumePoint *, 8, IonAllocPolicy> resumePoints;

Use the normal alloc policy here, unless FlattenedMResumePointIters will be stored in the heap.

@@ +1748,5 @@
> +
> +    bool traverse(MResumePoint *resumePoint) {
> +        if (!resumePoint)
> +            return true;
> +        if (!traverse(resumePoint->parent()))

nit: seems like this could be iterative?

::: js/src/ion/shared/CodeGenerator-shared.cpp
@@ +187,5 @@
> +        jsbytecode *pc = mir->pc();
> +        uint32 exprStack = mir->stackDepth() - block->info().ninvoke();
> +        snapshots_.startFrame(fun, script, pc, exprStack);
> +        encodeSlots(snapshot, mir, &startIndex);
> +        snapshots_.endFrame();

Nice.

::: js/src/ion/x86/Lowering-x86.cpp
@@ +179,4 @@
>  
> +    size_t i = 0;
> +    for (MResumePoint **it = iter.begin(), **end = iter.end(); it != end; ++it) {
> +        MResumePoint *mir = *it;

Nice - but do Lowering-arm and Lowering-x64 need these changes as well?

::: js/src/shell/js.cpp
@@ +5410,5 @@
>          ion::js_IonOptions.setEagerCompilation();
>      }
> +
> +    if (op->getBoolOption("ion-inlining"))
> +        ion::js_IonOptions.inlining = true;

Instead, could this be --ion-inlining=on/off, defaulted to on?
Comment 15 Chris Leary [:cdleary] (not checking bugmail) 2011-11-21 14:18:52 PST
https://hg.mozilla.org/projects/ionmonkey/rev/01ebfabf29e2

Addresses all review comments, except the stack-like script interface for the oracles, for which I filed bug 703085, which I'm going to do next. Also, I think |thisArg| will be used in the future (when we have callprop support) and we don't append that into the arguments vector, so I left that one alone for now.
Comment 16 Chris Leary [:cdleary] (not checking bugmail) 2011-11-21 14:41:56 PST
Some files were strangely changed during my mq rebase -- here's the revert changeset:

https://hg.mozilla.org/projects/ionmonkey/rev/e974d2324bb5

Note You need to log in before you can comment on or make changes to this bug.