Closed Bug 861596 Opened 11 years ago Closed 11 years ago

IonMonkey: Optimize arguments-objects support in Ion

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: djvj, Unassigned)

References

Details

Attachments

(3 files, 1 obsolete file)

Bug 860145 adds argsobj support to Ion in a very crude way, always forcing the creation of an arguments object, and emitting slow ops for |arguments.length| (which becomes an MGetPropertyCache), and for |arguments[i]| (which becomes an MGetElementCache).

However this doesn't deliver the necessary perf gains to recover our scores on dromaeo.

We can still optimize idomatic uses of |arguments| in this case, as described in Dromaeo-performance bug 856625 (comments 14 and the next few).
While doing this, I ran into a couple of snags:
1. Ion doesn't inline functions with loops, but that should be fixed shortly by h4writer who is working on it.
2. Prototype's arguments-to-array conversion method actually does a few extra things to the |arguments| object, even when inlined:

function $A(iterable) {
  if (!iterable) return [];
  if ('toArray' in Object(iterable)) return iterable.toArray();
  var length = iterable.length || 0, results = new Array(length);
  while (length--) results[length] = iterable[length];
  return results;
}

Here, iterable is a passed-in |arguments| object (and since $A is inlined, assuming inlining of functions with loops, this code will be embedded in the MIRGraph of the caller during the caller's compiling, with iterable being substitued with the MCreateArgumentsObject of the caller).

The line:
  if ('toArray' in Object(iterable)) return iterable.toArray();

Needs to be handled.

Object(iterable) is easy: Object(x) is the identity function for any object x.
The |str in ArgumentsObj| expression is likewise easy to handle.  Arguments objects directly define only the property names "length", "callee", and the integer arg indices.

The |ArgsObj.toArray()| is more difficult.  The conditional that leads to that expression will never be taken, but deducing that statically is just too difficult.

One solution to this issue is the following: when building the MIR with IonBuilder, replace any attempt to do something "weird" with a CreateArgumentsObject-derived definition (where CreateArgumentsObject-derived is a Box/Unbox/Phi chain originating at CreateArgumentsObject), with an unconditional bailout.
Depends on: 768288
As mentioned in bug 856625 comment 18 it would be good to remove the arguments optimization analysis done based on the script->analysis() info so that arguments analysis only happens in one place.

I don't know though how that should interact with this bug and bug 804676.  Are you trying to land this optimization work ASAP, or just soon?  Some of the stuff you're dealing with for $A will be much easier to deal with post bug 804676, in particular dealing with that ArgsObj.toArray() call.  IonBuilder will know it is dealing with an arguments object from the caller and can deduce the result of the CALLPROP will be undefined.  Something else could be hacked first but it would be good to avoid the mess.

Similarly, it would be nice to not be in an intermediate state where we analyze arguments usage in two places, though that seems more difficult to avoid.  The arguments optimization in Ion could land, then we could just remove the existing arguments optimization stuff and fold its functionality into the Ion version.
I'm actually trying to avoid having to do this in the short term, so if I can find an alternative way of fixing the dromaeo regressions introduced by baseline, we can leave this issue for a more full-fledged addressal later on.

There's a small amount of ugliness in doing this analysis on top of the Ion MIR.  In the MIR, the flow of the CreateArguments into various uses is obscured by Boxes, Unboxes, and Phis.

The nice thing about having this analysis done on the bytecode prior to IonBuilder::build is that it allows us, when starting to Ion-compile a script, to decide early on whether to actively generate an arguments object or not.  That influences selection of MIR instructions later on.  Without that early hint, we'd have to do pick one assumption during generation (probably assume !needsArgsObj), and go back and patch up the MIR if we discover during code generation that this assumption was broken.  That's easier said than done, especially due to aforementioned issues with Boxing/Unboxing/TypeBarrier decisions taken during MIR construction.  Discovering using the MIR whether a particular operand does actually trace back to CreateArgumentsObject (or alternatively, whether CreateArgumentsObject does in fact feed into a "weird" instruction) is more roundabout than doing it on bytecode.

Moreover, baseline also takes advantage of distinguishing between |needsArgsObj| and |!needsArgsObj| in its codegeneration (reducing jitcode size of the majority of scripts which don't make heavyweight use of arguments).

I think we might end up having to keep the bytecode analysis for this.  The resulting info takes only a couple bits to remember (argumentsHasVarBinding, needsArgsObj), and baseline makes good use of those bits.
Given the observation that we spend about 35% of time in Prototype's update() method, a simpler plan of attack suggests itself.

|update()| doesn't actually construct an arguments object, so a good chunk of that time is probably being spent on slowpath access to the arguments object.  I wanted to get a quick idea of what kind of speedups one could expect from just optimizing accesses to arguments objects.

If those are positive, then a decent plan of might be the following:

Have baseline generate optimized stubs for |arguments.length|, |arguments[i]|, and |str in arguments|. (Stubs for length and getelem done, with results & patch that'll be posted for review today)

Have Ion elide |Object(obj)|, which should be trivial (I have something written up which builds, but not tested).

Change Ion to inspect baseline stub chains and generate inline versions of the optimized arguments checks.


These should get most of the benchmarks on dromaeo better than they were prior to baseline landing (including on OSX), but I don't think it'll fix prototype entirely, and I don't think it'll do anything for jquery.trigger.  I have a suspicion that Prototype and jquery.trigger just spends a lot of time in baseline, but I'll have to profile more to confirm that.  The "easy" arguments work above is almost done and should deliver some significant gains, so I'll look at the further profiling after that goes in.
Adds optimized baseline stubs for arguments.length and arguments[int], for both arguments objects, and magic JS_OPTIMIZED_ARGUMENTS.
Attachment #737529 - Flags: review?(bhackett1024)
With the above patch (which only modifies baseline), OSX scores are looking a lot better.  Prototype scores improve, but are still lagging.  And jquery.trigger and jquery.unbind are still behind.

post-baseline-argslobj-stubs-patch(left) vs. pre-baseline-landing(right): 
http://dromaeo.com/?id=193539,193003
Attachment #737529 - Flags: review?(bhackett1024) → review+
Handles JSOP_IN of an atomic, non-index string against object or arguments, when the string property is not found on the object.
Attachment #738133 - Flags: review?(bhackett1024)
Comment on attachment 738133 [details] [diff] [review]
Optimize "str in NativeObj|Arguments" false case

Review of attachment 738133 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/BaselineIC.cpp
@@ +3203,5 @@
> +        return false;
> +
> +    // If obj is not native, or proto chain cannot be scanned without side effects, give up.
> +    if (!obj->isNative())
> +        return false;

I think this test is subsumed by the hasIdempotentProtoChain() above.

@@ +3216,5 @@
> +    for (JSObject *proto = obj->getProto(); proto; proto = proto->getProto()) {
> +        chainDepth++;
> +        // if prototype is non-native, don't optimize
> +        if (!proto->isNative())
> +            return false;

Ditto.

@@ +4696,5 @@
> +    RootedObject holder(cx);
> +    if (!EffectlesslyLookupProperty(cx, obj, propName, &holder, &shape))
> +        return false;
> +
> +    // Try to add optimized stuf for property-not-found

typo

@@ +4857,5 @@
> +    // Return false value.
> +    masm.moveValue(BooleanValue(false), R0);
> +    EmitReturnFromIC(masm);
> +
> +    // Failure case - fail but first reconstruct R0 and R1 form keyReg and objReg.

typo
Attachment #738133 - Flags: review?(bhackett1024) → review+
Phew, some light at the end of the tunnel.

I finished implementing optimized ICs stubs in Ion for ArgsObj operations - specifically .length and GetElement access on integers.  Numbers turn out a LOT better after that:

Dromaeo scores for post-patch (left) and pre-patch (right):

http://dromaeo.com/?id=193702,192854

All the prototype scores have jumped significantly, and are generally within spitting distance of pre-landing scores.  DOM Traversal (Prototype) is the major laggard now.

That said, we're still getting beat around on jquery.trigger.
Whiteboard: [leave open]
Pushed baseline arguments-object stubs patch:
https://hg.mozilla.org/integration/mozilla-inbound/rev/eb99b3a22c5c
Comment on attachment 739080 [details] [diff] [review]
Add ArgsObj.length and ArgsObj[i] stubs to Ion IC GetProp/GetElem

Review of attachment 739080 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work, nice reported improvements :D.
Since there are some bigger changes, I would like to have a second view.

::: js/src/ion/IonCaches.cpp
@@ +1114,5 @@
> +    masm.branchTest32(Assembler::NonZero, tmpReg, Imm32(ArgumentsObject::LENGTH_OVERRIDDEN_BIT),
> +                      &failures);
> +
> +    masm.rshiftPtr(Imm32(ArgumentsObject::PACKED_BITS_COUNT), tmpReg);
> +    // If output is Int32, result is already in right place.

Add a newline between the rshift and comment

@@ +2245,5 @@
> +
> +    // In tempReg, calculate index of word containing bit: (idx >> logBitsPerWord)
> +    masm.movePtr(indexReg, tmpReg);
> +    masm.rshiftPtr(Imm32(JS_BITS_PER_WORD_LOG2), tmpReg);
> +    masm.loadPtr(BaseIndex(scratchReg, tmpReg, ScaleFromElemWidth(sizeof(size_t))), scratchReg);

If we can load the data into tmpReg instead of scratchReg, I think you can remove the dependency on ScratchReg. I.e. the check if the property was deleted on arguments object can already use the tmpReg...

@@ +2274,5 @@
> +                JS_NOT_REACHED("Invalid MIRType for GetElementCache return!");
> +        }
> +        Assembler::Condition cond = masm.testValueNonDoubleType(Assembler::NotEqual, elemIdx,
> +                                                                valType);
> +        masm.j(cond, &failureRestoreIndex);

Use the suggested masm.branchTestMIRType()

@@ +2344,5 @@
> +            ((uint32_t)idval.toInt32() < obj->asArguments().initialLength()) &&
> +            !cache.index().constant() &&
> +            (cache.index().reg().hasValue() ||
> +             cache.index().reg().type() == MIRType_Int32) &&
> +            (cache.output().hasValue() || !cache.output().typedReg().isFloat()))

This is a thorn in my side. I was already hesitant with the condition statement in ArgumentsLength. Can we somehow factor this out. I.e. a new method/function? Something like AttachArgumentsElementPossible or something. I'm really open for any solution, that would make this more readable...

@@ +2349,5 @@
> +        {
> +            if (!cache.attachArgumentsElement(cx, ion, obj))
> +                return false;
> +            attachedStub = true;
> +        } else if (!attachedStub && obj->isNative() && cache.monitoredResult()) {

Why adding !attachedStub here? I think it is impossible to get a problem here, since those are "else if"'s. I.e. that code only runs when it hasn't taken another branch...

@@ +2357,5 @@
>                  if (!cache.attachGetProp(cx, ion, obj, idval, name))
>                      return false;
>                  attachedStub = true;
>              }
> +        } else if (!attachedStub && !cache.hasDenseStub() && obj->isNative() && idval.isInt32()) {

!attachedStub can go away

@@ +2362,4 @@
>              if (!cache.attachDenseElement(cx, ion, obj, idval))
>                  return false;
>              attachedStub = true;
> +        } else if (!attachedStub && obj->isTypedArray()) {

!attachedStub can go away

::: js/src/ion/arm/MacroAssembler-arm.h
@@ +647,5 @@
>      Condition testInt32(Condition cond, const Address &address);
>      Condition testDouble(Condition cond, const Address &address);
>  
> +    Condition testValueNonDoubleType(Condition cond, const BaseIndex &address,
> +                                     JSValueType type);

As discussed on IRC, it would make more sense to add:
- template <typename T>
  void branchTestMIRType(Condition cond, const T &t, MIRType type, Label *label)
- template <typename T>
  Condition testMIRType(Condition cond, const T &t, MIRType type)

and put the switch/case that do the right branchTest* and test*
Attachment #739080 - Flags: review?(hv1989)
Attachment #739080 - Attachment is obsolete: true
Attachment #739670 - Flags: review?(hv1989)
Comment on attachment 739670 [details] [diff] [review]
Add ArgsObj.length and ArgsObj[i] stubs to Ion IC GetProp/GetElem  [Try 2]

Review of attachment 739670 [details] [diff] [review]:
-----------------------------------------------------------------

Cool, looks good!

::: js/src/ion/IonCaches.cpp
@@ +2240,5 @@
> +    masm.branchTest32(Assembler::NonZero, tmpReg, Imm32(ArgumentsObject::LENGTH_OVERRIDDEN_BIT),
> +                      &failures);
> +
> +    // Decide to what type index the stub should be optimized
> +    GeneralRegisterSet regs(GeneralRegisterSet::All());

We can remove this. It isn't needed anymore.
Attachment #739670 - Flags: review?(hv1989) → review+
Pushed - Add optimized stubs for ArgsObj operations in Ion ICs:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6eb19014d780
Depends on: 864342
503a5fb6d530 broke a grid we have in a webpage here, should i file that as a new bug or should i reduce it and post a testcase here? (i would have to reduce the page and that will take quite a bit of time)
(In reply to Cork from comment #20)
> 503a5fb6d530 broke a grid we have in a webpage here, should i file that as a
> new bug or should i reduce it and post a testcase here? (i would have to
> reduce the page and that will take quite a bit of time)

Post it as a new bug with an attachment or link that reproduces the issue (you don't need to reduce it), and mark it as blocking this bug.
Depends on: 864412
Depends on: 866339
Depends on: 866013
(In reply to Phil Ringnalda (:philor) from comment #19)
> https://hg.mozilla.org/mozilla-central/rev/503a5fb6d530

Kannan, I'm thinking that we should back this out due to the various regressions it caused. What says you?
I did a remote build last night with the argsobj functionality disabled, just need to test it.  If that eliminates this behaviour, how about pushing that patch to temporarily disable the logic instead of backing out the entire changeset?
I can't reproduce the issue on either a reference build or the one with the logic disabled on my linux box.  But I'm building on 32-bit, so maybe 64-bit is different, or it's OSX specific.  Go ahead and back out the patch for now - I can't get to this until Monday.
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbd36f35883c

Note that this also reverts bug 864342. I assume that you'll just roll that fix in when you're able to re-land this.
Re-pushing with super minor one-line fix for top crash.  No other changes aside from merging to latest tip.

https://hg.mozilla.org/integration/mozilla-inbound/rev/c860539f82bb
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla23
Depends on: 1263811
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: