Last Comment Bug 706328 - IonMonkey: Handle LSetElement for Out-of-Bounds Index.
: IonMonkey: Handle LSetElement for Out-of-Bounds Index.
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: ---
Assigned To: Jan de Mooij [:jandem]
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 718982 721280
Blocks: IonSpeed
  Show dependency treegraph
 
Reported: 2011-11-29 17:33 PST by Sean Stangl [:sstangl]
Modified: 2012-02-02 05:04 PST (History)
7 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (41.84 KB, patch)
2011-12-29 08:39 PST, Jan de Mooij [:jandem]
no flags Details | Diff | Splinter Review
WIP 2 (51.34 KB, patch)
2011-12-29 12:44 PST, Jan de Mooij [:jandem]
no flags Details | Diff | Splinter Review
Patch (53.52 KB, patch)
2011-12-30 07:03 PST, Jan de Mooij [:jandem]
no flags Details | Diff | Splinter Review
Patch (52.27 KB, patch)
2012-01-17 02:59 PST, Jan de Mooij [:jandem]
no flags Details | Diff | Splinter Review
Patch (55.53 KB, patch)
2012-01-26 05:29 PST, Jan de Mooij [:jandem]
no flags Details | Diff | Splinter Review
Patch (52.89 KB, patch)
2012-01-27 13:44 PST, Jan de Mooij [:jandem]
dvander: review+
Details | Diff | Splinter Review

Description Sean Stangl [:sstangl] 2011-11-29 17:33:30 PST
Ion performance in SS 1.0 bitops-nsieve-bits is about twice as slow as in the interpreter. The reason for this is that we bail out almost immediately at the first site of LBoundsCheck: the InitializedLength opcode reports a length of 0x27, which is quite an understatement.

The MIR, LIR, and regalloc all look correct.
Comment 1 Sean Stangl [:sstangl] 2011-11-29 18:00:45 PST
Apparently "InitializedLength" means "the current (dense) length of the array", not "the length to which the array was initialized". So the value of 0x27 is actually correct.

This means that nsieve-bits needs a stub call in order to expand the array -- and that *nearly every* LSetElement in this benchmark will take that stub call.

An obvious change to support this behavior would be to make LBoundsCheck an input to LSetElement.
Comment 2 Jan de Mooij [:jandem] 2011-11-30 02:37:15 PST
Yeah we should have an OOL path to handle the index == initializedLength case like JM.
Comment 3 Jan de Mooij [:jandem] 2011-12-12 12:51:47 PST
I looked a bit into this today, and there are a number of different ways to fix this.

For each jsop_setelem, we have analysis info to determine whether this setelem has been used to write holes in the past. I think we should use this info to handle two common use cases:

1) setelem is used to overwrite non-hole values (index is always < initializedLength).

2) setelem is used to initialized an array by overwriting holes, for example:
--
var a = Array(n);
for (var i=0; i<n; i++)
    a[i] = 0;
--
Every iteration we have to bump the initializedLength value, and sometimes we have to reallocate the elements vector.

We currently handle case 1, but we need to handle the second case too. For the first case we use the following MIR instructions:

- MElements
- MInitializedLength
- MBoundsCheck (always bails out if this fails)
- MStoreElement

All but the last instruction can be hoisted (LICM) or eliminated (GVN). I think we should keep these instructions to handle the first case, and do something else for the second case.

To handle the second case, we could add 2 new LIR instructions, LStoreElementHole{T,V} (or something similar). These instructions check the initialized length and have an OOL path to bump the initializedLength or call a StoreElement stub. (note that with this approach, we don't need MInitializedLength and MBoundsCheck for case 2)

I think this is the best approach, but there are two alternatives:

1) Add MBoundsCheckAndBranch (or something) like MBoundsCheck, but instead of bailing out it branches to the OOL path of MStoreElement.

2) Add MUpdateInitializedLength and use it instead of MInitializedLength + MBoundsCheck.

These approaches have the following disadvantages:

- We need to jump from one LIR instruction to another LIR instruction (for instance, LBoundsCheckAndBranch -> LStoreElement OOL path). This is complicated because an LMoveGroup may be inserted between the bounds check and the store element. We can't just jump over the move group.

- The OOL path of LStoreElement{T,V} needs access to the JSObject, not just the elements array, for case 2.
Comment 4 Jan de Mooij [:jandem] 2011-12-29 08:39:45 PST
Created attachment 584761 [details] [diff] [review]
WIP

This seems to work quite well. For the following micro-benchmark:
--
var a = [];
for (var i=0; i<10000000; i++) {
    a[i] = i;
}
--
I get these numbers:

IM+TI (new):  82 ms
JM+TI      : 108 ms
IM+TI (old): 814 ms

TODO: remove duplicate code, cleanup, x64/ARM support.

pushValueArg and related functions were copied from the patch in bug 713526, I will remove them once Brian's patch lands.
Comment 5 Jan de Mooij [:jandem] 2011-12-29 12:44:32 PST
Created attachment 584817 [details] [diff] [review]
WIP 2

Adds x64 support, removes a lot of duplicate code, moves StoreElement{V,T} to CodeGenerator (and have it call platform-dependent methods).
Comment 6 Jan de Mooij [:jandem] 2011-12-30 07:03:15 PST
Created attachment 584963 [details] [diff] [review]
Patch

Basically done, but depends on the patch for bug 713526. When it lands I can cleanup this patch and ask for review.
Comment 7 Jan de Mooij [:jandem] 2012-01-17 02:59:06 PST
Created attachment 589139 [details] [diff] [review]
Patch

Makes access-nsieve 9x faster (1 ms slower than JM+TI, but looking at the generated code, there's still room for improvement).

This patch does not help bitops-nsieve-bits. I suspect that we hit bug 705302 and then block OSR. Once that's fixed, this patch should help there too.
Comment 8 Jan de Mooij [:jandem] 2012-01-18 04:58:26 PST
Please ignore the VM call. If we box the int32 index, we can just call the generic stub I'm going to add in bug 718982. Later we could also add a SetElementDense stub, but the generic function is okay for now.
Comment 9 Jan de Mooij [:jandem] 2012-01-26 05:29:03 PST
Created attachment 591760 [details] [diff] [review]
Patch

Rebased + ARM support.
Comment 10 Jan de Mooij [:jandem] 2012-01-27 10:52:26 PST
Comment on attachment 591760 [details] [diff] [review]
Patch

Probably best if I wait for bug 706328, should be able to simplify StoreElementV a bit with the BaseIndex type added there.
Comment 11 Paul Wright 2012-01-27 10:59:01 PST
(In reply to Jan de Mooij (:jandem) from comment #10)
> Comment on attachment 591760 [details] [diff] [review]
> Patch
> 
> Probably best if I wait for bug 706328, should be able to simplify
> StoreElementV a bit with the BaseIndex type added there.

Bug 706328 is this bug.
Comment 12 Jan de Mooij [:jandem] 2012-01-27 11:05:36 PST
(In reply to Paul Wright from comment #11)
> 
> Bug 706328 is this bug.

Oops right, bug 721280.
Comment 13 Jan de Mooij [:jandem] 2012-01-27 13:44:40 PST
Created attachment 592257 [details] [diff] [review]
Patch

Rebased on top of the patches in bug 721280, use BaseIndex for StoreElementV..
Comment 14 David Anderson [:dvander] 2012-01-27 16:28:35 PST
Comment on attachment 592257 [details] [diff] [review]
Patch

Review of attachment 592257 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/CodeGenerator.cpp
@@ +936,5 @@
> +    const ValueOperand value = ToValue(lir, LStoreElementHoleV::Value);
> +
> +    // OOL path if index >= initializedLength.
> +    Address initLength(elements, ObjectElements::offsetOfInitializedLength());
> +    Int32Key key = index->isConstant() ? Int32Key(ToInt32(index)) : Int32Key(ToRegister(index));

This pattern appears a few times; should we have a ToInt32Key()?

@@ +980,5 @@
> +
> +    // If index == initializedLength, try to bump the initialized length inline.
> +    // If index > initializedLength, call a stub.
> +    Label callStub;
> +    masm.j(Assembler::NotEqual, &callStub);

Comment here that it relies on the condition flags sticking from the incoming branch.

@@ +1004,5 @@
> +    masm.bumpKey(&key, -1);
> +
> +    // The inline path for StoreElementHoleT does not always store the type tag,
> +    // so we have to make sure the new element has a valid type tag.
> +    if (ins->isStoreElementHoleT() && valueType != MIRType_Double)

This seems okay for x86/arm, but on x64 we probably just want to always box the first set, and box any further pointer sets.

@@ +1021,5 @@
> +        pushArg(*index->toConstant());
> +    else
> +        pushArg(TypedOrValueRegister(MIRType_Int32, ToAnyRegister(index)));
> +    pushArg(object);
> +    if (!callVM(Info, ins))

There's no risk about |elements| changing because it is necessarily dead after this instruction, right? (because of the alias set)

::: js/src/ion/Lowering.cpp
@@ +958,5 @@
> +      case MIRType_Double:
> +        lir = new LStoreElementHoleT(useRegister(ins->object()),
> +                                     useRegister(ins->elements()),
> +                                     useRegisterOrConstant(ins->index()),
> +                                     useRegister(ins->value()));

I wonder if we should have useRegisterOrNonDoubleConstant() or something to avoid this duplication.

::: js/src/ion/arm/MacroAssembler-arm.h
@@ +534,5 @@
> +    void branch32(Condition cond, const Address &address, Register reg, Label *label) {
> +        move32(address, ScratchRegister);
> +        branch32(cond, ScratchRegister, reg, label);
> +    }
> +    void branch32(Condition cond, const Address &address, Imm32 imm, Label *label) {

super nit: name parameters for these two lhs/rhs

::: js/src/ion/shared/Assembler-x86-shared.h
@@ +450,5 @@
>            default:
>              JS_NOT_REACHED("unexpected operand kind");
>          }
>      }
> +    void cmpl(const Operand &op, const Register &reg) {

super nit: name these lhs and rhs

::: js/src/ion/x86/MacroAssembler-x86.h
@@ +179,5 @@
>      void pushValue(const Value &val) {
>          jsval_layout jv = JSVAL_TO_IMPL(val);
>          push(Imm32(jv.s.tag));
> +        if (val.isGCThing())
> +            push(ImmGCPtr(reinterpret_cast<gc::Cell *>(val.toGCThing())));

Good catch.
Comment 15 Jan de Mooij [:jandem] 2012-02-02 05:04:24 PST
http://hg.mozilla.org/projects/ionmonkey/rev/550a780f73ae

(In reply to David Anderson [:dvander] from comment #14)
> 
> This seems okay for x86/arm, but on x64 we probably just want to always box
> the first set, and box any further pointer sets.

I changed the OOL path to just call storeElementTyped and rejoin to the end of the inline path, so that we no longer need initElementType.
 
> 
> There's no risk about |elements| changing because it is necessarily dead
> after this instruction, right? (because of the alias set)

Right, the AliasSet::ObjectFields flag prevents this.

> 
> I wonder if we should have useRegisterOrNonDoubleConstant() or something to
> avoid this duplication.

Added this function and removed the duplicate useRegister(...) calls by moving them to the top of the function.

Note You need to log in before you can comment on or make changes to this bug.