Closed Bug 1147430 Opened 10 years ago Closed 9 years ago

Optimize Load{FixedSlot,Slot,Element} followed by an Unbox

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla45
Tracking Status
firefox45 --- fixed

People

(Reporter: jandem, Assigned: jschulte)

References

Details

(Keywords: perf)

Attachments

(1 file, 4 obsolete files)

I was curious why the "misc-basic-closures" benchmark on AWFY is at least 3x slower on x64 than on x86. One problem is that we do this on 32-bit to load an int32 upvar: [LoadFixedSlotV] movl 0x20(%eax), %ebp movl 0x24(%eax), %esi [Unbox:Int32] cmpl $0xffffff81, %esi jne .Lfrom270 But on 64-bit: [LoadFixedSlotV] movq 0x30(%rax), %rbx [Unbox:Int32] movq %rbx, %r11 shrq $47, %r11 cmpl $0x1fff1, %r11d jne .Lfrom250 movl %ebx, %ebx This is pretty inefficient; at least for int32/bool we should be able to combine these instructions and do something like: cmpl $0x1fff1, 0x34(%rax) jne <bail> movl 0x30(%rax), %rbx Note that this is also shorter than the x86 version, so we'd probably want this optimization there as well.
Related to bug 787735?
Attached patch WIP.patch (obsolete) — Splinter Review
For now I've only implemented int32 in a fixed slot to hear a first feedback from you.
Attachment #8639556 - Flags: feedback?(jdemooij)
Comment on attachment 8639556 [details] [diff] [review] WIP.patch Review of attachment 8639556 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay. Excited to see somebody working on this bug! Bug 1067484 should give us a framework to make it easier to do this kind of optimization, but in the meantime this seems fine. A high-level comment: instead of LoadAndUnboxInt32, can we have a LoadFixedSlotAndUnbox that handles the other unboxed types too? It seems like it should be pretty easy :) We don't need a new field for this, we can just use setResultType(type) and mir->type() for this. I like having "FixedSlot" in the name, so that we can later add MLoadSlotAndUnbox to match MLoadSlot. Eventually it might be nice to get rid of MLoadFixedSlot and fold it into MLoadSlot, but that can wait. ::: js/src/jit/CodeGenerator.cpp @@ +8246,5 @@ > + Operand base(Address(input, NativeObject::getFixedSlotOffset(slot))); > + if (mir->fallible()) { > + Operand type(Register::FromCode(base.base()), base.disp() + 4); > + masm.cmp32(type, Imm32(JSVAL_TYPE_TO_SHIFTED_TAG(JSVAL_TYPE_INT32) >> 32)); > + bailoutIf(Assembler::NotEqual, ins->snapshot()); I think it's a bit simpler here to do something like: Label bail; switch (mir->type()) { case MIRType_Int32: masm.branchTestInt32(Assembler::NotEqual, ..Address.., &bail); break; ... default: MOZ_CRASH(...); } masm.bailoutFrom(&bail, ins->snapshot()); Then you also don't need the Value.h changes :) @@ +8248,5 @@ > + Operand type(Register::FromCode(base.base()), base.disp() + 4); > + masm.cmp32(type, Imm32(JSVAL_TYPE_TO_SHIFTED_TAG(JSVAL_TYPE_INT32) >> 32)); > + bailoutIf(Assembler::NotEqual, ins->snapshot()); > + } > + masm.unboxInt32(base.toAddress(), result); If we handle other types too, this could be masm.loadUnboxedValue ::: js/src/jit/MIR.h @@ +9952,5 @@ > + size_t slot_; > + MUnbox::Mode mode_; > + protected: > + MLoadAndUnboxInt32(MDefinition* obj, size_t slot, MUnbox::Mode mode) > + : MUnaryInstruction(obj), slot_(slot), mode_(mode) Nit: indent this line with 6 instead of 8 spaces @@ +9955,5 @@ > + MLoadAndUnboxInt32(MDefinition* obj, size_t slot, MUnbox::Mode mode) > + : MUnaryInstruction(obj), slot_(slot), mode_(mode) > + { > + setResultType(MIRType_Int32); > + setMovable(); We need something similar to MUnbox to avoid eliminating this instruction: if (mode_ == TypeBarrier || mode_ == Fallible) setGuard();
Attachment #8639556 - Flags: feedback?(jdemooij) → feedback+
Attached patch v1.patch (obsolete) — Splinter Review
Assignee: nobody → j_schulte
Attachment #8639556 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8662130 - Flags: review?(jdemooij)
Attached patch v1.patch (obsolete) — Splinter Review
Fixed arm64 build-error.
Attachment #8662130 - Attachment is obsolete: true
Attachment #8662130 - Flags: review?(jdemooij)
Attachment #8664508 - Flags: review?(jdemooij)
Comment on attachment 8664508 [details] [diff] [review] v1.patch Review of attachment 8664508 [details] [diff] [review]: ----------------------------------------------------------------- Beautiful, thanks! ::: js/src/jit/CodeGenerator.cpp @@ +8398,5 @@ > + size_t slot = mir->slot(); > + // Out-of-line path to convert int32 to double or bailout > + // if this instruction is fallible. > + OutOfLineLoadAndUnboxFloatingPointFixedSlot* ool = > + new(alloc()) OutOfLineLoadAndUnboxFloatingPointFixedSlot(ins); Nit: this name is really long and since there's only one OOL path atm, I'd prefer just OutOfLineLoadFixedSlotAndUnbox. @@ +8412,5 @@ > + case MIRType_Boolean: > + masm.branchTestBoolean(Assembler::NotEqual, address, &bail); > + break; > + case MIRType_Double: > + case MIRType_Float32: As far as I know, Float32 can't actually happen here, because we don't track whether slots contain float32 values. Please remove this case and let me know if it fails any tests. @@ +8422,5 @@ > + bailoutFrom(&bail, ins->snapshot()); > + } > + masm.loadUnboxedValue(address, type, result); > + if (type == MIRType_Float32) > + masm.convertDoubleToFloat32(result.fpu(), result.fpu()); Same here. @@ +8442,5 @@ > + masm.branchTestInt32(Assembler::NotEqual, address, &bail); > + bailoutFrom(&bail, ins->snapshot()); > + } > + masm.unboxInt32(address, temp); > + masm.convertInt32ToFloatingPoint(temp, output, ins->mir()->type()); masm.convertInt32ToDouble ::: js/src/jit/MIR.cpp @@ +1551,5 @@ > + if (!input()->isLoadFixedSlot()) > + return this; > + MLoadFixedSlot* load = input()->toLoadFixedSlot(); > + if (load->type() != MIRType_Value || type() != MIRType_Int32 || type() != MIRType_Boolean || > + IsFloatingPointType(type())) The type() checks should use && instead of ||. I'd probably also use two if-statements because it seems a bit more readable than adding parentheses, and we can use IsNumberType instead of IsFloatingPointType to get rid of the Int32 test: if (load->type() != MIRType_Value) return this; if (type() != MIRType_Boolean && !IsNumberType(type())) return this;
Attachment #8664508 - Flags: review?(jdemooij) → review+
Comment on attachment 8664508 [details] [diff] [review] v1.patch Review of attachment 8664508 [details] [diff] [review]: ----------------------------------------------------------------- Changing to feedback+ because it'd be good to get another review after making the changes below. I can review those quickly. Also, would you mind testing how much this patch improves the benchmark below on x64? This is the one I mentioned in comment 0. https://raw.githubusercontent.com/h4writer/arewefastyet/master/benchmarks/misc/tests/assorted/misc-basic-closures.js ::: js/src/jit/CodeGenerator.cpp @@ +8420,5 @@ > + MOZ_CRASH("Given MIRType cannot be unboxed."); > + } > + bailoutFrom(&bail, ins->snapshot()); > + } > + masm.loadUnboxedValue(address, type, result); Hm I just realized loadUnboxedValue with a FloatRegister result also does the int32 vs double check. I think it'd be simplest to use masm.ensureDouble in the MIRType_Double case, and else use the switch + loadUnboxedValue.
Attachment #8664508 - Flags: review+ → feedback+
Attached patch v2.patch (obsolete) — Splinter Review
Yes, definitely nicer this way. x64: Without patch: 300 ms With patch: 90 ms No influence on x86.
Attachment #8664508 - Attachment is obsolete: true
Attachment #8675190 - Flags: review?(jdemooij)
Comment on attachment 8675190 [details] [diff] [review] v2.patch Review of attachment 8675190 [details] [diff] [review]: ----------------------------------------------------------------- Nice patch, but … ::: js/src/jit/MIR.cpp @@ +1558,5 @@ > + MLoadFixedSlotAndUnbox* ins = MLoadFixedSlotAndUnbox::New(alloc, load->object(), load->slot(), > + mode(), type(), bailoutKind()); > + // As GVN runs after the Alias Analysis, we have to set the dependency by hand > + ins->setDependency(load->dependency()); > + return ins; Unfortunately, this might be wrong because the data dependency might no longer hold at the unbox location. The problem here is related to the fact that you did not add the load instruction in any basic block yet, which means that GVN will add it next to the soon-to-be-removed unbox instruction. The unbox location is only restricted by the fact that the load is dominating it, which is not enough to ensure that we can load the value at the location of the unbox instruction. I simple test case to experiment with that would be something similar to: function g(o) { o.a = false; return true; } function f(o) { for (var i = 0; i < 20; i += inIon()) { var x = o.a; if (typeof x == "number") { if (g(o)) assertEq(x + 1, 2); // Expect MUnbox added by the "Value + Int" (but not converted into a MLoadFixedSlotAndUnbox) } else { assertEq(x, false); o.a = 1; } } } f({a:false}) We can probably make simpler test cases … but this is made to highlight 2 issues. The first one is that we don't want to introduce loads at the location of the Unbox instruction, as this does not respect the alias set rules. The second one is that we don't *always* want to add this LoadAndUnbox at the location of the load because we might bailout a lot. ::: js/src/jit/ScalarReplacement.cpp @@ +175,5 @@ > MDefinition* def = consumer->toDefinition(); > switch (def->op()) { > case MDefinition::Op_StoreFixedSlot: > case MDefinition::Op_LoadFixedSlot: > + case MDefinition::Op_LoadFixedSlotAndUnbox: This addition is both unnecessary and incomplete. It is unnecessary as Scalar Replacement is being executed before GVN, which produces these loads. It is incomplete as Scalar Replacement object memory view is now expecting to be able to visit this new instructions. I suggest to just remove this line as it is not necessary.
Comment on attachment 8675190 [details] [diff] [review] v2.patch Review of attachment 8675190 [details] [diff] [review]: ----------------------------------------------------------------- Looks good, thanks! r=me with review comments below and from Nicolas addressed. ::: js/src/jit/MIR.cpp @@ +1557,5 @@ > + > + MLoadFixedSlotAndUnbox* ins = MLoadFixedSlotAndUnbox::New(alloc, load->object(), load->slot(), > + mode(), type(), bailoutKind()); > + // As GVN runs after the Alias Analysis, we have to set the dependency by hand > + ins->setDependency(load->dependency()); To fix the issue Nicolas pointed out, maybe we can do this optimization only if `load` comes immediately before the unbox? Something like: // Only optimize if the load comes immediately before the unbox, so it's // safe to copy the load's dependency field. MInstructionIterator iter(load->block()->begin(load)); ++iter; if (*iter != this) return this; If you do that, does it still work for the microbenchmark? ::: js/src/jit/shared/LIR-shared.h @@ +5411,5 @@ > } > }; > > +class > +LLoadFixedSlotAndUnbox : public LInstructionHelper<1, 1, 0> Nit: this can be on one line.
Attachment #8675190 - Flags: review?(jdemooij) → review+
Attached patch v3.patchSplinter Review
Thanks Nicolas for taking a look! With the proposed changes, we still get the speedup, v2 got us, so I think, we're good to go. https://treeherder.mozilla.org/#/jobs?repo=try&revision=2221b763403b
Attachment #8675190 - Attachment is obsolete: true
Attachment #8688103 - Flags: review+
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: