Closed
Bug 1147430
Opened 10 years ago
Closed 9 years ago
Optimize Load{FixedSlot,Slot,Element} followed by an Unbox
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla45
Tracking | Status | |
---|---|---|
firefox45 | --- | fixed |
People
(Reporter: jandem, Assigned: jschulte)
References
Details
(Keywords: perf)
Attachments
(1 file, 4 obsolete files)
18.57 KB,
patch
|
jschulte
:
review+
|
Details | Diff | Splinter Review |
I was curious why the "misc-basic-closures" benchmark on AWFY is at least 3x slower on x64 than on x86.
One problem is that we do this on 32-bit to load an int32 upvar:
[LoadFixedSlotV]
movl 0x20(%eax), %ebp
movl 0x24(%eax), %esi
[Unbox:Int32]
cmpl $0xffffff81, %esi
jne .Lfrom270
But on 64-bit:
[LoadFixedSlotV]
movq 0x30(%rax), %rbx
[Unbox:Int32]
movq %rbx, %r11
shrq $47, %r11
cmpl $0x1fff1, %r11d
jne .Lfrom250
movl %ebx, %ebx
This is pretty inefficient; at least for int32/bool we should be able to combine these instructions and do something like:
cmpl $0x1fff1, 0x34(%rax)
jne <bail>
movl 0x30(%rax), %rbx
Note that this is also shorter than the x86 version, so we'd probably want this optimization there as well.
![]() |
||
Comment 1•10 years ago
|
||
Related to bug 787735?
Assignee | ||
Comment 2•10 years ago
|
||
For now I've only implemented int32 in a fixed slot to hear a first feedback from you.
Attachment #8639556 -
Flags: feedback?(jdemooij)
Reporter | ||
Comment 3•10 years ago
|
||
Comment on attachment 8639556 [details] [diff] [review]
WIP.patch
Review of attachment 8639556 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for the delay. Excited to see somebody working on this bug! Bug 1067484 should give us a framework to make it easier to do this kind of optimization, but in the meantime this seems fine.
A high-level comment: instead of LoadAndUnboxInt32, can we have a LoadFixedSlotAndUnbox that handles the other unboxed types too? It seems like it should be pretty easy :) We don't need a new field for this, we can just use setResultType(type) and mir->type() for this.
I like having "FixedSlot" in the name, so that we can later add MLoadSlotAndUnbox to match MLoadSlot. Eventually it might be nice to get rid of MLoadFixedSlot and fold it into MLoadSlot, but that can wait.
::: js/src/jit/CodeGenerator.cpp
@@ +8246,5 @@
> + Operand base(Address(input, NativeObject::getFixedSlotOffset(slot)));
> + if (mir->fallible()) {
> + Operand type(Register::FromCode(base.base()), base.disp() + 4);
> + masm.cmp32(type, Imm32(JSVAL_TYPE_TO_SHIFTED_TAG(JSVAL_TYPE_INT32) >> 32));
> + bailoutIf(Assembler::NotEqual, ins->snapshot());
I think it's a bit simpler here to do something like:
Label bail;
switch (mir->type()) {
case MIRType_Int32:
masm.branchTestInt32(Assembler::NotEqual, ..Address.., &bail);
break;
...
default:
MOZ_CRASH(...);
}
masm.bailoutFrom(&bail, ins->snapshot());
Then you also don't need the Value.h changes :)
@@ +8248,5 @@
> + Operand type(Register::FromCode(base.base()), base.disp() + 4);
> + masm.cmp32(type, Imm32(JSVAL_TYPE_TO_SHIFTED_TAG(JSVAL_TYPE_INT32) >> 32));
> + bailoutIf(Assembler::NotEqual, ins->snapshot());
> + }
> + masm.unboxInt32(base.toAddress(), result);
If we handle other types too, this could be masm.loadUnboxedValue
::: js/src/jit/MIR.h
@@ +9952,5 @@
> + size_t slot_;
> + MUnbox::Mode mode_;
> + protected:
> + MLoadAndUnboxInt32(MDefinition* obj, size_t slot, MUnbox::Mode mode)
> + : MUnaryInstruction(obj), slot_(slot), mode_(mode)
Nit: indent this line with 6 instead of 8 spaces
@@ +9955,5 @@
> + MLoadAndUnboxInt32(MDefinition* obj, size_t slot, MUnbox::Mode mode)
> + : MUnaryInstruction(obj), slot_(slot), mode_(mode)
> + {
> + setResultType(MIRType_Int32);
> + setMovable();
We need something similar to MUnbox to avoid eliminating this instruction:
if (mode_ == TypeBarrier || mode_ == Fallible)
setGuard();
Attachment #8639556 -
Flags: feedback?(jdemooij) → feedback+
Assignee | ||
Comment 4•9 years ago
|
||
Assignee: nobody → j_schulte
Attachment #8639556 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8662130 -
Flags: review?(jdemooij)
Assignee | ||
Comment 5•9 years ago
|
||
Fixed arm64 build-error.
Attachment #8662130 -
Attachment is obsolete: true
Attachment #8662130 -
Flags: review?(jdemooij)
Attachment #8664508 -
Flags: review?(jdemooij)
Reporter | ||
Comment 6•9 years ago
|
||
Comment on attachment 8664508 [details] [diff] [review]
v1.patch
Review of attachment 8664508 [details] [diff] [review]:
-----------------------------------------------------------------
Beautiful, thanks!
::: js/src/jit/CodeGenerator.cpp
@@ +8398,5 @@
> + size_t slot = mir->slot();
> + // Out-of-line path to convert int32 to double or bailout
> + // if this instruction is fallible.
> + OutOfLineLoadAndUnboxFloatingPointFixedSlot* ool =
> + new(alloc()) OutOfLineLoadAndUnboxFloatingPointFixedSlot(ins);
Nit: this name is really long and since there's only one OOL path atm, I'd prefer just OutOfLineLoadFixedSlotAndUnbox.
@@ +8412,5 @@
> + case MIRType_Boolean:
> + masm.branchTestBoolean(Assembler::NotEqual, address, &bail);
> + break;
> + case MIRType_Double:
> + case MIRType_Float32:
As far as I know, Float32 can't actually happen here, because we don't track whether slots contain float32 values. Please remove this case and let me know if it fails any tests.
@@ +8422,5 @@
> + bailoutFrom(&bail, ins->snapshot());
> + }
> + masm.loadUnboxedValue(address, type, result);
> + if (type == MIRType_Float32)
> + masm.convertDoubleToFloat32(result.fpu(), result.fpu());
Same here.
@@ +8442,5 @@
> + masm.branchTestInt32(Assembler::NotEqual, address, &bail);
> + bailoutFrom(&bail, ins->snapshot());
> + }
> + masm.unboxInt32(address, temp);
> + masm.convertInt32ToFloatingPoint(temp, output, ins->mir()->type());
masm.convertInt32ToDouble
::: js/src/jit/MIR.cpp
@@ +1551,5 @@
> + if (!input()->isLoadFixedSlot())
> + return this;
> + MLoadFixedSlot* load = input()->toLoadFixedSlot();
> + if (load->type() != MIRType_Value || type() != MIRType_Int32 || type() != MIRType_Boolean ||
> + IsFloatingPointType(type()))
The type() checks should use && instead of ||. I'd probably also use two if-statements because it seems a bit more readable than adding parentheses, and we can use IsNumberType instead of IsFloatingPointType to get rid of the Int32 test:
if (load->type() != MIRType_Value)
return this;
if (type() != MIRType_Boolean && !IsNumberType(type()))
return this;
Attachment #8664508 -
Flags: review?(jdemooij) → review+
Reporter | ||
Comment 7•9 years ago
|
||
Comment on attachment 8664508 [details] [diff] [review]
v1.patch
Review of attachment 8664508 [details] [diff] [review]:
-----------------------------------------------------------------
Changing to feedback+ because it'd be good to get another review after making the changes below. I can review those quickly.
Also, would you mind testing how much this patch improves the benchmark below on x64? This is the one I mentioned in comment 0.
https://raw.githubusercontent.com/h4writer/arewefastyet/master/benchmarks/misc/tests/assorted/misc-basic-closures.js
::: js/src/jit/CodeGenerator.cpp
@@ +8420,5 @@
> + MOZ_CRASH("Given MIRType cannot be unboxed.");
> + }
> + bailoutFrom(&bail, ins->snapshot());
> + }
> + masm.loadUnboxedValue(address, type, result);
Hm I just realized loadUnboxedValue with a FloatRegister result also does the int32 vs double check. I think it'd be simplest to use masm.ensureDouble in the MIRType_Double case, and else use the switch + loadUnboxedValue.
Attachment #8664508 -
Flags: review+ → feedback+
Assignee | ||
Comment 8•9 years ago
|
||
Yes, definitely nicer this way.
x64:
Without patch: 300 ms
With patch: 90 ms
No influence on x86.
Attachment #8664508 -
Attachment is obsolete: true
Attachment #8675190 -
Flags: review?(jdemooij)
Comment 9•9 years ago
|
||
Comment on attachment 8675190 [details] [diff] [review]
v2.patch
Review of attachment 8675190 [details] [diff] [review]:
-----------------------------------------------------------------
Nice patch, but …
::: js/src/jit/MIR.cpp
@@ +1558,5 @@
> + MLoadFixedSlotAndUnbox* ins = MLoadFixedSlotAndUnbox::New(alloc, load->object(), load->slot(),
> + mode(), type(), bailoutKind());
> + // As GVN runs after the Alias Analysis, we have to set the dependency by hand
> + ins->setDependency(load->dependency());
> + return ins;
Unfortunately, this might be wrong because the data dependency might no longer hold at the unbox location.
The problem here is related to the fact that you did not add the load instruction in any basic block yet, which means that GVN will add it next to the soon-to-be-removed unbox instruction. The unbox location is only restricted by the fact that the load is dominating it, which is not enough to ensure that we can load the value at the location of the unbox instruction.
I simple test case to experiment with that would be something similar to:
function g(o) {
o.a = false;
return true;
}
function f(o) {
for (var i = 0; i < 20; i += inIon()) {
var x = o.a;
if (typeof x == "number") {
if (g(o))
assertEq(x + 1, 2); // Expect MUnbox added by the "Value + Int" (but not converted into a MLoadFixedSlotAndUnbox)
} else {
assertEq(x, false);
o.a = 1;
}
}
}
f({a:false})
We can probably make simpler test cases … but this is made to highlight 2 issues. The first one is that we don't want to introduce loads at the location of the Unbox instruction, as this does not respect the alias set rules. The second one is that we don't *always* want to add this LoadAndUnbox at the location of the load because we might bailout a lot.
::: js/src/jit/ScalarReplacement.cpp
@@ +175,5 @@
> MDefinition* def = consumer->toDefinition();
> switch (def->op()) {
> case MDefinition::Op_StoreFixedSlot:
> case MDefinition::Op_LoadFixedSlot:
> + case MDefinition::Op_LoadFixedSlotAndUnbox:
This addition is both unnecessary and incomplete.
It is unnecessary as Scalar Replacement is being executed before GVN, which produces these loads.
It is incomplete as Scalar Replacement object memory view is now expecting to be able to visit this new instructions.
I suggest to just remove this line as it is not necessary.
Reporter | ||
Comment 10•9 years ago
|
||
Comment on attachment 8675190 [details] [diff] [review]
v2.patch
Review of attachment 8675190 [details] [diff] [review]:
-----------------------------------------------------------------
Looks good, thanks! r=me with review comments below and from Nicolas addressed.
::: js/src/jit/MIR.cpp
@@ +1557,5 @@
> +
> + MLoadFixedSlotAndUnbox* ins = MLoadFixedSlotAndUnbox::New(alloc, load->object(), load->slot(),
> + mode(), type(), bailoutKind());
> + // As GVN runs after the Alias Analysis, we have to set the dependency by hand
> + ins->setDependency(load->dependency());
To fix the issue Nicolas pointed out, maybe we can do this optimization only if `load` comes immediately before the unbox?
Something like:
// Only optimize if the load comes immediately before the unbox, so it's
// safe to copy the load's dependency field.
MInstructionIterator iter(load->block()->begin(load));
++iter;
if (*iter != this)
return this;
If you do that, does it still work for the microbenchmark?
::: js/src/jit/shared/LIR-shared.h
@@ +5411,5 @@
> }
> };
>
> +class
> +LLoadFixedSlotAndUnbox : public LInstructionHelper<1, 1, 0>
Nit: this can be on one line.
Attachment #8675190 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 11•9 years ago
|
||
Thanks Nicolas for taking a look!
With the proposed changes, we still get the speedup, v2 got us, so I think, we're good to go.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2221b763403b
Attachment #8675190 -
Attachment is obsolete: true
Attachment #8688103 -
Flags: review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 12•9 years ago
|
||
Keywords: checkin-needed
Comment 13•9 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox45:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla45
You need to log in
before you can comment on or make changes to this bug.
Description
•