Closed Bug 1289054 Opened 8 years ago Closed 8 years ago

Implement 64bit integer operations on arm

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla50

Tracking Flags:

Tracking

Status

firefox50

---

fixed

People

(Reporter: h4writer, Assigned: h4writer)

References

Details

Attachments

(25 files, 1 obsolete file)

Part 1: Make ion ready for int64 on arm 8 years ago Hannes Verschore [:h4writer] 9.63 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Part 2: implement LAsmJSParameterI64 8 years ago Hannes Verschore [:h4writer] 1.36 KB, patch	sstangl : review+	Details \| Diff \| Splinter Review
Part 3: Implement LWrapInt64ToInt32 8 years ago Hannes Verschore [:h4writer] 1.76 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 4: Implement LExtendInt32ToInt64 8 years ago Hannes Verschore [:h4writer] 2.55 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 5: Implement LAddI64 and LSubI64 8 years ago Hannes Verschore [:h4writer] 12.25 KB, patch	lth : review+	Details \| Diff \| Splinter Review
Part 6: Implement LMulI64 8 years ago Hannes Verschore [:h4writer] 12.42 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 7: Implement L(U)DivOrModI64 8 years ago Hannes Verschore [:h4writer] 18.19 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 8: Implement LCompareI64 8 years ago Hannes Verschore [:h4writer] 27.47 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 9: Implement LShiftI64 8 years ago Hannes Verschore [:h4writer] 10.25 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 10: Implement LBitOpI64 8 years ago Hannes Verschore [:h4writer] 4.24 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 11: Implement LRotateI64 8 years ago Hannes Verschore [:h4writer] 11.32 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 12: Implement LAsmJSPassStackArgI64 8 years ago Hannes Verschore [:h4writer] 3.24 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 13: Implement LAsmSelectI64 8 years ago Hannes Verschore [:h4writer] 3.52 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 14: Implement LAsmReinterpret(From\|To)I64 8 years ago Hannes Verschore [:h4writer] 2.54 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 15: Implement LPopcntI64 8 years ago Hannes Verschore [:h4writer] 6.29 KB, patch	lth : review+	Details \| Diff \| Splinter Review
Part 16: Implement LClzI64 and LCtzI64 8 years ago Hannes Verschore [:h4writer] 6.34 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 17: Implement LNotI64 8 years ago Hannes Verschore [:h4writer] 2.78 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 18: Implement LWasmTruncateToInt64 8 years ago Hannes Verschore [:h4writer] 17.55 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 19: Implement LInt64ToFloatingPoint 8 years ago Hannes Verschore [:h4writer] 10.86 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 20: Implement LTestI64AndBranch 8 years ago Hannes Verschore [:h4writer] 1.89 KB, patch	nbp : review+	Details \| Diff \| Splinter Review
Part 21: Remove some arm code 8 years ago Hannes Verschore [:h4writer] 1.70 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 22: Enable 64bit operations on arm 8 years ago Hannes Verschore [:h4writer] 9.96 KB, patch	luke : review+	Details \| Diff \| Splinter Review
Part 23: Implement LWasmLoadGlobalVarI64 and LWasmStoreGlobalVarI64 8 years ago Hannes Verschore [:h4writer] 5.39 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 24: Implement LWasmLoadI64 and LWasmStoreI64 8 years ago Hannes Verschore [:h4writer] 9.99 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 25: Temporarily don't use defineInt64ReuseInput 8 years ago Hannes Verschore [:h4writer] 41.44 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
Part 25: Temporarily don't use defineInt64ReuseInput 8 years ago Hannes Verschore [:h4writer] 40.47 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review

Hannes Verschore [:h4writer]

Assignee

Description

•

8 years ago

Same as bug 1279248, but now for arm.

Hannes Verschore [:h4writer]

Assignee

Comment 1

•

8 years ago

Attached patch Part 1: Make ion ready for int64 on arm — Details — Splinter Review

Assignee: nobody → hv1989

Hannes Verschore [:h4writer]

Assignee

Comment 2

•

8 years ago

Attached patch Part 2: implement LAsmJSParameterI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 3

•

8 years ago

Attached patch Part 3: Implement LWrapInt64ToInt32 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 4

•

8 years ago

Attached patch Part 4: Implement LExtendInt32ToInt64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 5

•

8 years ago

Attached patch Part 5: Implement LAddI64 and LSubI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 6

•

8 years ago

Attached patch Part 6: Implement LMulI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 7

•

8 years ago

Attached patch Part 7: Implement L(U)DivOrModI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 8

•

8 years ago

Attached patch Part 8: Implement LCompareI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 9

•

8 years ago

Attached patch Part 9: Implement LShiftI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 10

•

8 years ago

Attached patch Part 10: Implement LBitOpI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 11

•

8 years ago

Attached patch Part 11: Implement LRotateI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 12

•

8 years ago

Attached patch Part 12: Implement LAsmJSPassStackArgI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 13

•

8 years ago

Attached patch Part 13: Implement LAsmSelectI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 14

•

8 years ago

Attached patch Part 14: Implement LAsmReinterpret(From|To)I64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 15

•

8 years ago

Attached patch Part 15: Implement LPopcntI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 16

•

8 years ago

Attached patch Part 16: Implement LClzI64 and LCtzI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 17

•

8 years ago

Attached patch Part 17: Implement LNotI64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 18

•

8 years ago

Attached patch Part 18: Implement LWasmTruncateToInt64 — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 19

•

8 years ago

Attached patch Part 19: Implement LInt64ToFloatingPoint — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 20

•

8 years ago

Attached patch Part 20: Implement LTestI64AndBranch — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 21

•

8 years ago

Attached patch Part 21: Remove some arm code — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Comment 22

•

8 years ago

Attached patch Part 22: Enable 64bit operations on arm — Details — Splinter Review

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774307 - Flags: review?(sstangl)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774309 - Flags: review?(sstangl)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774311 - Flags: review?(efaustbmo)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774312 - Flags: review?(efaustbmo)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774313 - Flags: review?(lhansen)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774315 - Flags: review?(nicolas.b.pierron)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774316 - Flags: review?(nicolas.b.pierron)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774317 - Flags: review?(nicolas.b.pierron)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774318 - Flags: review?(bbouvier)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774319 - Flags: review?(bbouvier)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774320 - Flags: review?(luke)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774321 - Flags: review?(luke)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774322 - Flags: review?(efaustbmo)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774323 - Flags: review?(efaustbmo)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774324 - Flags: review?(lhansen)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774325 - Flags: review?(bbouvier)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774326 - Flags: review?(bbouvier)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774327 - Flags: review?(sunfish)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774328 - Flags: review?(sunfish)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774329 - Flags: review?(nicolas.b.pierron)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774331 - Flags: review?(sstangl)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Attachment #8774332 - Flags: review?(luke)

Hannes Verschore [:h4writer]

Assignee

Comment 23

•

8 years ago

As a general remark on the patches. Often the codegen is a direct copy from the shared/x86 variant. I would have loved to put this in the global codegen, but the specific xxx64 are not implemented on all macroassemblers (arm64/mips). As soon as that is done we can unify them again. (Some examples are mul64/add46/sub64/...)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Priority: -- → P1

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Blocks: 1246648

Luke Wagner [:luke]

Updated

•

8 years ago

Attachment #8774321 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 24

•

8 years ago

Comment on attachment 8774320 [details] [diff] [review] Part 11: Implement LRotateI64 Review of attachment 8774320 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +3343,5 @@ > + > + if (count->isConstant()) { > + int32_t c = int32_t(count->toConstant()->toInt64() & 0x3F); > + if (!c) > + return; Don't you still need to call the masm.rotate* so that the input gets copied to the output? ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +617,5 @@ > + ma_mov(dest.high, scratch); > + as_mov(dest.high, lsl(dest.high, amount)); > + as_orr(dest.high, dest.high, lsl(dest.low, amount - 32)); > + as_mov(dest.low, lsl(dest.low, amount)); > + as_orr(dest.low, dest.low, lsl(scratch, amount - 32)); From irc: I think this case has a bug but you could fix it and reuse code by calling rotateRight64(32 - (amount & 0x1f)). Could you also add a test case with an immediate that hits this case? @@ +705,5 @@ > + as_orr(dest.high, dest.high, lsl(dest.low, 32 - amount)); > + as_mov(dest.low, lsr(dest.low, amount)); > + as_orr(dest.low, dest.low, lsl(scratch, 32 - amount)); > + } else { > + ma_mov(dest.high, scratch); Ditto comment above.

Attachment #8774320 - Flags: review?(luke) → review+

Luke Wagner [:luke]

Comment 25

•

8 years ago

Comment on attachment 8774332 [details] [diff] [review] Part 22: Enable 64bit operations on arm Review of attachment 8774332 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/asmjs/Wasm.cpp @@ +68,1 @@ > return true; \o/ thanks so much for all this hard work! ::: js/src/asmjs/WasmIonCompile.cpp @@ +805,5 @@ > return true; > > ABIArg arg = args->abi_.next(ToMIRType(type)); > +#ifdef JS_CODEGEN_REGISTER_PAIR > + if (arg.kind() == ABIArg::GPR_PAIR) { Now that there are a few cases, could you express this control flow as a `switch (arg.kind())`? ::: js/src/jit/x64/CodeGenerator-x64.cpp @@ +1022,5 @@ > > + if (lir->mir()->bottomHalf()) > + masm.movl(ToOperand(input), output); > + else > + MOZ_CRASH("Not implemented."); Maybe "Never used."?

Attachment #8774332 - Flags: review?(luke) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 26

•

8 years ago

Lars T Hansen [:lth]

Comment 27

•

8 years ago

Comment on attachment 8774313 [details] [diff] [review] Part 5: Implement LAddI64 and LSubI64 Review of attachment 8774313 [details] [diff] [review]: ----------------------------------------------------------------- I'm assuming the code cut from CodeGenerator-x86 moved to some other patch and that those changes are spurious here?

Attachment #8774313 - Flags: review?(lhansen) → review+

Lars T Hansen [:lth]

Comment 28

•

8 years ago

Comment on attachment 8774324 [details] [diff] [review] Part 15: Implement LPopcntI64 Review of attachment 8774324 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +204,5 @@ > +{ > + MOZ_ASSERT(dest.low != tmp); > + MOZ_ASSERT(dest.high != tmp); > + MOZ_ASSERT(dest.low != dest.high); > + if (dest.low != src.high) { I think there could usefully be a comment here that we can have src and dest overlapping, but partially or in reverse order, to motivate the logic.

Attachment #8774324 - Flags: review?(lhansen) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 29

•

8 years ago

(In reply to Lars T Hansen [:lth] from comment #27) > Comment on attachment 8774313 [details] [diff] [review] > Part 5: Implement LAddI64 and LSubI64 > > Review of attachment 8774313 [details] [diff] [review]: > ----------------------------------------------------------------- > > I'm assuming the code cut from CodeGenerator-x86 moved to some other patch > and that those changes are spurious here? Oh yes. The code removal of Codegen-x86 shouldn't be here. (I first moved it to Codegenerator.cpp which didn't work yet since we still have 2/3 architectures not implemented yet. (arm64 and mips). I forgot to remove the code removal from Codegenerator-x86.cpp)

Nicolas B. Pierron [:nbp | PTO until 11/11]

Comment 30

•

8 years ago

Comment on attachment 8774315 [details] [diff] [review] Part 6: Implement LMulI64 Review of attachment 8774315 [details] [diff] [review]: ----------------------------------------------------------------- Fix the nits and r=me. ::: js/src/jit/arm/Lowering-arm.cpp @@ +209,5 @@ > + int64_t constant = rhs->toConstant()->toInt64(); > + int32_t shift = mozilla::FloorLog2(constant); > + if (constant <= 0 || int64_t(1) << shift != constant) { > + constantNeedTemp = constant != -1 && constant != 0 && > + constant != 1 && constant != 2; nit: 1/ constant == 1, shift = 0, (constant <= 0) == false, (int64_t(1) << shift != constant) == false => constantNeedTemp = true 2/ constant == 2, shift = 1, (constant <= 0) == false, (int64_t(1) << shift != constant) == false => constantNeedTemp = true 3/ constant == 4, shift = 2, (constant <= 0) == false, (int64_t(1) << shift != constant) == false => constantNeedTemp = true I suggest to change this code to: // See special cases in CodeGeneratorARM::visitMulI64 if (-1 <= constant && constant <= 2) constantNeedTemp = false; if (int64_t(1) << shift == constant) constantNeedTemp = false; ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +377,5 @@ > + MOZ_ASSERT(dest != src); > + MOZ_ASSERT(dest.low != src.high && dest.high != src.low); > + > + // Compute mul64 > + ma_mul(src.low, dest.high, dest.high); // (2) nit: swap src.low and dest.high first arguments to avoid ARMv6 encoding limitation. @@ +380,5 @@ > + // Compute mul64 > + ma_mul(src.low, dest.high, dest.high); // (2) > + ma_mul(src.high, dest.low, temp); // (3) > + ma_add(dest.high, temp, temp); > + ma_umull(src.low, dest.low, dest.high, dest.low); // (4) + (1) nit: swap src.low and dest.low first arguments to avoid the ARMv6 encoding limitation. ::: js/src/jit/arm/MacroAssembler-arm.cpp @@ +910,5 @@ > + > +void > +MacroAssemblerARM::ma_umull(Register src1, Register src2, Register destHigh, Register destLow) > +{ > + as_umull(destHigh, destLow, src1, src2); nit: as_umul has encoding limitations which are not yet translated as assertions in our code base. Namely: destHigh != pc, destLow != pc, src1 != pc, src2 != pc, destHigh != destLow, destHigh != src2 && destLow != src2 (ARMv6) The same issue also exists with ma_mul.

Attachment #8774315 - Flags: review?(nicolas.b.pierron) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 31

•

8 years ago

Comment on attachment 8774331 [details] [diff] [review] Part 21: Remove some arm code Review of attachment 8774331 [details] [diff] [review]: ----------------------------------------------------------------- Stealing, trivial. Thanks.

Attachment #8774331 - Flags: review?(sstangl) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 32

•

8 years ago

Comment on attachment 8774319 [details] [diff] [review] Part 10: Implement LBitOpI64 Review of attachment 8774319 [details] [diff] [review]: ----------------------------------------------------------------- Nice, cheers. ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +81,5 @@ > void > MacroAssembler::and64(Imm64 imm, Register64 dest) > { > + if (imm.low().value) > + and32(imm.low(), dest.low); It seems that if !imm.low().value, then dest.low should be 0. (and right below too) That works for `or`/`xor`, but not for `and`. Can you add a test for it, please?

Attachment #8774319 - Flags: review?(bbouvier) → review+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Comment 33

•

8 years ago

Comment on attachment 8774316 [details] [diff] [review] Part 7: Implement L(U)DivOrModI64 Review of attachment 8774316 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/Assembler-arm.cpp @@ +978,5 @@ > return Condition(ConditionInversionBit ^ cond); > } > > +Assembler::Condition > +Assembler::SignedCondition(Condition cond) nit: This function does not seems to be used in this patch, remove it if it is not used in any other patch either. ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +3148,5 @@ > + regs.take(lhs.low); > + regs.take(lhs.high); > + regs.take(rhs.low); > + regs.take(rhs.high); > + Register temp = regs.takeAny(); I would highly prefer to request a temp() in the lowering than using an AllocatableGeneralRegisterSet here. @@ +3201,5 @@ > + regs.take(lhs.low); > + regs.take(lhs.high); > + regs.take(rhs.low); > + regs.take(rhs.high); > + Register temp = regs.takeAny(); same here. ::: js/src/jit/arm/LIR-arm.h @@ +169,5 @@ > + if (mir_->isMod()) > + return mir_->toMod()->canBeDivideByZero(); > + return mir_->toDiv()->canBeDivideByZero(); > + } > + bool canBeNegativeOverflow() const { nit: Greedy copy&pasta issue? Does it make sense to check for negative-overflow (INT64_MIN / -1) for unsigned operations?

Attachment #8774316 - Flags: review?(nicolas.b.pierron)

Hannes Verschore [:h4writer]

Assignee

Comment 34

•

8 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #33) > Comment on attachment 8774316 [details] [diff] [review] > Part 7: Implement L(U)DivOrModI64 > > Review of attachment 8774316 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/arm/Assembler-arm.cpp > @@ +978,5 @@ > > return Condition(ConditionInversionBit ^ cond); > > } > > > > +Assembler::Condition > > +Assembler::SignedCondition(Condition cond) > > nit: This function does not seems to be used in this patch, remove it if it > is not used in any other patch either. It is used in branch64 > > ::: js/src/jit/arm/CodeGenerator-arm.cpp > @@ +3148,5 @@ > > + regs.take(lhs.low); > > + regs.take(lhs.high); > > + regs.take(rhs.low); > > + regs.take(rhs.high); > > + Register temp = regs.takeAny(); > > I would highly prefer to request a temp() in the lowering than using an > AllocatableGeneralRegisterSet here. Since this is a LCallInstruction all inputs are usedAtStart. Using a temp() through LIR could give us one of the inputs and that will clobber the result. I could use ScratchRegister though if you prefer that. > @@ +3201,5 @@ > > + regs.take(lhs.low); > > + regs.take(lhs.high); > > + regs.take(rhs.low); > > + regs.take(rhs.high); > > + Register temp = regs.takeAny(); > > same here. > > ::: js/src/jit/arm/LIR-arm.h > @@ +169,5 @@ > > + if (mir_->isMod()) > > + return mir_->toMod()->canBeDivideByZero(); > > + return mir_->toDiv()->canBeDivideByZero(); > > + } > > + bool canBeNegativeOverflow() const { > > nit: Greedy copy&pasta issue? Does it make sense to check for > negative-overflow (INT64_MIN / -1) for unsigned operations? Could be I was a bit greedy here. Gonna upload a new version tomorrow with this fixed.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Comment 35

•

8 years ago

Comment on attachment 8774317 [details] [diff] [review] Part 8: Implement LCompareI64 Review of attachment 8774317 [details] [diff] [review]: ----------------------------------------------------------------- Nice!

Attachment #8774317 - Flags: review?(nicolas.b.pierron) → review+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Updated

•

8 years ago

Attachment #8774329 - Flags: review?(nicolas.b.pierron) → review+

Nicolas B. Pierron [:nbp | PTO until 11/11]

Comment 36

•

8 years ago

(In reply to Hannes Verschore [:h4writer] from comment #34) > (In reply to Nicolas B. Pierron [:nbp] from comment #33) > > Comment on attachment 8774316 [details] [diff] [review] > > Part 7: Implement L(U)DivOrModI64 > > > > Review of attachment 8774316 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > ::: js/src/jit/arm/Assembler-arm.cpp > > @@ +978,5 @@ > > > return Condition(ConditionInversionBit ^ cond); > > > } > > > > > > +Assembler::Condition > > > +Assembler::SignedCondition(Condition cond) > > > > nit: This function does not seems to be used in this patch, remove it if it > > is not used in any other patch either. > > It is used in branch64 UnsignedCondition is used in branch64, but not SignedCondition. > > > > ::: js/src/jit/arm/CodeGenerator-arm.cpp > > @@ +3148,5 @@ > > > + regs.take(lhs.low); > > > + regs.take(lhs.high); > > > + regs.take(rhs.low); > > > + regs.take(rhs.high); > > > + Register temp = regs.takeAny(); > > > > I would highly prefer to request a temp() in the lowering than using an > > AllocatableGeneralRegisterSet here. > > Since this is a LCallInstruction all inputs are usedAtStart. Using a temp() > through LIR could give us one of the inputs and that will clobber the > result. I could use ScratchRegister though if you prefer that. Ok, then mention that in the comment with the AllocatableGeneralRegisterSet, and r=me. The ScratchRegister is supposed to be a magical wand of the MacroAssembler.

Nicolas B. Pierron [:nbp | PTO until 11/11]

Comment 37

•

8 years ago

Attachment #8774316 - Flags: review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 38

•

8 years ago

Comment on attachment 8774318 [details] [diff] [review] Part 9: Implement LShiftI64 Review of attachment 8774318 [details] [diff] [review]: ----------------------------------------------------------------- Wow, this is tricky. Thanks. ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +3287,5 @@ > } > } > + > +void > +CodeGeneratorARM::visitShiftI64(LShiftI64* lir) Could this be hoisted to CodeGenerator, with dummy NYI impls for arm64 and others? ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +466,5 @@ > + as_orr(dest.high, dest.high, lsr(dest.low, 32 - imm.value)); > + as_mov(dest.low, lsl(dest.low, imm.value)); > + } else { > + ma_lsl(Imm32(imm.value - 32), dest.low, dest.high); > + ma_mov(Imm32(0), dest.low); For consistency with the above (for the same result; ma_lsl ends up doing this), can you use: as_mov(dest.high, lsl(dest.low, imm.value - 32)); as_mov(Imm32(0), dest.low); @@ +473,5 @@ > + > +void > +MacroAssembler::lshift64(Register shift, Register64 dest) > +{ > + ScratchRegisterScope scratch(*this); Can you rename the parameter shift -> unmaskedShift, and scratch -> shift, please? @@ +474,5 @@ > +void > +MacroAssembler::lshift64(Register shift, Register64 dest) > +{ > + ScratchRegisterScope scratch(*this); > + ma_and(Imm32(0x3f), shift, scratch); // TODO: needed? I think we expect it to happen % 64 Yes, quoting https://github.com/WebAssembly/design/blob/master/AstSemantics.md#32-bit-integer-operators Shifts counts are wrapped to be less than the log-base-2 of the number of bits in the value to be shifted, as an unsigned quantity. For example, in a 32-bit shift, only the least 5 significant bits of the count affect the result. In a 64-bit shift, only the least 6 significant bits of the count affect the result. @@ +479,5 @@ > + as_mov(dest.high, lsl(dest.high, scratch)); > + ma_sub(scratch, Imm32(32), scratch); > + as_orr(dest.high, dest.high, lsl(dest.low, scratch)); > + ma_neg(scratch, scratch); > + as_orr(dest.high, dest.high, lsr(dest.low, scratch)); Maybe precise in a comment that one of the two orr has no effect, because one of the lsl or lsr will return 0? @@ +508,5 @@ > + as_mov(dest.low, lsr(dest.low, imm.value)); > + as_orr(dest.low, dest.low, lsl(dest.high, 32 - imm.value)); > + as_mov(dest.high, asr(dest.high, imm.value)); > + } else { > + ma_asr(Imm32(imm.value - 32), dest.high, dest.low); For consistency, can you a as_mov with a asr Operand2 here, too, please? @@ +518,5 @@ > +MacroAssembler::rshift64Arithmetic(Register shift, Register64 dest) > +{ > + Label proceed; > + > + ScratchRegisterScope scratch(*this); Same renaming suggestion as in lshift64. @@ +526,5 @@ > + as_orr(dest.low, dest.low, lsl(dest.high, scratch)); > + ma_neg(scratch, scratch, SetCC); > + ma_b(&proceed, Signed); > + > + as_orr(dest.low, dest.low, asr(dest.high, scratch)); Maybe add a comment above that this happens only if the shift value is > 32, and that dest.low is 0 in this case? @@ +542,5 @@ > + as_mov(dest.low, lsr(dest.low, imm.value)); > + as_orr(dest.low, dest.low, lsl(dest.high, 32 - imm.value)); > + as_mov(dest.high, lsr(dest.high, imm.value)); > + } else { > + ma_lsr(Imm32(imm.value - 32), dest.high, dest.low); Same remark as in the other functions. @@ +550,5 @@ > + > +void > +MacroAssembler::rshift64(Register shift, Register64 dest) > +{ > + ScratchRegisterScope scratch(*this); And same renaming suggestion here too.

Attachment #8774318 - Flags: review?(bbouvier) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 39

•

8 years ago

Comment on attachment 8774325 [details] [diff] [review] Part 16: Implement LClzI64 and LCtzI64 Review of attachment 8774325 [details] [diff] [review]: ----------------------------------------------------------------- Nice, thanks! ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ -1060,5 @@ > Register output = ToRegister(ins->output()); > - ScratchRegisterScope scratch(masm); > - > - masm.ma_rsb(input, Imm32(0), scratch, SetCC); > - masm.ma_and(input, scratch, input); (hmmm, clobbering input...) ::: js/src/jit/arm/MacroAssembler-arm-inl.h @@ +223,5 @@ > + ScratchRegisterScope scratch(*this); > + as_rsb(scratch, src, Imm8(0), SetCC); > + as_and(dest, src, O2Reg(scratch), LeaveCC); > + as_clz(dest, dest); > + as_rsb(dest, dest, Imm8(0x1F), LeaveCC, Assembler::NotEqual); NotEqual means "not zero", here; so I think the condition at the end could be: knownNotZero ? Always : NotEqual but maybe that's premature optimization. @@ +257,5 @@ > + ScratchRegisterScope scratch(*this); > + ma_clz(src.high, scratch); > + ma_cmp(scratch, Imm32(32)); > + ma_mov(scratch, dest, LeaveCC, NotEqual); > + ma_clz(src.low, dest, Equal); ma_clz leaves CC by default too?

Attachment #8774325 - Flags: review?(bbouvier) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 40

•

8 years ago

Comment on attachment 8774326 [details] [diff] [review] Part 17: Implement LNotI64 Review of attachment 8774326 [details] [diff] [review]: ----------------------------------------------------------------- Sweet. ::: js/src/jit/arm/Assembler-arm.cpp @@ +1287,5 @@ > MOZ_ASSERT(1 <= amt && amt <= 32); > return O2RegImmShift(r, ASR, amt); > } > > other pre-existing nit: two blank lines here

Attachment #8774326 - Flags: review?(bbouvier) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 41

•

8 years ago

Comment on attachment 8774311 [details] [diff] [review] Part 3: Implement LWrapInt64ToInt32 Review of attachment 8774311 [details] [diff] [review]: ----------------------------------------------------------------- Trivial, stealing.

Attachment #8774311 - Flags: review?(efaustbmo) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 42

•

8 years ago

Comment on attachment 8774312 [details] [diff] [review] Part 4: Implement LExtendInt32ToInt64 Review of attachment 8774312 [details] [diff] [review]: ----------------------------------------------------------------- Stealing (efaust might be busy at tc39). Looks good, thanks. ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +3059,5 @@ > + masm.ma_mov(Imm32(0), output.high); > + else > + masm.ma_asr(Imm32(31), output.low, output.high); > +} > + nit: trailing blank line at EOF ::: js/src/jit/arm/Lowering-arm.cpp @@ +850,5 @@ > > void > LIRGeneratorARM::visitExtendInt32ToInt64(MExtendInt32ToInt64* ins) > { > + LExtendInt32ToInt64* lir = Can use `auto* lir =` to fit on a single line. @@ +858,5 @@ > + LDefinition def(LDefinition::GENERAL, LDefinition::MUST_REUSE_INPUT); > + def.setReusedInput(0); > + > + lir->setDef(0, def); > + lir->getDef(0)->setVirtualRegister(ins->virtualRegister()); Can you have def.setVirtualRegister(ins->virtualRegister()); just after def.setReusedInput(0) and then remove this line, instead? This groups all the actions on `def` before using it.

Attachment #8774312 - Flags: review?(efaustbmo) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 43

•

8 years ago

Comment on attachment 8774322 [details] [diff] [review] Part 13: Implement LAsmSelectI64 Review of attachment 8774322 [details] [diff] [review]: ----------------------------------------------------------------- Looks legit, thanks.

Attachment #8774322 - Flags: review?(efaustbmo) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Updated

•

8 years ago

Attachment #8774323 - Flags: review?(efaustbmo) → review+

Sean Stangl [:sstangl]

Comment 44

•

8 years ago

Comment on attachment 8774307 [details] [diff] [review] Part 1: Make ion ready for int64 on arm Review of attachment 8774307 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/Assembler-arm.cpp @@ +62,5 @@ > + // when odd. > + intRegIndex_ = (intRegIndex_ + 1) & ~1; > + if (intRegIndex_ == NumIntArgRegs) { > + // Align the stack on 8 bytes. > + static const int align = sizeof(uint64_t) - 1; Prefer "static const uint32_t align" to match type of stackOffset_. @@ +122,5 @@ > + // when odd. > + intRegIndex_ = (intRegIndex_ + 1) & ~1; > + if (intRegIndex_ == NumIntArgRegs) { > + // Align the stack on 8 bytes. > + static const int align = sizeof(uint64_t) - 1; Ditto ::: js/src/jit/arm/Lowering-arm.cpp @@ +146,5 @@ > +void > +LIRGeneratorARM::defineInt64Phi(MPhi* phi, size_t lirIndex) > +{ > + LPhi* low = current->getPhi(lirIndex + INT64LOW_INDEX); > + LPhi* high = current->getPhi(lirIndex + INT64HIGH_INDEX); INT64LOW_INDEX and INT64HIGH_INDEX aren't defined yet.

Attachment #8774307 - Flags: review?(sstangl) → review+

Sean Stangl [:sstangl]

Updated

•

8 years ago

Attachment #8774309 - Flags: review?(sstangl) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 45

•

8 years ago

Comment on attachment 8774327 [details] [diff] [review] Part 18: Implement LWasmTruncateToInt64 Review of attachment 8774327 [details] [diff] [review]: ----------------------------------------------------------------- Stealing review, Dan is busy. Looks good, thanks. ::: js/src/asmjs/WasmTypes.cpp @@ +196,5 @@ > return x % y; > } > > +static int64_t > +WasmTruncateDoubleToInt64(double input) I think we're already in wasm:: namespace, so no need for the prefixing here and below. @@ +204,5 @@ > + if (input >= INT64_MAX * 1.0) > + return 0x8000000000000000; > + if (input < INT64_MIN * 1.0) > + return 0x8000000000000000; > + return input; Make the conversions explicit, please: input >= double(INT64_MAX) input <= double(INT64_MIN) return int64_t(input); @@ +216,5 @@ > + if (input >= UINT64_MAX * 1.0) > + return 0x8000000000000000; > + if (input <= -1.0) > + return 0x8000000000000000; > + return input; ditto ::: js/src/asmjs/WasmTypes.h @@ +725,5 @@ > UDivI64, > ModI64, > UModI64, > + WasmTruncateDoubleToInt64, > + WasmTruncateDoubleToUint64, ditto ::: js/src/jit/IonTypes.h @@ +758,5 @@ > enum { > ArgType_General = 0x1, > ArgType_Double = 0x2, > ArgType_Float32 = 0x3, > + ArgType_Int64 = 0x4, Out of curiosity, why do these values not start with 0x0? ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +2956,5 @@ > > void > +CodeGeneratorARM::visitWasmTruncateToInt64(LWasmTruncateToInt64* lir) > +{ > + // TODO: reuse output.high as temp. Remove, or add a bug # in the comment for follow-up. @@ +2982,5 @@ > + masm.callWithABI(wasm::SymbolicAddress::WasmTruncateDoubleToUint64); > + else > + masm.callWithABI(wasm::SymbolicAddress::WasmTruncateDoubleToInt64); > + > + masm.Pop(input); As input uses useRegisterAtStart, is there a chance that input === ReturnReg64 and that popping here would clobber the result? If not, could you assert it at the top of this function, please? @@ +2986,5 @@ > + masm.Pop(input); > + > + masm.ma_cmp(output.high, Imm32(0x80000000)); > + masm.ma_cmp(output.low, Imm32(0x00000000), Assembler::Equal); > + masm.ma_b(ool->entry(), Assembler::Equal); As discussed on irc, it's a bit weird that we use x86 error codes here. Wouldn't a real ARM device clamp the results anyway if the input is out of bounds? I think another strategy would be to: - allocate space on the stack for the int64 return value; - make the callout return a bool indicating whether the conversion succeeded or not, and reporting the JS error if it did not - here in the call stub, if the bool is false, just jump to wasm::Throw - otherwise copy the result from stack to register This way we wouldn't need the OOL at all. What do you think? (can be done in a follow-up bug, though) @@ +3018,5 @@ > + // By default test for the following inputs and bail: > + // signed: ] -Inf, INTXX_MIN - 1.0 ] and [ INTXX_MAX + 1.0 : +Inf [ > + // unsigned: ] -Inf, -1.0 ] and [ UINTXX_MAX + 1.0 : +Inf [ > + // Note: we cannot always represent those exact values. As a result > + // this changes the actual comparison a bit. Look at what x86-shared does, it's slightly less complicated (not having different conditions, more specialized if bodies). ::: js/src/jit/arm/LIR-arm.h @@ +559,5 @@ > return mir_->toAsmJSAtomicBinopHeap(); > } > }; > > +class LWasmTruncateToInt64 : public LCallInstructionHelper<INT64_PIECES, 1, 0> Could we name it LWasmTruncateToInt64Callout, to make it clear that it's a call? ::: js/src/jit/arm/MacroAssembler-arm.cpp @@ +4694,5 @@ > void > +MacroAssembler::Pop(FloatRegister reg) > +{ > + VFPRegister r = VFPRegister(reg); > + ma_vpop(VFPRegister(reg)); ma_vpop(r);

Attachment #8774327 - Flags: review?(sunfish) → review+

Benjamin Bouvier [:bbouvier] (inactive)

Comment 46

•

8 years ago

Comment on attachment 8774328 [details] [diff] [review] Part 19: Implement LInt64ToFloatingPoint Review of attachment 8774328 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! ::: js/src/asmjs/WasmTypes.cpp @@ +223,5 @@ > +static double > +Int64ToFloatingPoint(int32_t x_hi, uint32_t x_lo) > +{ > + int64_t x = ((uint64_t)x_hi << 32) + x_lo; > + return x; Too many implicit conversions here. How about: - making x_hi an uint32 too int64_t x = int64_t((uint64_t(x_hi) << 32)) + int64_t(x_hlo); or int64_t x = int64_t((uint64_t(x_hi) << 32) + uint64_t(x_hlo)); And at the end return double(x); (I wonder if technically, it might be undefined behavior to do this conversion...) @@ +230,5 @@ > +static double > +Uint64ToFloatingPoint(int32_t x_hi, uint32_t x_lo) > +{ > + uint64_t x = ((uint64_t)x_hi << 32) + x_lo; > + return x; ditto ::: js/src/jit/arm/LIR-arm.h @@ +574,5 @@ > return mir_->toWasmTruncateToInt64(); > } > }; > > +class LInt64ToFloatingPoint: public LCallInstructionHelper<1, INT64_PIECES, 0> Can you put "Callout" in the name, please?

Attachment #8774328 - Flags: review?(sunfish) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 47

•

8 years ago

Thanks everybody for reviewing so quickly!!!

Hannes Verschore [:h4writer]

Assignee

Comment 48

•

8 years ago

(In reply to Sean Stangl [:sstangl] from comment #44) > Comment on attachment 8774307 [details] [diff] [review] > Part 1: Make ion ready for int64 on arm > ::: js/src/jit/arm/Lowering-arm.cpp > @@ +146,5 @@ > > +void > > +LIRGeneratorARM::defineInt64Phi(MPhi* phi, size_t lirIndex) > > +{ > > + LPhi* low = current->getPhi(lirIndex + INT64LOW_INDEX); > > + LPhi* high = current->getPhi(lirIndex + INT64HIGH_INDEX); > > INT64LOW_INDEX and INT64HIGH_INDEX aren't defined yet. It is defined in bug 1279248, which implements this for x86

Hannes Verschore [:h4writer]

Assignee

Comment 49

•

8 years ago

Attached patch Part 23: Implement LWasmLoadGlobalVarI64 and LWasmStoreGlobalVarI64 — Details — Splinter Review

Attachment #8775591 - Flags: review?(bbouvier)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 50

•

8 years ago

Comment on attachment 8775591 [details] [diff] [review] Part 23: Implement LWasmLoadGlobalVarI64 and LWasmStoreGlobalVarI64 Review of attachment 8775591 [details] [diff] [review]: ----------------------------------------------------------------- Thanks!

Attachment #8775591 - Flags: review?(bbouvier) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 51

•

8 years ago

Attached patch Part 24: Implement LWasmLoadI64 and LWasmStoreI64 — Details — Splinter Review

Attachment #8775612 - Flags: review?(bbouvier)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 52

•

8 years ago

Comment on attachment 8775612 [details] [diff] [review] Part 24: Implement LWasmLoadI64 and LWasmStoreI64 Review of attachment 8775612 [details] [diff] [review]: ----------------------------------------------------------------- Thanks! ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +2375,5 @@ > + if (type == Scalar::Int64) { > + MOZ_ASSERT(INT64LOW_OFFSET == 0); > + masm.ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, output.low); > + masm.ma_add(Imm32(INT64HIGH_OFFSET), ptr); > + masm.ma_dataTransferN(IsLoad, 32, isSigned, HeapReg, ptr, output.high); Note (no change needed): ma_dataTransferN may apparently handle 64 bits loads and stores, but I wonder how this is supposed to work with 32-bits registers... @@ +2387,4 @@ > } else { > + AnyRegister output = ToAnyRegister(lir->output()); > + bool isFloat = output.isFloat(); > + if (isFloat) { pre-existing: you can inline output.isFloat() here @@ +2425,5 @@ > return; > } > > Register ptr = ToRegister(lir->ptr()); > + unsigned byteSize = mir->byteSize(); Can you keep byteSize closer to its uses, please? @@ +2443,5 @@ > + > + Register64 value = ToRegister64(lir->getInt64Operand(lir->ValueIndex)); > + masm.ma_dataTransferN(IsStore, 32 /* bits */, /* signed */ false, HeapReg, ptr, value.low); > + masm.ma_add(Imm32(INT64HIGH_OFFSET), ptr); > + masm.ma_dataTransferN(IsStore, 32 /* bits */, /* signed */ true, HeapReg, ptr, value.high); Couldn't the second one also be an unsigned store? sign is not taken into account in ma_dataTransferN when the size is 32, anyways.

Attachment #8775612 - Flags: review?(bbouvier) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 53

•

8 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb1525e83067 It looks like we are getting close, though there is an issue. Seems like the regalloc doesn't know how to handle two "reuseinput" on one definition properly. The issue presented itself only on arm currently, but I think it is also present on x86. It happens when: [MoveGroup] [stack:12 -> r3] [stack:8 -> r2] [22,23 Foo] [def v13<g>:r3] [def v14<g>:r2] [use v4:r r3] [use v5:r r2] we need to switch the registers in the movegroup. E.g. in this case we need to add [r2 -> r3] and [r3 -> r2] to flip the registers. Now the logic isn't smart enough to handle this correctly and makes: [MoveGroup] [stack:8 -> r3] [stack:8 -> r2] out of this. Giving allocation integrity errors and off course wrong results. Given this is regalloc it might be hard to fix this quickly. I'm doubting if for the landing I should quickly replace all reuseInput with a move and open a follow-up bug to fix it. That should solve it quickly with a little bit worse codegen.

Hannes Verschore [:h4writer]

Assignee

Comment 54

•

8 years ago

Attached patch Part 25: Temporarily don't use defineInt64ReuseInput (obsolete) — Details — Splinter Review

Like mentioned in the previous comment: defineInt64ReuseInput isn't playing nicely with the register allocator. Seems it doesn't support more than one reuseInput yet. This fixes it by not using defineInt64ReuseInput for now. I'll open a follow-up bug to enable it again.

Attachment #8775965 - Flags: review?(bbouvier)

Hannes Verschore [:h4writer]

Assignee

Updated

•

8 years ago

Blocks: 1290456

Benjamin Bouvier [:bbouvier] (inactive)

Comment 55

•

8 years ago

Comment on attachment 8775965 [details] [diff] [review] Part 25: Temporarily don't use defineInt64ReuseInput Review of attachment 8775965 [details] [diff] [review]: ----------------------------------------------------------------- It seems okay to land this for unblocking shipping, and take care of it later. Thanks! ::: js/src/jit/arm/CodeGenerator-arm.cpp @@ +298,5 @@ > const LInt64Allocation lhs = lir->getInt64Operand(LAddI64::Lhs); > const LInt64Allocation rhs = lir->getInt64Operand(LAddI64::Rhs); > + Register64 out = ToOutRegister64(lir); > + > + masm.move64(ToRegister64(lhs), out); Can you do this move only if ToReg64(lhs) != out? (and for all other instances in this patch) ::: js/src/jit/arm/Lowering-arm.cpp @@ +195,5 @@ > void > LIRGeneratorARM::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins, > MDefinition* mir, MDefinition* lhs, MDefinition* rhs) > { > + ins->setInt64Operand(0, useInt64Register(lhs)); This could still be useRegisterAtStart? @@ +214,5 @@ > if (int64_t(1) << shift == constant) > constantNeedTemp = false; > } > > + ins->setInt64Operand(0, useInt64Register(lhs)); Same question. @@ +295,5 @@ > { > if (mir->isRotate() && !rhs->isConstant()) > ins->setTemp(0, temp()); > > + ins->setInt64Operand(0, useInt64Register(lhs)); And here too. @@ +508,5 @@ > void > LIRGeneratorARM::visitAsmSelect(MAsmSelect* ins) > { > if (ins->type() == MIRType::Int64) { > + auto* lir = new(alloc()) LAsmSelectI64(useInt64Register(ins->trueExpr()), And here too. ::: js/src/jit/x64/Lowering-x64.cpp @@ +72,5 @@ > +LIRGeneratorX64::lowerForShiftInt64(LInstructionHelper<INT64_PIECES, INT64_PIECES + 1, Temps>* ins, > + MDefinition* mir, MDefinition* lhs, MDefinition* rhs) > +{ > + ins->setInt64Operand(0, useInt64RegisterAtStart(lhs)); > +#if defined(JS_NUNBOX32) I think this can't happen on x64. @@ +92,5 @@ > + use.setVirtualRegister(rhs->virtualRegister()); > + ins->setOperand(INT64_PIECES, use); > + } > + > + defineInt64ReuseInput(ins, mir, 0); I thought defineInt64ReuseInput was removed? (ok to move it back in this file) @@ +114,5 @@ > + auto* lir = new(alloc()) LAsmSelectI64(useInt64RegisterAtStart(ins->trueExpr()), > + useInt64(ins->falseExpr()), > + useRegister(ins->condExpr()) > + ); > + defineInt64ReuseInput(lir, ins, LAsmSelectI64::TrueExprIndex); ditto ::: js/src/jit/x86/CodeGenerator-x86.cpp @@ +1516,4 @@ > Register64 falseExpr = ToRegister64(lir->falseExpr()); > Register64 out = ToOutRegister64(lir); > > + masm.move64(trueExpr, out); Can you do this only if trueExpr != out? ::: js/src/jit/x86/Lowering-x86.cpp @@ +201,5 @@ > void > LIRGeneratorX86::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins, > MDefinition* mir, MDefinition* lhs, MDefinition* rhs) > { > + ins->setInt64Operand(0, useInt64Register(lhs)); Can it stay atStart? @@ +238,5 @@ > +{ > + ins->setInt64Operand(0, useInt64Register(lhs)); > +#if defined(JS_NUNBOX32) > + if (mir->isRotate()) > + ins->setTemp(0, temp()); This is gross, it should go under LIRGeneratorX86::visitRotate, instead. @@ +248,5 @@ > + ins->setOperand(INT64_PIECES, useOrConstant(rhs)); > + } else { > + // The operands are int64, but we only care about the lower 32 bits of > + // the RHS. On 32-bit, the code below will load that part in ecx and > + // will discard the upper half. Can you remove "On 32-bit," in this sentence? @@ +274,5 @@ > + lowerAsmSelect(ins); > + return; > + } > + > + auto* lir = new(alloc()) LAsmSelectI64(useInt64Register(ins->trueExpr()), Can it stay atStart?

Attachment #8775965 - Flags: review?(bbouvier) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 56

•

8 years ago

Attached patch Part 25: Temporarily don't use defineInt64ReuseInput — Details — Splinter Review

Attachment #8775965 - Attachment is obsolete: true

Attachment #8775970 - Flags: review?(bbouvier)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 57

•

8 years ago

Comment on attachment 8775970 [details] [diff] [review] Part 25: Temporarily don't use defineInt64ReuseInput Review of attachment 8775970 [details] [diff] [review]: ----------------------------------------------------------------- Haha, review conflict! Carrying forward r+ with addressed comments from the previous review. (fwiw, interdiff looks good https://bugzilla.mozilla.org/attachment.cgi?oldid=8775965&action=interdiff&newid=8775970&headers=1)

Attachment #8775970 - Flags: review?(bbouvier) → review+

Hannes Verschore [:h4writer]

Assignee

Comment 58

•

8 years ago

(In reply to Benjamin Bouvier [:bbouvier] from comment #55) > Comment on attachment 8775965 [details] [diff] [review] > Part 25: Temporarily don't use defineInt64ReuseInput > > Review of attachment 8775965 [details] [diff] [review]: > ----------------------------------------------------------------- > > It seems okay to land this for unblocking shipping, and take care of it > later. Thanks! > > ::: js/src/jit/arm/CodeGenerator-arm.cpp > @@ +298,5 @@ > > const LInt64Allocation lhs = lir->getInt64Operand(LAddI64::Lhs); > > const LInt64Allocation rhs = lir->getInt64Operand(LAddI64::Rhs); > > + Register64 out = ToOutRegister64(lir); > > + > > + masm.move64(ToRegister64(lhs), out); > > Can you do this move only if ToReg64(lhs) != out? (and for all other > instances in this patch) Not needed, since we don't use AtStart register they cannot overlap. So we have to copy them! > ::: js/src/jit/arm/Lowering-arm.cpp > @@ +195,5 @@ > > void > > LIRGeneratorARM::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins, > > MDefinition* mir, MDefinition* lhs, MDefinition* rhs) > > { > > + ins->setInt64Operand(0, useInt64Register(lhs)); > > This could still be useRegisterAtStart? Here our register allocator is the problem (which is a bug). We cannot have mixed AtStart and NotAtStart definitions. > ::: js/src/jit/x64/Lowering-x64.cpp > @@ +72,5 @@ > > +LIRGeneratorX64::lowerForShiftInt64(LInstructionHelper<INT64_PIECES, INT64_PIECES + 1, Temps>* ins, > > + MDefinition* mir, MDefinition* lhs, MDefinition* rhs) > > +{ > > + ins->setInt64Operand(0, useInt64RegisterAtStart(lhs)); > > +#if defined(JS_NUNBOX32) > > I think this can't happen on x64. correct removing > > @@ +92,5 @@ > > + use.setVirtualRegister(rhs->virtualRegister()); > > + ins->setOperand(INT64_PIECES, use); > > + } > > + > > + defineInt64ReuseInput(ins, mir, 0); > > I thought defineInt64ReuseInput was removed? (ok to move it back in this > file) Yeah I was almost going to move it over to x64 only. But fixed with next patch ;). > @@ +238,5 @@ > > +{ > > + ins->setInt64Operand(0, useInt64Register(lhs)); > > +#if defined(JS_NUNBOX32) > > + if (mir->isRotate()) > > + ins->setTemp(0, temp()); > > This is gross, it should go under LIRGeneratorX86::visitRotate, instead. I removed the JS_NUNBOX32, since we are now on x86 only. But the rest is quite similar, therefore keeping. > @@ +248,5 @@ > > + ins->setOperand(INT64_PIECES, useOrConstant(rhs)); > > + } else { > > + // The operands are int64, but we only care about the lower 32 bits of > > + // the RHS. On 32-bit, the code below will load that part in ecx and > > + // will discard the upper half. > > Can you remove "On 32-bit," in this sentence? Yes

Pulsebot

Comment 59

•

8 years ago

Pushed by hv1989@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/c98e305712c8 Part 1: Preparations in IonMonkey to support i64 on arm, r=sstangl https://hg.mozilla.org/integration/mozilla-inbound/rev/8b30ccb87d0a Part 2: Implement the 64bit variant of AsmJSParameter on arm, r=sstangl https://hg.mozilla.org/integration/mozilla-inbound/rev/14fb049c6eab Part 3: Implement the 64bit variant of WrapInt64ToInt32 on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/38c4a72c8804 Part 4: Implement the 64bit variant of Extend on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/e11ffc9f839c Part 5: Implement the 64bit variant of Add and Sub on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/ea5752e51caf Part 6: Implement the 64bit variant of Mul on arm, r=nbp https://hg.mozilla.org/integration/mozilla-inbound/rev/48ed814c72c6 Part 7: Implement the 64bit variant of Div and Mod on arm, r=nbp https://hg.mozilla.org/integration/mozilla-inbound/rev/df8f35c18584 Part 8: Implement the 64bit variant of Compare on arm, r=nbp https://hg.mozilla.org/integration/mozilla-inbound/rev/65981e46881a Part 9: Implement the 64bit variant of Shift on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/d2e991415189 Part 10: Implement the 64bit variant of BitOp on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/9ae4ad38a4b3 Part 11: Implement the 64bit variant of Rotate on arm, r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/6e959f39e82f Part 12: Implement the 64bit variant of AsmJSPassStackArg on arm, r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/8f7fa045bfba Part 13: Implement the 64bit variant of AsmSelect on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/f5f53c099276 Part 14: Implement the 64bit variant of AsmReinterpret on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/4c042c13a73a Part 15: Implement the 64bit variant of PopCnt on arm, r=lth https://hg.mozilla.org/integration/mozilla-inbound/rev/973d29183f5f Part 16: Implement the 64bit variant of Clz and Ctz on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/2f7ac340ea3c Part 17: Implement the 64bit variant of Not on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/48d273fefe73 Part 18: Implement the 64bit variant of WasmTruncate on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/ce45ee774edb Part 19: Implement the 64bit variant of Int64ToFloatingPoint on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/7edcd686d163 Part 20: Implement the 64bit variant of Test on arm, r=nbp https://hg.mozilla.org/integration/mozilla-inbound/rev/33dfbd1645d9 Part 21: Unrelated cleanup, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/1f97f6942a84 Part 22: Implement the 64bit variant of WasmLoadGlobalVar and WasmStoreGlobalVar on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/1449099f1906 Part 23: Implement the 64bit variant of WasmLoad and WasmStore on arm, r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/0f0fe678a40a Part 24: Make WebAssembly ready to run 64bit instructions on arm, r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/5b6fd86e965e Part 25: Don't reuse input during lowering for int 64 values on 32 bit platforms, r=bbouvier

Carsten Book [:Tomcat]

Comment 60

•

8 years ago

bugherder

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox50: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla50

Pulsebot

Comment 61

•

8 years ago

Pushed by hv1989@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/97ae43d193b1 Backout revision 5b6fd86e965e (part 25), r=backout

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 62

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/97ae43d193b1

Lars T Hansen [:lth]

Updated

•

8 years ago

Blocks: 1277011

You need to log in before you can comment on or make changes to this bug.