If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Implement 64bit integer operations on arm

RESOLVED FIXED in Firefox 50

Status

()

Core
JavaScript Engine: JIT
P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: h4writer, Assigned: h4writer)

Tracking

unspecified
mozilla50
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox50 fixed)

Details

Attachments

(25 attachments, 1 obsolete attachment)

9.63 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
1.36 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
1.76 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
2.55 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
12.25 KB, patch
lth
: review+
Details | Diff | Splinter Review
12.42 KB, patch
nbp
: review+
Details | Diff | Splinter Review
18.19 KB, patch
nbp
: review+
Details | Diff | Splinter Review
27.47 KB, patch
nbp
: review+
Details | Diff | Splinter Review
10.25 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
4.24 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
11.32 KB, patch
luke
: review+
Details | Diff | Splinter Review
3.24 KB, patch
luke
: review+
Details | Diff | Splinter Review
3.52 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
2.54 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
6.29 KB, patch
lth
: review+
Details | Diff | Splinter Review
6.34 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
2.78 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
17.55 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
10.86 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
1.89 KB, patch
nbp
: review+
Details | Diff | Splinter Review
1.70 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
9.96 KB, patch
luke
: review+
Details | Diff | Splinter Review
5.39 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
9.99 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
40.47 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
(Assignee)

Description

a year ago
Same as bug 1279248, but now for arm.
(Assignee)

Comment 1

a year ago
Created attachment 8774307 [details] [diff] [review]
Part 1: Make ion ready for int64 on arm
Assignee: nobody → hv1989
(Assignee)

Comment 2

a year ago
Created attachment 8774309 [details] [diff] [review]
Part 2: implement LAsmJSParameterI64
(Assignee)

Comment 3

a year ago
Created attachment 8774311 [details] [diff] [review]
Part 3: Implement LWrapInt64ToInt32
(Assignee)

Comment 4

a year ago
Created attachment 8774312 [details] [diff] [review]
Part 4: Implement LExtendInt32ToInt64
(Assignee)

Comment 5

a year ago
Created attachment 8774313 [details] [diff] [review]
Part 5: Implement LAddI64 and LSubI64
(Assignee)

Comment 6

a year ago
Created attachment 8774315 [details] [diff] [review]
Part 6: Implement LMulI64
(Assignee)

Comment 7

a year ago
Created attachment 8774316 [details] [diff] [review]
Part 7: Implement L(U)DivOrModI64
(Assignee)

Comment 8

a year ago
Created attachment 8774317 [details] [diff] [review]
Part 8: Implement LCompareI64
(Assignee)

Comment 9

a year ago
Created attachment 8774318 [details] [diff] [review]
Part 9: Implement LShiftI64
(Assignee)

Comment 10

a year ago
Created attachment 8774319 [details] [diff] [review]
Part 10: Implement LBitOpI64
(Assignee)

Comment 11

a year ago
Created attachment 8774320 [details] [diff] [review]
Part 11: Implement LRotateI64
(Assignee)

Comment 12

a year ago
Created attachment 8774321 [details] [diff] [review]
Part 12: Implement LAsmJSPassStackArgI64
(Assignee)

Comment 13

a year ago
Created attachment 8774322 [details] [diff] [review]
Part 13: Implement LAsmSelectI64
(Assignee)

Comment 14

a year ago
Created attachment 8774323 [details] [diff] [review]
Part 14: Implement LAsmReinterpret(From|To)I64
(Assignee)

Comment 15

a year ago
Created attachment 8774324 [details] [diff] [review]
Part 15: Implement LPopcntI64
(Assignee)

Comment 16

a year ago
Created attachment 8774325 [details] [diff] [review]
Part 16: Implement LClzI64 and LCtzI64
(Assignee)

Comment 17

a year ago
Created attachment 8774326 [details] [diff] [review]
Part 17: Implement LNotI64
(Assignee)

Comment 18

a year ago
Created attachment 8774327 [details] [diff] [review]
Part 18: Implement LWasmTruncateToInt64
(Assignee)

Comment 19

a year ago
Created attachment 8774328 [details] [diff] [review]
Part 19: Implement LInt64ToFloatingPoint
(Assignee)

Comment 20

a year ago
Created attachment 8774329 [details] [diff] [review]
Part 20: Implement LTestI64AndBranch
(Assignee)

Comment 21

a year ago
Created attachment 8774331 [details] [diff] [review]
Part 21: Remove some arm code
(Assignee)

Comment 22

a year ago
Created attachment 8774332 [details] [diff] [review]
Part 22: Enable 64bit operations on arm
(Assignee)

Updated

a year ago
Attachment #8774307 - Flags: review?(sstangl)
(Assignee)

Updated

a year ago
Attachment #8774309 - Flags: review?(sstangl)
(Assignee)

Updated

a year ago
Attachment #8774311 - Flags: review?(efaustbmo)
(Assignee)

Updated

a year ago
Attachment #8774312 - Flags: review?(efaustbmo)
(Assignee)

Updated

a year ago
Attachment #8774313 - Flags: review?(lhansen)
(Assignee)

Updated

a year ago
Attachment #8774315 - Flags: review?(nicolas.b.pierron)
(Assignee)

Updated

a year ago
Attachment #8774316 - Flags: review?(nicolas.b.pierron)
(Assignee)

Updated

a year ago
Attachment #8774317 - Flags: review?(nicolas.b.pierron)
(Assignee)

Updated

a year ago
Attachment #8774318 - Flags: review?(bbouvier)
(Assignee)

Updated

a year ago
Attachment #8774319 - Flags: review?(bbouvier)
(Assignee)

Updated

a year ago
Attachment #8774320 - Flags: review?(luke)
(Assignee)

Updated

a year ago
Attachment #8774321 - Flags: review?(luke)
(Assignee)

Updated

a year ago
Attachment #8774322 - Flags: review?(efaustbmo)
(Assignee)

Updated

a year ago
Attachment #8774323 - Flags: review?(efaustbmo)
(Assignee)

Updated

a year ago
Attachment #8774324 - Flags: review?(lhansen)
(Assignee)

Updated

a year ago
Attachment #8774325 - Flags: review?(bbouvier)
(Assignee)

Updated

a year ago
Attachment #8774326 - Flags: review?(bbouvier)
(Assignee)

Updated

a year ago
Attachment #8774327 - Flags: review?(sunfish)
(Assignee)

Updated

a year ago
Attachment #8774328 - Flags: review?(sunfish)
(Assignee)

Updated

a year ago
Attachment #8774329 - Flags: review?(nicolas.b.pierron)
(Assignee)

Updated

a year ago
Attachment #8774331 - Flags: review?(sstangl)
(Assignee)

Updated

a year ago
Attachment #8774332 - Flags: review?(luke)
(Assignee)

Comment 23

a year ago
As a general remark on the patches. Often the codegen is a direct copy from the shared/x86 variant. I would have loved to put this in the global codegen, but the specific xxx64 are not implemented on all macroassemblers (arm64/mips). As soon as that is done we can unify them again. (Some examples are mul64/add46/sub64/...)
(Assignee)

Updated

a year ago
Priority: -- → P1
(Assignee)

Updated

a year ago
Blocks: 1246648

Updated

a year ago
Attachment #8774321 - Flags: review?(luke) → review+
Comment on attachment 8774320 [details] [diff] [review]
Part 11: Implement LRotateI64

Review of attachment 8774320 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +3343,5 @@
> +
> +    if (count->isConstant()) {
> +        int32_t c = int32_t(count->toConstant()->toInt64() & 0x3F);
> +        if (!c)
> +            return;

Don't you still need to call the masm.rotate* so that the input gets copied to the output?

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +617,5 @@
> +        ma_mov(dest.high, scratch);
> +        as_mov(dest.high, lsl(dest.high, amount));
> +        as_orr(dest.high, dest.high, lsl(dest.low, amount - 32));
> +        as_mov(dest.low, lsl(dest.low, amount));
> +        as_orr(dest.low, dest.low, lsl(scratch, amount - 32));

From irc: I think this case has a bug but you could fix it and reuse code by calling rotateRight64(32 - (amount & 0x1f)).

Could you also add a test case with an immediate that hits this case?

@@ +705,5 @@
> +        as_orr(dest.high, dest.high, lsl(dest.low, 32 - amount));
> +        as_mov(dest.low, lsr(dest.low, amount));
> +        as_orr(dest.low, dest.low, lsl(scratch, 32 - amount));
> +    } else {
> +        ma_mov(dest.high, scratch);

Ditto comment above.
Attachment #8774320 - Flags: review?(luke) → review+
Comment on attachment 8774332 [details] [diff] [review]
Part 22: Enable 64bit operations on arm

Review of attachment 8774332 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/Wasm.cpp
@@ +68,1 @@
>      return true;

\o/ thanks so much for all this hard work!

::: js/src/asmjs/WasmIonCompile.cpp
@@ +805,5 @@
>              return true;
>  
>          ABIArg arg = args->abi_.next(ToMIRType(type));
> +#ifdef JS_CODEGEN_REGISTER_PAIR
> +        if (arg.kind() == ABIArg::GPR_PAIR) {

Now that there are a few cases, could you express this control flow as a `switch (arg.kind())`?

::: js/src/jit/x64/CodeGenerator-x64.cpp
@@ +1022,5 @@
>  
> +    if (lir->mir()->bottomHalf())
> +        masm.movl(ToOperand(input), output);
> +    else
> +        MOZ_CRASH("Not implemented.");

Maybe "Never used."?
Attachment #8774332 - Flags: review?(luke) → review+
(Assignee)

Comment 26

a year ago
Comment on attachment 8774307 [details] [diff] [review]
Part 1: Make ion ready for int64 on arm

Review of attachment 8774307 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +707,5 @@
>  
>  void
>  MacroAssemblerARM::ma_rsb(Register src1, Register dest, SBit s, Condition c)
>  {
> +    as_alu(dest, dest, O2Reg(src1), OpRsb, s, c);

I even think this should be:
as_alu(dest, src1, O2Reg(dest), OpRsb, s, c);
Comment on attachment 8774313 [details] [diff] [review]
Part 5: Implement LAddI64 and LSubI64

Review of attachment 8774313 [details] [diff] [review]:
-----------------------------------------------------------------

I'm assuming the code cut from CodeGenerator-x86 moved to some other patch and that those changes are spurious here?
Attachment #8774313 - Flags: review?(lhansen) → review+
Comment on attachment 8774324 [details] [diff] [review]
Part 15: Implement LPopcntI64

Review of attachment 8774324 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +204,5 @@
> +{
> +    MOZ_ASSERT(dest.low != tmp);
> +    MOZ_ASSERT(dest.high != tmp);
> +    MOZ_ASSERT(dest.low != dest.high);
> +    if (dest.low != src.high) {

I think there could usefully be a comment here that we can have src and dest overlapping, but partially or in reverse order, to motivate the logic.
Attachment #8774324 - Flags: review?(lhansen) → review+
(Assignee)

Comment 29

a year ago
(In reply to Lars T Hansen [:lth] from comment #27)
> Comment on attachment 8774313 [details] [diff] [review]
> Part 5: Implement LAddI64 and LSubI64
> 
> Review of attachment 8774313 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm assuming the code cut from CodeGenerator-x86 moved to some other patch
> and that those changes are spurious here?

Oh yes. The code removal of Codegen-x86 shouldn't be here.
(I first moved it to Codegenerator.cpp which didn't work yet since we still have 2/3 architectures not implemented yet. (arm64 and mips). I forgot to remove the code removal from Codegenerator-x86.cpp)
Comment on attachment 8774315 [details] [diff] [review]
Part 6: Implement LMulI64

Review of attachment 8774315 [details] [diff] [review]:
-----------------------------------------------------------------

Fix the nits and r=me.

::: js/src/jit/arm/Lowering-arm.cpp
@@ +209,5 @@
> +        int64_t constant = rhs->toConstant()->toInt64();
> +        int32_t shift = mozilla::FloorLog2(constant);
> +        if (constant <= 0 || int64_t(1) << shift != constant) {
> +            constantNeedTemp = constant != -1 && constant != 0 &&
> +                               constant != 1 && constant != 2;

nit:
  1/ constant == 1, shift = 0, (constant <= 0) == false, (int64_t(1) << shift != constant) == false
     => constantNeedTemp = true

  2/ constant == 2, shift = 1, (constant <= 0) == false, (int64_t(1) << shift != constant) == false
     => constantNeedTemp = true

  3/ constant == 4, shift = 2, (constant <= 0) == false, (int64_t(1) << shift != constant) == false
     => constantNeedTemp = true

I suggest to change this code to:

    // See special cases in CodeGeneratorARM::visitMulI64
    if (-1 <= constant && constant <= 2)
        constantNeedTemp = false;
    if (int64_t(1) << shift == constant)
        constantNeedTemp = false;

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +377,5 @@
> +    MOZ_ASSERT(dest != src);
> +    MOZ_ASSERT(dest.low != src.high && dest.high != src.low);
> +
> +    // Compute mul64
> +    ma_mul(src.low, dest.high, dest.high); // (2)

nit: swap src.low and dest.high first arguments to avoid ARMv6 encoding limitation.

@@ +380,5 @@
> +    // Compute mul64
> +    ma_mul(src.low, dest.high, dest.high); // (2)
> +    ma_mul(src.high, dest.low, temp); // (3)
> +    ma_add(dest.high, temp, temp);
> +    ma_umull(src.low, dest.low, dest.high, dest.low); // (4) + (1)

nit: swap src.low and dest.low first arguments to avoid the ARMv6 encoding limitation.

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +910,5 @@
> +
> +void
> +MacroAssemblerARM::ma_umull(Register src1, Register src2, Register destHigh, Register destLow)
> +{
> +    as_umull(destHigh, destLow, src1, src2);

nit: as_umul has encoding limitations which are not yet translated as assertions in our code base.
Namely:
  destHigh != pc,
  destLow != pc,
  src1 != pc,
  src2 != pc,
  destHigh != destLow,
  destHigh != src2 && destLow != src2 (ARMv6)

The same issue also exists with ma_mul.
Attachment #8774315 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 8774331 [details] [diff] [review]
Part 21: Remove some arm code

Review of attachment 8774331 [details] [diff] [review]:
-----------------------------------------------------------------

Stealing, trivial. Thanks.
Attachment #8774331 - Flags: review?(sstangl) → review+
Comment on attachment 8774319 [details] [diff] [review]
Part 10: Implement LBitOpI64

Review of attachment 8774319 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, cheers.

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +81,5 @@
>  void
>  MacroAssembler::and64(Imm64 imm, Register64 dest)
>  {
> +    if (imm.low().value)
> +        and32(imm.low(), dest.low);

It seems that if !imm.low().value, then dest.low should be 0. (and right below too)
That works for `or`/`xor`, but not for `and`.

Can you add a test for it, please?
Attachment #8774319 - Flags: review?(bbouvier) → review+
Comment on attachment 8774316 [details] [diff] [review]
Part 7: Implement L(U)DivOrModI64

Review of attachment 8774316 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/Assembler-arm.cpp
@@ +978,5 @@
>      return Condition(ConditionInversionBit ^ cond);
>  }
>  
> +Assembler::Condition
> +Assembler::SignedCondition(Condition cond)

nit: This function does not seems to be used in this patch, remove it if it is not used in any other patch either.

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +3148,5 @@
> +    regs.take(lhs.low);
> +    regs.take(lhs.high);
> +    regs.take(rhs.low);
> +    regs.take(rhs.high);
> +    Register temp = regs.takeAny();

I would highly prefer to request a temp() in the lowering than using an AllocatableGeneralRegisterSet here.

@@ +3201,5 @@
> +    regs.take(lhs.low);
> +    regs.take(lhs.high);
> +    regs.take(rhs.low);
> +    regs.take(rhs.high);
> +    Register temp = regs.takeAny();

same here.

::: js/src/jit/arm/LIR-arm.h
@@ +169,5 @@
> +        if (mir_->isMod())
> +            return mir_->toMod()->canBeDivideByZero();
> +        return mir_->toDiv()->canBeDivideByZero();
> +    }
> +    bool canBeNegativeOverflow() const {

nit: Greedy copy&pasta issue?  Does it make sense to check for negative-overflow (INT64_MIN / -1) for unsigned operations?
Attachment #8774316 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 34

a year ago
(In reply to Nicolas B. Pierron [:nbp] from comment #33)
> Comment on attachment 8774316 [details] [diff] [review]
> Part 7: Implement L(U)DivOrModI64
> 
> Review of attachment 8774316 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/arm/Assembler-arm.cpp
> @@ +978,5 @@
> >      return Condition(ConditionInversionBit ^ cond);
> >  }
> >  
> > +Assembler::Condition
> > +Assembler::SignedCondition(Condition cond)
> 
> nit: This function does not seems to be used in this patch, remove it if it
> is not used in any other patch either.

It is used in branch64
> 
> ::: js/src/jit/arm/CodeGenerator-arm.cpp
> @@ +3148,5 @@
> > +    regs.take(lhs.low);
> > +    regs.take(lhs.high);
> > +    regs.take(rhs.low);
> > +    regs.take(rhs.high);
> > +    Register temp = regs.takeAny();
> 
> I would highly prefer to request a temp() in the lowering than using an
> AllocatableGeneralRegisterSet here.

Since this is a LCallInstruction all inputs are usedAtStart. Using a temp() through LIR could give us one of the inputs and that will clobber the result. I could use ScratchRegister though if you prefer that.

> @@ +3201,5 @@
> > +    regs.take(lhs.low);
> > +    regs.take(lhs.high);
> > +    regs.take(rhs.low);
> > +    regs.take(rhs.high);
> > +    Register temp = regs.takeAny();
> 
> same here.
> 
> ::: js/src/jit/arm/LIR-arm.h
> @@ +169,5 @@
> > +        if (mir_->isMod())
> > +            return mir_->toMod()->canBeDivideByZero();
> > +        return mir_->toDiv()->canBeDivideByZero();
> > +    }
> > +    bool canBeNegativeOverflow() const {
> 
> nit: Greedy copy&pasta issue?  Does it make sense to check for
> negative-overflow (INT64_MIN / -1) for unsigned operations?

Could be I was a bit greedy here. Gonna upload a new version tomorrow with this fixed.
Comment on attachment 8774317 [details] [diff] [review]
Part 8: Implement LCompareI64

Review of attachment 8774317 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #8774317 - Flags: review?(nicolas.b.pierron) → review+
Attachment #8774329 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Hannes Verschore [:h4writer] from comment #34)
> (In reply to Nicolas B. Pierron [:nbp] from comment #33)
> > Comment on attachment 8774316 [details] [diff] [review]
> > Part 7: Implement L(U)DivOrModI64
> > 
> > Review of attachment 8774316 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: js/src/jit/arm/Assembler-arm.cpp
> > @@ +978,5 @@
> > >      return Condition(ConditionInversionBit ^ cond);
> > >  }
> > >  
> > > +Assembler::Condition
> > > +Assembler::SignedCondition(Condition cond)
> > 
> > nit: This function does not seems to be used in this patch, remove it if it
> > is not used in any other patch either.
> 
> It is used in branch64

UnsignedCondition is used in branch64, but not SignedCondition.

> > 
> > ::: js/src/jit/arm/CodeGenerator-arm.cpp
> > @@ +3148,5 @@
> > > +    regs.take(lhs.low);
> > > +    regs.take(lhs.high);
> > > +    regs.take(rhs.low);
> > > +    regs.take(rhs.high);
> > > +    Register temp = regs.takeAny();
> > 
> > I would highly prefer to request a temp() in the lowering than using an
> > AllocatableGeneralRegisterSet here.
> 
> Since this is a LCallInstruction all inputs are usedAtStart. Using a temp()
> through LIR could give us one of the inputs and that will clobber the
> result. I could use ScratchRegister though if you prefer that.

Ok, then mention that in the comment with the AllocatableGeneralRegisterSet, and r=me.
The ScratchRegister is supposed to be a magical wand of the MacroAssembler.
Comment on attachment 8774316 [details] [diff] [review]
Part 7: Implement L(U)DivOrModI64

Review of attachment 8774316 [details] [diff] [review]:
-----------------------------------------------------------------

(see comment 36)
Attachment #8774316 - Flags: review+
Comment on attachment 8774318 [details] [diff] [review]
Part 9: Implement LShiftI64

Review of attachment 8774318 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, this is tricky. Thanks.

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +3287,5 @@
>      }
>  }
> +
> +void
> +CodeGeneratorARM::visitShiftI64(LShiftI64* lir)

Could this be hoisted to CodeGenerator, with dummy NYI impls for arm64 and others?

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +466,5 @@
> +        as_orr(dest.high, dest.high, lsr(dest.low, 32 - imm.value));
> +        as_mov(dest.low, lsl(dest.low, imm.value));
> +    } else {
> +        ma_lsl(Imm32(imm.value - 32), dest.low, dest.high);
> +        ma_mov(Imm32(0), dest.low);

For consistency with the above (for the same result; ma_lsl ends up doing this), can you use:

as_mov(dest.high, lsl(dest.low, imm.value - 32));
as_mov(Imm32(0), dest.low);

@@ +473,5 @@
> +
> +void
> +MacroAssembler::lshift64(Register shift, Register64 dest)
> +{
> +    ScratchRegisterScope scratch(*this);

Can you rename the parameter shift -> unmaskedShift, and scratch -> shift, please?

@@ +474,5 @@
> +void
> +MacroAssembler::lshift64(Register shift, Register64 dest)
> +{
> +    ScratchRegisterScope scratch(*this);
> +    ma_and(Imm32(0x3f), shift, scratch); // TODO: needed? I think we expect it to happen % 64

Yes, quoting https://github.com/WebAssembly/design/blob/master/AstSemantics.md#32-bit-integer-operators

Shifts counts are wrapped to be less than the log-base-2 of the number of bits in the value to be shifted, as an unsigned quantity. For example, in a 32-bit shift, only the least 5 significant bits of the count affect the result. In a 64-bit shift, only the least 6 significant bits of the count affect the result.

@@ +479,5 @@
> +    as_mov(dest.high, lsl(dest.high, scratch));
> +    ma_sub(scratch, Imm32(32), scratch);
> +    as_orr(dest.high, dest.high, lsl(dest.low, scratch));
> +    ma_neg(scratch, scratch);
> +    as_orr(dest.high, dest.high, lsr(dest.low, scratch));

Maybe precise in a comment that one of the two orr has no effect, because one of the lsl or lsr will return 0?

@@ +508,5 @@
> +        as_mov(dest.low, lsr(dest.low, imm.value));
> +        as_orr(dest.low, dest.low, lsl(dest.high, 32 - imm.value));
> +        as_mov(dest.high, asr(dest.high, imm.value));
> +    } else {
> +        ma_asr(Imm32(imm.value - 32), dest.high, dest.low);

For consistency, can you a as_mov with a asr Operand2 here, too, please?

@@ +518,5 @@
> +MacroAssembler::rshift64Arithmetic(Register shift, Register64 dest)
> +{
> +    Label proceed;
> +
> +    ScratchRegisterScope scratch(*this);

Same renaming suggestion as in lshift64.

@@ +526,5 @@
> +    as_orr(dest.low, dest.low, lsl(dest.high, scratch));
> +    ma_neg(scratch, scratch, SetCC);
> +    ma_b(&proceed, Signed);
> +
> +    as_orr(dest.low, dest.low, asr(dest.high, scratch));

Maybe add a comment above that this happens only if the shift value is > 32, and that dest.low is 0 in this case?

@@ +542,5 @@
> +        as_mov(dest.low, lsr(dest.low, imm.value));
> +        as_orr(dest.low, dest.low, lsl(dest.high, 32 - imm.value));
> +        as_mov(dest.high, lsr(dest.high, imm.value));
> +    } else {
> +        ma_lsr(Imm32(imm.value - 32), dest.high, dest.low);

Same remark as in the other functions.

@@ +550,5 @@
> +
> +void
> +MacroAssembler::rshift64(Register shift, Register64 dest)
> +{
> +    ScratchRegisterScope scratch(*this);

And same renaming suggestion here too.
Attachment #8774318 - Flags: review?(bbouvier) → review+
Comment on attachment 8774325 [details] [diff] [review]
Part 16: Implement LClzI64 and LCtzI64

Review of attachment 8774325 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, thanks!

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ -1060,5 @@
>      Register output = ToRegister(ins->output());
> -    ScratchRegisterScope scratch(masm);
> -
> -    masm.ma_rsb(input, Imm32(0), scratch, SetCC);
> -    masm.ma_and(input, scratch, input);

(hmmm, clobbering input...)

::: js/src/jit/arm/MacroAssembler-arm-inl.h
@@ +223,5 @@
> +    ScratchRegisterScope scratch(*this);
> +    as_rsb(scratch, src, Imm8(0), SetCC);
> +    as_and(dest, src, O2Reg(scratch), LeaveCC);
> +    as_clz(dest, dest);
> +    as_rsb(dest, dest, Imm8(0x1F), LeaveCC, Assembler::NotEqual);

NotEqual means "not zero", here; so I think the condition at the end could be:

knownNotZero ? Always : NotEqual

but maybe that's premature optimization.

@@ +257,5 @@
> +    ScratchRegisterScope scratch(*this);
> +    ma_clz(src.high, scratch);
> +    ma_cmp(scratch, Imm32(32));
> +    ma_mov(scratch, dest, LeaveCC, NotEqual);
> +    ma_clz(src.low, dest, Equal);

ma_clz leaves CC by default too?
Attachment #8774325 - Flags: review?(bbouvier) → review+
Comment on attachment 8774326 [details] [diff] [review]
Part 17: Implement LNotI64

Review of attachment 8774326 [details] [diff] [review]:
-----------------------------------------------------------------

Sweet.

::: js/src/jit/arm/Assembler-arm.cpp
@@ +1287,5 @@
>      MOZ_ASSERT(1 <= amt && amt <= 32);
>      return O2RegImmShift(r, ASR, amt);
>  }
>  
>  

other pre-existing nit: two blank lines here
Attachment #8774326 - Flags: review?(bbouvier) → review+
Comment on attachment 8774311 [details] [diff] [review]
Part 3: Implement LWrapInt64ToInt32

Review of attachment 8774311 [details] [diff] [review]:
-----------------------------------------------------------------

Trivial, stealing.
Attachment #8774311 - Flags: review?(efaustbmo) → review+
Comment on attachment 8774312 [details] [diff] [review]
Part 4: Implement LExtendInt32ToInt64

Review of attachment 8774312 [details] [diff] [review]:
-----------------------------------------------------------------

Stealing (efaust might be busy at tc39). Looks good, thanks.

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +3059,5 @@
> +        masm.ma_mov(Imm32(0), output.high);
> +    else
> +        masm.ma_asr(Imm32(31), output.low, output.high);
> +}
> +

nit: trailing blank line at EOF

::: js/src/jit/arm/Lowering-arm.cpp
@@ +850,5 @@
>  
>  void
>  LIRGeneratorARM::visitExtendInt32ToInt64(MExtendInt32ToInt64* ins)
>  {
> +    LExtendInt32ToInt64* lir =

Can use `auto* lir =` to fit on a single line.

@@ +858,5 @@
> +    LDefinition def(LDefinition::GENERAL, LDefinition::MUST_REUSE_INPUT);
> +    def.setReusedInput(0);
> +
> +    lir->setDef(0, def);
> +    lir->getDef(0)->setVirtualRegister(ins->virtualRegister());

Can you have

def.setVirtualRegister(ins->virtualRegister());

just after

def.setReusedInput(0)

and then remove this line, instead? This groups all the actions on `def` before using it.
Attachment #8774312 - Flags: review?(efaustbmo) → review+
Comment on attachment 8774322 [details] [diff] [review]
Part 13: Implement LAsmSelectI64

Review of attachment 8774322 [details] [diff] [review]:
-----------------------------------------------------------------

Looks legit, thanks.
Attachment #8774322 - Flags: review?(efaustbmo) → review+
Attachment #8774323 - Flags: review?(efaustbmo) → review+
Comment on attachment 8774307 [details] [diff] [review]
Part 1: Make ion ready for int64 on arm

Review of attachment 8774307 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm/Assembler-arm.cpp
@@ +62,5 @@
> +        // when odd.
> +        intRegIndex_ = (intRegIndex_ + 1) & ~1;
> +        if (intRegIndex_ == NumIntArgRegs) {
> +            // Align the stack on 8 bytes.
> +            static const int align = sizeof(uint64_t) - 1;

Prefer "static const uint32_t align" to match type of stackOffset_.

@@ +122,5 @@
> +        // when odd.
> +        intRegIndex_ = (intRegIndex_ + 1) & ~1;
> +        if (intRegIndex_ == NumIntArgRegs) {
> +            // Align the stack on 8 bytes.
> +            static const int align = sizeof(uint64_t) - 1;

Ditto

::: js/src/jit/arm/Lowering-arm.cpp
@@ +146,5 @@
> +void
> +LIRGeneratorARM::defineInt64Phi(MPhi* phi, size_t lirIndex)
> +{
> +    LPhi* low = current->getPhi(lirIndex + INT64LOW_INDEX);
> +    LPhi* high = current->getPhi(lirIndex + INT64HIGH_INDEX);

INT64LOW_INDEX and INT64HIGH_INDEX aren't defined yet.
Attachment #8774307 - Flags: review?(sstangl) → review+
Attachment #8774309 - Flags: review?(sstangl) → review+
Comment on attachment 8774327 [details] [diff] [review]
Part 18: Implement LWasmTruncateToInt64

Review of attachment 8774327 [details] [diff] [review]:
-----------------------------------------------------------------

Stealing review, Dan is busy. Looks good, thanks.

::: js/src/asmjs/WasmTypes.cpp
@@ +196,5 @@
>      return x % y;
>  }
>  
> +static int64_t
> +WasmTruncateDoubleToInt64(double input)

I think we're already in wasm:: namespace, so no need for the prefixing here and below.

@@ +204,5 @@
> +    if (input >= INT64_MAX * 1.0)
> +        return 0x8000000000000000;
> +    if (input < INT64_MIN * 1.0)
> +        return 0x8000000000000000;
> +    return input;

Make the conversions explicit, please:

input >= double(INT64_MAX)

input <= double(INT64_MIN)

return int64_t(input);

@@ +216,5 @@
> +    if (input >= UINT64_MAX * 1.0)
> +        return 0x8000000000000000;
> +    if (input <= -1.0)
> +        return 0x8000000000000000;
> +    return input;

ditto

::: js/src/asmjs/WasmTypes.h
@@ +725,5 @@
>      UDivI64,
>      ModI64,
>      UModI64,
> +    WasmTruncateDoubleToInt64,
> +    WasmTruncateDoubleToUint64,

ditto

::: js/src/jit/IonTypes.h
@@ +758,5 @@
>  enum {
>      ArgType_General = 0x1,
>      ArgType_Double  = 0x2,
>      ArgType_Float32 = 0x3,
> +    ArgType_Int64 = 0x4,

Out of curiosity, why do these values not start with 0x0?

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +2956,5 @@
>  
>  void
> +CodeGeneratorARM::visitWasmTruncateToInt64(LWasmTruncateToInt64* lir)
> +{
> +    // TODO: reuse output.high as temp.

Remove, or add a bug # in the comment for follow-up.

@@ +2982,5 @@
> +        masm.callWithABI(wasm::SymbolicAddress::WasmTruncateDoubleToUint64);
> +    else
> +        masm.callWithABI(wasm::SymbolicAddress::WasmTruncateDoubleToInt64);
> +
> +    masm.Pop(input);

As input uses useRegisterAtStart, is there a chance that input === ReturnReg64 and that popping here would clobber the result? If not, could you assert it at the top of this function, please?

@@ +2986,5 @@
> +    masm.Pop(input);
> +
> +    masm.ma_cmp(output.high, Imm32(0x80000000));
> +    masm.ma_cmp(output.low, Imm32(0x00000000), Assembler::Equal);
> +    masm.ma_b(ool->entry(), Assembler::Equal);

As discussed on irc, it's a bit weird that we use x86 error codes here. Wouldn't a real ARM device clamp the results anyway if the input is out of bounds?

I think another strategy would be to:
- allocate space on the stack for the int64 return value;
- make the callout return a bool indicating whether the conversion succeeded or not, and reporting the JS error if it did not
- here in the call stub, if the bool is false, just jump to wasm::Throw
- otherwise copy the result from stack to register

This way we wouldn't need the OOL at all. What do you think? (can be done in a follow-up bug, though)

@@ +3018,5 @@
> +    // By default test for the following inputs and bail:
> +    // signed:   ] -Inf, INTXX_MIN - 1.0 ] and [ INTXX_MAX + 1.0 : +Inf [
> +    // unsigned: ] -Inf, -1.0 ] and [ UINTXX_MAX + 1.0 : +Inf [
> +    // Note: we cannot always represent those exact values. As a result
> +    // this changes the actual comparison a bit.

Look at what x86-shared does, it's slightly less complicated (not having different conditions, more specialized if bodies).

::: js/src/jit/arm/LIR-arm.h
@@ +559,5 @@
>          return mir_->toAsmJSAtomicBinopHeap();
>      }
>  };
>  
> +class LWasmTruncateToInt64 : public LCallInstructionHelper<INT64_PIECES, 1, 0>

Could we name it LWasmTruncateToInt64Callout, to make it clear that it's a call?

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +4694,5 @@
>  void
> +MacroAssembler::Pop(FloatRegister reg)
> +{
> +    VFPRegister r = VFPRegister(reg);
> +    ma_vpop(VFPRegister(reg));

ma_vpop(r);
Attachment #8774327 - Flags: review?(sunfish) → review+
Comment on attachment 8774328 [details] [diff] [review]
Part 19: Implement LInt64ToFloatingPoint

Review of attachment 8774328 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

::: js/src/asmjs/WasmTypes.cpp
@@ +223,5 @@
> +static double
> +Int64ToFloatingPoint(int32_t x_hi, uint32_t x_lo)
> +{
> +    int64_t x = ((uint64_t)x_hi << 32) + x_lo;
> +    return x;

Too many implicit conversions here. How about:

- making x_hi an uint32 too

int64_t x = int64_t((uint64_t(x_hi) << 32)) + int64_t(x_hlo);

or

int64_t x = int64_t((uint64_t(x_hi) << 32) + uint64_t(x_hlo));

And at the end

return double(x);

(I wonder if technically, it might be undefined behavior to do this conversion...)

@@ +230,5 @@
> +static double
> +Uint64ToFloatingPoint(int32_t x_hi, uint32_t x_lo)
> +{
> +    uint64_t x = ((uint64_t)x_hi << 32) + x_lo;
> +    return x;

ditto

::: js/src/jit/arm/LIR-arm.h
@@ +574,5 @@
>          return mir_->toWasmTruncateToInt64();
>      }
>  };
>  
> +class LInt64ToFloatingPoint: public LCallInstructionHelper<1, INT64_PIECES, 0>

Can you put "Callout" in the name, please?
Attachment #8774328 - Flags: review?(sunfish) → review+
(Assignee)

Comment 47

a year ago
Thanks everybody for reviewing so quickly!!!
(Assignee)

Comment 48

a year ago
(In reply to Sean Stangl [:sstangl] from comment #44)
> Comment on attachment 8774307 [details] [diff] [review]
> Part 1: Make ion ready for int64 on arm
> ::: js/src/jit/arm/Lowering-arm.cpp
> @@ +146,5 @@
> > +void
> > +LIRGeneratorARM::defineInt64Phi(MPhi* phi, size_t lirIndex)
> > +{
> > +    LPhi* low = current->getPhi(lirIndex + INT64LOW_INDEX);
> > +    LPhi* high = current->getPhi(lirIndex + INT64HIGH_INDEX);
> 
> INT64LOW_INDEX and INT64HIGH_INDEX aren't defined yet.

It is defined in bug 1279248, which implements this for x86
(Assignee)

Comment 49

a year ago
Created attachment 8775591 [details] [diff] [review]
Part 23: Implement LWasmLoadGlobalVarI64 and LWasmStoreGlobalVarI64
Attachment #8775591 - Flags: review?(bbouvier)
Comment on attachment 8775591 [details] [diff] [review]
Part 23: Implement LWasmLoadGlobalVarI64 and LWasmStoreGlobalVarI64

Review of attachment 8775591 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8775591 - Flags: review?(bbouvier) → review+
(Assignee)

Comment 51

a year ago
Created attachment 8775612 [details] [diff] [review]
Part 24: Implement LWasmLoadI64 and LWasmStoreI64
Attachment #8775612 - Flags: review?(bbouvier)
Comment on attachment 8775612 [details] [diff] [review]
Part 24: Implement LWasmLoadI64 and LWasmStoreI64

Review of attachment 8775612 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +2375,5 @@
> +        if (type == Scalar::Int64) {
> +            MOZ_ASSERT(INT64LOW_OFFSET == 0);
> +            masm.ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, output.low);
> +            masm.ma_add(Imm32(INT64HIGH_OFFSET), ptr);
> +            masm.ma_dataTransferN(IsLoad, 32, isSigned, HeapReg, ptr, output.high);

Note (no change needed): ma_dataTransferN may apparently handle 64 bits loads and stores, but I wonder how this is supposed to work with 32-bits registers...

@@ +2387,4 @@
>      } else {
> +        AnyRegister output = ToAnyRegister(lir->output());
> +        bool isFloat = output.isFloat();
> +        if (isFloat) {

pre-existing: you can inline output.isFloat() here

@@ +2425,5 @@
>          return;
>      }
>  
>      Register ptr = ToRegister(lir->ptr());
> +    unsigned byteSize = mir->byteSize();

Can you keep byteSize closer to its uses, please?

@@ +2443,5 @@
> +
> +        Register64 value = ToRegister64(lir->getInt64Operand(lir->ValueIndex));
> +        masm.ma_dataTransferN(IsStore, 32 /* bits */, /* signed */ false, HeapReg, ptr, value.low);
> +        masm.ma_add(Imm32(INT64HIGH_OFFSET), ptr);
> +        masm.ma_dataTransferN(IsStore, 32 /* bits */, /* signed */ true, HeapReg, ptr, value.high);

Couldn't the second one also be an unsigned store? sign is not taken into account in ma_dataTransferN when the size is 32, anyways.
Attachment #8775612 - Flags: review?(bbouvier) → review+
(Assignee)

Comment 53

a year ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb1525e83067

It looks like we are getting close, though there is an issue. Seems like the regalloc doesn't know how to handle two "reuseinput" on one definition properly. The issue presented itself only on arm currently, but I think it is also present on x86.

It happens when:

[MoveGroup] [stack:12 -> r3] [stack:8 -> r2]
[22,23 Foo] [def v13<g>:r3] [def v14<g>:r2] [use v4:r r3] [use v5:r r2]

we need to switch the registers in the movegroup. E.g. in this case we need to add [r2 -> r3] and [r3 -> r2] to flip the registers. Now the logic isn't smart enough to handle this correctly and makes:

[MoveGroup] [stack:8 -> r3] [stack:8 -> r2]

out of this. Giving allocation integrity errors and off course wrong results.

Given this is regalloc it might be hard to fix this quickly. I'm doubting if for the landing I should quickly replace all reuseInput with a move and open a follow-up bug to fix it. That should solve it quickly with a little bit worse codegen.
(Assignee)

Comment 54

a year ago
Created attachment 8775965 [details] [diff] [review]
Part 25: Temporarily don't use defineInt64ReuseInput

Like mentioned in the previous comment:
defineInt64ReuseInput isn't playing nicely with the register allocator. Seems it doesn't support more than one reuseInput yet. This fixes it by not using defineInt64ReuseInput for now. I'll open a follow-up bug to enable it again.
Attachment #8775965 - Flags: review?(bbouvier)
(Assignee)

Updated

a year ago
Blocks: 1290456
Comment on attachment 8775965 [details] [diff] [review]
Part 25: Temporarily don't use defineInt64ReuseInput

Review of attachment 8775965 [details] [diff] [review]:
-----------------------------------------------------------------

It seems okay to land this for unblocking shipping, and take care of it later. Thanks!

::: js/src/jit/arm/CodeGenerator-arm.cpp
@@ +298,5 @@
>      const LInt64Allocation lhs = lir->getInt64Operand(LAddI64::Lhs);
>      const LInt64Allocation rhs = lir->getInt64Operand(LAddI64::Rhs);
> +    Register64 out = ToOutRegister64(lir);
> +
> +    masm.move64(ToRegister64(lhs), out);

Can you do this move only if ToReg64(lhs) != out? (and for all other instances in this patch)

::: js/src/jit/arm/Lowering-arm.cpp
@@ +195,5 @@
>  void
>  LIRGeneratorARM::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins,
>                                    MDefinition* mir, MDefinition* lhs, MDefinition* rhs)
>  {
> +    ins->setInt64Operand(0, useInt64Register(lhs));

This could still be useRegisterAtStart?

@@ +214,5 @@
>          if (int64_t(1) << shift == constant)
>              constantNeedTemp = false;
>      }
>  
> +    ins->setInt64Operand(0, useInt64Register(lhs));

Same question.

@@ +295,5 @@
>  {
>      if (mir->isRotate() && !rhs->isConstant())
>          ins->setTemp(0, temp());
>  
> +    ins->setInt64Operand(0, useInt64Register(lhs));

And here too.

@@ +508,5 @@
>  void
>  LIRGeneratorARM::visitAsmSelect(MAsmSelect* ins)
>  {
>      if (ins->type() == MIRType::Int64) {
> +        auto* lir = new(alloc()) LAsmSelectI64(useInt64Register(ins->trueExpr()),

And here too.

::: js/src/jit/x64/Lowering-x64.cpp
@@ +72,5 @@
> +LIRGeneratorX64::lowerForShiftInt64(LInstructionHelper<INT64_PIECES, INT64_PIECES + 1, Temps>* ins,
> +                                    MDefinition* mir, MDefinition* lhs, MDefinition* rhs)
> +{
> +    ins->setInt64Operand(0, useInt64RegisterAtStart(lhs));
> +#if defined(JS_NUNBOX32)

I think this can't happen on x64.

@@ +92,5 @@
> +        use.setVirtualRegister(rhs->virtualRegister());
> +        ins->setOperand(INT64_PIECES, use);
> +    }
> +
> +    defineInt64ReuseInput(ins, mir, 0);

I thought defineInt64ReuseInput was removed? (ok to move it back in this file)

@@ +114,5 @@
> +    auto* lir = new(alloc()) LAsmSelectI64(useInt64RegisterAtStart(ins->trueExpr()),
> +                                           useInt64(ins->falseExpr()),
> +                                           useRegister(ins->condExpr())
> +                                          );
> +    defineInt64ReuseInput(lir, ins, LAsmSelectI64::TrueExprIndex);

ditto

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +1516,4 @@
>      Register64 falseExpr = ToRegister64(lir->falseExpr());
>      Register64 out = ToOutRegister64(lir);
>  
> +    masm.move64(trueExpr, out);

Can you do this only if trueExpr != out?

::: js/src/jit/x86/Lowering-x86.cpp
@@ +201,5 @@
>  void
>  LIRGeneratorX86::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins,
>                                    MDefinition* mir, MDefinition* lhs, MDefinition* rhs)
>  {
> +    ins->setInt64Operand(0, useInt64Register(lhs));

Can it stay atStart?

@@ +238,5 @@
> +{
> +    ins->setInt64Operand(0, useInt64Register(lhs));
> +#if defined(JS_NUNBOX32)
> +    if (mir->isRotate())
> +        ins->setTemp(0, temp());

This is gross, it should go under LIRGeneratorX86::visitRotate, instead.

@@ +248,5 @@
> +        ins->setOperand(INT64_PIECES, useOrConstant(rhs));
> +    } else {
> +        // The operands are int64, but we only care about the lower 32 bits of
> +        // the RHS. On 32-bit, the code below will load that part in ecx and
> +        // will discard the upper half.

Can you remove "On 32-bit," in this sentence?

@@ +274,5 @@
> +        lowerAsmSelect(ins);
> +        return;
> +    }
> +
> +    auto* lir = new(alloc()) LAsmSelectI64(useInt64Register(ins->trueExpr()),

Can it stay atStart?
Attachment #8775965 - Flags: review?(bbouvier) → review+
(Assignee)

Comment 56

a year ago
Created attachment 8775970 [details] [diff] [review]
Part 25: Temporarily don't use defineInt64ReuseInput
Attachment #8775965 - Attachment is obsolete: true
Attachment #8775970 - Flags: review?(bbouvier)
Comment on attachment 8775970 [details] [diff] [review]
Part 25: Temporarily don't use defineInt64ReuseInput

Review of attachment 8775970 [details] [diff] [review]:
-----------------------------------------------------------------

Haha, review conflict! Carrying forward r+ with addressed comments from the previous review.

(fwiw, interdiff looks good https://bugzilla.mozilla.org/attachment.cgi?oldid=8775965&action=interdiff&newid=8775970&headers=1)
Attachment #8775970 - Flags: review?(bbouvier) → review+
(Assignee)

Comment 58

a year ago
(In reply to Benjamin Bouvier [:bbouvier] from comment #55)
> Comment on attachment 8775965 [details] [diff] [review]
> Part 25: Temporarily don't use defineInt64ReuseInput
> 
> Review of attachment 8775965 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It seems okay to land this for unblocking shipping, and take care of it
> later. Thanks!
> 
> ::: js/src/jit/arm/CodeGenerator-arm.cpp
> @@ +298,5 @@
> >      const LInt64Allocation lhs = lir->getInt64Operand(LAddI64::Lhs);
> >      const LInt64Allocation rhs = lir->getInt64Operand(LAddI64::Rhs);
> > +    Register64 out = ToOutRegister64(lir);
> > +
> > +    masm.move64(ToRegister64(lhs), out);
> 
> Can you do this move only if ToReg64(lhs) != out? (and for all other
> instances in this patch)

Not needed, since we don't use AtStart register they cannot overlap. So we have to copy them!

> ::: js/src/jit/arm/Lowering-arm.cpp
> @@ +195,5 @@
> >  void
> >  LIRGeneratorARM::lowerForALUInt64(LInstructionHelper<INT64_PIECES, 2 * INT64_PIECES, 0>* ins,
> >                                    MDefinition* mir, MDefinition* lhs, MDefinition* rhs)
> >  {
> > +    ins->setInt64Operand(0, useInt64Register(lhs));
> 
> This could still be useRegisterAtStart?

Here our register allocator is the problem (which is a bug). We cannot have mixed AtStart and NotAtStart definitions.

> ::: js/src/jit/x64/Lowering-x64.cpp
> @@ +72,5 @@
> > +LIRGeneratorX64::lowerForShiftInt64(LInstructionHelper<INT64_PIECES, INT64_PIECES + 1, Temps>* ins,
> > +                                    MDefinition* mir, MDefinition* lhs, MDefinition* rhs)
> > +{
> > +    ins->setInt64Operand(0, useInt64RegisterAtStart(lhs));
> > +#if defined(JS_NUNBOX32)
> 
> I think this can't happen on x64.

correct removing

> 
> @@ +92,5 @@
> > +        use.setVirtualRegister(rhs->virtualRegister());
> > +        ins->setOperand(INT64_PIECES, use);
> > +    }
> > +
> > +    defineInt64ReuseInput(ins, mir, 0);
> 
> I thought defineInt64ReuseInput was removed? (ok to move it back in this
> file)

Yeah I was almost going to move it over to x64 only. But fixed with next patch ;).
 
> @@ +238,5 @@
> > +{
> > +    ins->setInt64Operand(0, useInt64Register(lhs));
> > +#if defined(JS_NUNBOX32)
> > +    if (mir->isRotate())
> > +        ins->setTemp(0, temp());
> 
> This is gross, it should go under LIRGeneratorX86::visitRotate, instead.

I removed the JS_NUNBOX32, since we are now on x86 only. But the rest is quite similar, therefore keeping.

> @@ +248,5 @@
> > +        ins->setOperand(INT64_PIECES, useOrConstant(rhs));
> > +    } else {
> > +        // The operands are int64, but we only care about the lower 32 bits of
> > +        // the RHS. On 32-bit, the code below will load that part in ecx and
> > +        // will discard the upper half.
> 
> Can you remove "On 32-bit," in this sentence?

Yes

Comment 59

a year ago
Pushed by hv1989@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c98e305712c8
Part 1: Preparations in IonMonkey to support i64 on arm, r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b30ccb87d0a
Part 2: Implement the 64bit variant of AsmJSParameter on arm, r=sstangl
https://hg.mozilla.org/integration/mozilla-inbound/rev/14fb049c6eab
Part 3: Implement the 64bit variant of WrapInt64ToInt32 on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/38c4a72c8804
Part 4: Implement the 64bit variant of Extend on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/e11ffc9f839c
Part 5: Implement the 64bit variant of Add and Sub on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea5752e51caf
Part 6: Implement the 64bit variant of Mul on arm, r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/48ed814c72c6
Part 7: Implement the 64bit variant of Div and Mod on arm, r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/df8f35c18584
Part 8: Implement the 64bit variant of Compare on arm, r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/65981e46881a
Part 9: Implement the 64bit variant of Shift on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2e991415189
Part 10: Implement the 64bit variant of BitOp on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/9ae4ad38a4b3
Part 11: Implement the 64bit variant of Rotate on arm, r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/6e959f39e82f
Part 12: Implement the 64bit variant of AsmJSPassStackArg on arm, r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f7fa045bfba
Part 13: Implement the 64bit variant of AsmSelect on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/f5f53c099276
Part 14: Implement the 64bit variant of AsmReinterpret on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/4c042c13a73a
Part 15: Implement the 64bit variant of PopCnt on arm, r=lth
https://hg.mozilla.org/integration/mozilla-inbound/rev/973d29183f5f
Part 16: Implement the 64bit variant of Clz and Ctz on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/2f7ac340ea3c
Part 17: Implement the 64bit variant of Not on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/48d273fefe73
Part 18: Implement the 64bit variant of WasmTruncate on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/ce45ee774edb
Part 19: Implement the 64bit variant of Int64ToFloatingPoint on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/7edcd686d163
Part 20: Implement the 64bit variant of Test on arm, r=nbp
https://hg.mozilla.org/integration/mozilla-inbound/rev/33dfbd1645d9
Part 21: Unrelated cleanup, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/1f97f6942a84
Part 22: Implement the 64bit variant of WasmLoadGlobalVar and WasmStoreGlobalVar on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/1449099f1906
Part 23: Implement the 64bit variant of WasmLoad and WasmStore on arm, r=bbouvier
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f0fe678a40a
Part 24: Make WebAssembly ready to run 64bit instructions on arm, r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b6fd86e965e
Part 25: Don't reuse input during lowering for int 64 values on 32 bit platforms, r=bbouvier

Comment 60

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c98e305712c8
https://hg.mozilla.org/mozilla-central/rev/8b30ccb87d0a
https://hg.mozilla.org/mozilla-central/rev/14fb049c6eab
https://hg.mozilla.org/mozilla-central/rev/38c4a72c8804
https://hg.mozilla.org/mozilla-central/rev/e11ffc9f839c
https://hg.mozilla.org/mozilla-central/rev/ea5752e51caf
https://hg.mozilla.org/mozilla-central/rev/48ed814c72c6
https://hg.mozilla.org/mozilla-central/rev/df8f35c18584
https://hg.mozilla.org/mozilla-central/rev/65981e46881a
https://hg.mozilla.org/mozilla-central/rev/d2e991415189
https://hg.mozilla.org/mozilla-central/rev/9ae4ad38a4b3
https://hg.mozilla.org/mozilla-central/rev/6e959f39e82f
https://hg.mozilla.org/mozilla-central/rev/8f7fa045bfba
https://hg.mozilla.org/mozilla-central/rev/f5f53c099276
https://hg.mozilla.org/mozilla-central/rev/4c042c13a73a
https://hg.mozilla.org/mozilla-central/rev/973d29183f5f
https://hg.mozilla.org/mozilla-central/rev/2f7ac340ea3c
https://hg.mozilla.org/mozilla-central/rev/48d273fefe73
https://hg.mozilla.org/mozilla-central/rev/ce45ee774edb
https://hg.mozilla.org/mozilla-central/rev/7edcd686d163
https://hg.mozilla.org/mozilla-central/rev/33dfbd1645d9
https://hg.mozilla.org/mozilla-central/rev/1f97f6942a84
https://hg.mozilla.org/mozilla-central/rev/1449099f1906
https://hg.mozilla.org/mozilla-central/rev/0f0fe678a40a
https://hg.mozilla.org/mozilla-central/rev/5b6fd86e965e
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox50: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla50

Comment 61

a year ago
Pushed by hv1989@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/97ae43d193b1
Backout revision 5b6fd86e965e (part 25), r=backout

Comment 62

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/97ae43d193b1

Updated

a year ago
Blocks: 1277011
You need to log in before you can comment on or make changes to this bug.