Closed Bug 745362 Opened 12 years ago Closed 12 years ago

IonMonkey: Math.floor() inlining incorrectly handles negatives.

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: sstangl, Assigned: sstangl)

References

Details

Attachments

(5 files)

The current implementation of visitRound() converts negative doubles to positive, rounds down, and then negates the integer result.

This differs from JS semantics, where rounding down is always toward the negative direction. For example, floor(-5.5) == -6.
Correct behavior using ROUNDSD. Non-SSE4.1 systems handle negative inputs in a slow way, but still faster than using callVM(). Math.round() is still incorrect, but that can be left to another patch.

ARM support missing. Marty said he would work on it. v8 has a very nice implementation that we should use for reference, closely resembling the x64-with-SSE4.1 code.
Attachment #615967 - Flags: review?(nicolas.b.pierron)
Adding Marty to CC list. This patch should not land without ARM support.
Comment on attachment 615967 [details] [diff] [review]
Implement Math.floor() for x86/x64.

Review of attachment 615967 [details] [diff] [review]:
-----------------------------------------------------------------

I do not understand why you split Floor and Round to have 2 different MIR / LIR and Codegen where round is clearly defined in function of floor as indicated by the comment left in round.  By the way, round can also be express with the SSE4.1 instruction by using the rounding mode.

Please provide ARM parity for floor implementation.
Please fix Math.round at the same time instead of duplicating code of Math.floor and adding more MIR/LIR/… where this is not necessary.

Is there a flag to test without SSE4.1 optimizations ?

To do a comparison test between the interpreter and IonMonkey, you can:

function interp_round(j) {
  try{}catch(x){}
  return Math.round(j)
}
function ion_round(j) {
  assertEq(Math.round(j), interp_round(j));
}

function interp_floor(j) {
  try{}catch(x){}
  return Math.floor(j)
}
function ion_floor(j) {
  assertEq(Math.floor(j), interp_floor(j));
}

for (var i = 0; i < 1000; i++) {
  j = i / 4 - 1000 / 8;
  ion_round(j);
  ion_floor(j);
}
ion_round(-0);
ion_floor(-0);
ion_round(0);
ion_floor(0);

::: js/src/ion/IonBuilder.cpp
@@ +129,5 @@
>  IonBuilder::getSingleCallTarget(uint32 argc, jsbytecode *pc)
>  {
>      types::TypeSet *calleeTypes = oracle->getCallTarget(script, argc, pc);
> +    if (!calleeTypes)
> +        return NULL;

is that case possible?  Shouldn't have we check before that we have type information?

@@ +2503,5 @@
>  
>      jsid protoid = ATOM_TO_JSID(cx->runtime->atomState.classPrototypeAtom);
>      types::TypeSet *protoTypes = target->getType(cx)->getProperty(cx, protoid, false);
> +    if (!protoTypes)
> +        return NULL;

idem.
Attachment #615967 - Flags: review?(nicolas.b.pierron)
No. Math.floor() and Math.round() need separate implementations if we are going to handle negative numbers, as this patch does. JM does not handle negatives; v8 does handle them (at least with SSE4.1), using an equivalent division between floor/round as in this patch, which is much smarter. The existing code was broken, and this patch attempts to fix it -- suggesting that I do the same thing as the broken code will not solve the problem.

Math.round() absolutely cannot be implemented with SSE4.1's ROUNDSD. The closest rounding mode for ROUNDSD is Round-To-Nearest, but JS has different behavior for negative numbers that fall directly on a half. For example, Math.round(-1.5) == -1, whereas ROUNDSD will yield -2. There is a graceful way to handle this already implemented in v8, but it diverges significantly from the Math.floor() implementation. That is why they are separate instructions.

To restate: these functions can only be implemented in terms of each other for positive integers, and we want to handle negatives.

I suggest re-reviewing this patch with regards to the correctness of the generated assembly. (The NULL checks on the TypeSets can happen; I added those checks because they were missing and caused failures.)
Attachment #615967 - Flags: review?(nicolas.b.pierron)
Comment on attachment 615967 [details] [diff] [review]
Implement Math.floor() for x86/x64.

Review of attachment 615967 [details] [diff] [review]:
-----------------------------------------------------------------

Good.
Implement the ARM part before landing.

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +898,5 @@
>  bool
> +CodeGeneratorARM::visitFloor(LFloor *lir)
> +{
> +    JS_NOT_REACHED("ARM floor() unimplemented.");
> +    return false;

No ARMing, no landing …

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +941,3 @@
>  
> +        // Round toward -Infinity.
> +        masm.roundsd(input, scratch, JSC::X86Assembler::RoundDown);

I am not a big fan of this namespace inside IonMonkey files, except inside Assembler-*.h

@@ +946,5 @@
> +        masm.cmp32(output, Imm32(INT_MIN));
> +        if (!bailoutIf(Assembler::Equal, lir->snapshot()))
> +            return false;
> +    } else {
> +        Label negative, end;

Please document this function in a readable way: (monospace) the text only is not clear enough, ASCII art sounds better ;)

    -2 -1.5 -1 -0.5 +0 +0.5 +1 +1.5 +2
i = ----------------------------------
if i >= 0:
  if i == -0: bailout
  o = truncate d
  o == xxxxxxxxxxxxxxxxxo<-----{o<-----{o
else:
  o = truncate d
  o == o}----->o}-----> xxxxxxxxxxxxxxxx
  if d != double(o):
    o -= 1
    o ==  <-----{ <-----{ xxxxxxxxxxxxxxxx
  else:
    o == o       o        xxxxxxxxxxxxxxxx
  o == o<-----{o<-----{ xxxxxxxxxxxxxxxx
o == o<-----{o<-----{/<-----{o<-----{

@@ +953,5 @@
> +        masm.xorpd(scratch, scratch);
> +        masm.branchDouble(Assembler::DoubleLessThan, input, scratch, &negative);
> +
> +        // Bail on negative-zero.
> +        Assembler::Condition bailCond = masm.testNegativeZero(input, output);

I don't know how frequent positive numbers are compared to negative numbers, may be you should move this case to the other branch of the if statement, making the negative part really slower and improving a bit for positive values.

But it does not matter much because we can hope people are using sse4.1.
Attachment #615967 - Flags: review?(nicolas.b.pierron) → review+
Same deal but for Math.round(). The non-negative paths are disjoint; the negative paths are similar.
Attachment #618858 - Flags: review?(nicolas.b.pierron)
ecma/Date is failing too, confirmed fix by the first patch. What are the plans for an ARM fix?
I just tested your code, and you are compatible with the interpreter, but this case exists:

js> var x = 0.49999999999999997;
js> x
0.49999999999999994
js> Math.round(x);
1
Implementation of floor that does not set the fpcsr, I'll need to try comparing this method with setting and unsetting fpcsr (probably in a micro-benchmark).
Attachment #620671 - Flags: review?(Jacob.Bramley)
Comment on attachment 620671 [details] [diff] [review]
Implement Math.floor() for ARM

Review of attachment 620671 [details] [diff] [review]:
-----------------------------------------------------------------

It looks right to me. (That is, I can't think of any nasty corner cases that this would fail.)

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +2665,5 @@
> +{
> +    Label handleZero;
> +    Label handleNeg;
> +    Label fin;
> +    compareDouble(input, InvalidFloatReg);

Unless I'm mistaken, compareDouble doesn't execute the 'vmov' (or 'fmrs' in old speak) required to actually get the comparison result into the APSR (/CPSR) flags.

@@ +2677,5 @@
> +    // then a value that rounds to INT_MAX is explicitly different from an
> +    // argument that clamps to INT_MAX
> +    ma_vcvt_F64_U32(input, ScratchFloatReg);
> +    ma_vxfer(VFPRegister(ScratchFloatReg).uintOverlay(), output);
> +    ma_mov(output, output, SetCond);

A compare-with-zero is more typical. It might be faster, too: A movs has an output register and doesn't set all of the flags.

@@ +2680,5 @@
> +    ma_vxfer(VFPRegister(ScratchFloatReg).uintOverlay(), output);
> +    ma_mov(output, output, SetCond);
> +    ma_b(bail, Signed);
> +    ma_b(&fin);
> +    bind(&handleZero);

Put the blank line on the other side of the bind, so the label is grouped with the code it belongs to.

@@ +2688,5 @@
> +    as_vxfer(output, InvalidReg, input, FloatToCore, Always, 1);
> +    ma_mov(output, output, SetCond);
> +    ma_b(bail, NonZero);
> +    ma_b(&fin);
> +    bind(&handleNeg);

Ditto.

@@ +2703,5 @@
> +    ma_vneg(input, input);
> +    // If the result looks positive, then this value didn't actually fit into
> +    // the int range, and special handling is required.
> +    ma_b(bail, Unsigned);
> +    bind(&fin);

Cunning! I like how it cleverly avoids the boundary condition with the add because MIN_INT < -MAX_INT. (This could do with a comment, though.)
Attachment #620671 - Flags: review?(Jacob.Bramley) → review+
Attached patch nits fixedSplinter Review
sstargl, you can check this patch in along with the x86 bits.
There are a couple of minor changes left over from my first attempt at implementing round(), they should be harmless.
Attachment #621171 - Flags: review?(sstangl)
Comment on attachment 621171 [details] [diff] [review]
Implement Math.round on ARM

Review of attachment 621171 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +2728,5 @@
> +    // Do a compare based on the original value, then do most other things based on the
> +    // shifted value.
> +    ma_vcmpz(input);
> +    // Adding 0.5 is technically incorrect!
> +    // We want to add 0.5 to negative numbers, and 0.49999999999999999 to positive numbers.

This isn't actually done, is it?
Attachment #621171 - Flags: review?(sstangl) → review+
Comment on attachment 618858 [details] [diff] [review]
Implement Math.round() for x86/x64.

Review of attachment 618858 [details] [diff] [review]:
-----------------------------------------------------------------

Apply suggestions and factor with LFloor.
Add a test case which ensure that all test cases of jaeger/inline/math{Round,Floor}.js are tested with LRound and LFloor.

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +1010,1 @@
>  

nit: avoid 2 blank lines.

@@ +1032,5 @@
> +    masm.bind(&negative);
> +
> +    if (AssemblerX86Shared::HasSSE41()) {
> +        // Add 0.5 and round toward -Infinity.
> +        masm.addsd(pointFive, input);

Factor these 2 lines outside the if.

@@ +1068,5 @@
> +            // Input is not integer-valued, so we rounded off-by-one in the
> +            // wrong direction. Correct by subtraction.
> +            masm.subl(Imm32(1), output);
> +            if (!bailoutIf(Assembler::Overflow, lir->snapshot()))
> +                return false;

output is not INT_MIN, so the overflow flag would never be triggered.

@@ +1075,5 @@
> +        }
> +
> +        masm.bind(&testZero);
> +        if (!bailoutIf(Assembler::Zero, lir->snapshot()))
> +            return false;

Factor this condition outside of this if-else statement.

// The last operation set the output to its final result, except that
// for negative numbers near 0, the actual result is -0 but it cannot be
// encoded with an Int.
masm.bind(&negativeBailOnZero);
if (!bailoutIf(Assembler::Zero, lir->snapshot()))
    return false;

By the way, checking the sign bit will work too as we only expect negative numbers to come out of this branch.
Attachment #618858 - Flags: review?(nicolas.b.pierron)
Comment on attachment 618858 [details] [diff] [review]
Implement Math.round() for x86/x64.

Review of attachment 618858 [details] [diff] [review]:
-----------------------------------------------------------------

After a discussion with Sean, we cannot factor the code generation at the LIR level because LRound is using an extra temporary not used by LFloor.  The code on the borders of the if-else statements are kept inside to improve readability of each action, to keep monolithic blocks of code.

So this looks good, after the nit on the removal of the overflow check.

For the test case, you should ensure that LFloor and LRound are executed with all cases, one way of doing is using 2 loops and make sure that the bailout of a test do not cause the next test to be executed outside the IonCode:


var roundDTests = [
  {i: -0, o: -0}
];
var roundITests = [
  {i: 4, o: 4}
  {i: 0, o: 0}
];

// Only compile these functions which are well typed.
function roundDfun(i) { return Math.round(i); }
function roundIfun(i) { return Math.round(i); }

// Do not compile the outer loops to avoid inlining which may cause bailouts of the outer loop and thus skip tests of LFloor and LRound.
try {} catch {}
for (var i = 0; i < 50; i++) {
  for (var j = 0; j < roundDTests.length; j++)
     assertEq(roundDfun(roundDTests[j].i), roundDTests[j].o);
  for (var j = 0; j < roundITests.length; j++)
     assertEq(roundIfun(roundITests[j].i), roundITests[j].o);
}
Attachment #618858 - Flags: review+
Depends on: 995826
You need to log in before you can comment on or make changes to this bug.