Closed Bug 764477 Opened 12 years ago Closed 12 years ago

IonMonkey: Inline Math.max and Math.min

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jandem, Assigned: evilpie)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ion:t])

Attachments

(1 file, 3 obsolete files)

For parity with JM+TI, we should inline Math.max/Math.min with two int32/double arguments.
Assignee: general → evilpies
I say we should just inline an arbitrary number of arguments, since I've seen bug reports that using more than 2 arguments slows it down noticeably, and it shouldn't be that much extra work.
Attached patch v1 (obsolete) — Splinter Review
I looked at inlining more than two args, but it doesn't look like it's so easy to do that with an arbitrary number of args, at least we don't seem to implement that anywhere. Here is a patch that handles two args for int and double. In the double case we eagerly de-optimize for zero and NaN, but that could probably be partly fixed. This is also missing arm support for doubles ...
Attachment #637184 - Flags: review?(jdemooij)
Comment on attachment 637184 [details] [diff] [review]
v1

Review of attachment 637184 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. I'd like an OOL path though to handle 0/NaN. Especially 0 is probably common and we have to be careful to avoid repeated bailouts.

::: js/src/ion/CodeGenerator.cpp
@@ +1303,5 @@
> +
> +    if (ins->second()->isConstant())
> +        masm.move32(Imm32(ToInt32(ins->second())), output);
> +    else
> +        masm.movePtr(ToRegister(ins->second()), output);

Nit: masm.mov(..)

::: js/src/ion/IonBuilder.h
@@ +394,5 @@
>      InliningStatus inlineMathAbs(uint32 argc, bool constructing);
>      InliningStatus inlineMathFloor(uint32 argc, bool constructing);
>      InliningStatus inlineMathRound(uint32 argc, bool constructing);
>      InliningStatus inlineMathSqrt(uint32 argc, bool constructing);
> +    InliningStatus inlineMathMinMax(uint32 argc, bool constructing, bool max);

Nit: all other functions pass |constructing| as the last argument, so I'd do that here too.

::: js/src/ion/Lowering.cpp
@@ +626,5 @@
>          return defineReuseInput(lir, ins, 0);
>      } else {
>          JS_ASSERT(num->type() == MIRType_Double);
>          LAbsD *lir = new LAbsD(useRegister(num));
> +        return defineReuseInput(lir, ins, 0);

You don't need this change (and AbsD requires input != output).

::: js/src/ion/MCallOptimize.cpp
@@ +404,5 @@
>  
>  IonBuilder::InliningStatus
> +IonBuilder::inlineMathMinMax(uint32 argc, bool constructing, bool max)
> +{
> +    if (argc != 2 || constructing)

The easiest way to handle 3 or 4 arguments here is to add multiple MathMin/MathMax instructions. Maybe worth doing in a follow-up bug or patch, but with 2 arguments we at least match JM+TI now.

@@ +421,5 @@
> +
> +    if (returnType == MIRType_Int32 && 
> +        (arg1Type != MIRType_Int32 || arg2Type != MIRType_Int32))
> +    {
> +        returnType = MIRType_Double;

I think we should "return InliningStatus_NotInlined;" here instead. If the call has an int32 and double argument, and always returns the int32 argument, we still have to tell TI if we finally get a double.

::: js/src/ion/MIR.h
@@ +2076,5 @@
> +        isMax_(isMax)
> +    {
> +        JS_ASSERT(type == MIRType_Double || type == MIRType_Int32);
> +        setResultType(type);
> +        specialization_ = type;

Call setMovable() here so that we can hoist/GVN MMinMax.

@@ +2096,5 @@
> +    TypePolicy *typePolicy() {
> +        return this;
> +    }
> +    bool congruentTo(MDefinition *const &ins) const {
> +        return congruentIfOperandsEqual(ins);

We should also compare isMax_ here, to avoid optimizing Math.max(x, y) and Math.min(x, y) to the same instruction. Something like this:

if (!ins->isMinMax())
    return false;
if (ins->toMinMax()->isMax() != isMax())
    return false;
return congruentIfOperandsEqual(ins);

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +433,5 @@
>      return true;
>  }
>  
>  bool
> +CodeGeneratorX86Shared::visitMinMaxD(LMinMaxD *ins)

Can we move this to CodeGenerator (and use moveDouble etc) so that we don't need an ARM port?

@@ +447,5 @@
> +    masm.xorpd(scratch, scratch);
> +    Assembler::DoubleCondition cond = Assembler::DoubleEqualOrUnordered;
> +    masm.compareDouble(cond, first, scratch);
> +    if (!bailoutIf(Assembler::ConditionFromDoubleCondition(cond), ins->snapshot()))
> +        return false;

Especially 0 may be common so we have to be careful not to invalidate/bailout all the time. Can you add an out-of-line path to call a max/min helper function which takes two doubles, like you did for LTruncateDToInt32?
Attachment #637184 - Flags: review?(jdemooij)
Attached patch v2 WIP (obsolete) — Splinter Review
So checking for -0 is already complicated and sucks on x86/x64, but it's even harder on arm, so I am (or somebody? :) going to implement a pretty lightweight version there.

As it turned out it's not so easy to tell apart -0 and 0 without using a GPR register, but I had this funny idea that seems to work great. When we already know that both parameter are zero, we shift the first one, one bit to the right. If it was -0, it's now going to be 2.0 and thus not anymore equal to 0.
Attachment #637184 - Attachment is obsolete: true
ok, I just tried this out on my ARM board, and it seems to work.  The shifting right idea also requires moving data between a floating point register, and a gpr (most likely back as well), which would be rather slow. As it turns out, distinguishing 0 from -0 is not necessary.  If we only consider the cases where both operands are zero or negative zero (call it zeroey from truthy and falsey), the only bit that we care about manipulating is the sign bit.
so making a truth table
min(T, T) = T
min(T, F) = T
min(F, T) = T
min(F, F) = F
in other words, we want to or together the sign bits.
the common operations available are multiply, negate, subtract and add
These can be thought of as:
addition: 0+0 = 0, 0+(-0)=0, (-0)+0=0, (-0)+(-0)=-0 aka and
multiplication: 0*0=0, (-0)*0=-0, 0*(-0)=-0, (-0)*(-0)=0 (xor)
negation: -(0) = -0, -(-0)=0 (not)
subtraction: 0-0=0, 0-(-0)=0, (-0)-0=-0, (-0)-(-0)=0 (implication)
none of these happen to be or, but using "and", "not" and demorgan's theorem, we can generate OR, namely -(-a+-b), which is also equivalent to -(-a-b).  On ARM the following sequence of instructions implements this on ARM:

        fnegd d2, d0
        fsubd d2, d2, d1
        fnegd d2, d2
I have not yet tried this instruction sequence on x86 nor x64.
I think we should just take my current approach for x86/x64 and yours for ARM and be done with it.
Blocks: 768572
Blocks: 768745
If you need this in the next to weeks, please take it. Same for Bug 725966.
Attached patch v1 (obsolete) — Splinter Review
So I hope this even compiles and maybe works on ARM :). This now handles -0 and NaN properly inline. Marty would you mind checking that out?

When we have two zero values, we always look at the first one.
For Max, when the first one isn't -0, we use that as the result. (aka we don't move the second argument to the result register)
For Min, when the first one is -0 we use that.
Attachment #639141 - Attachment is obsolete: true
Attachment #647740 - Flags: review?(jdemooij)
Attachment #647740 - Flags: feedback?(mrosenberg)
Comment on attachment 647740 [details] [diff] [review]
v1

Review of attachment 647740 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! r=me on the non-ARM changes with nits addressed. The only thing that worries me is use of the Parity flag to test for NaN outside x86-only code, if that's a problem for ARM we can always move the code to x86-shared and give ARM its own implementation.

You may want to enable the hg color extension, it will show trailing whitespace and is very nice for viewing diffs anyway :)

::: js/src/ion/CodeGenerator.cpp
@@ +1544,5 @@
> +
> +bool
> +CodeGenerator::visitMinMaxD(LMinMaxD *ins)
> +{
> +

Nit: can remove this new line.

@@ +1556,5 @@
> +
> +    masm.zeroDouble(ScratchFloatReg);
> +    masm.compareDouble(Assembler::DoubleEqualOrUnordered, first, ScratchFloatReg);
> +    masm.j(Assembler::Parity, &nan); // If one argument is NaN the result is NaN.
> +    masm.j(Assembler::NotEqual, &secondCheck);

I'm not sure about using the Parity flag here, it seems very x86-specific, please ask mjrosenb what he thinks.

@@ +1579,5 @@
> +
> +    masm.bind(&secondCheck);
> +    // first is not NaN and not zero so we only need to check if second is NaN.
> +    masm.compareDouble(Assembler::DoubleEqualOrUnordered, second, ScratchFloatReg);
> +    masm.j(Assembler::Parity, &nan);

This can be

masm.branchDouble(Assembler::DoubleUnordered, second, second, &nan);

::: js/src/ion/LIR-Common.h
@@ +1232,5 @@
> +class LMinMaxD : public LInstructionHelper<1, 2, 0>
> +{
> +  public:
> +    LIR_HEADER(MinMaxD);
> +    LMinMaxD(const LAllocation &first, const LAllocation &second) 

Nit: trailing whitespace.

::: js/src/ion/Lowering.cpp
@@ +647,5 @@
>          return false;
>      return define(lir, ins);
>  }
>  
> +bool 

Nit: trailing whitespace

::: js/src/ion/MCallOptimize.cpp
@@ +39,5 @@
>          return inlineMathSqrt(argc, constructing);
> +    if (native == js_math_max)
> +        return inlineMathMinMax(true /* max */, argc, constructing);
> +    if (native == js_math_min)
> +        return inlineMathMinMax(false /* max */, argc, constructing);        

Nit: trailing whitespace.

@@ +408,5 @@
> +    MIRType arg2Type = getInlineArgType(argc, 2);
> +    if (arg2Type != MIRType_Double && arg2Type != MIRType_Int32)
> +        return InliningStatus_NotInlined;
> +
> +    if (returnType == MIRType_Int32 && 

Nit: trailing whitespace.

::: js/src/jit-test/tests/ion/mathMinMax.js
@@ +36,5 @@
> +      
> +    }
> +}
> +
> +for (var i = 0; i < 40; i++)

Nit: can you remove this loop and change the other one to 100 iterations or so? That should be enough for --ion-eager and --no-jm. There's also some trailing whitespace in this file.
Attachment #647740 - Flags: review?(jdemooij) → review+
(In reply to Jan de Mooij (:jandem) from comment #9)
> The only thing that worries me is use of the Parity flag to test for NaN outside x86-only
> code, if that's a problem for ARM we can always move the code to x86-shared
> and give ARM its own implementation.

I just discussed this with mjrosenb, there may be some way to abstract it but it seems cleaner to move visitMinMaxD to CodeGeneratorX86Shared and CodeGeneratorARM. Sorry for suggesting otherwise in an earlier comment, I didn't know these double condition flags would complicate things.

And thanks again for taking this bug, would be good to get this landed soon for parity with JM+TI and V8.
Status: NEW → ASSIGNED
Comment on attachment 647740 [details] [diff] [review]
v1

Review of attachment 647740 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/LIR-Common.h
@@ +1208,5 @@
> +class LMinMaxI : public LInstructionHelper<1, 2, 0>
> +{
> +  public:
> +    LIR_HEADER(MinMaxI);
> +    LMinMaxI(const LAllocation &first, const LAllocation &second) 

Trailing whitespace

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +1924,5 @@
>      ma_b(label, ConditionFromDoubleCondition(cond));
>  }
>  
> +void
> +MacroAssemblerARMCompat::branchTestNegativeZero(Condition cond, const FloatRegister &input, 

Id.

::: js/src/jit-test/tests/ion/mathMinMax.js
@@ +15,5 @@
> +        assertEq(min(0, negative_zero), negative_zero);
> +        assertEq(min(negative_zero, 0), negative_zero);
> +
> +        assertEq(min(negative_zero, negative_zero), negative_zero);
> +        assertEq(max(negative_zero, negative_zero), negative_zero);                

More

@@ +32,5 @@
> +        assertEq(max(Infinity, nan), nan);
> +
> +        assertEq(max(negative_zero, -5), negative_zero);
> +        assertEq(min(negative_zero, -5), -5);
> +      

More
Attached patch v2Splinter Review
Totally new version based on v8 code. Also should handle ARM properly this time, but try is still going on https://tbpl.mozilla.org/?tree=Try&rev=20fe4cb4cbf2.
Attachment #647740 - Attachment is obsolete: true
Attachment #647740 - Flags: feedback?(mrosenberg)
Attachment #650625 - Flags: review?(jdemooij)
Comment on attachment 650625 [details] [diff] [review]
v2

Review of attachment 650625 [details] [diff] [review]:
-----------------------------------------------------------------

Very nice! r=me on everything but the ARM changes, ask Marty to review these.
Attachment #650625 - Flags: review?(jdemooij) → review+
Attachment #650625 - Flags: review?(mrosenberg)
Comment on attachment 650625 [details] [diff] [review]
v2

Review of attachment 650625 [details] [diff] [review]:
-----------------------------------------------------------------

Any plans for special casing this when one of the arguments is 0?  I suspect anything that makes the register allocator's job easier will be helpful.

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +288,5 @@
> +                               ? Assembler::VFP_GreaterThanOrEqual
> +                               : Assembler::VFP_LessThanOrEqual;
> +    Label nan, equal, returnSecond, done;
> +
> +    masm.compareDouble(second, first);

I think you want compareDouble(first, second)
Attachment #650625 - Flags: review?(mrosenberg) → review+
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Hmm.  I'm running a script that does this:

      data[targetpixel] = Math.max(0, Math.min(255, sum * factor + bias));

and I see js_math_min showing up in my profiles... any idea why?  Is it a matter of the args being an int and a double?
(In reply to Boris Zbarsky (:bz) from comment #16)
> Hmm.  I'm running a script that does this:
> 
>       data[targetpixel] = Math.max(0, Math.min(255, sum * factor + bias));
> 
> and I see js_math_min showing up in my profiles... any idea why?  Is it a
> matter of the args being an int and a double?

Looking at IonBuilder::inlineMathMinMax() in js/src/ion/MCallOptimize.cpp, the only situation with known int+double arguments we wouldn't inline is when TI is forcing the return type to int32 and there is a double argument.
You need to log in before you can comment on or make changes to this bug.