IonMonkey: moar Float32 specializations

RESOLVED FIXED in mozilla27

Status

()

Core
JavaScript Engine
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: bbouvier, Assigned: bbouvier)

Tracking

unspecified
mozilla27
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(12 attachments, 25 obsolete attachments)

1.87 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
3.00 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
3.84 KB, patch
Waldo
: review+
Details | Diff | Splinter Review
19.75 KB, patch
h4writer
: review+
Details | Diff | Splinter Review
110.42 KB, patch
decoder
: feedback+
dougc
: feedback+
Details | Diff | Splinter Review
6.26 KB, patch
Details | Diff | Splinter Review
6.50 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
12.28 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
9.76 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
22.08 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
27.56 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
100.61 KB, patch
Details | Diff | Splinter Review
(Assignee)

Description

5 years ago
To enhance the Bus Factor [0] related to the Float32 optimizations and propagate knowledge, here are some new Float32 specialization patches to review for everybody :)

These implement some of the operators required by Odin, and thus will need an ARM implementation before they land.

If you need an overview of the algorithm that decides whether or not to specialize Float32, see the first paragraph of the comment in IonAnalysis [1].

[0] https://en.wikipedia.org/wiki/Bus_factor
[1] https://bugzilla.mozilla.org/attachment.cgi?id=799860&action=diff#a/js/src/jit/IonAnalysis.cpp_sec6
(Assignee)

Comment 1

5 years ago
Created attachment 800516 [details] [diff] [review]
TruncateToInt32

This one cleans the workaround in the last patch of bug 888109 and specializes TruncateToInt32 for float32.
Attachment #800516 - Flags: review?(jdemooij)
(Assignee)

Comment 2

5 years ago
Created attachment 800517 [details] [diff] [review]
Compare

This one implements and specializes Comparisons between Float32 operators.
Attachment #800517 - Flags: review?(hv1989)
(Assignee)

Comment 3

5 years ago
Created attachment 800518 [details] [diff] [review]
Sqrt

This one implements and specializes sqrt for Float32.
Attachment #800518 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 4

5 years ago
Created attachment 800519 [details] [diff] [review]
Abs

This one implements and specializes Abs for Float32.

Sunfish, is there any NaN trick that we can use with the floats, as the one you provided for doubles?
Attachment #800519 - Flags: review?(sstangl)
(Assignee)

Comment 5

5 years ago
Created attachment 800520 [details] [diff] [review]
ARM stub patch

Marty, Jon or Doug: would you be interested in writing the ARM parts?
(Assignee)

Updated

5 years ago
Flags: needinfo?(mrosenberg)
Flags: needinfo?(jcoppeard)
Flags: needinfo?(dtc-moz)

Comment 6

5 years ago
I assume comment 4 refers to "masm.loadConstantDouble(SpecificNaN(0, DoubleSignificandBits), ScratchFloatReg);"?  It should be pretty trivial adding a SpecificNaNFloat method to FloatingPoint.h to do this, if that's what you mean -- just need to copy the existing method, and have it use a new set of corresponding Float* uint32_t constants.
Comment on attachment 800519 [details] [diff] [review]
Abs

Review of attachment 800519 [details] [diff] [review]:
-----------------------------------------------------------------

I am unsure if this patch is valid in general due to TI, but my knowledge of TI's behavior predates baseline integration. I would check on the current state of affairs with jandem or kannan before r+.

In general, TI does not permit MIR nodes that mimic some JSOp to return types that it has not observed. For example, if the result of abs() has always been an integer, it would be invalid for the respective MAbs instruction to be MIRType_Double -- in case of bailout, a double would be restored to the interpreter stack, and TI would complain vigorously about an unexpected type when that Value is popped.

Baseline likely changed this behavior (or even potentially eliminated it with bail-to-baseline), but it needs checking first. I will be on PTO tomorrow (Friday), back Monday. If a baseline engineer says this retyping is safe, feel free to set the patch as r+ from me and land it :)

::: js/src/jit/Lowering.cpp
@@ +1126,5 @@
>          // needed to handle abs(INT32_MIN)
>          if (ins->fallible() && !assignSnapshot(lir))
>              return false;
>          return defineReuseInput(lir, ins, 0);
> +    } else if (num->type() == MIRType_Float32) {

nit: no else after return.

::: js/src/jit/MCallOptimize.cpp
@@ +532,5 @@
>      callInfo.unwrapArgs();
>  
> +    // If the arg is a Float32, we specialize the op as double, it will be specialized
> +    // as float32 if necessary later.
> +    MIRType absType = (argType == MIRType_Float32) ? MIRType_Double : argType;

See overview comment about TI safety.

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +442,5 @@
> +{
> +    FloatRegister input = ToFloatRegister(ins->input());
> +    JS_ASSERT(input == ToFloatRegister(ins->output()));
> +    masm.xorps(ScratchFloatReg, ScratchFloatReg);
> +    masm.subss(input, ScratchFloatReg); // negate the sign bit.

Sunfish recently landed patches that change this code pattern into an explicit load. The pattern is in the function just above this one -- the |loadConstantDouble(SpecificNaN(...))|.
Attachment #800519 - Flags: review?(sstangl)
(In reply to Benjamin Bouvier [:bbouvier] from comment #5)

> Marty, Jon or Doug: would you be interested in writing the ARM parts?

I'd be happy to help out with the ARM parts.  I'm on PTO from today until the 16th however, so if it needs to happen before then it'll have to be someone else.
Flags: needinfo?(jcoppeard)
Comment on attachment 800517 [details] [diff] [review]
Compare

Review of attachment 800517 [details] [diff] [review]:
-----------------------------------------------------------------

Some small nits when reading this for the first time. No need to update patch yet. Wait for my real review.
I'll review this Monday a bit more thorough. I want to familiarize myself with the Float32 optimization first.
(Also we have had to many fallout in MCompare typepolicy. So want to be sure this is correct) 

Though the changes look good ;)

::: js/src/jit/LOpcodes.h
@@ +81,2 @@
>      _(CompareDAndBranch)            \
> +    _(CompareFAndBranch)            \

There isn't much order here, except that CompareX is followed by CompareXAndBranch.
So could you add CompareF and CompareFAndBranch after CompareD and CompareDAndBranch like:
CompareD
CompareDAndBranch
CompareF
CompareFAndBranch

::: js/src/jit/shared/CodeGenerator-x86-shared.h
@@ +108,2 @@
>      virtual bool visitCompareDAndBranch(LCompareDAndBranch *comp);
> +    virtual bool visitCompareFAndBranch(LCompareFAndBranch *comp);

Same here. Order is:
virtual bool visitCompareD(LCompareD *comp);
virtual bool visitCompareDAndBranch(LCompareDAndBranch *comp);
virtual bool visitCompareF(LCompareD *comp);
virtual bool visitCompareFAndBranch(LCompareDAndBranch *comp);
Comment on attachment 800516 [details] [diff] [review]
TruncateToInt32

Review of attachment 800516 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, except for the x86 issue below.

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +849,5 @@
> +    Label fail;
> +
> +    if (Assembler::HasSSE3()) {
> +        // Push double.
> +        masm.subl(Imm32(sizeof(double)), esp);

sizeof(float)?

@@ +858,5 @@
> +        static const uint32_t TOO_BIG_EXPONENT = (DoubleExponentBias + 63) << EXPONENT_SHIFT;
> +
> +        // Check exponent to avoid fp exceptions.
> +        Label failPopDouble;
> +        masm.movl(Operand(esp, 4), output);

movss only stores 32-bits, so this will read garbage. Don't we also want to adjust the EXPONENT_* constants?

@@ +900,5 @@
> +            masm.loadStaticFloat32(&shiftPos, temp);
> +            masm.bind(&skip);
> +        }
> +
> +        masm.addsd(input, temp);

Should this be addss?
Attachment #800516 - Flags: review?(jdemooij)
Comment on attachment 800518 [details] [diff] [review]
Sqrt

Review of attachment 800518 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/assembler/assembler/X86Assembler.h
@@ +2736,5 @@
> +    {
> +        spew("sqrtss     %s, %s",
> +             nameFPReg(src), nameFPReg(dst));
> +        m_formatter.prefix(PRE_SSE_F3);
> +        m_formatter.twoByteOp(OP2_SQRTSD_VsdWsd, (RegisterID)dst, (RegisterID)src);

Whoa, this is confusing.  Can you declare OP2_SQRTSS_VssWss and use it here.

::: js/src/jit/Lowering.cpp
@@ +1142,5 @@
> +        LSqrtD *lir = new LSqrtD(useRegisterAtStart(num));
> +        return defineReuseInput(lir, ins, 0);
> +    } else {
> +        LSqrtF *lir = new LSqrtF(useRegisterAtStart(num));
> +        return defineReuseInput(lir, ins, 0);

Do not use defineReuseInput here, otherwise the register allocator will have to do a copy if the same value is used at multiple times:

  x = x * f32(Math.sqrt(x));

useRegisterAtStart is enough to let the register allocator reuse the inputs if it wants to.

::: js/src/jit/MIR.cpp
@@ +2568,5 @@
>      return call;
>  }
>  
> +void
> +MSqrt::trySpecializeFloat32() {

From where is this function called? I cannot find any reference in dxr.

::: js/src/jit/MIR.h
@@ +3317,5 @@
>  
>  // Inline implementation of Math.sqrt().
>  class MSqrt
>    : public MUnaryInstruction,
> +    public RuntimePolicy<0>

RuntimePolicy ?!  Does that mean we expect a JSRuntime?

I guess not, and I suggest that we should rename it.
Attachment #800518 - Flags: review?(nicolas.b.pierron)
Comment on attachment 800517 [details] [diff] [review]
Compare

Review of attachment 800517 [details] [diff] [review]:
-----------------------------------------------------------------

Don't forget the previous review comments. Looks good to me, if ARM is added too.
Attachment #800517 - Flags: review?(hv1989) → review+
(Assignee)

Comment 13

5 years ago
Created attachment 801791 [details] [diff] [review]
TruncateToInt32 - Part 1: mfbt Float32 constants and static asserts
Attachment #800516 - Attachment is obsolete: true
Attachment #801791 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 14

5 years ago
Created attachment 801795 [details] [diff] [review]
TruncateToInt32 - Part 2

Thanks for the review! Indeed, x86 implementation wasn't appropriate. This one should work better.
Attachment #801795 - Flags: review?(jdemooij)
(Assignee)

Comment 15

5 years ago
Created attachment 801798 [details] [diff] [review]
Compare - fixed nits

Carrying over r+ from h4writer.
Attachment #800517 - Attachment is obsolete: true
Attachment #801798 - Flags: review+
(Assignee)

Comment 16

5 years ago
Created attachment 801805 [details] [diff] [review]
Sqrt

(In reply to Nicolas B. Pierron [:nbp] from comment #11)
> Whoa, this is confusing.  Can you declare OP2_SQRTSS_VssWss and use it here.
Hum. If we do this for sqrt, we'll have to do it for all other operators that have already been implemented. Also, it will just be code duplication as it is the same exact byte sequence (only the prefix changes). What do you think of discussing this in a follow-up bug?


> Do not use defineReuseInput here [...]
Nice catch! Looks like a bad rebase.

> From where is this function called? I cannot find any reference in dxr.
Float32 patch for Ion:
https://bugzilla.mozilla.org/attachment.cgi?id=799860&action=diff#a/js/src/jit/IonAnalysis.cpp_sec6

> RuntimePolicy ?!  Does that mean we expect a JSRuntime?
> 
> I guess not, and I suggest that we should rename it.
This is a type policy that chooses at runtime between MIRType_Double or MIRType_Float32, with respect to the specialization of the operation. How about FloatingPointPolicy?
Attachment #800518 - Attachment is obsolete: true
Attachment #801805 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 17

5 years ago
Created attachment 801839 [details] [diff] [review]
Sqrt
Attachment #801805 - Attachment is obsolete: true
Attachment #801805 - Flags: review?(nicolas.b.pierron)
Attachment #801839 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 18

5 years ago
Created attachment 801878 [details] [diff] [review]
Abs

That's a good point. I slightly changed the code to make it more obvious that we don't fool TI.

What was doing the code before the Float32 abs patch:
1. if (argType != returnType && returnType != MIRType_Int32), don't inline
2. Create abs with the same type as argType
3. if (argType != returnType), insert conversion to Int32

Part 3 suggests that the condition is equivalent to argType == Double and returnType == Int. Injecting this into part 1, the condition could be rewritten (argType != returnType && !(argType == Double && returnType == Int32)), which is more verbose but way easier to understand.

I adapted the code so that it does almost the same thing, with the difference that in part 2, if the arg is a Float32, we specialize the operation as Double (this way, the specialization decision comes later, in ApplyTypes).
Attachment #800519 - Attachment is obsolete: true
Attachment #801878 - Flags: review?(sstangl)
(Assignee)

Comment 19

5 years ago
Created attachment 801938 [details] [diff] [review]
UnsignedToFloat32

Useful for Odin.
Attachment #801938 - Flags: review?(hv1989)
Comment on attachment 801839 [details] [diff] [review]
Sqrt

Review of attachment 801839 [details] [diff] [review]:
-----------------------------------------------------------------

Please address the question too, and not only the comments.  Ask for review again as soon as you have answer the question.

::: js/src/assembler/assembler/X86Assembler.h
@@ +2793,5 @@
> +    {
> +        spew("sqrtss     %s, %s",
> +             nameFPReg(src), nameFPReg(dst));
> +        m_formatter.prefix(PRE_SSE_F3);
> +        m_formatter.twoByteOp(OP2_SQRTSD_VsdWsd, (RegisterID)dst, (RegisterID)src);

Please address this issue, and make create a follow-up patch/bug to fix the previous one.

::: js/src/jit/Lowering.cpp
@@ +1161,5 @@
> +    JS_ASSERT(IsFloatingPointType(num->type()));
> +    if (num->type() == MIRType_Double) {
> +        LSqrtD *lir = new LSqrtD(useRegisterAtStart(num));
> +        return define(lir, ins);
> +    } else {

nit: As Sean mentioned, no else after a return.

::: js/src/jit/MIR.cpp
@@ +2592,5 @@
>      return call;
>  }
>  
> +void
> +MSqrt::trySpecializeFloat32() {

Can you point me at the patch which is adding this function?  I still cannot find it on dxr.

::: js/src/jit/MIR.h
@@ +3326,5 @@
>  
>  // Inline implementation of Math.sqrt().
>  class MSqrt
>    : public MUnaryInstruction,
> +    public FloatingPointPolicy<0>

Thanks, this is way better than RuntimePolicy :)

::: js/src/jit/shared/CodeGenerator-x86-shared.cpp
@@ +486,5 @@
>  bool
> +CodeGeneratorX86Shared::visitSqrtF(LSqrtF *ins)
> +{
> +    FloatRegister input = ToFloatRegister(ins->input());
> +    JS_ASSERT(input == ToFloatRegister(ins->output()));

Without defineReuseInput, this assertion might fail with an example similar to the one given in the previous review.
Attachment #801839 - Flags: review?(nicolas.b.pierron)

Updated

5 years ago
Attachment #801791 - Flags: review?(jwalden+bmo) → review+
Comment on attachment 801795 [details] [diff] [review]
TruncateToInt32 - Part 2

Review of attachment 801795 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! r=me with comments below addressed.

::: js/src/jit/shared/CodeGenerator-shared.cpp
@@ +710,5 @@
>  
>      saveVolatile(dest);
>  
> +    if (ool->needFloat32Conversion())
> +        masm.convertFloatToDouble(src, src);

This will clobber src; another instruction may still expect a float32 instead of a double.

On x86 it's not a problem because all float regs are volatile, so the src register will be restored by the restoreVolatile call. Still, this is platform-independent code so we can't rely on that. I think ScratchFloatReg will be used by the move emitter (see callWithABI), so we can't use that either.

If ool->needFloat32Conversion(), we can just push src before converting to double and pop it after the call, like you did with the fast x86 truncation code.

::: js/src/jit/x64/CodeGenerator-x64.cpp
@@ +569,5 @@
> +{
> +    FloatRegister input = ToFloatRegister(ins->input());
> +    Register output = ToRegister(ins->output());
> +
> +    // On x64, branchTruncateDouble uses cvttsd2sq. Unlike the x86

Nit: nit/branchTruncateDouble/branchTruncateFloat32, s/cvttsd2sq/cvttss2sq

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +907,5 @@
> +    Register output = ToRegister(ins->output());
> +
> +    Label fail;
> +
> +    if (Assembler::HasSSE3()) {

Before landing this please make HasSSE3 |return false;| temporarily and make sure it passes jit-tests on x86. We should add a shell flag to disable SSE 3+ but we don't have one yet...

@@ +925,5 @@
> +        masm.branch32(Assembler::GreaterThanOrEqual, output, Imm32(TOO_BIG_EXPONENT), &failPopFloat);
> +
> +        // Load float, perform 32-bit truncation.
> +        masm.fld32(Operand(esp, 0));
> +        masm.fisttp(Operand(esp, 0));

Sorry, I know I suggested using sizeof(float) earlier, but fisttp will store a 64-bit value... So can you use sizeof(uint64_t) or sizeof(double) inside this |if| and change the "Push float32." comment to "Push float32. Subtract 64 bits so that fisttp can store a 64-bit integer." or something similar?
Attachment #801795 - Flags: review?(jdemooij) → review+

Updated

5 years ago
Attachment #801878 - Flags: review?(sstangl) → review+
Comment on attachment 801938 [details] [diff] [review]
UnsignedToFloat32

Review of attachment 801938 [details] [diff] [review]:
-----------------------------------------------------------------

- I request two follow-ups. Though they are small enough to just ride this patch too... You can decide yourself.
- Also you might want to check 32bit, since I think it shouldn't have compiled (due to missing functions).
- Where do you add these functions in odinmonkey? I mean all this new functionality isn't used atm...?

::: js/src/jit/MIR.cpp
@@ +2617,5 @@
> +        if (v.isInt32()) {
> +            MConstant *c = MConstant::New(DoubleValue(float(uint32_t(v.toInt32()))));
> +            c->setResultType(MIRType_Float32);
> +            return c;
> +        }

(Value &) v could be typed as a double, but still fit into a float. So I think we want to fold that too?

::: js/src/jit/x64/LOpcodes-x64.h
@@ +17,5 @@
>      _(ModI)                         \
>      _(ModPowTwoI)                   \
>      _(PowHalfD)                     \
>      _(UInt32ToDouble)               \
> +    _(UInt32ToFloat32)              \

Since this is only used in odinmonkey, this should be AsmJSUInt32ToFloat32

Can you file follow-up to adjust UInt32ToDouble into AsmJSUInt32ToDouble. Since that is also only used for odinmonkey.

::: js/src/jit/x64/MacroAssembler-x64.h
@@ +1053,5 @@
>          cvtsq2sd(src, dest);
>      }
>  
> +    void convertUInt32ToFloat32(const Register &src, const FloatRegister &dest) {
> +        cvtsq2ss(src, dest);

Is there a website explaining the internals of this asm instruction? I couldn't find one myself and it has bothered me that I don't have a list with instructions and explanation.

::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +404,5 @@
> +    if (input != temp)
> +        masm.mov(input, temp);
> +
> +    // Beware: convertUInt32ToFloat32 clobbers input.
> +    masm.convertUInt32ToFloat32(temp, ToFloatRegister(lir->output()));

I think you forgot to add "convertUInt32ToFloat32" in MacroAssembler-x86.h.

Also I think it's better to create convertUInt32ToFloat32(input, output, temp) for the x86 variant. And move input to temp, when the registers aren't the same. That way it doesn't clobber input register anymore, unnoticeable.

Here you can also open a follow-up for adjusting the double variant.

::: js/src/jit/x86/LIR-x86.h
@@ +87,5 @@
> +{
> +  public:
> +    LIR_HEADER(UInt32ToFloat32)
> +
> +    LUInt32ToFloat32(const LAllocation &input, const LDefinition &temp) {

Nit: Indentation of bracket is wrong.
Attachment #801938 - Flags: review?(hv1989)
(Assignee)

Comment 23

5 years ago
Created attachment 804046 [details] [diff] [review]
TruncateToInt32 - Part 2

Thanks Jan!
FWIW, the --no-fpu shell option allows to test for non SSE3 x86 operations :) I checked and the tests pass with this option enabled.

Carrying over r+ from jandem.
Attachment #801795 - Attachment is obsolete: true
Attachment #804046 - Flags: review+
(Assignee)

Comment 24

5 years ago
Created attachment 804059 [details] [diff] [review]
Sqrt

(In reply to Nicolas B. Pierron [:nbp] from comment #20)
> Comment on attachment 801839 [details] [diff] [review]
> 
> Please address this issue, and make create a follow-up patch/bug to fix the
> previous one.
Filed bug 915901

>
> nit: As Sean mentioned, no else after a return.
Doh!

> 
> Can you point me at the patch which is adding this function?  I still cannot
> find it on dxr.
Mmh, my last answer to your review mentioned the patch section in bug 888109. Now that it landed, I can give you that link:
http://mxr.mozilla.org/mozilla-central/source/js/src/jit/IonAnalysis.cpp#855

> Without defineReuseInput, this assertion might fail with an example similar
> to the one given in the previous review.
Double doh, I don't know what happened when writing this code, but I was clearly wrong. Thanks for catching this. I will also add tests that assert that the behaviour is correct.
Attachment #801839 - Attachment is obsolete: true
Attachment #804059 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 25

5 years ago
Created attachment 804121 [details] [diff] [review]
UnsignedToFloat32 - Part 1 : factor out MFBT stuff

IsFloat32Representable was present in MIR.cpp and used only there, but it might used somewhere else in a near future and might be useful for other people too.
Attachment #801938 - Attachment is obsolete: true
Attachment #804121 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 26

5 years ago
Created attachment 804125 [details] [diff] [review]
UnsignedToFloat32 - Part 2

(In reply to Hannes Verschore [:h4writer] from comment #22)

Thanks for the quick review!
Diff:
- addresses your comments
- adds MConstant::NewAsmJS(Value, MIRType), which would have been added in the Odin patch otherwise.
- uses mozilla::IsFloat32Representable for folding (we don't want to fold values that can't be exactly represented as Float32).

> Comment on attachment 801938 [details] [diff] [review]
> - I request two follow-ups. Though they are small enough to just ride this
> patch too... You can decide yourself.
Filed bug 915934.

> - Also you might want to check 32bit, since I think it shouldn't have
> compiled (due to missing functions).
Forgot to qref :/

> - Where do you add these functions in odinmonkey? I mean all this new
> functionality isn't used atm...?
You're right, UnsignedToFloat32 will be used by the Odin patch. The other operators can already be specialized before, even if the ARM implementation isn't ready.

> (Value &) v could be typed as a double, but still fit into a float. So I
> think we want to fold that too?
Good catch. Added that in the UnsignedToDouble clean up bug too.

> > +        cvtsq2ss(src, dest);
> 
> Is there a website explaining the internals of this asm instruction? I
> couldn't find one myself and it has bothered me that I don't have a list
> with instructions and explanation.
As explained on IRC, cvtsq2ss is cvtsi2ss for a quadword input (copied on the double model, which names the equivalent operation cvtsq2sd).
Attachment #804125 - Flags: review?(hv1989)
(Assignee)

Comment 27

5 years ago
Created attachment 804130 [details] [diff] [review]
Better check coherency function

Sean, the Float consistent algorithm introduced in bug 915301 works fine, but some Float32 uses might be valid and not validate any of the 3 conditions (I am thinking of MTruncateToInt32 and MCompare). As a matter of fact, we need to internalize the check in the MDefinition. Does it seem reasonable?
Attachment #804130 - Flags: review?(sstangl)
(Assignee)

Comment 28

5 years ago
Created attachment 804180 [details] [diff] [review]
UnsignedToFloat32 - Part 2

This one is more correct.

During folding, the input value can't be a double, as AsmJS uses Int32 for storing UInt32.
Attachment #804125 - Attachment is obsolete: true
Attachment #804125 - Flags: review?(hv1989)
Attachment #804180 - Flags: review?(hv1989)
(In reply to Benjamin Bouvier [:bbouvier] from comment #23)
> FWIW, the --no-fpu shell option allows to test for non SSE3 x86 operations
> :) I checked and the tests pass with this option enabled.

--no-fpu disables SSE2 and Odin/Ion completely though :) (see jitSupportsFloatingPoint in AsmJS.cpp)
(In reply to Benjamin Bouvier [:bbouvier] from comment #28)
> Created attachment 804180 [details] [diff] [review]
> UnsignedToFloat32 - Part 2
> 
> This one is more correct.
> 
> During folding, the input value can't be a double, as AsmJS uses Int32 for
> storing UInt32.

And what does it do for max_uint? Since that doesn't fit in int32...
(Assignee)

Comment 31

5 years ago
Created attachment 804570 [details] [diff] [review]
UnsignedToFloat32 - Part 2

As said on IRC, AsmJS uses int32 to store uint32 (it makes sense as by definition, they're both encoded on 32-bits). However, I forgot a cast to uint32_t after several layers of modification... This is fixed here.
Attachment #804180 - Attachment is obsolete: true
Attachment #804180 - Flags: review?(hv1989)
Attachment #804570 - Flags: review?(hv1989)
Comment on attachment 804121 [details] [diff] [review]
UnsignedToFloat32 - Part 1 : factor out MFBT stuff

Review of attachment 804121 [details] [diff] [review]:
-----------------------------------------------------------------

You forgot to add FloatingPoint.cpp, looks like.

::: mfbt/sources.mk
@@ +8,5 @@
>  endif
>  
>  CPPSRCS += \
>    HashFunctions.cpp \
> +  FloatingPoint.cpp \

This should be alphabetized in the list, so before HashFunctions.cpp.
Attachment #804121 - Flags: review?(jwalden+bmo)
Comment on attachment 804059 [details] [diff] [review]
Sqrt

Review of attachment 804059 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.h
@@ +3376,5 @@
>      AliasSet getAliasSet() const {
>          return AliasSet::None();
>      }
> +
> +    bool isFloat32Commutative() const { return true; }

Commutative is a confusing name, and really not expected on a MUnaryInstruction.
Attachment #804059 - Flags: review?(nicolas.b.pierron) → review+
(Assignee)

Comment 34

5 years ago
Created attachment 805834 [details] [diff] [review]
UnsignedToFloat32 - Part 1 : factor out MFBT stuff

Doh
Attachment #804121 - Attachment is obsolete: true
Attachment #805834 - Flags: review?(jwalden+bmo)
(Assignee)

Comment 35

5 years ago
Created attachment 805840 [details] [diff] [review]
UnsignedToFloat32 - Part 2

rebased
Attachment #804570 - Attachment is obsolete: true
Attachment #804570 - Flags: review?(hv1989)
Attachment #805840 - Flags: review?(hv1989)
(Assignee)

Updated

5 years ago
Depends on: 917200
Comment on attachment 805834 [details] [diff] [review]
UnsignedToFloat32 - Part 1 : factor out MFBT stuff

Review of attachment 805834 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/FloatingPoint.cpp
@@ +4,5 @@
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +/* Implementations of FloatingPoint functions */
> +
> +# include "mozilla/FloatingPoint.h"

No space between # and include.

::: mfbt/FloatingPoint.h
@@ +260,5 @@
>  }
>  
> +/**
> + * Returns true if the given value can be represented as a Float32 without any loss, false otherwise.
> + * Needs to not be inlined to avoid overzealous optimizations by certain compilers (MSVC).

A few tweaks: refer to IEEE-754, not to "Float32" (whose meaning is not necessarily clear).  Break out the second sentence to a separate paragraph (and complete sentence) to not detract from the functional description, and be clear that exactly MSVC is affected, and nothing else (assuming that's the case), so it's clearer when this can eventually be removed.  Note that NaN values are considered representable even though their bit patterns aren't.  Watch out for 80-character limits.  In sum, then (assuming Bugzilla doesn't wrap this on me):

/**
 * Returns true if the given value can be losslessly represented as an IEEE-754
 * single format number, false otherwise.  All NaN values are considered
 * representable (notwithstanding that the exact bit pattern of a double format
 * NaN value can't be exactly represented in single format).
 *
 * This function isn't inlined to avoid buggy optimizations by MSVC.
 */
Attachment #805834 - Flags: review?(jwalden+bmo) → review+
(Assignee)

Comment 37

5 years ago
Created attachment 806348 [details] [diff] [review]
Not

This one showed up in one of my Float32 benchmarks.
Attachment #806348 - Flags: review?
Comment on attachment 805840 [details] [diff] [review]
UnsignedToFloat32 - Part 2

Review of attachment 805840 [details] [diff] [review]:
-----------------------------------------------------------------

I still want those tests for MAX_UINT32, that would have failed due to not casting to uint32 during folding.
(And sorry about the delay.)

::: js/src/jit/MIR.cpp
@@ +2643,5 @@
> +        const Value &v = input()->toConstant()->value();
> +        if (v.isInt32()) {
> +            double dval = double(uint32_t(v.toInt32()));
> +            if (IsFloat32Representable(dval)) {
> +                MConstant *c = MConstant::NewAsmJS(DoubleValue(float(dval)), MIRType_Float32);

DoubleValue(float(dval)) => Float32Value(float(dval))

@@ +2645,5 @@
> +            double dval = double(uint32_t(v.toInt32()));
> +            if (IsFloat32Representable(dval)) {
> +                MConstant *c = MConstant::NewAsmJS(DoubleValue(float(dval)), MIRType_Float32);
> +                return c;
> +            }

return MConstant::...
and lose the inner braces
Attachment #805840 - Flags: review?(hv1989) → review+
Fyi, the last patch doesn't have a reviewer set!
Comment on attachment 804130 [details] [diff] [review]
Better check coherency function

Review of attachment 804130 [details] [diff] [review]:
-----------------------------------------------------------------

Putting the float32 data in the MDefinition seems nicer to me than having it in IonAnalysis.cpp.
Attachment #804130 - Flags: review?(sstangl) → review+
Comment on attachment 806348 [details] [diff] [review]
Not

Review of attachment 806348 [details] [diff] [review]:
-----------------------------------------------------------------

This patch does not provide ARM parity, or should only lower to NotF on x86 platforms.

::: js/src/jit/MIR.cpp
@@ +2423,5 @@
> +{
> +    MDefinition *in = input();
> +    if (!in->canProduceFloat32() && in->type() == MIRType_Float32) {
> +        ConvertDefinitionToDouble<0>(in, this);
> +    }

nit: no braces for one line.

::: js/src/jit/MIR.h
@@ +4947,5 @@
> +
> +    void trySpecializeFloat32();
> +    bool isFloat32Commutative() const { return true; }
> +#ifdef DEBUG
> +    bool isConsistentFloat32Use() const {

Is it used for anything? I cannot spot any use of it.
Attachment #806348 - Flags: review?
(Assignee)

Comment 42

5 years ago
(In reply to Nicolas B. Pierron [:nbp] from comment #41)
> Comment on attachment 806348 [details] [diff] [review]
> This patch does not provide ARM parity, or should only lower to NotF on x86
> platforms.
As all the other patches in this bug :) Bug 917200 is filed for that purpose.
> 
> ::: js/src/jit/MIR.cpp
> @@ +2423,5 @@
> > +{
> > +    MDefinition *in = input();
> > +    if (!in->canProduceFloat32() && in->type() == MIRType_Float32) {
> > +        ConvertDefinitionToDouble<0>(in, this);
> > +    }
> 
> nit: no braces for one line.
Will fix that.
> 
> ::: js/src/jit/MIR.h
> @@ +4947,5 @@
> > +
> > +    void trySpecializeFloat32();
> > +    bool isFloat32Commutative() const { return true; }
> > +#ifdef DEBUG
> > +    bool isConsistentFloat32Use() const {
> 
> Is it used for anything? I cannot spot any use of it.
See "Better check coherency function patch", right there https://bugzilla.mozilla.org/attachment.cgi?id=804130&action=diff#a/js/src/jit/IonAnalysis.cpp_sec2
(Assignee)

Updated

5 years ago
Flags: needinfo?(mrosenberg)
Flags: needinfo?(dtc-moz)
(Assignee)

Comment 43

5 years ago
Shipping the features that have no dependencies and won't break anything.

https://hg.mozilla.org/integration/mozilla-inbound/rev/f0468b6afda3
https://hg.mozilla.org/integration/mozilla-inbound/rev/27b1173a9e8c
Whiteboard: [leave open]
(Assignee)

Comment 44

5 years ago
Created attachment 807915 [details] [diff] [review]
Not

See comments in the previous answer.
Also deleting the ARM stub patch as bug 917200 has been filed.
Attachment #800520 - Attachment is obsolete: true
Attachment #806348 - Attachment is obsolete: true
Attachment #807915 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 45

5 years ago
Created attachment 807962 [details] [diff] [review]
Folded patch for fuzzing

Here is a rebased, folded patch that might be useful for implementing ARM and for fuzzing purposes.

Gary, Christian, could you please fuzz it? No need to fuzz it on any ARM platform: while it will compile, it will also crash (almost) every test that uses Float32.
The parent revision is 148123:27b1173a9e8c on mozilla-inbound.
Attachment #807962 - Flags: feedback?(gary)
Attachment #807962 - Flags: feedback?(choller)
Depends on: 919150
Depends on: 919275
(In reply to Benjamin Bouvier [:bbouvier] from comment #45)
> Created attachment 807962 [details] [diff] [review]
> Folded patch for fuzzing
> 
> Here is a rebased, folded patch that might be useful for implementing ARM
> and for fuzzing purposes.

This would appear to be only the ARM stub patch.  Might the wrong patch
have been uploaded?
(Assignee)

Comment 48

5 years ago
Created attachment 808828 [details] [diff] [review]
Real folded patch for fuzzing

Thanks dougc, I must have exported qtip instead of diffed with qparent.

The new parent revision is m-inbound, 148346:4ba8514e5036. Thanks for fuzzing!
Attachment #807962 - Attachment is obsolete: true
Attachment #807962 - Flags: feedback?(gary)
Attachment #807962 - Flags: feedback?(choller)
Attachment #808828 - Flags: feedback?(gary)
Attachment #808828 - Flags: feedback?(dtc-moz)
Attachment #808828 - Flags: feedback?(choller)
Comment on attachment 807915 [details] [diff] [review]
Not

Review of attachment 807915 [details] [diff] [review]:
-----------------------------------------------------------------

This x86/x64 implementation sounds fine, do not land without either the ARM parity or a way to disable the Float conversion on ARM (with the corresponding assertion in the lowering).
Attachment #807915 - Flags: review?(nicolas.b.pierron) → review+
Created attachment 809913 [details] [diff] [review]
ARM support

Quick go at the ARM support.  Passes the new JIT tests.
Needs more testing, and I would like to eye-ball the
generated code.
(Assignee)

Comment 51

5 years ago
Created attachment 810824 [details] [diff] [review]
More tests

Here are stronger tests that not only check that we emit Float32 at some places, but also check we get the same results when using Float32 or Doubles (tests also ToInt32 in bug 919838 - tests should already pass because of the workaround, and some tests are ready in another patch for the Math functions in bug 918163). Plus, it ensures that the commutative property of Float32 / Doubles is correct :)

I am glad to announce that all these tests pass with Douglas' ARM support patch, at least on a cross-compiled shell run with qemu. Well done!
(Assignee)

Comment 52

5 years ago
Created attachment 810835 [details] [diff] [review]
Not - rebased

I will reupload all rebased patches as I won't be around for some time.
Carrying forward r+ from nbp
Attachment #807915 - Attachment is obsolete: true
Attachment #810835 - Flags: review+
(Assignee)

Comment 53

5 years ago
Created attachment 810837 [details] [diff] [review]
Abs - rebased

Carrying forward r+ from sstangl
Attachment #810837 - Flags: review+
(Assignee)

Comment 54

5 years ago
Created attachment 810840 [details] [diff] [review]
Sqrt - rebased

Forgot to check obsolete attachments for the previous ones.

Carrying forward r+ from nbp
Attachment #801878 - Attachment is obsolete: true
Attachment #804046 - Attachment is obsolete: true
Attachment #804059 - Attachment is obsolete: true
Attachment #810840 - Flags: review+
(Assignee)

Comment 55

5 years ago
Created attachment 810841 [details] [diff] [review]
Compare - rebased

Carrying forward r+ from h4writer
Attachment #801798 - Attachment is obsolete: true
Attachment #810841 - Flags: review+
(Assignee)

Comment 56

5 years ago
Created attachment 810842 [details] [diff] [review]
TruncateToInt32 - Part 2 - rebased

Carrying forward r+ from jandem
Attachment #810842 - Flags: review+
Created attachment 811437 [details] [diff] [review]
ARM support, rebased.

This was mostly a cut-and-paste of the existing float32 support.
Attachment #809913 - Attachment is obsolete: true
Attachment #811437 - Flags: review?(jcoppeard)
Comment on attachment 811437 [details] [diff] [review]
ARM support, rebased.

Best move this to bug 917200 as was intended.
Attachment #811437 - Attachment is obsolete: true
Attachment #811437 - Flags: review?(jcoppeard)
Comment on attachment 808828 [details] [diff] [review]
Real folded patch for fuzzing

I haven't seen anything bad popping up here :)
Attachment #808828 - Flags: feedback?(choller) → feedback+
Created attachment 813674 [details] [diff] [review]
Folded patch, rebased.

Rebased folded patch, in case it helps testing.

Note the folded patch appears to have included the patch from bug 918163 and it is retained.   Consider landing the patch in bug 918163 when landing these.
Attachment #808828 - Flags: feedback?(dtc-moz) → feedback+
Created attachment 817761 [details] [diff] [review]
Rebased folded patch.
Attachment #813674 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Blocks: 900120
Created attachment 819281 [details] [diff] [review]
Rebased folded patch.

Some of the changes in the prior folded patch have landed, and this appears to be what remains.
Attachment #817761 - Attachment is obsolete: true
(In reply to Douglas Crosher [:dougc] from comment #62)
> Created attachment 819281 [details] [diff] [review]
> Rebased folded patch.
> 
> Some of the changes in the prior folded patch have landed, and this appears
> to be what remains.

Note that some improvement was made to branchTruncateDouble, probably from bug 925088, and it remains to explore if these this is applicable to branchTruncateFloat32.
Created attachment 819519 [details] [diff] [review]
Rebased folded patch.

Rebase.
Attachment #819281 - Attachment is obsolete: true
(Assignee)

Comment 65

5 years ago
(In reply to Douglas Crosher [:dougc] from comment #63)
> (In reply to Douglas Crosher [:dougc] from comment #62)
> > Created attachment 819281 [details] [diff] [review]
> > Rebased folded patch.
> > 
> > Some of the changes in the prior folded patch have landed, and this appears
> > to be what remains.
> 
> Note that some improvement was made to branchTruncateDouble, probably from
> bug 925088, and it remains to explore if these this is applicable to
> branchTruncateFloat32.

I rebased locally and applied this optimization also for Float32, as cvtss2si also returns the same integer constant if there was an error.

These patches and the patch in bug 917200 have circular dependencies. As sstangl requested, I finally merged patches by MDefinition (so each patch contains MDefinition + code generation on all platforms, including ARM). I included dougc in every message commit, so if there are regressions, the two of us get contacted. Douglas, are you fine with that? I tried to find another solution but there is no easy other one.

I'll push everything as soon as I get a decent internet connection.
(Assignee)

Updated

5 years ago
Duplicate of this bug: 917200

Updated

5 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla27
Comment on attachment 808828 [details] [diff] [review]
Real folded patch for fuzzing

I didn't get to this latest patch revision in time as I was on PTO then, and it kind of slipped off the radar, but now that this has landed on m-c I don't think I've really seen anything blow up dramatically yet, so setting as feedback+ in any case.
Attachment #808828 - Flags: feedback?(gary) → feedback+
Blocks: 927408
You need to log in before you can comment on or make changes to this bug.