Closed Bug 1343007 Opened 5 years ago Closed 5 years ago

MBinaryArithInstruction::foldsTo should not wrap an Int32 constant in an MTruncateToInt32 node

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: lth, Assigned: lth)

References

Details

Attachments

(3 files, 4 obsolete files)

MBinaryArithInstruction::foldsTo always wraps the computed constant value in a MTruncateToInt32 node.  This seems unnecessary, for one thing; for another thing, it inhibits further constant folding.  For example, in this simple wasm function, the generated code adds the constants 64 and 8, but should not have to do that:

      (func (param i32) (result i32)
       (i32.load (i32.add (i32.add (i32.const 0) (i32.const 64)) (i32.const 8))))

The reason it does it is that the simplification of 0+64 yields not 64, but trunc(64).

Plausibly it is enough to insert a couple of guards in foldsTo() to avoid wrapping values that are already int32.  At the moment, the truncate generates no code at run time because it is stripped in lowering.
The optimizer depends on this in some manner.
Priority: P1 → P3
Version: 52 Branch → Trunk
This needs a little discussion.

In MIR.cpp it is simple: We will no longer introduce a MTruncateToInt32 node when the operand of that node would be an Int32 constant.

But that makes jit-tests/tests/asm.js/bug940864.js fail, in asm.js mode.

What happens there is that a signed result, namely "((-1) >>> (0+0)) | 0" becomes interpreted as unsigned, as follows.  First, (0+0) is folded to 0.  Then ((-1) >>> 0) can be folded to 0xFFFFFFFF.  This is new: previously, the folding of (0+0) would be TruncateToInt32(0), so the URSH node would stay unfolded, and would be handled in some other manner, there are several cases throughout the compiler handling URSH specially.

I think that what we have here is really a bug in the asm.js->wasm translator, and that when the semantics call for an unsigned -> signed reinterpretation the translator needs to insert code to ensure that Ion is aware of that.  In this patch I do that by inserting an explicit 'or' with zero.

However the fallout from the latter change is that machine code is generated for that bitwise 'or' since the optimization that removes such useless operations is lumped in with a part of range analysis (truncation) that is disabled for wasm.  So assuming the change is correct and desirable, we may want to do more work to avoid that code being generated.  For example, we could have a private Wasm op whose effect is to insert the desired TruncateToInt32 node; that would get removed eventually.
Attachment #8841903 - Flags: feedback?(luke)
Attachment #8841903 - Flags: feedback?(bbouvier)
In asm.js+Ion, we basically erase signedness and instead imbue signedness into the operations (just like wasm).  In Ion, we use MIRType_Int32 which means that, if we ever bailed-out to JS, all unsigned values would be interpreted incorrectly as signed values, but asm.js never bails out by construction and we forced signedness-imbuing coercions when a sign-agnostic int would flow out to JS via parameter or return.

So it looks like the expression `((-1) >>> (0+0)) | 0` is fed into a % which is a sign-aware operator.  wasm has i32.rem_u which the asm.js->wasm translator should statically picking.  I'm guessing the MMod is also being constant-folded and thus the bug is in MMod::foldTo?  A quick scan of EvaluateConstantOperands() shows MMod::isUnsigned() being checked...
So after a lot of time trying to understand where ion is going wrong, I've found it: range analysis is being overzealous here, and it optimizes the modulo operation as an unsigned modulo because hasUint32s is true there, when it was not before (because of the TruncateToInt32 after the addition): http://searchfox.org/mozilla-central/source/js/src/jit/RangeAnalysis.cpp#1593-1595

Disabling range analysis makes us pass this test with your patch (and the AsmJS bits removed). (disabling GVN also, but that's a side-effect)

So the asm.js emitter would be correct (it is emitting a signed modulo which is what we want because of the signed coercion), but Ion is smarter and realizes it can optimize it to an unsigned modulo, since there are no external indications that the modulo *must* be signed. So we should be explicit and have asm.js emit a MTruncateToInt32 wrapping the MMod operation. Or adding an explicit MIRType::Uint32 to Ion, which we've discussed a lot but implies much more changes...
Comment on attachment 8841903 [details] [diff] [review]
bug1343007-dont-wrap-constants.patch

See comment.
Attachment #8841903 - Flags: feedback?(bbouvier)
Now inserts a signedness-preserving op before MOD's left argument if that argument is signed.  I made this a new asm.specific op instead of reusing something else, to avoid confusing existing code generation.  The op turns into a TruncateToInt32 operation in WasmIonCompile.  This change makes the tricky test case pass.

The assertions in AsmJS.cpp are just my own paranoia, it looks to me like they are not necessary.  Opinions welcome.
Attachment #8841903 - Attachment is obsolete: true
Attachment #8841903 - Flags: feedback?(luke)
Attachment #8842831 - Flags: review?(bbouvier)
(In reply to Benjamin Bouvier [:bbouvier] from comment #4)
> is what we want because of the signed coercion), but Ion is smarter and
> realizes it can optimize it to an unsigned modulo, since there are no
> external indications that the modulo *must* be signed.

But can't it take the MMod::isUnsigned() into account?

Also, I don't see how this is an asm.js-only bug; wouldn't wasm have the same problem?
(In reply to Luke Wagner [:luke] from comment #7)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #4)
> > is what we want because of the signed coercion), but Ion is smarter and
> > realizes it can optimize it to an unsigned modulo, since there are no
> > external indications that the modulo *must* be signed.
> 
> But can't it take the MMod::isUnsigned() into account?

It does: it even sees the MMod is signed at the start of MMod::computeRange, and all the conditions are present to optimize this modulo into an unsigned one, so range analysis does it. We could just disable the optimization when compiling AsmJS, but that would be unfortunate (especially since this relies on pattern-matching the asm.js unsigned typing, see also IsUint32Type()).

> Also, I don't see how this is an asm.js-only bug; wouldn't wasm have the
> same problem?

Well, maybe but I don't think so. It would only do it if we had a sequence like:

(some i32 expr X)
i32.const 0
i32.shr_u
i32.const 42 ;; anything positive, really
i32.rem_s

but that is correctly equivalent to, since X is explicitly marked as an unsigned value:

(some i32 expr X)
i32.const 42
i32.rem_u

Now, in asm.js, the JS code is basically doing:

((x>>>0)|0)%y

The binary-or is lost in type-checking, because x>>>0 is intish: http://searchfox.org/mozilla-central/source/js/src/wasm/AsmJS.cpp#6289,6311-6313

This information, if present, would translate in Ion into a MTruncateToInt32 on the Ursh. Since it is not there in Ion, range analysis just sees:

(x>>>0)%y

and this can be optimized (under certain conditions on y, which are true here) into an unsigned modulo.

Lars, have you considered other places in AsmJS.cpp where we could lose this typing information as well? (validating something unsigned into something signed)
(In reply to Benjamin Bouvier [:bbouvier] from comment #8)
> > But can't it take the MMod::isUnsigned() into account?
> 
> It does: it even sees the MMod is signed at the start of MMod::computeRange,
> and all the conditions are present to optimize this modulo into an unsigned
> one, so range analysis does it.

The isUnsigned bit isn't an optimization, it *changes the semantics of Mod*.  That is, MMod doesn't perform a single mathematical function, it performs one of two: rem_s or rem_u.  So as long as isUnsigned is being set correctly, which it sounds like it is, range analysis should produce the correct ranges without any other context.

So from Ion's POV, MMod(isUnsigned=false,x,y) = x%y (in JS) and MMod(isUnsigned=true,x,y) = (((x>>>0)%(y>>>0))>>>0) (in JS).

> Well, maybe but I don't think so. It would only do it if we had a sequence
> like:
> 
> (some i32 expr X)
> i32.const 0
> i32.shr_u
> i32.const 42 ;; anything positive, really
> i32.rem_s
> 
> but that is correctly equivalent to, since X is explicitly marked as an
> unsigned value:
> 
> (some i32 expr X)
> i32.const 42
> i32.rem_u

But that produces different i32 bits if X is negative.  In fact, when I test locally, I can indeed see different behavior between Ion and Baseline in this specific case, so this is a bug for wasm too.
Duh, I should have checked before saying something stupid. Probably the entire optimization in RangeAnalysis should be removed then; cc'ing range analysis people who r+'d it in the first place.
Comment on attachment 8842831 [details] [diff] [review]
bug1343007-dont-wrap-constants-v2.patch

Review of attachment 8842831 [details] [diff] [review]:
-----------------------------------------------------------------

(clearing review until we understand more the MMod::computeRange situation)
Attachment #8842831 - Flags: review?(bbouvier)
Attachment #8842831 - Attachment is obsolete: true
I threw in some other cases and kept them even though I did not find any bugs with them.  Still thinking about how to add some more.
Attachment #8844491 - Flags: review?(luke)
Attached patch bug1343007-tri-state-mmod.patch (obsolete) — Splinter Review
As discussed, this makes it possible to flag the MMod as definitely-signed, definitely-unsigned, and "whatever JS requires", so as to do minimal harm to existing optimizations and JS semantics.
Attachment #8844492 - Flags: review?(nicolas.b.pierron)
With the former patch in place, we can remove the intermediate truncation nodes that are inserted by the constant folder.
Attachment #8844493 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8844491 [details] [diff] [review]
bug1343007-test-case.patch

Review of attachment 8844491 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8844491 - Flags: review?(luke) → review+
Comment on attachment 8844492 [details] [diff] [review]
bug1343007-tri-state-mmod.patch

Review of attachment 8844492 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.h
@@ +7149,5 @@
>  
>  class MMod : public MBinaryArithInstruction
>  {
> +  public:
> +    enum class Interpretation {

I'd be curious to know why MMod needs this but MDiv doesn't.
(In reply to Luke Wagner [:luke] from comment #16)
> Comment on attachment 8844492 [details] [diff] [review]
> bug1343007-tri-state-mmod.patch
> 
> Review of attachment 8844492 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/MIR.h
> @@ +7149,5 @@
> >  
> >  class MMod : public MBinaryArithInstruction
> >  {
> > +  public:
> > +    enum class Interpretation {
> 
> I'd be curious to know why MMod needs this but MDiv doesn't.

And so you should be.  Changes proposed in bug 1331346 may create a need for the introduction of similar machinery in MDiv.  It is possible I should introduce it proactively when I land this patch.
Comment on attachment 8844492 [details] [diff] [review]
bug1343007-tri-state-mmod.patch

Review of attachment 8844492 [details] [diff] [review]:
-----------------------------------------------------------------

Note, Range Analysis bite us so many times in the past, that any modification in this file should have at least 2 reviewers.
The reason being that it is easy to miss conditions, context, and boundary cases when manipulating Range Analysis code.

::: js/src/jit/MIR.h
@@ +7149,5 @@
>  
>  class MMod : public MBinaryArithInstruction
>  {
> +  public:
> +    enum class Interpretation {

MMod needs this but not MDiv because I still have to review Bug 1331346.

@@ +7152,5 @@
> +  public:
> +    enum class Interpretation {
> +        JS,                     // According to JS semantics
> +        Signed,                 // Always signed
> +        Unsigned                // Always unsigned

nit: JS semantics/interpretation is: Double, but the specialization already gives us a Int32 type.

This naming still sounds miss-leading to me.  I would expect Range Analysis to convert a signed operation into an unsigned operation if it has the opportunity to do so.

I think this flag is not deciding whether this operation is signed or unsigned but is deciding whether the inputs of this operations are coerced to match the sign of this operation.

    bool unsigned_;            // Type of the MMod operation.
    bool coerceOperands_;      // If set, the unsigned flag is used to coerce the inputs.

Range Analysis can then be short-circuited, simply because the type of the inputs is already provided by MMod operation, and already matches the input type.

I will note that, even with WASM, if the inputs can be proved to remain unsigned even after being coerced, then we could convert a signed-MMod to an unsigned-MMod.

::: js/src/jit/RangeAnalysis.cpp
@@ +1583,5 @@
>  
>      // If both operands are non-negative integers, we can optimize this to an
>      // unsigned mod.
> +    if (specialization() == MIRType::Int32 && rhs.lower() > 0 &&
> +        interpretation_ == Interpretation::JS)

We might want to take this branch with signed MMod, in order to check if the operands are unsigned.

@@ +1593,5 @@
>          // Range::Range(). However, IsUint32Type() will only return true for
>          // nodes that lie in the range [0, UINT32_MAX].
>          bool hasUint32s = IsUint32Type(getOperand(0)) &&
>              getOperand(1)->type() == MIRType::Int32 &&
>              (IsUint32Type(getOperand(1)) || getOperand(1)->isConstant());

But this condition should change if MMod::coerceOperands_ is set, as the following modification should be applied:
   IsUint32Type(getOperand(0))  becomes  lhs->isInt32() && lhs->lower() >= 0
   IsUint32Type(getOperand(1))  becomes  rhs->isInt32() && rhs->lower() > 0
Attachment #8844492 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8844493 [details] [diff] [review]
bug1343007-no-truncate-nodes.patch

Review of attachment 8844493 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.cpp
@@ +3169,5 @@
>          if (isTruncated()) {
>              if (!folded->block())
>                  block()->insertBefore(this, folded);
> +            if (folded->type() != MIRType::Int32)
> +                return MTruncateToInt32::New(alloc, folded);

This one is fine and I think the MConstant value can be truncated directly, replacing the folded MConstant returned by EvaluateConstantOperands, but …

@@ +3184,5 @@
>  
>      if (IsConstant(rhs, getIdentity())) {
> +        if (isTruncated()) {
> +            if (lhs->type() != MIRType::Int32)
> +                return MTruncateToInt32::New(alloc, lhs);

this one and the 3 others are not fine, because lhs can be any expression and replacing the current instruction by a its operand effectively loose the "truncation" property associated with the instruction.

This actually matter for Math.imul example from Bug 1186271 comment 2.
Attachment #8844493 - Flags: review?(nicolas.b.pierron)
Had a long discussion with nbp about this.  The central issue is that the unsigned_ flag, if *not* set, does *not* mean that the operation is for sure performed signed, but that the operation is subject to further analysis starting with the types of the operands.  Setting unsigned_ to true later is a valid operation and Ion assumes JS semantics when deciding to do so.

MIR works by the principle that the operation does not affect the interpretation of input types generally, so the MIR-est way to fix the incorrect interpretation when the TruncateToInt32 wrapper on constant-folded nodes goes away would be to wrap the operands of Wasm's rem_s operation in ToInt32() nodes, which would prevent in a standard way their being erroneously interpreted as unsigned.  That wrapping might also prevent the desirable optimization wherein signed modulo operations that can be proven to result in the same value when performed unsigned are transformed to (faster) unsigned operations, however.  To do better we would then have to add additional optimizations, or to augment the MMod node further.  Both my patch here and nbp's suggestions on how it should be done augments the MMod node further, but this is probably not how it should be done.

Another fact that came out of this discussion is that in Ion, node->type() == MIRType::Int32 does not mean that the type of the node is in fact Int32, but that it is predicted to be Int32 yet may bail out.  That precludes certain transformations, such as those I proposed on my third patch here, because the TruncateToInt32 nodes I tried to remove in effect act as limiters on their inputs ("input could bail out but since we truncate here we don't want that").  In turn, I might be a little worried that this loose interpretation of MIRType::Int32 precludes certain optimizations that would be desirable for Wasm, but I don't have any actionable evidence for that at this time.  It precludes my constant folding "optimization" but other optimizations may strip the quasi-redundant truncate nodes anyway.
See comment 21 (especially) and earlier for background.

Insert ToInt32 nodes around the operands to Wasm signed Int32 div and mod to avoid the situation where RangeAnalysis turns what should be a signed operation into an unsigned operation, JS semantics allows that conversion and the "unsigned_" flag on div and mod does not mean "signed" when false.  The inserted ToInt32 nodes effectively act as casts.

Only mod needs this change now, because div has a more restrictive analysis.  So I could restrict myself to mod, but it doesn't seem harmful to be on the safe side here.

I have done this only for Int32, as the analysis applies only to Int32 nodes at this time (and I'd have to implement a suitable cast operation for Int64).  I'm in the process of adding Int64 test cases however, so we will catch the problem should the analysis ever include Int64.

This change does not move the needle on any Wasm benchmark.  It would be strange if it did, because the undesirable optimization is extremely narrow in scope (from the point of view of Wasm), although it may be generally useful for JS.
Attachment #8844492 - Attachment is obsolete: true
Attachment #8844493 - Attachment is obsolete: true
Attachment #8847197 - Flags: review?(nicolas.b.pierron)
Attachment #8847197 - Flags: review?(bbouvier)
Comment on attachment 8847197 [details] [diff] [review]
bug1343007-cast-to-signed.patch

Review of attachment 8847197 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/wasm/WasmIonCompile.cpp
@@ +629,5 @@
> +            // the operation being executed unsigned.
> +            //
> +            // Do this for Int32 only since Int64 is not subject to the same
> +            // issues.
> +            auto* lhs2 = MToInt32::New(alloc(), lhs);

note: MToInt32 and MTruncateToInt32 should not make any differences here, as they lower the same way when their input are Int32. The only difference is MTruncateToInt32 hints Range Analysis to attempt truncated-Double math with int32 values.

So, any of MToInt32 and MTruncateToInt32 should be fine here, as long as inputs are explicitly coerced from Double to Int32, as I expect from wasm.
Attachment #8847197 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 8847197 [details] [diff] [review]
bug1343007-cast-to-signed.patch

Review of attachment 8847197 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks and sorry for the long review time. Can you also add the test case from comment 8 mentioned to fail in comment 9, please?

::: js/src/wasm/WasmIonCompile.cpp
@@ +629,5 @@
> +            // the operation being executed unsigned.
> +            //
> +            // Do this for Int32 only since Int64 is not subject to the same
> +            // issues.
> +            auto* lhs2 = MToInt32::New(alloc(), lhs);

I'd rather go with MTruncateToInt32, which seems to be more explicit what we want do here. Plus the behavior of MToInt32 seems contrary to what we want when constructing a Range: http://searchfox.org/mozilla-central/source/js/src/jit/RangeAnalysis.cpp#601

@@ +631,5 @@
> +            // Do this for Int32 only since Int64 is not subject to the same
> +            // issues.
> +            auto* lhs2 = MToInt32::New(alloc(), lhs);
> +            curBlock_->add(lhs2);
> +            lhs = lhs2;

Could this just be

lhs = MToInt32:New(alloc(), lhs);
curBlock_->add(lhs);

@@ +654,5 @@
> +            // are not unsigned to Baldr (eg, unsigned right shifts) may lead to
> +            // the operation being executed unsigned.
> +            //
> +            // Do this for Int32 only since Int64 is not subject to the same
> +            // issues.

How about just saying "See comment in div" (and maybe mention there that this comment applies to mod() too)?
Attachment #8847197 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #24)
> Comment on attachment 8847197 [details] [diff] [review]
> bug1343007-cast-to-signed.patch
> 
> Review of attachment 8847197 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks and sorry for the long review time. Can you also add the test case
> from comment 8 mentioned to fail in comment 9, please?

Included in the earlier patch containing test cases.

> @@ +631,5 @@
> > +            // Do this for Int32 only since Int64 is not subject to the same
> > +            // issues.
> > +            auto* lhs2 = MToInt32::New(alloc(), lhs);
> > +            curBlock_->add(lhs2);
> > +            lhs = lhs2;
> 
> Could this just be
> 
> lhs = MToInt32:New(alloc(), lhs);
> curBlock_->add(lhs);

Sadly not, because lhs is an MDefinition* and add() does not accept that.  I could perhaps futz with the type of lhs but that does not seem like an improvement.
Keywords: leave-open
This is the more limited case of a patch you've seen before, it avoids inserting a truncation node in constant folding when the operand is a constant that already has type Int32.
Attachment #8849470 - Flags: review?(nicolas.b.pierron)
Attachment #8849470 - Flags: review?(nicolas.b.pierron) → review+
Keywords: leave-open
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
https://hg.mozilla.org/integration/mozilla-inbound/rev/d13479bfbc633160a96f032d0ae23554910b3d75
Bug 1343007 - Do not insert redundant truncation nodes in constant folding. r=nbp
Oops, this was closed before all the patches landed.

Anyway, landed now.

Last try run for this:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3e8bbbe07d126c8f3110cce23a81119c553beac2
You need to log in before you can comment on or make changes to this bug.