Closed Bug 736135 Opened 12 years ago Closed 12 years ago

IonMonkey: remove negative zero check when Mul get's followed by Sub or Add

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: h4writer, Assigned: h4writer)

References

Details

Attachments

(2 files, 8 obsolete files)

Attached patch Patch (obsolete) — Splinter Review
We have a check to bailout to interpreter whenever Mul produces a negative zero.
Sometimes this isn't needed, because the opcode that uses this value does the same whenever -0 or 0 is provided

E.g:

Add:
-0 + 0 = 0
0 + -0 = 0
So we can remove the check when one operand can't generate a -0 (=all operands)
And in case we have two Mul operands, we can remove one of both checks.

Sub:
0 - 0 = 0
0 - (-0) = 0
So we can remove the check when a Mul is the second operand.
Attachment #606257 - Flags: review?(dvander)
Attachment #606257 - Attachment is patch: true
Assignee: general → hv1989
Comment on attachment 606257 [details] [diff] [review]
Patch

Review of attachment 606257 [details] [diff] [review]:
-----------------------------------------------------------------

Idea looks good, a few questions:
 (1) How did you find this case? Does it appear in a benchmark?
 (2) What sort of performance improvement does it get?
 (3) A few more related ideas have popped up, both from you and Marty. I think we should split this stuff out of GVN into a new range analysis pass that runs after both LICM and GVN. We can give MIR nodes a new virtual function for inferring properties relating to overflow and zero checks.
(1) It appears mostly in V8-crypto and v8-deltablue. 
Now in most cases the check gets removed indeed. Reasons it doesn't get removed (on deltablue) are the following:
- BOX(MUL): Can't be sure -0 and 0 are handled correctly
- MOD(MUL, x) or MOD(x, MUL): I just found out yesterday that x%(-0) == x%0. So this can get added
- BITAND: Should check when -0 == 0 in this case.

Nexto this DIVI should get the same optimizations and possibly add the other BIT operations too.

(2) Should give 1% improvement on deltablue. Didn't test crypto anymore
But I suspect even bigger gains eventually when we have range analysis. 
Because if we can remove the overflow check together with this the opcode isn't fallible anymore!

(3) Would indeed be better. I'll see if I find time to do this.
Status: NEW → ASSIGNED
Attached patch Adds range analysis phase (obsolete) — Splinter Review
This patch adds:
- Range Analysis pass after LICM, before Lowering. (No real range analysis yet! This is another bug)
- Now Div has also this negative zero check removal
- Added checks for bit operations
- Added check for Modulo

Deltablue.js:
- 13x times we try to remove negative zero check
- 11x succeeded
- 2x failed. 
-- Solution goes to "storefixedslot" op.

Crypto.js:
- 49x tried to remove negative zero check
- 21x succeeded
- 28x failed:
-- 14x solution goes to "Box" op
-- 5x solution goes to "Phi" op
-- 9x can't remove (e.g. 0 - x != -0 - x)

Most failures are because of storefixedslot, box and phi. 
- Check can't get removed with Box
- Not sure about Phi. Guess I just have to check the two definitions of Phi? I'll leave this as follow up bug for now
- Check on Storefixedslot can't get removed either, right?

@dvander: After a good review I'll let Kannan review the changes to his code (Div checks)
Attachment #606257 - Attachment is obsolete: true
Attachment #606257 - Flags: review?(dvander)
Attachment #606559 - Flags: review?(dvander)
Attached patch Part 2 - small improvements (obsolete) — Splinter Review
1) Fixes a typo in comment of MDIV
2) Adds two additional methods for removing the negative zero check. (Codegeneration already didn't add a bailout in those cases. So this doesn't improve performance atm. But now the canBeNegativeZero_ value is correct. Can make a different when we can remove the overflow check. Because at that moment fallible is false).
Attachment #606576 - Flags: review?(dvander)
Attached patch Part 1.1 add RangeAnalsys.h|cpp (obsolete) — Splinter Review
Woops, just noticed I forgot to add the .h|.cpp files
Attachment #606579 - Flags: review?(dvander)
Attached patch Part 3 - add Compare (obsolete) — Splinter Review
Noticed this when running performance tests on SS, that we could remove checks when result of DIV/MOD goes to compare. 
Every compare statement gives same result with 0 or -0. 
(0 CMP x equals -0 CMP x for CMP in {==, ===, !=, !==, <, <=, >=, >})

Measurements of V8 and SS tests where analysis gets run:
- v8-deltablue: noise
- v8-crypte: 1.03x - 1.04x increase in score
- ss-3d-cube: 0.9997x
- ss-3d-morph: 1.01x
- ss-crypto-aes: 0.998x
- ss-math-cordic: 1.01x
- ss-string-fasta: 1.001x

Unhandled MIR's atm:
      8x todouble: can't get removed
      3x div: can't get removed
      3x mul: can't get removed
      2x box: can't get removed
      8x boundscheck: ?
      2x storeelement: ?
      1x storeslot: ?
Attachment #606900 - Flags: review?(dvander)
Comment on attachment 606579 [details] [diff] [review]
Part 1.1 add RangeAnalsys.h|cpp

>+ * The Original Code is Mozilla Communicator client code, released
>+ * March 31, 1998.
>+ *
>+ * The Initial Developer of the Original Code is
>+ * Netscape Communications Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 1998
>+ * the Initial Developer. All Rights Reserved.

No, I don't think so, actually. (Use the MPL2 instead: http://www.mozilla.org/MPL/headers/.)
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

I think RangeAnalysis.cpp is missing from this patch?

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

This needs a more descriptive name, like analyzeRange
Comment on attachment 606579 [details] [diff] [review]
Part 1.1 add RangeAnalsys.h|cpp

Review of attachment 606579 [details] [diff] [review]:
-----------------------------------------------------------------

Err nevermind they're right here!
Attachment #606579 - Flags: review?(dvander) → review+
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

(Don't forget to include RangeAnalysis.* with this patch)

::: js/src/ion/MIR.cpp
@@ +515,5 @@
>      }
>  }
>  
> +static inline bool
> +TryRemoveNegativeZeroCheck(MDefinition *def)

Can this be renamed to CanRemoveNegativeZeroCheck? Then true/false returns are easier to see.

@@ +534,5 @@
> +            if (operand->isMul()) {
> +                MMul *mul = operand->toMul();
> +                if (!mul->canBeNegativeZero()) {
> +                    return true;
> +                }

Nit: no braces needed here.

@@ +539,5 @@
> +            } else if (operand->isDiv()) {
> +                MDiv *div = operand->toDiv();
> +                if (!div->canBeNegativeZero()) {
> +                    return true;
> +                }

Ditto

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

This needs a more descriptive name, like analyzeRange
Attachment #606559 - Flags: review?(dvander) → review+
Attachment #606576 - Flags: review?(dvander) → review+
Attachment #606900 - Flags: review?(dvander) → review+
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/MIR.cpp
@@ +519,5 @@
> +TryRemoveNegativeZeroCheck(MDefinition *def)
> +{
> +    // Test if all uses have the same symantic for -0 and 0
> +    // Note: returns true if negative zero check is still needed!
> +    for (MUseDefIterator use(def); use; use++) {

Whoops, I forgot: MUseDefIterator will miss resume points. I think you want MUseIterator. Resume points must capture the -0, in case there's a bailout.
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

djvj, can you look at the changes I did to the MDIV? I eventually changed the code not to be computed at request time. This made it possible to remove even more checks... Normally the changes are just copy-pasted. Can you check if I didn't break anything. Just to be sure. (In Part 2 I also adjusted a faulty comment)

(In reply to David Anderson [:dvander] from comment #11)
> Comment on attachment 606559 [details] [diff] [review]
> Adds range analysis phase
> 
> Review of attachment 606559 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/ion/MIR.cpp
> @@ +519,5 @@
> > +TryRemoveNegativeZeroCheck(MDefinition *def)
> > +{
> > +    // Test if all uses have the same symantic for -0 and 0
> > +    // Note: returns true if negative zero check is still needed!
> > +    for (MUseDefIterator use(def); use; use++) {
> 
> Whoops, I forgot: MUseDefIterator will miss resume points. I think you want
> MUseIterator. Resume points must capture the -0, in case there's a bailout.

I'm not sure about this! At least for this improvement. We remove the checks, because we deduct that the code following this instruction handles -0 or 0 in the same manor. So if we bail to the interpreter. The interpreter will get the wrong value (0 instead of -0), but will actually do the same calculations where there is no difference in 0 or -0 ... So I think it is not needed or am I missing something?

And what should I do with the license. All files in IonMonkey use the header I've used. But Ms2ger requests to use the MPL2 one?
Attachment #606559 - Flags: review?(kvijayan)
New files should use MPL2, yeah.
See Also: → 699883
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  I'm seeing a 1% gain in the v8 benches overall, and I think there's potential for bigger wins down the line.

Please consider taking all the disparate patches and organizing them better.  Ideally, there should be one patch for introducing the scaffolding for range analysis, and then another one for adding the MDiv checks, and another one for adding the MMul checks.

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

Ditto to what David said.  Just commenting again for emphasis.  "Analyse" is a really generic name and doesn't convey very much.  "analyzeRange" as David suggests is a better call, if this is going to go into the range analysis phase.
Attachment #606559 - Flags: review?(kvijayan) → review+
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/MIR.cpp
@@ +533,5 @@
> +
> +            if (operand->isMul()) {
> +                MMul *mul = operand->toMul();
> +                if (!mul->canBeNegativeZero()) {
> +                    return true;

I just noticed something.  You're returning true if the other operand to the ADD is a MUL which can't be a negative zero.  This means that your canBeNegativeZero_ assignment for this operand (the DIV you're analyzing in this call) will be set to true.

That seems to be the opposite of what you're going for.  Seems like the negation on the conditional for 'if(!mul->canBeNegativeZero()) { ... }' should be removed.

Can you check this out and confirm and/or tell me I'm wrong (and why)?

@@ +538,5 @@
> +                }
> +            } else if (operand->isDiv()) {
> +                MDiv *div = operand->toDiv();
> +                if (!div->canBeNegativeZero()) {
> +                    return true;

See comment above.  I think the same issue applies here.
Attachment #606559 - Flags: review+ → review?(kvijayan)
Attachment #606559 - Flags: review?(kvijayan)
All nits addressed, carries r+ over:
- MPL 2.0
- analyse -> analyzeRange
- TryRemoveNegativeZeroCheck -> NeedNegativeZeroCheck
@dvander, I think this name captures it even better ?!
- brackets removed

About MUseDefIterator vs MUseIterator see next patch

@djvj:
The logic behind that piece is as following:
x + y equals negative zero ONLY when x == -0 AND y == -0.
So we can remove the check of x OR y, but not both!

Now assume x equals the definition where we want to remove the canBeNegativeZero and y equals the definition where we check the 'if(!mul->canBeNegativeZero()) { ... }' on.
So in this case x->canBeNegativeZero == true.
(Note: we do not know if we visit x or y first !!!!!)

-> if y->canBeNegativeZero == true:
both x and y have the canbeNegativeZero check. So we can remove the check on x.

-> if y->canBeNegativeZero == false:
y doesn't have a negativeZero check. This can happen in two cases.
1) We reasoned y can't produce a negative zero. (In this case we could remove the negative zero check on x, but we don't do it atm!)
2) We removed the check premature because when running analyzeRange on y, x had canBeNegativeZero == true.

Because we can't distinguish between 1) or 2) atm we don't remove the negative zero check in both cases. And that is also the reason why the check has a "!".

It is a bit stupid we can't remove the check in case of 1), but I don't care atm. It would be possible to mark which definitions we visited already. But whenever the real range analysis pass is implemented we will probably be able to check if the definitions have analysis information and get this information for free. Whenever that is possible we know which definitions has been visited and with a small adjustment in the logic above we can remove case 1) too! So I'll do that whenever range analysis is implemented.
Attachment #606559 - Attachment is obsolete: true
Attachment #606576 - Attachment is obsolete: true
Attachment #606579 - Attachment is obsolete: true
Attachment #606900 - Attachment is obsolete: true
Attachment #607989 - Flags: review+
(In reply to David Anderson [:dvander] from comment #11)
> ::: js/src/ion/MIR.cpp
> @@ +519,5 @@
> > +TryRemoveNegativeZeroCheck(MDefinition *def)
> > +{
> > +    // Test if all uses have the same symantic for -0 and 0
> > +    // Note: returns true if negative zero check is still needed!
> > +    for (MUseDefIterator use(def); use; use++) {
> 
> Whoops, I forgot: MUseDefIterator will miss resume points. I think you want
> MUseIterator. Resume points must capture the -0, in case there's a bailout.

I Could indeed reproduce it. I've added a testcase that shows this behavior and also the fix. As extra I added the MAbs opcode to the list of instruction that behave the same on -0 and 0.
Attachment #607990 - Flags: review?(dvander)
> @djvj:
> The logic behind that piece is as following:
> x + y equals negative zero ONLY when x == -0 AND y == -0.
> So we can remove the check of x OR y, but not both!
> 
> Now assume x equals the definition where we want to remove the
> canBeNegativeZero and y equals the definition where we check the
> 'if(!mul->canBeNegativeZero()) { ... }' on.
> So in this case x->canBeNegativeZero == true.
> (Note: we do not know if we visit x or y first !!!!!)
> 
> -> if y->canBeNegativeZero == true:
> both x and y have the canbeNegativeZero check. So we can remove the check on
> x.
> 
> -> if y->canBeNegativeZero == false:
> y doesn't have a negativeZero check. This can happen in two cases.
> 1) We reasoned y can't produce a negative zero. (In this case we could
> remove the negative zero check on x, but we don't do it atm!)
> 2) We removed the check premature because when running analyzeRange on y, x
> had canBeNegativeZero == true.
> 
> Because we can't distinguish between 1) or 2) atm we don't remove the
> negative zero check in both cases. And that is also the reason why the check
> has a "!".
> 
> It is a bit stupid we can't remove the check in case of 1), but I don't care
> atm. It would be possible to mark which definitions we visited already. But
> whenever the real range analysis pass is implemented we will probably be
> able to check if the definitions have analysis information and get this
> information for free. Whenever that is possible we know which definitions
> has been visited and with a small adjustment in the logic above we can
> remove case 1) too! So I'll do that whenever range analysis is implemented.

Awesome.  That explains it perfectly.  I'd suggest adding this reasoning as comment in the code.
(In reply to Kannan Vijayan [:djvj] from comment #18)
> Awesome.  That explains it perfectly.  I'd suggest adding this reasoning as
> comment in the code.

Done ;)
Attachment #607994 - Flags: review?(kvijayan)
Attachment #607990 - Flags: review?(dvander) → review+
Attachment #607994 - Flags: review?(kvijayan) → review+
Combined all pieces into one patch, together with commit msg.
Attachment #607989 - Attachment is obsolete: true
Attachment #607990 - Attachment is obsolete: true
Attachment #607994 - Attachment is obsolete: true
Attachment #608334 - Flags: review+
Keywords: checkin-needed
Whiteboard: checkin on ionmonkey branch
Hannes, have you considered asking for commit access?
http://hg.mozilla.org/projects/ionmonkey/rev/b01dfbbd3608
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Keywords: checkin-needed
Whiteboard: checkin on ionmonkey branch
TBPL disapproved the changes:
https://tbpl.mozilla.org/?tree=Ionmonkey&rev=b01dfbbd3608
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
b01dfbbd3608 also caused a 620.9ms AWFY regression for sunspider on armv7
(In reply to Cyko_01 from comment #24)
> b01dfbbd3608 also caused a 620.9ms AWFY regression for sunspider on armv7

ss-cordic to be specific
(In reply to Nicolas B. Pierron [:pierron] from comment #23)
> TBPL disapproved the changes:
> https://tbpl.mozilla.org/?tree=Ionmonkey&rev=b01dfbbd3608

http://hg.mozilla.org/projects/ionmonkey/rev/f7335a8a93ed
This should fix it. The bug failed on the testcase I added here.
"congruentTo" can only be called from the foldsTo function and not from the pass I added. This because the congruentIfOperandsEqual called from MPhi checks if operands are equal by using the valuenumber(). But gvn hasn't run!

The small regression on SS (32bit only) is pretty normal. The extra pass increases the compilation time a bit on these very short functions. The regression on ss-cordic on armv7 was unexpected. Gonna look into that. If needed I can disable the analysis range pass?
http://hg.mozilla.org/projects/ionmonkey/rev/cc2329cebf20
Should fix the ss-cordic regression. This wasn't an arm specific bug. I tried to create a testcase for x86, but didn't succeed yet. I hope it will also fix the broken v8 tests. I can't reproduce the broken tests at home (neither before or after this landing).
Seems like eventually it did fix the broken v8 tests too. So can get closed!
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Attached patch Follow up fixSplinter Review
There was still a bug left in the implementation. When the value of a multiply gets used in an addition the negative zero check gets eagerly removed on at least one of the operands. Now there is a problem when the operands of the addition point to the same Mul, because removing it from one operand, will actually remove the check on both operands...
This patch contains testcase + fix.
Attachment #609894 - Flags: review?(dvander)
Reopened until follow-up fix gets in.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #609894 - Flags: review?(dvander) → review+
https://hg.mozilla.org/projects/ionmonkey/rev/24294fe42c19
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Depends on: 831087
No longer depends on: 831087
You need to log in before you can comment on or make changes to this bug.