Last Comment Bug 736135 - IonMonkey: remove negative zero check when Mul get's followed by Sub or Add
: IonMonkey: remove negative zero check when Mul get's followed by Sub or Add
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: ---
Assigned To: Hannes Verschore [:h4writer]
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-15 09:37 PDT by Hannes Verschore [:h4writer]
Modified: 2013-01-16 03:22 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch (2.13 KB, patch)
2012-03-15 09:37 PDT, Hannes Verschore [:h4writer]
no flags Details | Diff | Review
Adds range analysis phase (13.95 KB, patch)
2012-03-16 06:53 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review
Part 2 - small improvements (1.88 KB, patch)
2012-03-16 07:52 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review
Part 1.1 add RangeAnalsys.h|cpp (4.94 KB, patch)
2012-03-16 08:01 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review
Part 3 - add Compare (694 bytes, patch)
2012-03-17 14:33 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review
Part 1 - Try to remove negative zero checks (16.31 KB, patch)
2012-03-21 09:25 PDT, Hannes Verschore [:h4writer]
hv1989: review+
Details | Diff | Review
Part 2 - Fixes bug dvander mentioned (2.95 KB, patch)
2012-03-21 09:29 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review
Part 3 - add some extra explanation (1.68 KB, patch)
2012-03-21 09:56 PDT, Hannes Verschore [:h4writer]
kvijayan: review+
Details | Diff | Review
Remove unneeded negative zero checks (17.54 KB, patch)
2012-03-22 08:11 PDT, Hannes Verschore [:h4writer]
hv1989: review+
Details | Diff | Review
Follow up fix (1.93 KB, patch)
2012-03-27 14:47 PDT, Hannes Verschore [:h4writer]
dvander: review+
Details | Diff | Review

Description Hannes Verschore [:h4writer] 2012-03-15 09:37:26 PDT
Created attachment 606257 [details] [diff] [review]
Patch

We have a check to bailout to interpreter whenever Mul produces a negative zero.
Sometimes this isn't needed, because the opcode that uses this value does the same whenever -0 or 0 is provided

E.g:

Add:
-0 + 0 = 0
0 + -0 = 0
So we can remove the check when one operand can't generate a -0 (=all operands)
And in case we have two Mul operands, we can remove one of both checks.

Sub:
0 - 0 = 0
0 - (-0) = 0
So we can remove the check when a Mul is the second operand.
Comment 1 David Anderson [:dvander] 2012-03-15 17:43:24 PDT
Comment on attachment 606257 [details] [diff] [review]
Patch

Review of attachment 606257 [details] [diff] [review]:
-----------------------------------------------------------------

Idea looks good, a few questions:
 (1) How did you find this case? Does it appear in a benchmark?
 (2) What sort of performance improvement does it get?
 (3) A few more related ideas have popped up, both from you and Marty. I think we should split this stuff out of GVN into a new range analysis pass that runs after both LICM and GVN. We can give MIR nodes a new virtual function for inferring properties relating to overflow and zero checks.
Comment 2 Hannes Verschore [:h4writer] 2012-03-16 02:56:28 PDT
(1) It appears mostly in V8-crypto and v8-deltablue. 
Now in most cases the check gets removed indeed. Reasons it doesn't get removed (on deltablue) are the following:
- BOX(MUL): Can't be sure -0 and 0 are handled correctly
- MOD(MUL, x) or MOD(x, MUL): I just found out yesterday that x%(-0) == x%0. So this can get added
- BITAND: Should check when -0 == 0 in this case.

Nexto this DIVI should get the same optimizations and possibly add the other BIT operations too.

(2) Should give 1% improvement on deltablue. Didn't test crypto anymore
But I suspect even bigger gains eventually when we have range analysis. 
Because if we can remove the overflow check together with this the opcode isn't fallible anymore!

(3) Would indeed be better. I'll see if I find time to do this.
Comment 3 Hannes Verschore [:h4writer] 2012-03-16 06:53:55 PDT
Created attachment 606559 [details] [diff] [review]
Adds range analysis phase

This patch adds:
- Range Analysis pass after LICM, before Lowering. (No real range analysis yet! This is another bug)
- Now Div has also this negative zero check removal
- Added checks for bit operations
- Added check for Modulo

Deltablue.js:
- 13x times we try to remove negative zero check
- 11x succeeded
- 2x failed. 
-- Solution goes to "storefixedslot" op.

Crypto.js:
- 49x tried to remove negative zero check
- 21x succeeded
- 28x failed:
-- 14x solution goes to "Box" op
-- 5x solution goes to "Phi" op
-- 9x can't remove (e.g. 0 - x != -0 - x)

Most failures are because of storefixedslot, box and phi. 
- Check can't get removed with Box
- Not sure about Phi. Guess I just have to check the two definitions of Phi? I'll leave this as follow up bug for now
- Check on Storefixedslot can't get removed either, right?

@dvander: After a good review I'll let Kannan review the changes to his code (Div checks)
Comment 4 Hannes Verschore [:h4writer] 2012-03-16 07:52:52 PDT
Created attachment 606576 [details] [diff] [review]
Part 2 - small improvements

1) Fixes a typo in comment of MDIV
2) Adds two additional methods for removing the negative zero check. (Codegeneration already didn't add a bailout in those cases. So this doesn't improve performance atm. But now the canBeNegativeZero_ value is correct. Can make a different when we can remove the overflow check. Because at that moment fallible is false).
Comment 5 Hannes Verschore [:h4writer] 2012-03-16 08:01:01 PDT
Created attachment 606579 [details] [diff] [review]
Part 1.1 add RangeAnalsys.h|cpp

Woops, just noticed I forgot to add the .h|.cpp files
Comment 6 Hannes Verschore [:h4writer] 2012-03-17 14:33:16 PDT
Created attachment 606900 [details] [diff] [review]
Part 3 - add Compare

Noticed this when running performance tests on SS, that we could remove checks when result of DIV/MOD goes to compare. 
Every compare statement gives same result with 0 or -0. 
(0 CMP x equals -0 CMP x for CMP in {==, ===, !=, !==, <, <=, >=, >})

Measurements of V8 and SS tests where analysis gets run:
- v8-deltablue: noise
- v8-crypte: 1.03x - 1.04x increase in score
- ss-3d-cube: 0.9997x
- ss-3d-morph: 1.01x
- ss-crypto-aes: 0.998x
- ss-math-cordic: 1.01x
- ss-string-fasta: 1.001x

Unhandled MIR's atm:
      8x todouble: can't get removed
      3x div: can't get removed
      3x mul: can't get removed
      2x box: can't get removed
      8x boundscheck: ?
      2x storeelement: ?
      1x storeslot: ?
Comment 7 :Ms2ger 2012-03-17 14:44:02 PDT
Comment on attachment 606579 [details] [diff] [review]
Part 1.1 add RangeAnalsys.h|cpp

>+ * The Original Code is Mozilla Communicator client code, released
>+ * March 31, 1998.
>+ *
>+ * The Initial Developer of the Original Code is
>+ * Netscape Communications Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 1998
>+ * the Initial Developer. All Rights Reserved.

No, I don't think so, actually. (Use the MPL2 instead: http://www.mozilla.org/MPL/headers/.)
Comment 8 David Anderson [:dvander] 2012-03-19 12:32:13 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

I think RangeAnalysis.cpp is missing from this patch?

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

This needs a more descriptive name, like analyzeRange
Comment 9 David Anderson [:dvander] 2012-03-19 12:32:53 PDT
Comment on attachment 606579 [details] [diff] [review]
Part 1.1 add RangeAnalsys.h|cpp

Review of attachment 606579 [details] [diff] [review]:
-----------------------------------------------------------------

Err nevermind they're right here!
Comment 10 David Anderson [:dvander] 2012-03-19 12:39:23 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

(Don't forget to include RangeAnalysis.* with this patch)

::: js/src/ion/MIR.cpp
@@ +515,5 @@
>      }
>  }
>  
> +static inline bool
> +TryRemoveNegativeZeroCheck(MDefinition *def)

Can this be renamed to CanRemoveNegativeZeroCheck? Then true/false returns are easier to see.

@@ +534,5 @@
> +            if (operand->isMul()) {
> +                MMul *mul = operand->toMul();
> +                if (!mul->canBeNegativeZero()) {
> +                    return true;
> +                }

Nit: no braces needed here.

@@ +539,5 @@
> +            } else if (operand->isDiv()) {
> +                MDiv *div = operand->toDiv();
> +                if (!div->canBeNegativeZero()) {
> +                    return true;
> +                }

Ditto

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

This needs a more descriptive name, like analyzeRange
Comment 11 David Anderson [:dvander] 2012-03-19 12:43:21 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/MIR.cpp
@@ +519,5 @@
> +TryRemoveNegativeZeroCheck(MDefinition *def)
> +{
> +    // Test if all uses have the same symantic for -0 and 0
> +    // Note: returns true if negative zero check is still needed!
> +    for (MUseDefIterator use(def); use; use++) {

Whoops, I forgot: MUseDefIterator will miss resume points. I think you want MUseIterator. Resume points must capture the -0, in case there's a bailout.
Comment 12 Hannes Verschore [:h4writer] 2012-03-20 01:57:45 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

djvj, can you look at the changes I did to the MDIV? I eventually changed the code not to be computed at request time. This made it possible to remove even more checks... Normally the changes are just copy-pasted. Can you check if I didn't break anything. Just to be sure. (In Part 2 I also adjusted a faulty comment)

(In reply to David Anderson [:dvander] from comment #11)
> Comment on attachment 606559 [details] [diff] [review]
> Adds range analysis phase
> 
> Review of attachment 606559 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/ion/MIR.cpp
> @@ +519,5 @@
> > +TryRemoveNegativeZeroCheck(MDefinition *def)
> > +{
> > +    // Test if all uses have the same symantic for -0 and 0
> > +    // Note: returns true if negative zero check is still needed!
> > +    for (MUseDefIterator use(def); use; use++) {
> 
> Whoops, I forgot: MUseDefIterator will miss resume points. I think you want
> MUseIterator. Resume points must capture the -0, in case there's a bailout.

I'm not sure about this! At least for this improvement. We remove the checks, because we deduct that the code following this instruction handles -0 or 0 in the same manor. So if we bail to the interpreter. The interpreter will get the wrong value (0 instead of -0), but will actually do the same calculations where there is no difference in 0 or -0 ... So I think it is not needed or am I missing something?

And what should I do with the license. All files in IonMonkey use the header I've used. But Ms2ger requests to use the MPL2 one?
Comment 13 David Anderson [:dvander] 2012-03-20 02:40:03 PDT
New files should use MPL2, yeah.
Comment 14 Kannan Vijayan [:djvj] 2012-03-20 10:41:45 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  I'm seeing a 1% gain in the v8 benches overall, and I think there's potential for bigger wins down the line.

Please consider taking all the disparate patches and organizing them better.  Ideally, there should be one patch for introducing the scaffolding for range analysis, and then another one for adding the MDiv checks, and another one for adding the MMul checks.

::: js/src/ion/MIR.h
@@ +320,5 @@
>          return false;
>      }
>      bool congruentIfOperandsEqual(MDefinition * const &ins) const;
>      virtual MDefinition *foldsTo(bool useValueNumbers);
> +    virtual void analyse();

Ditto to what David said.  Just commenting again for emphasis.  "Analyse" is a really generic name and doesn't convey very much.  "analyzeRange" as David suggests is a better call, if this is going to go into the range analysis phase.
Comment 15 Kannan Vijayan [:djvj] 2012-03-20 14:44:29 PDT
Comment on attachment 606559 [details] [diff] [review]
Adds range analysis phase

Review of attachment 606559 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/MIR.cpp
@@ +533,5 @@
> +
> +            if (operand->isMul()) {
> +                MMul *mul = operand->toMul();
> +                if (!mul->canBeNegativeZero()) {
> +                    return true;

I just noticed something.  You're returning true if the other operand to the ADD is a MUL which can't be a negative zero.  This means that your canBeNegativeZero_ assignment for this operand (the DIV you're analyzing in this call) will be set to true.

That seems to be the opposite of what you're going for.  Seems like the negation on the conditional for 'if(!mul->canBeNegativeZero()) { ... }' should be removed.

Can you check this out and confirm and/or tell me I'm wrong (and why)?

@@ +538,5 @@
> +                }
> +            } else if (operand->isDiv()) {
> +                MDiv *div = operand->toDiv();
> +                if (!div->canBeNegativeZero()) {
> +                    return true;

See comment above.  I think the same issue applies here.
Comment 16 Hannes Verschore [:h4writer] 2012-03-21 09:25:56 PDT
Created attachment 607989 [details] [diff] [review]
Part 1 - Try to remove negative zero checks

All nits addressed, carries r+ over:
- MPL 2.0
- analyse -> analyzeRange
- TryRemoveNegativeZeroCheck -> NeedNegativeZeroCheck
@dvander, I think this name captures it even better ?!
- brackets removed

About MUseDefIterator vs MUseIterator see next patch

@djvj:
The logic behind that piece is as following:
x + y equals negative zero ONLY when x == -0 AND y == -0.
So we can remove the check of x OR y, but not both!

Now assume x equals the definition where we want to remove the canBeNegativeZero and y equals the definition where we check the 'if(!mul->canBeNegativeZero()) { ... }' on.
So in this case x->canBeNegativeZero == true.
(Note: we do not know if we visit x or y first !!!!!)

-> if y->canBeNegativeZero == true:
both x and y have the canbeNegativeZero check. So we can remove the check on x.

-> if y->canBeNegativeZero == false:
y doesn't have a negativeZero check. This can happen in two cases.
1) We reasoned y can't produce a negative zero. (In this case we could remove the negative zero check on x, but we don't do it atm!)
2) We removed the check premature because when running analyzeRange on y, x had canBeNegativeZero == true.

Because we can't distinguish between 1) or 2) atm we don't remove the negative zero check in both cases. And that is also the reason why the check has a "!".

It is a bit stupid we can't remove the check in case of 1), but I don't care atm. It would be possible to mark which definitions we visited already. But whenever the real range analysis pass is implemented we will probably be able to check if the definitions have analysis information and get this information for free. Whenever that is possible we know which definitions has been visited and with a small adjustment in the logic above we can remove case 1) too! So I'll do that whenever range analysis is implemented.
Comment 17 Hannes Verschore [:h4writer] 2012-03-21 09:29:06 PDT
Created attachment 607990 [details] [diff] [review]
Part 2 - Fixes bug dvander mentioned

(In reply to David Anderson [:dvander] from comment #11)
> ::: js/src/ion/MIR.cpp
> @@ +519,5 @@
> > +TryRemoveNegativeZeroCheck(MDefinition *def)
> > +{
> > +    // Test if all uses have the same symantic for -0 and 0
> > +    // Note: returns true if negative zero check is still needed!
> > +    for (MUseDefIterator use(def); use; use++) {
> 
> Whoops, I forgot: MUseDefIterator will miss resume points. I think you want
> MUseIterator. Resume points must capture the -0, in case there's a bailout.

I Could indeed reproduce it. I've added a testcase that shows this behavior and also the fix. As extra I added the MAbs opcode to the list of instruction that behave the same on -0 and 0.
Comment 18 Kannan Vijayan [:djvj] 2012-03-21 09:37:47 PDT
> @djvj:
> The logic behind that piece is as following:
> x + y equals negative zero ONLY when x == -0 AND y == -0.
> So we can remove the check of x OR y, but not both!
> 
> Now assume x equals the definition where we want to remove the
> canBeNegativeZero and y equals the definition where we check the
> 'if(!mul->canBeNegativeZero()) { ... }' on.
> So in this case x->canBeNegativeZero == true.
> (Note: we do not know if we visit x or y first !!!!!)
> 
> -> if y->canBeNegativeZero == true:
> both x and y have the canbeNegativeZero check. So we can remove the check on
> x.
> 
> -> if y->canBeNegativeZero == false:
> y doesn't have a negativeZero check. This can happen in two cases.
> 1) We reasoned y can't produce a negative zero. (In this case we could
> remove the negative zero check on x, but we don't do it atm!)
> 2) We removed the check premature because when running analyzeRange on y, x
> had canBeNegativeZero == true.
> 
> Because we can't distinguish between 1) or 2) atm we don't remove the
> negative zero check in both cases. And that is also the reason why the check
> has a "!".
> 
> It is a bit stupid we can't remove the check in case of 1), but I don't care
> atm. It would be possible to mark which definitions we visited already. But
> whenever the real range analysis pass is implemented we will probably be
> able to check if the definitions have analysis information and get this
> information for free. Whenever that is possible we know which definitions
> has been visited and with a small adjustment in the logic above we can
> remove case 1) too! So I'll do that whenever range analysis is implemented.

Awesome.  That explains it perfectly.  I'd suggest adding this reasoning as comment in the code.
Comment 19 Hannes Verschore [:h4writer] 2012-03-21 09:56:37 PDT
Created attachment 607994 [details] [diff] [review]
Part 3 - add some extra explanation

(In reply to Kannan Vijayan [:djvj] from comment #18)
> Awesome.  That explains it perfectly.  I'd suggest adding this reasoning as
> comment in the code.

Done ;)
Comment 20 Hannes Verschore [:h4writer] 2012-03-22 08:11:15 PDT
Created attachment 608334 [details] [diff] [review]
Remove unneeded negative zero checks

Combined all pieces into one patch, together with commit msg.
Comment 21 Ryan VanderMeulen [:RyanVM] 2012-03-22 18:26:05 PDT
Hannes, have you considered asking for commit access?
Comment 22 Hannes Verschore [:h4writer] 2012-03-24 09:55:51 PDT
http://hg.mozilla.org/projects/ionmonkey/rev/b01dfbbd3608
Comment 23 Nicolas B. Pierron [:nbp] 2012-03-24 14:36:43 PDT
TBPL disapproved the changes:
https://tbpl.mozilla.org/?tree=Ionmonkey&rev=b01dfbbd3608
Comment 24 Cyko_01 2012-03-24 14:41:12 PDT
b01dfbbd3608 also caused a 620.9ms AWFY regression for sunspider on armv7
Comment 25 Cyko_01 2012-03-24 14:43:42 PDT
(In reply to Cyko_01 from comment #24)
> b01dfbbd3608 also caused a 620.9ms AWFY regression for sunspider on armv7

ss-cordic to be specific
Comment 26 Hannes Verschore [:h4writer] 2012-03-24 17:37:43 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #23)
> TBPL disapproved the changes:
> https://tbpl.mozilla.org/?tree=Ionmonkey&rev=b01dfbbd3608

http://hg.mozilla.org/projects/ionmonkey/rev/f7335a8a93ed
This should fix it. The bug failed on the testcase I added here.
"congruentTo" can only be called from the foldsTo function and not from the pass I added. This because the congruentIfOperandsEqual called from MPhi checks if operands are equal by using the valuenumber(). But gvn hasn't run!

The small regression on SS (32bit only) is pretty normal. The extra pass increases the compilation time a bit on these very short functions. The regression on ss-cordic on armv7 was unexpected. Gonna look into that. If needed I can disable the analysis range pass?
Comment 27 Hannes Verschore [:h4writer] 2012-03-25 08:01:07 PDT
http://hg.mozilla.org/projects/ionmonkey/rev/cc2329cebf20
Should fix the ss-cordic regression. This wasn't an arm specific bug. I tried to create a testcase for x86, but didn't succeed yet. I hope it will also fix the broken v8 tests. I can't reproduce the broken tests at home (neither before or after this landing).
Comment 28 Hannes Verschore [:h4writer] 2012-03-26 12:27:11 PDT
Seems like eventually it did fix the broken v8 tests too. So can get closed!
Comment 29 Hannes Verschore [:h4writer] 2012-03-27 14:47:02 PDT
Created attachment 609894 [details] [diff] [review]
Follow up fix

There was still a bug left in the implementation. When the value of a multiply gets used in an addition the negative zero check gets eagerly removed on at least one of the operands. Now there is a problem when the operands of the addition point to the same Mul, because removing it from one operand, will actually remove the check on both operands...
This patch contains testcase + fix.
Comment 30 Hannes Verschore [:h4writer] 2012-03-28 05:27:20 PDT
Reopened until follow-up fix gets in.
Comment 31 Hannes Verschore [:h4writer] 2012-03-29 14:22:31 PDT
https://hg.mozilla.org/projects/ionmonkey/rev/24294fe42c19

Note You need to log in before you can comment on or make changes to this bug.