Closed Bug 699883 Opened 13 years ago Closed 12 years ago

IonMonkey: Implement range analysis

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jandem, Unassigned)

References

(Blocks 2 open bugs)

Details

Attachments

(3 files, 2 obsolete files)

Sunspider's crypto-md5 performance depends on this function:
--
function safe_add(x, y)  {
    var lsw = (x & 0xFFFF) + (y & 0xFFFF);
    var msw = (x >> 16) + (y >> 16) + (lsw >> 16);
    var ret = (msw << 16) | (lsw & 0xFFFF);
    return ret;
}
--
The code IM generates is quite good, but we need range analysis to determine that the add instructions cannot overflow. Another common case where this could help is the i++ in typical for-loops.

Any thoughts?
Note that TI provides information to handle the i++ case in loops.
I seem to remember another Ionmonkey bug which mentions the need for range analysis, but can't find it right now.
Another case where IonMonkey could benefit from the range analysis is (sunspider/string-validate-input.js):

      var l = Math.floor(9*Math.random());
      tmp = tmp.concat(l);

This could benefit from range analysis because «l» is in the range [0..9] and of type Int32.  String.concat is replaced by MConcat which can then check it's inputs and convert the digit to a string by using a simple array access instead of an IntToString function call.
(In reply to Paul Wright from comment #2)
> I seem to remember another Ionmonkey bug which mentions the need for range
> analysis, but can't find it right now.

That other bug is Bug 688078 comment 8
I'm taking this bug because we're going to implement range analysis for a class project in CMU's Optimizing Compilers for Modern Architectures (https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15745-s12/www/)
Status: NEW → ASSIGNED
Attached patch WIP patch for range analysis (obsolete) — Splinter Review
Ryan and I have a work in progress patch that adds an SSA based range analysis pass and will omit bounds checks for adds, multiplies, and subtractions that we can prove won't overflow. This gets some pretty good speedups in contrived microbenchmarks, but we haven't really analyzed it on real programs yet.
Comment on attachment 618857 [details] [diff] [review]
WIP patch for range analysis

Review of attachment 618857 [details] [diff] [review]:
-----------------------------------------------------------------

This suffers from the same OSR ails as gvn.  If you have a for loop:
var arr = new Array(1001);
for (var i = 0; i < 1000; i++) {
  arr[i] = 0;
}
arr[i] = 1
and we OSR into this loop, there is no information about the OSR values, so (I believe) it can't remove all of the range checks. Theoretically, we should know that all of the values that are generated from OSR were values that had been generated by this code initially, but these assumptions aren't really things that we can make.

Since the paper made a bunch of explicit references to bounds checks, I'm assuming you're planning on incorporating the bounds check into the range analysis, as well as eliminating bounds checks

::: ion/Ion.cpp
@@ +714,5 @@
> +            return false;
> +        IonSpewPass("De-Beta");
> +        AssertGraphCoherency(graph);
> +    }
> +

Since range analysis is being run after GVN, you probably want to use some GVN knowledge when computing the range information.

::: ion/IonAnalysis.cpp
@@ +582,5 @@
> +            if (!worklist.append(block->getImmediatelyDominatedBlock(i)))
> +                return false;
> +        }
> +        index++;
> +    }

the comment says "save the index", but as far as I can tell, the index of the block is getting overwritten.

::: ion/Lowering.cpp
@@ +567,5 @@
>          LAddI *lir = new LAddI;
> +        // If the result isn't truncated and we don't have range
> +        // analysis proving that it can't overflow, then we need to
> +        // create a bailout.
> +        if (!ins->isTruncated() && !ins->range()->isFinite()) {

I'd make this a single function call that returns this value, since we theoretically will be adding more optimizations in the future, and having a large sequence of !a && !b && !c ... will become cumbersome.

::: ion/MIR.h
@@ +49,1 @@
>  #include "jscntxt.h"

Is this just being included for min and max? I'm decently sure the mozilla codebase has these functions.

@@ +259,5 @@
>      uint32 id_;                    // Instruction ID, which after block re-ordering
>                                     // is sorted within a basic block.
>      ValueNumberData *valueNumber_; // The instruction's value number (see GVN for details in use)
> +    Range range_;                  // The most specific known range for this def.
> +

Range looked kind of large (12 bytes), do we really want to have that embedded in every MDefinition, even if range analysis isn't turned on?

::: ion/Range.cpp
@@ +44,5 @@
> +    }
> +#endif
> +}
> +
> +// XXX I *think* we just wanted MUseDefIterator (which skips bailout points

)

@@ +82,5 @@
> +
> +        if (left->isConstant() && left->toConstant()->value().isInt32()) {
> +            bound = left->toConstant()->value().toInt32();
> +            val = right;
> +            jsop = analyze::NegateCompareOp(jsop);

I suspect this doesn't do what you want, it leterally negates the operation, so 5 != x gets translated to x == 5

@@ +170,5 @@
> +    setUpper(std::min(upper_, other->upper_));
> +    setLower(std::max(lower_, other->lower_));
> +    // FIXME: This is completely not true: upper_ being less than
> +    // lower_ means that the range is *empty*, not infinite!. How
> +    // should we deal with this?

I'd say we should add some generic way of marking blocks/instructions as unreachable.  sstangl currently has a (bad) way to mark blocks as dead.

@@ +221,5 @@
> +    int32 shift = c & 0x1f;
> +    setUpper(upper_ >> shift);
> +    setLower(lower_ >> shift);
> +}
> +

All of the above functions modify |this|. My personal preference would be to make pure functions with type Range * Range -> Range, so you don't need to explicitly save a copy of any Range object you need to perform an operation with.  This would also allow set to take a Range in and return if the new range is different from the old range, so the last line of recomputeRange would be |return range()->update(newRange);|
I also assume you're planning on implementing ushr, mod, div, floor, ceil, round, and most of the other math operations.

::: ion/Range.h
@@ +22,5 @@
> + * Portions created by the Initial Developer are Copyright (C) 1998
> + * the Initial Developer. All Rights Reserved.
> + *
> + * Contributor(s):
> + *   Ryan Pearl <rpearl@andrew.cmu.edu>

this email address is going to go away somewhat shortly...

@@ +82,5 @@
> +        // direction. When computing the range of an integer
> +        // instruction, the ranges of the operands can be clamped to
> +        // [INT_MIN, INT_MAX], since if they had overflowed they would
> +        // no longer be integers. This is important for optimizations
> +        // and somewhat subtle.

You may want to change this a bit.  Namely, when bit operations are present (or other operations that truncate their inputs), a*b > 2**32 is fine, but a*b > 2**53 is problematic, since (uint64)a*(uint64)b != (double)a * (double) b. Since emscripten uses bitor to convert values that have moved out of the integer range into int32s, knowing that integer multiplication does the right thing is actually useful.

::: shell/js.cpp
@@ +5020,5 @@
>                                 "Loop invariant code motion (default: on, off to disable)")
>          || !op.addStringOption('\0', "ion-range-analysis", "on/off",
>                                 "Range Analysis (default: on, off to disable)")
> +        || !op.addStringOption('\0', "ion-real-range-analysis", "on/off",
> +                               "Real Range Analysis (default: off, on to enable)")

All of our optimizations default to on, probably want to do that here.
Updated our patch to take into account some of Marty's suggestions.
Attachment #618857 - Attachment is obsolete: true
Attachment #619236 - Attachment is obsolete: true
Attachment #633380 - Flags: review?(dvander)
It is probably a good idea to make GVN use this functionality too--it re-implements it.
Attachment #633381 - Flags: review?(dvander)
Attachment #633383 - Flags: review?(mrosenberg)
Attachment #633383 - Flags: review?(dvander)
Follow-ups:
These are genuinely bugs or bad, and should be fixed Real Soon (I probably have time to do this).
Bug 765119 - fix iloop in Range Analysis
Bug 765126 - range analysis needs to allocate lazily
When Michael Sullivan and I wrote this feature, we did so for a class. We also wrote a short term paper summarizing the results of our pass and suggesting future work. The paper is available here: http://www.endofunctor.org/~cmplrz/paper.pdf

The paper suggests a few features based on range analysis as future work. I'm sure more are possible as well.

Follow-up features:
Bug 765128 - use range analysis to eliminate and move bounds checks
Bug 765127 - use range analysis to eliminate dead code
Blocks: 765119
Blocks: 765126
Blocks: 765128
Blocks: 765127
Attachment #633380 - Flags: review?(dvander) → review+
Comment on attachment 633381 [details] [diff] [review]
[2/3] Add a fast way to check if one block dominates another

Review of attachment 633381 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonAnalysis.cpp
@@ +585,5 @@
>      // Without OSR, there is only one root block which dominates all.
>      if (!graph.osrBlock())
>          JS_ASSERT(graph.begin()->numDominated() == graph.numBlocks() - 1);
>  #endif
> +    // Now, iterate through the dominator tree and annotate every

nit: newline above comment

@@ +602,5 @@
> +            if (!worklist.append(block))
> +                return false;
> +        }
> +    }
> +    // Starting from each self-dominating block, traverse the CFG in pre-order.

nit: newline above comment
Attachment #633381 - Flags: review?(dvander) → review+
Comment on attachment 633383 [details] [diff] [review]
[3/3] The actual range analysis patch

Review of attachment 633383 [details] [diff] [review]:
-----------------------------------------------------------------

Great work, thanks for doing this! Patch is very readable and the algorithm seems fairly straightforward. Most of my comments are nits, but I have a few meta comments:

(1) Thinking more about the iloop bug, we have plenty of time, we shouldn't rush this just to land it. It might turn out that fixing it requires significant changes (from GVN experience), so we should investigate that first.

(2) Somewhere in RangeAnalysis.* there should be a big comment explaining the algorithm at a high-level. Also we should cite any papers involved.

::: js/src/ion/Lowering.cpp
@@ +1119,5 @@
>  
>  bool
>  LIRGenerator::visitBoundsCheckLower(MBoundsCheckLower *ins)
>  {
> +    if (!ins->fallible()) return true;

nit: return true on newline

::: js/src/ion/MIR.h
@@ +261,5 @@
>      ValueNumberData *valueNumber_; // The instruction's value number (see GVN for details in use)
> +
> +    // Bug 765126: This should be a pointer. The range should only be allocated if range analysis is
> +    // enabled.
> +    Range range_;                  // The most specific known range for this def.

Yeah, I think this should be allocated on-demand, if it doesn't add much complexity. range() could return a |const Range *| (for maybe a sentinel one for infinite ranges), and MInstruction could get an updateRange function instead that allocates if needed.

@@ +1981,5 @@
> +            return false;
> +
> +            int32 c = right->toConstant()->value().toInt32();
> +            Range *other = getOperand(0)->range();
> +            return range()->update(Range::shr(other, c));

nit: unindent these three lines

@@ +2104,5 @@
> +
> +        Range *other = getOperand(0)->range();
> +        Range r(0,
> +                Max(llabs((int64_t)other->lower()),
> +                    llabs((int64_t)other->upper())));

llabs doesn't look available on Windows, where it seems to be _abs64 instead. Why are these casted to int64?

@@ +2352,5 @@
> +        if (specialization() != MIRType_Int32)
> +            return false;
> +        Range *other = getOperand(0)->range();
> +        int64_t a = llabs((int64_t)other->lower());
> +        int64_t b = llabs((int64_t)other->upper());

nits: Same comment about llabs. Also a comment here explaining the new range would be good.

@@ +2528,5 @@
> +        : MUnaryInstruction(val),
> +          comparison_(low, high),
> +          val_(val)
> +    {
> +        //setResultType(MIRType_Value);

nit: Remove this line

::: js/src/ion/RangeAnalysis.cpp
@@ +16,5 @@
> +using namespace js::ion;
> +
> +RangeAnalysis::RangeAnalysis(MIRGraph &graph)
> +  : graph_(graph)
> +{ }

nit: } on newline

@@ +43,5 @@
> +    }
> +#endif
> +}
> +
> +// XXX I *think* we just wanted MUseDefIterator (which skips bailout points)

Answer is yes :) (you can remove this comment anyway)

@@ +67,5 @@
> +        IonSpew(IonSpew_Range, "Looking at block %d", block->id());
> +
> +        BranchDirection branch_dir;
> +        MTest *test = block->immediateDominatorBranch(&branch_dir);
> +        if (!test || !test->getOperand(0)->isCompare()) continue;

nit: continue on newline

@@ +194,5 @@
> +        Min(lhs->upper_, rhs->upper_));
> +
> +    // FIXME: This is completely not true: upper_ being less than
> +    // lower_ means that the range is *empty*, not infinite!. How
> +    // should we deal with this?

What is the significance of this case?

::: js/src/ion/RangeAnalysis.h
@@ +73,5 @@
> +    private:
> +        /* TODO: we should do symbolic range evaluation, where we have
> +         * information of the form v1 < v2 for arbitrary defs v1 and v2, not
> +         * just constants
> +         */

nit: refactor this comment to not be a :TODO: (or, cite a bug), use // instead

@@ +95,5 @@
> +        // 2) upper_infinite == true implies upper_ == JSVAL_INT_MAX
> +        int32 lower_;
> +        bool lower_infinite_;
> +        int32 upper_;
> +        bool upper_infinite_;

You might want to consider either just using int64 for simplicity, or making JSVAL_INT_MIN/MAX mean -/+ infinity.

@@ +112,5 @@
> +        }
> +
> +        void printRange(FILE *fp);
> +        bool update(const Range *other);
> +        bool update(const Range &other) { return update(&other); }

nit: IonMonkey style for inlines is to use two newlines:
        bool update(const Range &other) {
            return update(&other);
        }

@@ +125,5 @@
> +        static Range add(const Range *lhs, const Range *rhs);
> +        static Range sub(const Range *lhs, const Range *rhs);
> +        static Range mul(const Range *lhs, const Range *rhs);
> +
> +        /* TODO: we probably want a function to add by a constant */

nit: remove this comment
Attachment #633383 - Flags: review?(dvander) → review+
(In reply to David Anderson [:dvander] from comment #15)

Working on addressing the nits right now.

> 
> (1) Thinking more about the iloop bug, we have plenty of time, we shouldn't
> rush this just to land it. It might turn out that fixing it requires
> significant changes (from GVN experience), so we should investigate that
> first.

I will look into it but it's not clear I'll have the spare time I currently have in order to finish this bug any time soon. So we won't land right now, but this feature does get reasonable wins--we should try to make sure this patch doesn't bitrot!

> llabs doesn't look available on Windows, where it seems to be _abs64
> instead. Why are these casted to int64?

This is better explained in the paper I linked in Comment #13--I'll add comments explaining why. The gist of it is that infinite ranges are infinite going *into* a def, since we haven't run the actual overflow check. But once the check passes, the range is just [INT_MIN, INT_MAX] going *out* of the def (and feeding into its consumers) since if code is still executing, then the overflow did not occur.

Is there already a platform-agnostic 64-bit abs() available in the codebase somewhere or should I write one? If I write one, is there a better place to put it than somewhere in ion/?

> You might want to consider either just using int64 for simplicity, or making
> JSVAL_INT_MIN/MAX mean -/+ infinity.

See comments above--further, separating out this idea into a couple of boolean flags actually makes a few pieces of code cleaner. I will add more comments to the patch about why this design is chosen.
Blocks: 766592
I experimented a bit with these patches - with two small changes this is a nice v8-crypto win:

1) x & 0x3fff -> here we know the range of the value is [0, 0x3fff]
2) If both operands are guaranteed to be >= 0, LMulI does not need a negative zero check.
Blocks: 768572
With nits addressed,
https://hg.mozilla.org/projects/ionmonkey/rev/4e6b1fda24cc
https://hg.mozilla.org/projects/ionmonkey/rev/fa2242cbcf51
https://hg.mozilla.org/projects/ionmonkey/rev/7233dc7d36c8

Range analysis landed, but defaulting to off until Bug 765119 is addressed in a satisfactory way.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
> I experimented a bit with these patches - with two small changes this is a
> nice v8-crypto win:
> 
> 1) x & 0x3fff -> here we know the range of the value is [0, 0x3fff]
> 2) If both operands are guaranteed to be >= 0, LMulI does not need a
> negative zero check.

The patch I committed addresses the second of these improvements, but not the first, which I didn't get around to implementing yet. We can patch it in the v8-crypto perf bug.
Comment on attachment 633383 [details] [diff] [review]
[3/3] The actual range analysis patch

Already fixed.
Attachment #633383 - Flags: review?(mrosenberg)
Depends on: 911845
You need to log in before you can comment on or make changes to this bug.