Closed Bug 766592 Opened 8 years ago Closed 7 years ago

IonMonkey: implement symbolic range analysis

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: rpearl, Assigned: bhackett)

References

(Blocks 1 open bug)

Details

(Whiteboard: [ion:t])

Attachments

(1 file)

Follow-up to Bug 699883.

Right now, range analysis acts on *constant* operands of type int32 only. With more work, the same algorithm could operate on arbitrary (numeric) defs. This will be useful for Bug 765128.
Blocks: 774922
Add limited symbolic range analysis to Ion.  This is exclusively directed towards finding symbolic ranges for loop counter variables; it could be improved in the future to handle more cases, though such future work really needs to be directed towards specific benchmarks / tests, as it's really easy to spin off and do expensive and pointless analysis when dealing with symbolic ranges.  For now this patch handles (a) everything that LICM used to do (that code is removed) and (b) linear equalities that show up in e.g. v8-crypto's am3.  This improves my v8-crypto score from bimodal 10400/11200 to bimodal 10800/11800 (lots of bailouts taken for shape guards on this test? odd...).  It is also positioning things to be able to cope with loops like:

do {
  x[i++];
} while (i < n);

Which I've seen in emscripten-translated code (fannkuch) and is difficult for standard symbolic range analyses to handle (in my experience).  Can't do this yet but shouldn't take much new code.

The analysis is split into a few phases:

1. Find an upper bound on the number of times the loop backedge will execute.
2. Use this upper bound to compute bounds on counter variables in the loop.
3. Use bounds on counter variables to hoist bounds checks using those counters as indexes.

This is built onto the existing range analysis though for now there isn't a lot in common between how absolute and symbolic ranges are computed; that should change in the future by using beta nodes to compute and propagate symbolic ranges.

To do this I needed to make some large changes to the range analysis code, mainly cleanup (make Range a temp object, move a bunch of range analysis code from MIR.h to RangeAnalysis.cpp).  There is a significant functional change though, which is that the range analysis is changed from a dataflow sort of analysis to a single SSA pass, similar to most other backend passes.  Symbolic range analysis doesn't want to see loop headers over and over again, there didn't seem to be a straightforward argument for termination of the analysis, and I can't think of any realistic cases where trying to fixpoint the absolute range of a mutable phi would add meaningful information.  Simple loops like 'for i = 0; i < 100; i++) {}' are handled by the beta nodes.  Does anyone know of important/real cases where this fixpointing helps?

Apologies for the large patch but it is mostly straightforward.  All the meat of the symbolic analysis is in the later section of RangeAnalysis.cpp.
Assignee: general → bhackett1024
Attachment #676027 - Flags: review?(dvander)
Comment on attachment 676027 [details] [diff] [review]
patch (4380bab9db04)

Review of attachment 676027 [details] [diff] [review]:
-----------------------------------------------------------------

I'd like to do a high-level review of this, but forwarding the real review to Marty since he shepherded the original RA into the tree.
Attachment #676027 - Flags: review?(mrosenberg)
Comment on attachment 676027 [details] [diff] [review]
patch (4380bab9db04)

Review of attachment 676027 [details] [diff] [review]:
-----------------------------------------------------------------

I am not really sure that symbolic range analysis and numeric range analysis belong joined together in this manner. It seems like the only thing they share is the "range" part of their names.
As for the fixed-pointing, a simple for loop like
for (i = 0; i < 100; i++) {}
sadly ends up with bounds INT_MIN+1 <= i <= 100 after the loop
Any uses of i after the loop are likely subject to underflow checks due to lacking sane lower bounds.  It looks like the framework that you have here can get better bounds without looping to a fixed point, but they also feel incredibly special cased.
It also looks like OSR blocks + the loop preheader aren't handled at all.  For the most part, just ensuring that you don't do anything incorrect is good, but every once in a while, we find a benchmark where not optimizing across the osrpoint actually negatively impacts our performance.

It looks sort of like there is some code to optimize induction variables, which is almost certainly a good direction to take symbolic range analysis.

::: js/src/ion/RangeAnalysis.cpp
@@ +369,5 @@
>  {
> +    if (!lhs)
> +        lhs = new Range();
> +    if (!rhs)
> +        rhs = new Range();

I realize we shouldn't do this too frequently, but it looks like we never modify these temps, would it be possible to make it into a static so there is less allocation.

@@ +474,5 @@
> +            IonSpew(IonSpew_Range, "Ignoring unreachable input %d", getOperand(i)->id());
> +            continue;
> +        }
> +
> +        if (!isOSRLikeValue(getOperand(i))) {

Should probably not negate this and just |continue;| the loop in this case.

@@ +558,5 @@
> +    if (!other)
> +        other = new Range();
> +    Range *range = new Range(0,
> +                             Max(Range::abs64((int64_t)other->lower()),
> +                                 Range::abs64((int64_t)other->upper())));

I'm sure I've noticed this before, but the range on abs(5) is actually 0-5

@@ +601,5 @@
> +    if (specialization() != MIRType_Int32)
> +        return;
> +    Range *rhs = getOperand(1)->range();
> +    if (!rhs)
> +        return;

Here there is an explicit check (and bail) that the rhs exists, but other computeRange functions just synthesize a range, is there a reason for that?

@@ +776,5 @@
> +        lhsWrite = lhsWrite->getOperand(0);
> +    if (!lhsWrite->isAdd() && !lhsWrite->isSub())
> +        return NULL;
> +    if (!lhsWrite->block()->isMarked())
> +        return NULL;

Why do we bail out here, because the write is outside of the loop?

@@ +796,5 @@
> +        return NULL;
> +
> +    LinearSum bound;
> +
> +    if (lhsModified.constant == 1 && !lessEqual) {

It looks like we only handle increments and decrements by 1, rather than any constant amount.  Is there a reason for this?

@@ +843,5 @@
> +
> +    if (phi->numOperands() != 2)
> +        return;
> +
> +    MDefinition *initial = phi->getOperand(0);

Do we have any assurance that operand 0 is the initial value?

@@ +905,5 @@
> +    if (ins->block() == header)
> +        return false;
> +    MBasicBlock *bb = ins->block()->immediateDominator();
> +    for (; bb != header && bb != bound->loop->test->block(); bb = bb->immediateDominator()) {
> +    }

Comment would be appreciated.  Empty for loops always make me nervous.

@@ +917,1 @@
>  {

Does this function assume that we won't be hoisting a use of a term above its definition? I can't find where that is ensured.

@@ +938,5 @@
> +                block->insertBefore(block->lastIns(), def->toInstruction());
> +            } else {
> +                MConstant *factor = MConstant::New(Int32Value(term.scale));
> +                block->insertBefore(block->lastIns(), factor);
> +                MMul *mul = MMul::New(term.term, factor);

These instructions that are being inserted won't have range information on them.  We probably want to do that to avoid overflow checks on these computations.
Attachment #676027 - Flags: review?(mrosenberg) → review+
Comment on attachment 676027 [details] [diff] [review]
patch (4380bab9db04)

Review of attachment 676027 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/Ion.cpp
@@ +850,5 @@
>          if (mir->shouldCancel("GVN"))
>              return NULL;
>      }
>  
> +    if (js_IonOptions.licm) {

Out of curiosity, why was LICM moved before range analysis?

::: js/src/ion/RangeAnalysis.cpp
@@ +905,5 @@
> +    if (ins->block() == header)
> +        return false;
> +    MBasicBlock *bb = ins->block()->immediateDominator();
> +    for (; bb != header && bb != bound->loop->test->block(); bb = bb->immediateDominator()) {
> +    }

Could this instead be:
while (bb != header && bb != bound->loop->test->block())
    bb = bb->immediateDominator();
Attachment #676027 - Flags: review?(dvander) → review+
Thanks for the comments.

(In reply to Marty Rosenberg [:mjrosenb] from comment #3)
> Comment on attachment 676027 [details] [diff] [review]
> patch (4380bab9db04)
> 
> Review of attachment 676027 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I am not really sure that symbolic range analysis and numeric range analysis
> belong joined together in this manner. It seems like the only thing they
> share is the "range" part of their names.

Right now, yeah, I agree.  I'd like to eventually use the beta nodes to compute symbolic ranges too, though, which would need to happen during numeric range analysis (i.e. when the beta nodes exist).  It seems better to place it now where it will eventually want to be.

> As for the fixed-pointing, a simple for loop like
> for (i = 0; i < 100; i++) {}
> sadly ends up with bounds INT_MIN+1 <= i <= 100 after the loop

I'm not sure what you mean here.  The code that's currently in the tree generates a range of [-inf,inf] for i after this loop.  That may be a bug, but the only way it will do better is if it can determine that 'i' after the loop comes from the beta node introduced by 'i >= 100', and doing this shouldn't require fixpointing.

> ::: js/src/ion/RangeAnalysis.cpp
> @@ +369,5 @@
> >  {
> > +    if (!lhs)
> > +        lhs = new Range();
> > +    if (!rhs)
> > +        rhs = new Range();
> 
> I realize we shouldn't do this too frequently, but it looks like we never
> modify these temps, would it be possible to make it into a static so there
> is less allocation.

Not sure what you mean here.  Making these local statics won't work for a couple reasons.  First, they are allocated from pools which will be wiped out after compilation finishes.  Second, using local statics at all won't work with off thread compilation --- there shouldn't be any truly global state at all in the compiler.

It would work though to have an emptyRange or somesuch in the range analysis itself, and reuse that one all over the place.  I'll try that.

> @@ +601,5 @@
> > +    if (specialization() != MIRType_Int32)
> > +        return;
> > +    Range *rhs = getOperand(1)->range();
> > +    if (!rhs)
> > +        return;
> 
> Here there is an explicit check (and bail) that the rhs exists, but other
> computeRange functions just synthesize a range, is there a reason for that?

Not really, I'll change things to be more consistent.

> @@ +776,5 @@
> > +        lhsWrite = lhsWrite->getOperand(0);
> > +    if (!lhsWrite->isAdd() && !lhsWrite->isSub())
> > +        return NULL;
> > +    if (!lhsWrite->block()->isMarked())
> > +        return NULL;
> 
> Why do we bail out here, because the write is outside of the loop?

Yeah, we need an induction variable that is written in each loop iteration, not solely outside the loop.  If the variable is written in the loop then writes which happened outside the loop will only appear inside a phi for the loop header (structured control flow...)

> @@ +796,5 @@
> > +        return NULL;
> > +
> > +    LinearSum bound;
> > +
> > +    if (lhsModified.constant == 1 && !lessEqual) {
> 
> It looks like we only handle increments and decrements by 1, rather than any
> constant amount.  Is there a reason for this?

Simplicity.  It'd be good to relax this but it's really easy to get fencepost errors dealing with induction variables that change by more than one, and I'd like to get some real world benchmarks which depend on this before improving the situation.

> @@ +917,1 @@
> >  {
> 
> Does this function assume that we won't be hoisting a use of a term above
> its definition? I can't find where that is ensured.

Well, all the instructions we generate here are at the end of the loop preheader, and the previous analysis ensures that all the terms used in hoisted conditions are loop invariant, i.e. they are in blocks outside the loop and must execute before the end of the preheader.
(In reply to David Anderson [:dvander] from comment #4)
> Comment on attachment 676027 [details] [diff] [review]
> patch (4380bab9db04)
> 
> Review of attachment 676027 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/ion/Ion.cpp
> @@ +850,5 @@
> >          if (mir->shouldCancel("GVN"))
> >              return NULL;
> >      }
> >  
> > +    if (js_IonOptions.licm) {
> 
> Out of curiosity, why was LICM moved before range analysis?

The symbolic range analysis is focused on analyzing loop induction variables and will benefit heavily from knowing that more terms are loop invariant.  The reverse relationship doesn't hold, i.e. the results of range analysis shouldn't affect LICM at all.
Some MConstants were finding their way into terms in linear sums, which caused arithmetic ops to be created with constant operations that terminally confused later phases in the compiler.

https://hg.mozilla.org/integration/mozilla-inbound/rev/89e5db8cf62f
https://hg.mozilla.org/mozilla-central/rev/89e5db8cf62f
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Duplicate of this bug: 774922
Duplicate of this bug: 810802
Duplicate of this bug: 801156
Depends on: 819794
Depends on: 822941
Duplicate of this bug: 769518
Blocks: 830063
Depends on: 916635
You need to log in before you can comment on or make changes to this bug.