Closed Bug 913935 Opened 11 years ago Closed 10 years ago

Suboptimal code for short-circuit && and ||

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1028580

People

(Reporter: sunfish, Unassigned)

Details

Attachments

(1 file)

Short-circuit operators at the top level of an if seem to be lowered to an inefficient form. For this JavaScript:

  function f(a, b) {
    if (a < 10 && b < 20) {
        print('hi');
    }
  }

  f(4, 400);
  f(400, 4);
  f(2, 9);
  f(5, 500);
  f(300, 3);

js --ion-eager emits MIR which conceptually does this:

    t0 = a < 10;
    if (t0) goto next; else goto retest;
  next:
    t1 = b < 20;
  retest:
    t2 = phi(t0, t1);
    if (t2) goto true; else goto end;
  true:

And it emits, for example, x64 code like this:

    cmpl       $0xa, %ecx
    setl       %al
    movzbl     %al, %eax
    testl      %eax, %eax
    je         retest
    cmpl       $0x14, %ebx
    setl       %al
    movzbl     %al, %eax
  retest:
    testl      %eax, %eax
    je         end

This is what optimal code might look look like:

    cmpl $0xa, %ecx
    jge  end
    cmpl $0x14, %ebx
    jge  end

A common way to do this is to special-case the lowering of short-circuit && and || at the top level of a controlling expression.
Also, this form foils Range Analysis' Beta node analysis. It puts the beta node for the lhs in the basic block which computes the rhs, where it doesn't dominate the body of the if. And the rhs doesn't get a beta node at all.
Thanks for filing this bug. This has been a problem from the start and affects several tight loops in benchmarks like Kraken imaging-gaussian-blur.

Are you interested in fixing this? Else I can take a look this week.
(In reply to Jan de Mooij [:jandem] from comment #2)
> Are you interested in fixing this? Else I can take a look this week.

I'm not planning to work on this right now, so feel free.
I've been thinking about this a bit. Due to the way IonBuilder works, it's not trivial to do this. We could fix it at the bytecode level but I think that will also complicate IonBuilder (it parses the bytecode) and probably also the expression decompiler etc.

What we want to do I think is have IonBuilder follow the JSOP_AND branch and when it flows into a JSOP_IFEQ it can push a CFGState entry with the false block. Then when we get to the IFEQ, we can reuse that block. Not pretty but something like this should be doable.
Assignee: general → jdemooij
Attached patch WIP v0Splinter Review
Here's a very hackish patch that adds a new pass to rewrite these branches after building the initial graph. I think this is simpler than handling it in IonBuilder.

I also had to remove unnecessary entry resume points before Lowering to make sure resume points for blocks created by edge splitting don't hold one of the operands of && alive (e.g. the MTest should be the only use of the MCompare), this avoids the weird cmpl/setl/movzbl pattern in comment 0.

With this patch we generate the optimal code mentioned in comment 0 for this script:

function f(a, b) {
    if (a < 10 && b < 20)
        return 1;
    return 0;
}

0x245c78c:	cmp    $0xa,%eax
0x245c78f:	jge    0x245c7ad
0x245c795:	cmp    $0x14,%ecx
0x245c798:	jge    0x245c7ad
0x245c79e:	mov    $0xffffff81,%ecx
0x245c7a3:	mov    $0x1,%edx
0x245c7a8:	jmp    0x245c7b7
0x245c7ad:	mov    $0xffffff81,%ecx
0x245c7b2:	mov    $0x0,%edx

And a while loop:

function f(a, b) {
    while (a < 10 && b < 20)
	a++;
    return a;
}

0x245c84f:	cmp    $0xa,%edx
0x245c852:	jge    0x245c869
0x245c858:	cmp    $0x14,%ecx
0x245c85b:	jge    0x245c869
0x245c861:	add    $0x1,%edx
0x245c864:	jmp    0x245c84f

Before:

0x245c84d:	cmp    $0x14,%ecx
0x245c850:	setl   %dl
0x245c853:	movzbl %dl,%edx
0x245c856:	cmp    $0xa,%eax
0x245c859:	setl   %bl
0x245c85c:	movzbl %bl,%ebx
0x245c85f:	test   %ebx,%ebx
0x245c861:	je     0x245c869
0x245c867:	mov    %edx,%ebx
0x245c869:	test   %ebx,%ebx
0x245c86b:	je     0x245c87f
0x245c871:	add    $0x1,%eax
0x245c874:	jo     0x245c8ce
0x245c87a:	jmp    0x245c856

Note that we're able to eliminate the overflow check for a++ now.

This needs a lot of cleanup and fixes but this approach should work I think..
This was fixed in bug 1028580.
Assignee: jdemooij → nobody
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: