Last Comment Bug 868708 - ARM optimize signed integer divisions by constant powers of two
: ARM optimize signed integer divisions by constant powers of two
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: ARM All
: -- enhancement (vote)
: mozilla23
Assigned To: Douglas Crosher [:dougc]
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-05-03 21:57 PDT by Douglas Crosher [:dougc]
Modified: 2013-05-05 17:18 PDT (History)
3 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Adapt the x86 support for the ARM. (8.91 KB, patch)
2013-05-04 06:13 PDT, Douglas Crosher [:dougc]
nicolas.b.pierron: review+
Details | Diff | Review
Revised patch address the case of division by one. (8.97 KB, patch)
2013-05-04 22:05 PDT, Douglas Crosher [:dougc]
no flags Details | Diff | Review
Follow up patch to correct the case of division by one, and to add a test case. (2.12 KB, patch)
2013-05-04 23:22 PDT, Douglas Crosher [:dougc]
no flags Details | Diff | Review

Description Douglas Crosher [:dougc] 2013-05-03 21:57:49 PDT
The optimization in bug 868535 should also help the ARM arch which uses a foreign call for integer division.
Comment 1 Douglas Crosher [:dougc] 2013-05-04 06:13:05 PDT
Created attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

Relatively straight forward adaptation of the x86 support to the ARM.  Plus some tests.
Comment 2 Nicolas B. Pierron [:nbp] 2013-05-04 13:39:49 PDT
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

Review of attachment 745532 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing the same on ARM :)

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +593,5 @@
> +        // Note that we wouldn't need to do this adjustment if we could use
> +        // Range Analysis to find cases when the value is never negative.
> +        if (shift > 1) {
> +            masm.as_mov(ScratchRegister, asr(lhs, 31));
> +            masm.as_add(ScratchRegister, lhs, lsr(ScratchRegister, 32 - shift));

Really nice! I think I like the ARM assembly and Marty implementation of the Assembler generator for that.

Usually I try to avoid using the ScratchRegister out-side the Macro Assembler, but in such case no address are manipulated, and you are only using assembler functions.  So this is ok.
Comment 3 Nicolas B. Pierron [:nbp] 2013-05-04 13:45:05 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/f3734569c303
Comment 4 Dan Gohman [:sunfish] 2013-05-04 19:07:50 PDT
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

Review of attachment 745532 [details] [diff] [review]:
-----------------------------------------------------------------

> Register output = ToRegister(ins->output());
> int32_t shift = ins->shift();
>
> if (shift != 0) {
>    [...]
> }
>	
> return true;

If shift is 0 here, this code never defines output. The x86 patch uses defineReuseInput, but this ARM patch does not, so it seems possible that lhs could be a different register from output here. Of course, a shift of 0 means means a division by 1, which probably would be folded by this point, but it seems prudent to either handle it or assert that it doesn't happen.
Comment 5 Nicolas B. Pierron [:nbp] 2013-05-04 19:39:59 PDT
(In reply to Dan Gohman from comment #4)
> Comment on attachment 745532 [details] [diff] [review]
> Adapt the x86 support for the ARM.
> 
> Review of attachment 745532 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> > Register output = ToRegister(ins->output());
> > int32_t shift = ins->shift();
> >
> > if (shift != 0) {
> >    [...]
> > }
> >	
> > return true;
> 
> If shift is 0 here, this code never defines output. The x86 patch uses
> defineReuseInput, but this ARM patch does not, so it seems possible that lhs
> could be a different register from output here. Of course, a shift of 0
> means means a division by 1, which probably would be folded by this point,
> but it seems prudent to either handle it or assert that it doesn't happen.

Good catch!
And apparently we don't fold the case where we divide by 1.
Comment 6 Douglas Crosher [:dougc] 2013-05-04 22:05:50 PDT
Created attachment 745614 [details] [diff] [review]
Revised patch address the case of division by one.

This patch moves the input to the output register if necessary for the case of division by one.  Dan, thank you for checking and reporting this.
Comment 7 Nicolas B. Pierron [:nbp] 2013-05-04 22:32:47 PDT
(In reply to Douglas Crosher [:dougc] from comment #6)
> Created attachment 745614 [details] [diff] [review]
> Revised patch address the case of division by one.
> 
> This patch moves the input to the output register if necessary for the case
> of division by one.  Dan, thank you for checking and reporting this.

I will land a follow-up patch that I got reviewed by Marty.  I am trying to make it fails before to make sure we add a test case.
Comment 8 Douglas Crosher [:dougc] 2013-05-04 23:22:39 PDT
Created attachment 745618 [details] [diff] [review]
Follow up patch to correct the case of division by one, and to add a test case.
Comment 9 Nicolas B. Pierron [:nbp] 2013-05-04 23:58:16 PDT
I added a test case and landed a follow-up to the previous patch.

Douglas: There is no need to re-do the patch unless it got backout.

05:09:23 <mjrosenb> huh, the scratch register shouldn't even be visible outside the assembler/macro assembler.
05:10:14 <nbp> mjrosenb: I agree, but in this particular case this is fine, as all instructions are as_*
05:11:02 <nbp> mjrosenb: this is like avoiding to create a special macro assembler for one use case in the codegen.
05:11:41 <mjrosenb> I still don't like it.
05:12:10 <mjrosenb> anyhow, r=me
05:12:27 <mjrosenb> although, we should test it first.

Done:
./jit-test/tests/asm.js/testExpressions.js:320:4 Error: Assertion failed: got 1086365961, expected 2

https://hg.mozilla.org/integration/mozilla-inbound/rev/c2de391c032f
Comment 10 Nicolas B. Pierron [:nbp] 2013-05-05 00:09:12 PDT
Comment on attachment 745618 [details] [diff] [review]
Follow up patch to correct the case of division by one, and to add a test case.

Review of attachment 745618 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch, I am sorry for not pushing yours I made the previous one and got it review via IRC half an hour after Dan's comment.
Sorry, I should have attached it here, to avoid the confusion.

Cancel the review as a similar patch has landed.
Comment 11 Nicolas B. Pierron [:nbp] 2013-05-05 00:10:20 PDT
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

This patch is not obsolete, because it has landed and it is not back out.

Note You need to log in before you can comment on or make changes to this bug.