Closed Bug 868708 Opened 11 years ago Closed 11 years ago

ARM optimize signed integer divisions by constant powers of two

Categories

(Core :: JavaScript Engine, enhancement)

ARM
All
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: dougc, Assigned: dougc)

Details

Attachments

(2 files, 1 obsolete file)

The optimization in bug 868535 should also help the ARM arch which uses a foreign call for integer division.
Relatively straight forward adaptation of the x86 support to the ARM.  Plus some tests.
Attachment #745532 - Flags: review?(nicolas.b.pierron)
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

Review of attachment 745532 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing the same on ARM :)

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +593,5 @@
> +        // Note that we wouldn't need to do this adjustment if we could use
> +        // Range Analysis to find cases when the value is never negative.
> +        if (shift > 1) {
> +            masm.as_mov(ScratchRegister, asr(lhs, 31));
> +            masm.as_add(ScratchRegister, lhs, lsr(ScratchRegister, 32 - shift));

Really nice! I think I like the ARM assembly and Marty implementation of the Assembler generator for that.

Usually I try to avoid using the ScratchRegister out-side the Macro Assembler, but in such case no address are manipulated, and you are only using assembler functions.  So this is ok.
Attachment #745532 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

Review of attachment 745532 [details] [diff] [review]:
-----------------------------------------------------------------

> Register output = ToRegister(ins->output());
> int32_t shift = ins->shift();
>
> if (shift != 0) {
>    [...]
> }
>	
> return true;

If shift is 0 here, this code never defines output. The x86 patch uses defineReuseInput, but this ARM patch does not, so it seems possible that lhs could be a different register from output here. Of course, a shift of 0 means means a division by 1, which probably would be folded by this point, but it seems prudent to either handle it or assert that it doesn't happen.
(In reply to Dan Gohman from comment #4)
> Comment on attachment 745532 [details] [diff] [review]
> Adapt the x86 support for the ARM.
> 
> Review of attachment 745532 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> > Register output = ToRegister(ins->output());
> > int32_t shift = ins->shift();
> >
> > if (shift != 0) {
> >    [...]
> > }
> >	
> > return true;
> 
> If shift is 0 here, this code never defines output. The x86 patch uses
> defineReuseInput, but this ARM patch does not, so it seems possible that lhs
> could be a different register from output here. Of course, a shift of 0
> means means a division by 1, which probably would be folded by this point,
> but it seems prudent to either handle it or assert that it doesn't happen.

Good catch!
And apparently we don't fold the case where we divide by 1.
This patch moves the input to the output register if necessary for the case of division by one.  Dan, thank you for checking and reporting this.
Attachment #745532 - Attachment is obsolete: true
Attachment #745614 - Flags: review?(nicolas.b.pierron)
(In reply to Douglas Crosher [:dougc] from comment #6)
> Created attachment 745614 [details] [diff] [review]
> Revised patch address the case of division by one.
> 
> This patch moves the input to the output register if necessary for the case
> of division by one.  Dan, thank you for checking and reporting this.

I will land a follow-up patch that I got reviewed by Marty.  I am trying to make it fails before to make sure we add a test case.
Attachment #745614 - Attachment is obsolete: true
Attachment #745614 - Flags: review?(nicolas.b.pierron)
Attachment #745618 - Flags: review?(nicolas.b.pierron)
I added a test case and landed a follow-up to the previous patch.

Douglas: There is no need to re-do the patch unless it got backout.

05:09:23 <mjrosenb> huh, the scratch register shouldn't even be visible outside the assembler/macro assembler.
05:10:14 <nbp> mjrosenb: I agree, but in this particular case this is fine, as all instructions are as_*
05:11:02 <nbp> mjrosenb: this is like avoiding to create a special macro assembler for one use case in the codegen.
05:11:41 <mjrosenb> I still don't like it.
05:12:10 <mjrosenb> anyhow, r=me
05:12:27 <mjrosenb> although, we should test it first.

Done:
./jit-test/tests/asm.js/testExpressions.js:320:4 Error: Assertion failed: got 1086365961, expected 2

https://hg.mozilla.org/integration/mozilla-inbound/rev/c2de391c032f
Comment on attachment 745618 [details] [diff] [review]
Follow up patch to correct the case of division by one, and to add a test case.

Review of attachment 745618 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch, I am sorry for not pushing yours I made the previous one and got it review via IRC half an hour after Dan's comment.
Sorry, I should have attached it here, to avoid the confusion.

Cancel the review as a similar patch has landed.
Attachment #745618 - Flags: review?(nicolas.b.pierron)
Comment on attachment 745532 [details] [diff] [review]
Adapt the x86 support for the ARM.

This patch is not obsolete, because it has landed and it is not back out.
Attachment #745532 - Attachment is obsolete: false
https://hg.mozilla.org/mozilla-central/rev/f3734569c303
https://hg.mozilla.org/mozilla-central/rev/c2de391c032f
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: