Closed Bug 849489 Opened 11 years ago Closed 11 years ago

IonMonkey: Optimise use of vstm/vldm

Categories

(Core :: JavaScript Engine, defect)

ARM
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla22

People

(Reporter: nrc, Assigned: nrc)

Details

Attachments

(1 file, 1 obsolete file)

We currently do multiple operations (addition/subtraction) on the stack pointer when we don't really need to. Lets make it better!
The tricky thing here is that when we break the register set into runs, the order (i.e., ascending or descending) of the runs should match the order of the registers within the runs as output by vstm/vldm. To do that we have to choose either a forwards or backwards iteration of the register set. But, the code in the arm assembler for vstm/vldm always assumes we will get a forward traversal and throws a hissy-fit.
Attached patch patch (obsolete) — Splinter Review
As usual my confidence level is low. I've tested it as best I can (and fixed it a bunch of times along the way, so the testing is doing something), but not particularly scientifically. In particular I don't know how to view the output machine code, so I'm not 100% sure we are doing better than before.
Attachment #723051 - Flags: review?(mrosenberg)
(In reply to Nick Cameron [:nrc] from comment #2)
> As usual my confidence level is low. I've tested it as best I can (and fixed
> it a bunch of times along the way, so the testing is doing something), but
> not particularly scientifically. In particular I don't know how to view the
> output machine code, so I'm not 100% sure we are doing better than before.

If you want I can give you access to one of the tegra that I have (ask me on irc, nbp), and I usually used gdb to print the assembly produced.

You can find some tips here https://developer.mozilla.org/en-US/docs/SpiderMonkey/Hacking_Tips

And for dumping the assembly, the best it to find the JSScript pointer, and dig into the IonScript & IonCode, and use the disas to disassemble the code between the code_ and the code_ + insnSize (or something similar).  Feel free to update the wiki page ;)
Summary: Optimise use of vstm/vldm → IonMonkey: Optimise use of vstm/vldm
Comment on attachment 723051 [details] [diff] [review]
patch

Review of attachment 723051 [details] [diff] [review]:
-----------------------------------------------------------------

It looks reasonable.  I assume it passes tests, and whatnot?

::: js/src/ion/arm/Assembler-arm.h
@@ +1624,4 @@
>          } else {
>              JS_ASSERT(dtmLastReg >= 0);
> +            JS_ASSERT(rn.code() == unsigned(dtmLastReg) + 1 ||
> +                      rn.code() == unsigned(dtmLastReg) - 1);

I would probably store the delta in a variable just to make sure we always get the same delta (e.g. d3, d2, d3 should be invalid)

@@ +1632,5 @@
>          JS_ASSERT(dtmActive);
>          dtmActive = false;
>          JS_ASSERT(dtmLastReg != -1);
>          // fencepost problem.
> +        int len = mozilla::Abs(dtmLastReg - vdtmFirstReg) + 1;

Abs could be replaced with *delta
Attachment #723051 - Flags: review?(mrosenberg) → review+
I finally managed to get dumps of the generated code. It looks good, for the float registers we only emit the vldm/vstm instructions.

mjrosenb, nbp - thanks for your help with this!

jit-test/tests/v8-v5/check-raytrace.js

   0x766a0c30:	sub	sp, sp, #20
   0x766a0c34:	stm	sp, {r0, r1, r2, r3, r4}
   0x766a0c38:	vpush	{d5}
   0x766a0c3c:	vpush	{d2-d3}
   0x766a0c40:	vpush	{d0}
=> 0x766a0c44:	bkpt	0x000b

   0x766a0c7c:	vpop	{d0}
   0x766a0c80:	vpop	{d2-d3}
   0x766a0c84:	vpop	{d5}
   0x766a0c88:	ldm	sp, {r0, r1, r2, r3, r4}
   0x766a0c8c:	add	sp, sp, #20
=> 0x766a0c90:	bkpt	0x000b


jit-test/tests/ion/inlining/inline-callarg-bailout-phi.js

   0x76971514:	sub	sp, sp, #12
   0x76971518:	stm	sp, {r1, r2, r3}
   0x7697151c:	vpush	{d2-d15}
   0x76971520:	vpush	{d0}
=> 0x76971524:	bkpt	0x0006

   0x76971550:	vpop	{d0}
   0x76971554:	vpop	{d2-d15}
   0x76971558:	ldm	sp, {r1, r2, r3}
   0x7697155c:	add	sp, sp, #12
=> 0x76971560:	bkpt	0x0006
Attached patch patchSplinter Review
with requested changes, carrying r=mjrosenb
Attachment #723051 - Attachment is obsolete: true
Attachment #727598 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/eb1e425dbf0f
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: