Closed
Bug 520905
Opened 15 years ago
Closed 15 years ago
TM: collapse callee saved register spills/reloads into LDMIA/STMIA instructions
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: gal, Assigned: jseward)
References
Details
(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey)
Attachments
(1 file)
6.43 KB,
patch
|
vlad
:
review+
|
Details | Diff | Splinter Review |
## compiling trunk 615C2CA0 r4 = iparam 0 r4 0061bdaa24 str r4, [fp, #-8] r4(r4) <= spill r4 r5 = iparam 1 r5 0061bdaa28 str r5, [fp, #-12] r5(r5) <= spill r5 r6 = iparam 2 r6 0061bdaa2c str r6, [fp, #-16] r6(r6) <= spill r6 r7 = iparam 3 r7 0061bdaa30 str r7, [fp, #-20] r7(r7) <= spill r7 r8 = iparam 4 r8 0061bdaa34 str r8, [fp, #-24] r8(r8) <= spill r8 r9 = iparam 5 r9 0061bdaa38 str r9, [fp, #-28] r9(r9) <= spill r9 r10 = iparam 6 r10 0061bdaa3c str r10, [fp, #-32] r10(r10) <= spill r10 state = iparam 0 r0 ... ## merging registers (intersect) with existing edge 0061bd1464 restore r10 0061bd1464 ldr r10, [fp, #-32] 0061bd1468 restore r9 r10(r10) 0061bd1468 ldr r9, [fp, #-28] r10(r10) 0061bd146c restore r8 r9(r9) r10(r10) 0061bd146c ldr r8, [fp, #-24] r9(r9) r10(r10) 0061bd1470 restore r7 r8(r8) r9(r9) r10(r10) 0061bd1470 ldr r7, [fp, #-20] r8(r8) r9(r9) r10(r10) 0061bd1474 restore r6 r7(r7) r8(r8) r9(r9) r10(r10) 0061bd1474 ldr r6, [fp, #-16] r7(r7) r8(r8) r9(r9) r10(r10) 0061bd1478 restore r5 r6(r6) r7(r7) r8(r8) r9(r9) r10(r10) 0061bd1478 ldr r5, [fp, #-12] r6(r6) r7(r7) r8(r8) r9(r9) r10(r10) 0061bd147c restore r4 r5(r5) r6(r6) r7(r7) r8(r8) r9(r9) r10(r10) 0061bd147c ldr r4, [fp, #-8] r5(r5) r6(r6) r7(r7) r8(r8) r9(r9) r10(r10) The backend should recognize LIR_ld/LIR_sti sequences that can be collapsed into load/store multiple instructions.
Comment 1•15 years ago
|
||
The "platform" field should be ARM, right?
Assignee | ||
Comment 3•15 years ago
|
||
(In reply to comment #0) > The backend should recognize LIR_ld/LIR_sti sequences that can be collapsed > into load/store multiple instructions. Yes. However, this isn't as simple as it sounds, because the sequences as shown are the "wrong" way round. LDM/STM require the lowest numbered register to be at the lowest address, and what we've got currently has the highest numbered register at the lowest address. That's easily enough fixed by changing the direction of the loop in Assembler::intersectRegisterState.
Assignee: general → jseward
Status: NEW → ASSIGNED
Comment 4•15 years ago
|
||
It's worth noting that on Cortex-A9, you'll do better to have a few small LDMs than one big one. The practical upshot of this is that you get the data earlier than you would do with a single LDM. Of course, it's a trade-off between code size and data latency, and as there are multiple factors it's hard to give general guidance, but it's usually beneficial to stick to around 4 registers in a single LDM. On Cortex-A8, you're _probably_ better off with a single LDMIA, but when compared to two 4-register LDMs, the difference is minimal. None of this is too critical; you'll only save a couple of cycles anyway (except in the probably rare case where you get a cache line miss on the first few registers but a hit in later ones), but it's worth noting, and if we can easily limit each LDM to four registers, it's worth doing.
Assignee | ||
Comment 5•15 years ago
|
||
Assignee | ||
Comment 6•15 years ago
|
||
> patch to generate LDMDB/STMDB insns
Gives a 1.1% improvement on SunSpider default set and 2.1% on the v8
set. Although SunSpider is even noisier on ARM linux than it is on
x86 linux, for whatever reason.
It generates LDMDB for the exit blocks and STMDB for the entries.
On the exits, (which obviously are the most significant target,
because there are a lot of them) it merges 8 LDR instructions into
a single LDMDB; on the entries, it merges 7 stores into one. Code
size numbers obtained by TMFLAGS=fragprofile indicate roughly a 1/3
reduction in the amount of exit code in total.
Assignee | ||
Updated•15 years ago
|
Attachment #411752 -
Flags: review?(vladimir)
Comment on attachment 411752 [details] [diff] [review] patch to generate LDMDB/STMDB insns Looks good! Thanks for the comments as well, they made reviewing much easier.
Attachment #411752 -
Flags: review?(vladimir) → review+
Assignee | ||
Updated•15 years ago
|
Keywords: checkin-needed
Comment 8•15 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/92ea7f43feb6
Keywords: checkin-needed
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey
Comment 9•15 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/92ea7f43feb6
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•