Closed Bug 502539 Opened 15 years ago Closed 15 years ago

nanojit: move displacements into load instructions

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 1 obsolete file)

Attached patch patch (obsolete) — Splinter Review
The second operand of all load instructions is a displacement.  Currently this
displacement is represented via an LIns.  But every such LIns is actually a
LIR_imm, so this representation is inefficient.

This patch changes load instructions so that the displacement is an
field within the load instruction rather than a separate LIns.  This makes loads like stores, which already hold the displacement as an immediate field.  Some measurements:

- The number of LIR instructions generated for trace-test.js drops by 11%.
  This means fewer LIns writes and reads, and a lot less stress on the
  CseFilter due to fewer repeated LIR_imm instructions.

- The amount of space used for LIR drops by 5.5%.  This is smaller than the
  11% instruction count drop because 32-bit immediates are only 2 words and
  thus smaller than the average instruction (and don't forget that the LIR
  buffers currently include skip payloads and call args which really blow out
  the average LIns size).

- The net result is that SunSpider is up to 10ms faster on my Mac (975ms ->
  965ms) -- that's the best result I've got, it's been smaller (eg. 3ms) on
  other --runs=100 attempts.

A small amount of extra code had to be added in LIR.cpp.  A new LInsLd class
was introduced, and various cases where loads were handled by the generic
two-operand case had to be handled separately.

I also removed the CseFilter::insLoad() method;  isCseOpcode() is never true
for load opcodes, so this method was never doing anything.

The load opcodes should arguably have an 'i' in their name, as LIR_sti and
LIR_stqi do.  Or maybe the 'i' should be dropped from the store opcodes.
But I didn't do that, to minimise changes.  The displacement fields should
also arguably be intptr_t instead of int32_t but I opted to match what's
currently in the store instructions.
Attachment #386939 - Flags: review?(gal)
Attachment #386939 - Flags: review?(gal) → review+
Comment on attachment 386939 [details] [diff] [review]
patch

I suggest a follow-up patch and renaming LIR_sti to LIR_st.

Aren't constant loads CSE-able? (LIR_ldcs or example)

Why not just disp() instead of load/storeDisp that do the same thing? We are unlikely to support different displacements there.

The CSE issue is a bug imo, the rest are nits to be addressed or not at your discretion :)
Comment on attachment 386939 [details] [diff] [review]
patch

I am pretty sure Ed has no objections. Lets get this in.
Attachment #386939 - Flags: review?(edwsmith)
(In reply to comment #1)
> (From update of attachment 386939 [details] [diff] [review])
> I suggest a follow-up patch and renaming LIR_sti to LIR_st.
> 
> Aren't constant loads CSE-able? (LIR_ldcs or example)

Ah, yes.  They are, but they are rarely/never used which is why when I counted how many loads were being CSE'd I got zero.  I'll add CseFilter::insLoad() back in.

> Why not just disp() instead of load/storeDisp that do the same thing? We are
> unlikely to support different displacements there.

For that to work the 'disp' field must be stored in the same relative position in loads and stores, but it's not.  And it can't be moved because that would require moving the oprnd_1 and oprnd_2 fields and then oprnd1() and oprnd2() couldn't be shared among loads/stores/op1s/op2s, and oprnd1() and oprnd2() are used more often.

> The CSE issue is a bug imo, the rest are nits to be addressed or not at your
> discretion :)

Thanks for the quick review!
Attachment #386939 - Attachment is obsolete: true
Attachment #386942 - Flags: review?(edwsmith)
Attachment #386939 - Flags: review?(edwsmith)
(In reply to comment #3)
> (In reply to comment #1)
> > Why not just disp() instead of load/storeDisp that do the same thing? We are
> > unlikely to support different displacements there.
> 
> For that to work the 'disp' field must be stored in the same relative position
> in loads and stores, but it's not.  And it can't be moved because that would
> require moving the oprnd_1 and oprnd_2 fields and then oprnd1() and oprnd2()
> couldn't be shared among loads/stores/op1s/op2s, and oprnd1() and oprnd2() are
> used more often.
> 

we can still use a disp() function even if it must test the opcode.  Keeps the API simple and the branch will predict well if the function is small and inlineable.

Also a warning: for RISC cpu's, the load/store instructions don't support full 32bit offsets.  the bit size depends on the cpu.  backend code gen is simpler if the LIR generator never exceeds this limit; for a large displacement, a temp register is needed, so if the displacement itself is a separate LIns, then the backend can generate the proper addressing code (or pattern-match the proper addressing mode) without having to use scratch registers.

nothing to do about it in this bug, but its a design pressure on load/store instructions and calls as well.  

my opinion:  its okay for LIR to be affected by knowing what the target backend is.  during frontend LIR generation or during a lowering pass it should be okay for backends to put tighter limits on the value range of these disp fields by generating explicit address generation LIns's and not use the implicit address generation of LIR_ld/st
Attachment #386942 - Flags: review?(edwsmith) → review+
see bug 444682 for a longer discussion on this
http://hg.mozilla.org/tracemonkey/rev/82986a4e5223
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/tracemonkey/rev/39882ccf1a24 fixes breakage on optimised builds.
http://hg.mozilla.org/mozilla-central/rev/82986a4e5223
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: