IonMonkey: Resolve memory to memory moves

RESOLVED FIXED

Status

()

RESOLVED FIXED
7 years ago
7 years ago

People

(Reporter: dvander, Assigned: dvander)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

Right now LMoves can have memory to memory traffic. I think I'm going to switch the roles of LMove and MoveGroup and resolve cycles in the code generator, which has more knowledge about the ISA.
Created attachment 545829 [details] [diff] [review]
v0

Backstory: Andrew and I both wanted this code to be as platform generic as possible, and wanted to try and not use recursion to perform cycle detection. The latter is just annoying from a memory management perspective, but this patch manages to do it (I hope) cleanly.

The former is kind of annoying. There are many ways to implement move cycles and perform memory-to-memory moves and the options available, and their costs, are different on every platform. So I chose a middle ground. The new resolver is split into two pieces:

MoveGroupResolver: This is a cycle detector which outputs a sorted list of moves with the cycles present.

MoveResolver($ARCH): A platform-specific helper class which takes the sorted move list and breaks cycles, also handling memory-to-memory moves which may need to spill to the stack.

It's not yet clear how valuable this extra design complexity is versus the KLoC win. The x86/64 resolution algorithm present in this patch, with a little more abstraction, could just be used outright on every platform. Maybe once we have ARM, and are performance tuning both, it will become apparent whether these two classes can be folded back into one.

Design critique welcome!

The x86/64 algorithm is pretty simple:
 * If there are cycles, and no free regs, reserve some stack space.
 * Start of cycle (A -> B -> A). Save B in cycle slot/reg.
 * Mem to Mem move? Get a temp register, spill if necessary.
 * Reg to Mem move? If Reg was spilled, restore it.
 * End of cycle (A -> B -> A). Restore B from cycle slot/reg to A.
 * Restore any regs that are still spilled.

We don't use xchg, Andrew measured it as being slower than stack movement.
Attachment #545829 - Flags: review?(adrake)
Created attachment 545831 [details] [diff] [review]
v1

Oh, we should probably build on x64.
Attachment #545829 - Attachment is obsolete: true
Attachment #545831 - Flags: review?(adrake)
Attachment #545829 - Flags: review?(adrake)
Comment on attachment 545831 [details] [diff] [review]
v1

Review of attachment 545831 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good here. While I don't see any really nice abstraction for the architecture specific stuff in the move group resolver that would give us as much flexibility as this split, it looks like it might be possible to share the move resolver itself between x86 and ARM. It'd still be a small code size increase, but it wouldn't embed architectural details in the cycle detector.
Attachment #545831 - Flags: review?(adrake) → review+
Yeah - I'd actually be glad if we could use resolver on ARM as well.

http://hg.mozilla.org/users/danderson_mozilla.com/ionmonkey/rev/3c4d3129a38a
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.