Closed Bug 661736 Opened 13 years ago Closed 13 years ago

IonMonkey: Implement lowering

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dvander, Assigned: dvander)

References

Details

Attachments

(6 files, 10 obsolete files)

103.82 KB, patch
adrake
: review+
Details | Diff | Splinter Review
22.79 KB, patch
adrake
: review+
Details | Diff | Splinter Review
8.87 KB, patch
adrake
: review+
Details | Diff | Splinter Review
34.08 KB, patch
adrake
: review+
Details | Diff | Splinter Review
7.69 KB, patch
adrake
: review+
Details | Diff | Splinter Review
6.20 KB, patch
adrake
: review+
Details | Diff | Splinter Review
Attached patch sketch of what LIR looks like (obsolete) — Splinter Review
The patches I'm about to attach are sketches for adrake to critique. The first has the base components: LInstruction attaches to MInstuction, and tells the register allocator what sort of outputs and temporaries it has, which are platform specific. LInstruction is not a great name, but I chose it since one of the designs conceptually gives us a second IR.

LAllocations tells the register allocator what sort of storage is required. For virtual registers, the allocator is responsible for using the available policy to change the LAllocation to something physical (like a register or stack slot).
Attached patch design 1: operands in LIR (obsolete) — Splinter Review
Here, the operands are explicitly attached to the LIR. The advantage of this is it makes the LIR more like its own IR, and the transition to a disjoint IR would be very easy.

If we had something untyped like:
 0: getprop
 1: getprop
 2: add(0, 1)

The LIR would look like:
 0: getprop ; defines vregs 1, 2
 1: getprop ; defines vregs 2, 3
 2: add_ic(vreg1, vreg2, vreg3, vreg4) 

Note that the actual opcode and its semantics still come from MInstruction, not the LInstruction. There are no "LIR opcodes".
Err, that should be:

 0: getprop ; defines vregs 1, 2
 1: getprop ; defines vregs 3, 4
 2: add_ic(vreg1, vreg2, vreg3, vreg4) ; defines vregs 5, 6
Attached patch design 2: operands in MIR (obsolete) — Splinter Review
In this design, the LIR looks exactly like the MIR so it's more like having one IR.
Andrew and Dave both agreed #1 is the way to go, the added flexibility is hard to ignore. So the next question: is it possible to have a LIR that has extra opcodes on top of MIR, while still re-using MIR ops?

I'm not convinced yet that there's an easy way. We could imagine having two derivations of LInstruction: one that relies on having backing MIR, and others that are true LIR opcodes. My gut feeling is it would end up really squirrelly in the code generation phase.

A good example is MAdd and MDiv. If their inputs are integers, they transform into LAddI and LDivI which require very different inputs on x86 versus ARM. But if their inputs are doubles, or values, they can be coalesced into one generic op. So a totally separate LIR would have: LAddI, LDivI (x86), LDivI (ARM), LBinaryD, LBinaryIC. But if all you have is LAdd and LDiv, you will end up with really gross branching all over code generation.

On the other hand, it looks like about half of V8's MIR and HIR ops are simple and shared, which is a point in favor of MIR-backed LIR.
Attached patch sketch 2 (obsolete) — Splinter Review
For now, I'm going to proceed with LIR being an entirely separate IR. If duplication becomes a serious problem, and I don't think it will, we have a few options:

(1) Many operations, like shape, clasp, and type checks, can be broken down into generic loads and compares. It adds compilation overhead though, and for bigger ops, introducing new virtual registers could be tricky.

(2) LIR ops don't necessarily need their own classes. For example, LCheckShape and LCheckClasp could be combined into LCheckPointer or something. Similarly if two operations have identical def and use counts, they can be combined into one LInstruction that stores the actual LOpcode.
Attachment #537055 - Attachment is obsolete: true
Attachment #537056 - Attachment is obsolete: true
Attachment #537057 - Attachment is obsolete: true
Another important aspect of this patch is that I'm starting to abstract away the Nitro assembler. We'll keep using it as a backend, but this way we can improve the exposed data structures and API. For example, we'll get rid of implicit constructors and bogus coercions (RegisterID -> int), and make it easier to emit various addressing schemes without branching. For example, we want to avoid this:

  if (lhs->isRegister() && rhs->isRegister())
     masm.add(lhs->toReg(), rhs->toReg())
  else if (lhs->isRegister() && rhs->isStackSlot())
     masm.add(lhs->toReg(), StackSlot(rhs->stackSlot());
  ... blah blah 100 combinations ...

In favor of:
  masm.add(ToOperand(lhs), ToOperand(rhs))

Long term, we may want to start breaking these abstractions where we could win over platform-generic code. On ARM, ALU instructions can take separate RHS and destination registers, but the MacroAssembler hides this behind the two-register API.
If we're switching to an independent completely low-level IR, does it make sense to make boxing and unboxing completely explicit now? We could let the register allocator and GVN have access to the boxing and unboxing computations proper where previously they would sort of be copy-pasted in in codegen. This would allow, say, a register to die in the middle of a box computation so it could be freed up for use as a temporary.
Attached patch not quite there yet (obsolete) — Splinter Review
The data structures and file layout are in place, the actual lowering process is next.
Attachment #537113 - Attachment is obsolete: true
Attached patch needs review (obsolete) — Splinter Review
This patch implements the proposed LIR data structures and file layout. It needs a high-level review, especially making sure that it will fit into LSRA. The code around lowering MParameter is gross but that will not be normal, the nunbox/punbox deviation will be limited to very few functions, and could be separated into derived classes if it gets problematic.

Brief overview: LBlocks (attached to MBasicBlocks) have a list of LInstructions. Each LInstruction has:
  * 0-2 LDefinitions, which represent outputs.
  * 0-N LAllocations, which represent inputs.
  * 0-N LAllocations needed for temporaries.

An LAllocation is either:
  * Some kind of fixed allocation (stack slot, register),
  * A constant, or
  * A use of a virtual register, with an allocation policy.

An LDefinition contains:
  * An allocation policy.
  * A virtual register for building live ranges.
  * A type, so regalloc can compute register classes and spill sizes,
    among other things.
  * An LAllocation, that may be pre-filled if the definition has a fixed
    policy, otherwise, it is filled during regalloc.
Attachment #537415 - Attachment is obsolete: true
Attachment #537662 - Flags: review?(adrake)
Attached patch rebased (obsolete) — Splinter Review
Attachment #537662 - Attachment is obsolete: true
Attachment #537662 - Flags: review?(adrake)
Attachment #537667 - Flags: review?(adrake)
Comment on attachment 537667 [details] [diff] [review]
rebased

Review of attachment 537667 [details] [diff] [review]:
-----------------------------------------------------------------

Any chance a dummy LIR-x64.h could make it in with a useful #error or some such if it's not going to be implemented until later? Currently it just throws an ugly "no include file" error.

::: js/src/ion/IncludeLIR.h
@@ +45,5 @@
> +// This file includes everything necessary to use LIR data structures.
> +// Unfortunately, it's needed to defeat limitations of C++ headers, since
> +// there are platform-specific opcodes and LIR derivations which must be
> +// defined before IonLIR-inl.h.
> +

I don't really see the issue with just having all of this in IonLIR.h. The comment is a bit confusing given that IonLIR-inl.h is never included anywhere.

::: js/src/ion/IonAllocPolicy.h
@@ +107,5 @@
>  };
>  
> +template <typename T, size_t Arity>
> +class FixedArityList
> +{

This seems like kind of an odd place for a generic data structure -- do we perhaps want to have a header file for ion-specific data structures?

::: js/src/ion/IonLIR-inl.h
@@ +40,5 @@
> + * ***** END LICENSE BLOCK ***** */
> +
> +#ifndef jsion_lir_inl_h__
> +#define jsion_lir_inl_h__
> +

It doesn't look like this file gets included or referenced anywhere but IncludeLIR.h.

::: js/src/ion/MOpcodes.h
@@ +45,5 @@
> +
> +namespace js {
> +namespace ion {
> +
> +#define MIR_OPCODE_LIST(_)                                                  \

While we're here, might we make this into MOpcodes.tbl? I think then the rest of this can be moved into MIR.h.

::: js/src/ion/MachineIncludes.h
@@ +42,5 @@
> +#ifndef jsion_machine_includes_h__
> +#define jsion_machine_includes_h__
> +
> +// This file includes everything necessary to use macro assembler
> +// functionality.

Why not have this in IonAssembler.h then?

@@ +44,5 @@
> +
> +// This file includes everything necessary to use macro assembler
> +// functionality.
> +
> +#include "IonAssembler.h"

I can see there being a use for arch-specific stuff that doesn't depend on the assembler (like how many registers there are), but pulling this in kind of breaks that abstraction.

@@ +46,5 @@
> +// functionality.
> +
> +#include "IonAssembler.h"
> +#if defined(JS_CPU_X86)
> +# include "x86/Assembler-x86.h"

It seems like IonAssembler.h would be responsible for pulling these in?

::: js/src/ion/x64/Architecture-x64.h
@@ +107,5 @@
> +
> +    static const uint32 AllocatableMask = AllMask & ~NonAllocatableMask;
> +};
> +
> +class FloatRegisterCodes {

SSERegisterCodes would probably be a better name, wouldn't want to confuse these with the x87 stack.

It'd be cool if there was a way to make 'alternative' register sets natural in the register allocator, since ARM doesn't have SSE and other architectures may have things that are not SSE or floating point specific but are still useful. We could get away with this currently by templating the register allocator on the FooRegisterCodes, but that's quite a big template and it feels hacky.

::: js/src/ion/x86/Architecture-x86.h
@@ +88,5 @@
> +
> +    static const uint32 AllocatableMask = AllMask & ~NonAllocatableMask;
> +};
> +
> +class FloatRegisterCodes {

Same comment as other FloatRegisterCodes.
Attachment #537667 - Flags: review?(adrake)
> While we're here, might we make this into MOpcodes.tbl? I think then the rest
> of this can be moved into MIR.h.

I kept it separate to minimize include dependence, but it's not really necessary. For example IonLowering.h doesn't include MIR.h, just the bare necessities. I can remove it if you want.

> SSERegisterCodes would probably be a better name, wouldn't want to confuse these with the x87 stack.

Yeah, this is tough. As you noted it'd be great to have some magic way of integrating random register classes, but nothing comes to mind. Probably there's some internal logic in the allocator you want to template on, at least.

I don't like "SSERegisterCodes" because then we have to come up with a separate name for each platform. We're not using x87 "registers" anyway (if you can even call them that), so I say we can steal the moniker.
Attached patch fixed up (obsolete) — Splinter Review
Fixes (I hope) all of the include travesties in the last patch. Also started adding comments for LIR at Nick's suggestion.

Lastly, there is one CPU-specific op now (LBox) to demonstrate how it might be done.
Attachment #537667 - Attachment is obsolete: true
Attachment #537738 - Flags: review?(adrake)
Attached patch emit-at-uses functionality (obsolete) — Splinter Review
Attachment #537738 - Attachment is obsolete: true
Attachment #537738 - Flags: review?(adrake)
Attachment #537801 - Flags: review?(adrake)
This version implements lowering of MReturn, which required attaching fixed registers to live ranges.
Attachment #537801 - Attachment is obsolete: true
Attachment #537801 - Flags: review?(adrake)
Attachment #537827 - Flags: review?(adrake)
Comment on attachment 537827 [details] [diff] [review]
fixed regs for live ranges

Review of attachment 537827 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonLIR-inl.h
@@ +39,5 @@
> + *
> + * ***** END LICENSE BLOCK ***** */
> +
> +#ifndef jsion_lir_inl_h__
> +#define jsion_lir_inl_h__

This file isn't included from anywhere.

::: js/src/ion/IonLIR.h
@@ +185,5 @@
> +    static const uint32 FLAG_SHIFT = POLICY_SHIFT + POLICY_BITS;
> +    static const uint32 FLAG_MASK = (1 << FLAG_BITS) - 1;
> +    static const uint32 REG_BITS = 5;
> +    static const uint32 REG_SHIFT = FLAG_SHIFT + FLAG_BITS;
> +    static const uint32 REG_MASK = (1 << REG_BITS) - 1;

Any chance we could use a bitfield for this to avoid doing the nasty bit-ops ourselves?

@@ +340,5 @@
> +    static const uint32 POLICY_MASK = (1 << POLICY_BITS) - 1;
> +
> +    static const uint32 VREG_BITS = (sizeof(bits_) * 8) - (POLICY_BITS + TYPE_BITS);
> +    static const uint32 VREG_SHIFT = POLICY_SHIFT + POLICY_BITS;
> +    static const uint32 VREG_MASK = (1 << VREG_BITS) - 1;

Same question about bitfields.

::: js/src/ion/IonLowering-inl.h
@@ +64,5 @@
> +    return true;
> +}
> +
> +template <size_t X, size_t Y> bool
> +LIRGenerator::define(LInstructionHelper<1, X, Y> *lir, MInstruction *mir)

Nit: X and Y probably aren't the most descriptive letters for template parameters.

::: js/src/ion/x64/Assembler-x64.h
@@ +46,5 @@
> +
> +namespace js {
> +namespace ion {
> +
> +static const Register rcx = { RegisterCodes::RCX };

As a proof-of-concept this is fine, but this should probably get fleshed out before going into the tree.
This applies on top of the original patch, I'll get to review comments shortly. I tried lowering MBox and MUnbox. For the most part they weren't bad, though x86's MBox is truly gross. I'm not sure if this is avoidable or if any other ops will look as bad (I suspect not).

The way boxes work on x86 is that there are two outputs. When defining a box, two sequential virtual registers are reserved, and the first is assigned to the MIR definition. When using an untyped MIR's type tag, the virtual register is |mir->id()|. When using its payload, it's |mir->id() + 1|.

For the most part this will be hidden deep in a few, specific places, and most code will not need to worry about how boxed defs/uses are implemented. Most code will be using happy functions like, use() and define(), and useBox() and defineBox().

This patch also introduces an "emit at uses" concept, so constants don't generate any LIR by default. If an instruction requests an operand and doesn't specify an "allows constants" policy, only then is the instruction is actually emitted, and can be emitted multiple times.
Attachment #537933 - Flags: review?(adrake)
Comment on attachment 537933 [details] [diff] [review]
lower box, unbox operations

Review of attachment 537933 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #537933 - Flags: review?(adrake) → review+
Attachment #537827 - Flags: review?(adrake) → review+
Okay, I've pushed these and then a third patch to implement spew, which is sort of cryptic but there's a lot of data and not a lot of space to print it.
Status: NEW → ASSIGNED
This patch lets a predecessor know its position in a successor's phi's operand list. It relies on the fact that:
 (1) For every block with phis, its predecessors have at most one successor that has phis.
 (2) A phi has one input for every predecessor.
Attachment #538454 - Flags: review?(adrake)
Attached patch lower phis (obsolete) — Splinter Review
This lowers phis. For x64 it's straightforward, for x86 we create two LIR phis per MPhi, one to capture types and the other to capture data. One tricky part about phis is that we haven't necessarily see their inputs at the start of a basic block, so while they are immediately assigned virtual registers, actually creating them is deferred until all other instructions have been lowered.

This patch also cleans up the emit-at-uses functionality. Emit-at-uses poses a problem for lowering phis, because at the time we lower them, the inputs must have been generated and given virtual registers in their containing predecessor path. To get around this, at the end of lowering a block, we find any successor phis and forcibly generate instructions for constant inputs.
Attachment #538455 - Flags: review?(adrake)
Attachment #538454 - Flags: review?(adrake) → review+
Comment on attachment 538455 [details] [diff] [review]
lower phis

Review of attachment 538455 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonLIR.h
@@ +543,5 @@
>  
>  class LBlock : public TempObject
>  {
>      MBasicBlock *block_;
> +    InlineList<LInstruction> phis_;

Does it make things too nasty if these are explicitly LPhis? Without doing that, we'd have to JS_ASSERT(foo->isPhi()) whenever iterating, and we could get it statically.

::: js/src/ion/IonLowering.cpp
@@ +206,5 @@
>      return false;
>  }
>  
>  bool
> +LIRGenerator::doPhi(MPhi *ins)

We have preparePhi, doPhi, and visitPhi. Perhaps doPhi -> lowerPhi would improve clarity a little bit?

@@ +338,5 @@
> +    for (size_t i = 0; i < graph.numBlocks(); i++) {
> +        MBasicBlock *block = graph.getBlock(i);
> +        current = block->lir();
> +        for (size_t j = 0; j < block->numPhis(); j++) {
> +            if (!visitPhi(block->getPhi(j)))

block->getPhi(j)->accept(this) for consistency with the visitor pattern?

::: js/src/ion/IonLowering.h
@@ +130,3 @@
>  
>      template <typename T>
> +    bool annotate(T *ins) {

Do you actually want duck typing here, or LInstruction *?

@@ +140,5 @@
>          return true;
>      }
>  
> +    template <typename T>
> +    bool add(T *ins) {

Same here.
> Does it make things too nasty if these are explicitly LPhis? Without doing
> that, we'd have to JS_ASSERT(foo->isPhi()) whenever iterating, and we could
> get it statically.

Hrm. I just tried this and couldn't get it to work, because LInstruction derives from InlineListNode, InlineList<LPhi>::insert() sees an LPhi as an InlineListNode<LInstruction>, and gives a type error.

What if we put phis in a js::Vector?

> We have preparePhi, doPhi, and visitPhi. Perhaps doPhi -> lowerPhi would
> improve clarity a little bit?

Good idea, thanks.

> block->getPhi(j)->accept(this) for consistency with the visitor pattern?

Done.

> Do you actually want duck typing here, or LInstruction *?

I'm trying to find ways to reduce the large number of virtual calls we're gaining. For add(), it also always statically eliminates one of the code paths. Anyway, it's easy enough to change either way.
Attached patch lower phisSplinter Review
Attachment #538455 - Attachment is obsolete: true
Attachment #538455 - Flags: review?(adrake)
Attachment #538628 - Flags: review?(adrake)
Attachment #538628 - Flags: review?(adrake) → review+
Comment on attachment 538629 [details] [diff] [review]
lower MTest

Looks good, I'm not quite sure what's going to become of fillBoxUses when we have instructions that take two boxes.
Attachment #538629 - Flags: review?(adrake) → review+
Attached patch lower MAddSplinter Review
Last lowering patch, MAdd.
Attachment #538636 - Flags: review?(adrake)
Comment on attachment 538636 [details] [diff] [review]
lower MAdd

Review of attachment 538636 [details] [diff] [review]:
-----------------------------------------------------------------
Attachment #538636 - Flags: review?(adrake) → review+
All of these are pushed. Some of the ops will assert on certain type inputs, but we've got enough functionality now to start assigning registers for mildly interesting code.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.