Closed
Bug 812945
Opened 13 years ago
Closed 13 years ago
IonMonkey: pave the way for non-LSRA register allocators
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: bhackett1024, Assigned: bhackett1024)
References
Details
Attachments
(1 file, 3 obsolete files)
|
78.38 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
Right now IonMonkey compilation is fairly well intertwined with its only allocator, LSRA. Adding a new register allocator is a somewhat daunting proposition, as a good fraction of the logic in LSRA needs to be duplicated and the interface with the remainder of compilation (i.e. CodeGenerator) is not well asserted or codified. The attached patch improves the situation in three ways:
- Logic generally useful to any register allocator has been factored out of LSRA and into a new GenericAllocator class from which LSRA inherits.
- Adds a structure AllocationIntegrityState for checking the validity of a register allocation. This class checks that the register allocation actually preserves the semantics of the original LIR, i.e. that every use of a vreg will be using the physical value that was written by that vreg's defining instruction. This means that regalloc bugs should show up as assertion failures with associated spew rather than as incorrect generated code.
- Adds a new register allocator, StupidAllocator, that is able to carry registers within basic blocks. Similar to that used by stock JM, except with more spills needed due to not knowing when stack values are dead. Selectable with --ion-regalloc=stupid
Mostly done, but still some lingering bugs with the new regalloc to sort out. The new regalloc is pretty simple and clean except for keeping track of GC thing pointers on the native stack, which is kind of a trainwreck without liveness information. Would be nice to separate this from the register allocator somehow but nothing obvious comes to mind.
Comment 1•13 years ago
|
||
Are there particular limitations of LSRA that are holding Ion back in terms of robustness and performance? Are there more robust, better performing RAs? If so, which one(s)?
| Assignee | ||
Comment 2•13 years ago
|
||
I don't think anyone has exactly quantified the problem, but I've seen on benchmarks a lot of spill code being generated, which generally indicates an issue with the regalloc. There is a belief floating around that for LSRA to perform well it really needs a lot of tuning, which hasn't really been done for IonMonkey's allocator, and rather than go through that process it'd be nice to investigate other allocators and see how they compare to the existing LSRA.
The main one of interest is the register allocator used in LLVM, see this post:
http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html
I'm also interested in trying out classical graph coloring allocation, at least for hot loops. The former is more interesting though, and I'd like to pursue it first.
The main point of this bug is to make isolating the regalloc from the rest of the compiler an easier proposition, which will make developing new regallocs much smoother. After this I'm going to do another bug to try to use the PC count stuff to isolate the performance cost of the spill code executed, after which things should be in a good place for developing and comparing different allocators in a mildly empirical fashion.
Nice, +1 to factoring pieces of LSRA out. We used to have something like the StupidAllocator (it was ambiguously called "Greedy"). It was really useful for catching bugs until LSRA was stable.
Two suggestions: Could you give it its own file? And just "RegisterAllocator" probably works for "GenericAllocator", since the derived ones will get descriptive names.
| Assignee | ||
Comment 4•13 years ago
|
||
Updated WIP, with improvements to simplify safepoint populating when prototyping regallocs. Instead of requiring different mechanisms in each regalloc to populate safepoints (denote live registers, and live locations holding Values and GC thing pointers for stack scanning), this tweaks the allocation integrity checker so that it can be used to do this population automatically. Still some jit-test failures to sort through.
Attachment #682958 -
Attachment is obsolete: true
Comment 5•13 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #2)
> I don't think anyone has exactly quantified the problem, but I've seen on
> benchmarks a lot of spill code being generated, which generally indicates an
> issue with the regalloc. There is a belief floating around that for LSRA to
> perform well it really needs a lot of tuning, which hasn't really been done
> for IonMonkey's allocator, and rather than go through that process it'd be
> nice to investigate other allocators and see how they compare to the
> existing LSRA.
>
> The main one of interest is the register allocator used in LLVM, see this
> post:
>
> http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html
>
> I'm also interested in trying out classical graph coloring allocation, at
> least for hot loops. The former is more interesting though, and I'd like to
> pursue it first.
>
I think it's great that you're investigating other RAs and paving the way to implement when/if the time comes. The obvious question, though, is why these pros/cons were not discussed in more detail during the design phase(s) of Ion. Or maybe they were, but didn't have enough real-world data to make a more informed choice?
Maybe, for the purposes of this bug, it didn't matter. Is your intention to basically do enough refactoring to enable an almost "plug-n-play" situation with RAs?
| Assignee | ||
Comment 6•13 years ago
|
||
(In reply to Paul Wright from comment #5)
> The obvious question, though, is why
> these pros/cons were not discussed in more detail during the design phase(s)
> of Ion. Or maybe they were, but didn't have enough real-world data to make
> a more informed choice?
I think that other regallocs were discussed; I've talked (from the sidelines) several times with people over the last year about the LLVM allocator / other regalloc approaches. LSRA is a proven algorithm though, has seen a lot of use in e.g. Java, and experience with other allocators in JITs is more limited. Ultimately, you need to implement multiple strategies to get a feel for what will work best in a given environment, and that isn't an easy luxury when the main priority is getting a performant compiler out the door.
Comment 7•13 years ago
|
||
(In reply to Paul Wright from comment #5)
> The obvious question, though, is why these pros/cons were not discussed
> in more detail during the design phase(s) of Ion. Or maybe they were,
> but didn't have enough real-world data to make a more informed choice?
LSRA is the same allocator used in v8. We decided not to delve into allocators that are unproven in the JIT context when a known-good algorithm exists and is expected to match the competition's performance. Now that Ion has stabilized, it is reasonable to experiment with unproven allocators, such as LLVM's Greedy.
Comment 8•13 years ago
|
||
(In reply to Sean Stangl from comment #7)
>
> LSRA is the same allocator used in v8. We decided not to delve into
> allocators that are unproven in the JIT context when a known-good algorithm
> exists and is expected to match the competition's performance. Now that Ion
> has stabilized, it is reasonable to experiment with unproven allocators,
> such as LLVM's Greedy.
Makes sense.. thanks.
| Assignee | ||
Comment 9•13 years ago
|
||
This patch seems to work on jit-tests --ion, modulo codegen issues due to the dependent bugs.
Assignee: general → bhackett1024
Attachment #684192 -
Attachment is obsolete: true
Attachment #684416 -
Flags: review?(jdemooij)
Comment 10•13 years ago
|
||
Comment on attachment 684416 [details] [diff] [review]
patch (a8832e8df0c8)
Review of attachment 684416 [details] [diff] [review]:
-----------------------------------------------------------------
This is really nice. I like the integrity checks.
I think TBPL should run jit-tests at least once with the stupid allocator. If we don't, it's going to bit rot very soon.
::: js/src/ion/LIR.cpp
@@ +318,5 @@
> +LMoveGroup::add(LAllocation *from, LAllocation *to)
> +{
> + JS_ASSERT(*from != *to);
> + for (size_t i = 0; i < moves_.length(); i++)
> + JS_ASSERT(*to != *moves_[i].to());
This loop should be in an #ifdef DEBUG, to not depend on the compiler's DCE.
::: js/src/ion/LIR.h
@@ +960,4 @@
> { }
> void addLiveRegister(AnyRegister reg) {
> + if (!liveRegs_.has(reg))
> + liveRegs_.add(reg);
Nit: liveRegs_.addUnchecked(reg);
@@ +966,5 @@
> return liveRegs_;
> }
> void addGcRegister(Register reg) {
> + if (!gcRegs_.has(reg))
> + gcRegs_.add(reg);
Nit: addUnchecked
@@ +995,5 @@
> + return true;
> + }
> + return false;
> + }
> + return true;
Is this for the LArgument case? Can we JS_ASSERT(alloc.isArgument())?
@@ +1109,5 @@
> + return true;
> + }
> + return addValueSlot(slot);
> + }
> + return true;
JS_ASSERT(alloc.isArgument())?
@@ +1117,5 @@
> + if (alloc.isRegister())
> + return valueRegs().has(alloc.toRegister().gpr());
> + if (alloc.isStackSlot())
> + return hasValueSlot(alloc.toStackSlot()->slot());
> + return true;
Same here.
::: js/src/ion/RegisterAllocator.cpp
@@ +16,5 @@
> + // Ignore repeated record() calls.
> + if (!instructions.empty())
> + return;
> +
> + instructions.reserve(graph.numInstructions());
The reserve/append calls in this method should propagate OOM.
@@ +68,5 @@
> +AllocationIntegrityState::check(bool populateSafepoints)
> +{
> + JS_ASSERT(!instructions.empty());
> +
> + for (size_t blockIndex = 0; blockIndex < graph.numBlocks(); blockIndex++) {
This loop should be in an #ifdef DEBUG, since this function can be called in opt builds.
@@ +129,5 @@
> + }
> + }
> +
> + if (IonSpewEnabled(IonSpew_RegAlloc))
> + dump();
Nit: indentation.
@@ +208,5 @@
> + break;
> +#ifdef JS_NUNBOX32
> + case LDefinition::TYPE:
> + if (populateSafepoints)
> + safepoint->addNunboxType(vreg, alloc);
Why is there no "check(safepoint->hasNunboxType(..), ..);" here?
@@ +212,5 @@
> + safepoint->addNunboxType(vreg, alloc);
> + break;
> + case LDefinition::PAYLOAD:
> + if (populateSafepoints)
> + safepoint->addNunboxPayload(vreg, alloc);
Same here.
@@ +269,5 @@
> + if (p)
> + return;
> + seen.add(p, item);
> +
> + worklist.append(item);
Nit: handle OOM?
@@ +280,5 @@
> + if (IonSpewEnabled(IonSpew_RegAlloc))
> + dump();
> + printf("%s\n", msg);
> + JS_NOT_REACHED("Regalloc integrity failure");
> + }
Nit: indentation. Also, should this be debug-only?
@@ +284,5 @@
> + }
> +}
> +
> +void
> +AllocationIntegrityState::dump()
This is a pretty big function, we should make it debug only.
::: js/src/ion/StupidAllocator.cpp
@@ +275,5 @@
> + // different, as the phi could be for the value of the input in a previous
> + // loop iteration.
> +
> + for (size_t i = 0; i < registerCount; i++)
> + syncRegister(ins, i);
We sync all registers here, but according to the comment in go() we can keep values in registers across basic blocks? Do we have to update the comment?
@@ +342,5 @@
> + LMoveGroup *input = getInputMoveGroup(ins->id());
> + LAllocation *source = stackLocation(vreg);
> + LAllocation *dest = new LAllocation(reg);
> + input->addAfter(source, dest);
> + registers[index].set(vreg, ins);
The last 5 lines are very similar to the last part of ensureHasRegister. Can you add a new function and call it in both places?
::: js/src/ion/StupidAllocator.h
@@ +26,5 @@
> + // Virtual register this physical reg backs, or MISSING_ALLOCATION.
> + uint32 vreg;
> +
> + uint32 age;
> + bool dirty;
"age" and "dirty" should also have a (short) comment.
@@ +43,5 @@
> + // Type indicating an index into registers.
> + typedef uint32 RegisterIndex;
> +
> + // Information about each virtual register.
> + Vector<LDefinition*, 100, SystemAllocPolicy> virtualRegisters;
Since init calls virtualRegisters.reserve(..), this can just be 0 right?
Attachment #684416 -
Flags: review?(jdemooij)
| Assignee | ||
Comment 11•13 years ago
|
||
Revised for comments.
I originally had checks for hasNunboxType and hasNunboxPayload, but they fail with LSRA. Not a correctness bug, but will need to be addressed when we have a moving GC. Added a comment.
Attachment #684416 -
Attachment is obsolete: true
Attachment #684769 -
Flags: review?(jdemooij)
Comment 12•13 years ago
|
||
Comment on attachment 684769 [details] [diff] [review]
patch
Review of attachment 684769 [details] [diff] [review]:
-----------------------------------------------------------------
Nice. Do you think we want to run some tests with --ion-regalloc=stupid on TBPL?
Attachment #684769 -
Flags: review?(jdemooij) → review+
| Assignee | ||
Comment 13•13 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #12)
> Comment on attachment 684769 [details] [diff] [review]
> patch
>
> Review of attachment 684769 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Nice. Do you think we want to run some tests with --ion-regalloc=stupid on
> TBPL?
I'm not sure yet (hopefully the integrity checks should keep non-standard regallocs from bitrotting too easily), but right now the stupid allocator fails some jit-tests and is waiting on the dependent bugs filed to be fixed.
| Assignee | ||
Comment 14•13 years ago
|
||
Comment 15•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
You need to log in
before you can comment on or make changes to this bug.
Description
•