812945 - IonMonkey: pave the way for non-LSRA register allocators

Assignee

Description

•

13 years ago

Attached patch WIP (a8832e8df0c8) (obsolete) — Details — Splinter Review

Right now IonMonkey compilation is fairly well intertwined with its only allocator, LSRA. Adding a new register allocator is a somewhat daunting proposition, as a good fraction of the logic in LSRA needs to be duplicated and the interface with the remainder of compilation (i.e. CodeGenerator) is not well asserted or codified. The attached patch improves the situation in three ways: - Logic generally useful to any register allocator has been factored out of LSRA and into a new GenericAllocator class from which LSRA inherits. - Adds a structure AllocationIntegrityState for checking the validity of a register allocation. This class checks that the register allocation actually preserves the semantics of the original LIR, i.e. that every use of a vreg will be using the physical value that was written by that vreg's defining instruction. This means that regalloc bugs should show up as assertion failures with associated spew rather than as incorrect generated code. - Adds a new register allocator, StupidAllocator, that is able to carry registers within basic blocks. Similar to that used by stock JM, except with more spills needed due to not knowing when stack values are dead. Selectable with --ion-regalloc=stupid Mostly done, but still some lingering bugs with the new regalloc to sort out. The new regalloc is pretty simple and clean except for keeping track of GC thing pointers on the native stack, which is kind of a trainwreck without liveness information. Would be nice to separate this from the register allocator somehow but nothing obvious comes to mind.

Brian Hackett [Laid off!]

Assignee

Updated

•

13 years ago

Depends on: 813671

Brian Hackett [Laid off!]

Assignee

Updated

•

13 years ago

Depends on: 814177

Paul Wright

Comment 1

•

13 years ago

Are there particular limitations of LSRA that are holding Ion back in terms of robustness and performance? Are there more robust, better performing RAs? If so, which one(s)?

Brian Hackett [Laid off!]

Assignee

Comment 2

•

13 years ago

I don't think anyone has exactly quantified the problem, but I've seen on benchmarks a lot of spill code being generated, which generally indicates an issue with the regalloc. There is a belief floating around that for LSRA to perform well it really needs a lot of tuning, which hasn't really been done for IonMonkey's allocator, and rather than go through that process it'd be nice to investigate other allocators and see how they compare to the existing LSRA. The main one of interest is the register allocator used in LLVM, see this post: http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html I'm also interested in trying out classical graph coloring allocation, at least for hot loops. The former is more interesting though, and I'd like to pursue it first. The main point of this bug is to make isolating the regalloc from the rest of the compiler an easier proposition, which will make developing new regallocs much smoother. After this I'm going to do another bug to try to use the PC count stuff to isolate the performance cost of the spill code executed, after which things should be in a good place for developing and comparing different allocators in a mildly empirical fashion.

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 3

•

13 years ago

Nice, +1 to factoring pieces of LSRA out. We used to have something like the StupidAllocator (it was ambiguously called "Greedy"). It was really useful for catching bugs until LSRA was stable. Two suggestions: Could you give it its own file? And just "RegisterAllocator" probably works for "GenericAllocator", since the derived ones will get descriptive names.

Brian Hackett [Laid off!]

Assignee

Comment 4

•

13 years ago

Attached patch WIP (a8832e8df0c8) (obsolete) — Details — Splinter Review

Updated WIP, with improvements to simplify safepoint populating when prototyping regallocs. Instead of requiring different mechanisms in each regalloc to populate safepoints (denote live registers, and live locations holding Values and GC thing pointers for stack scanning), this tweaks the allocation integrity checker so that it can be used to do this population automatically. Still some jit-test failures to sort through.

Attachment #682958 - Attachment is obsolete: true

Paul Wright

Comment 5

•

13 years ago

(In reply to Brian Hackett (:bhackett) from comment #2) > I don't think anyone has exactly quantified the problem, but I've seen on > benchmarks a lot of spill code being generated, which generally indicates an > issue with the regalloc. There is a belief floating around that for LSRA to > perform well it really needs a lot of tuning, which hasn't really been done > for IonMonkey's allocator, and rather than go through that process it'd be > nice to investigate other allocators and see how they compare to the > existing LSRA. > > The main one of interest is the register allocator used in LLVM, see this > post: > > http://blog.llvm.org/2011/09/greedy-register-allocation-in-llvm-30.html > > I'm also interested in trying out classical graph coloring allocation, at > least for hot loops. The former is more interesting though, and I'd like to > pursue it first. > I think it's great that you're investigating other RAs and paving the way to implement when/if the time comes. The obvious question, though, is why these pros/cons were not discussed in more detail during the design phase(s) of Ion. Or maybe they were, but didn't have enough real-world data to make a more informed choice? Maybe, for the purposes of this bug, it didn't matter. Is your intention to basically do enough refactoring to enable an almost "plug-n-play" situation with RAs?

Brian Hackett [Laid off!]

Assignee

Comment 6

•

13 years ago

(In reply to Paul Wright from comment #5) > The obvious question, though, is why > these pros/cons were not discussed in more detail during the design phase(s) > of Ion. Or maybe they were, but didn't have enough real-world data to make > a more informed choice? I think that other regallocs were discussed; I've talked (from the sidelines) several times with people over the last year about the LLVM allocator / other regalloc approaches. LSRA is a proven algorithm though, has seen a lot of use in e.g. Java, and experience with other allocators in JITs is more limited. Ultimately, you need to implement multiple strategies to get a feel for what will work best in a given environment, and that isn't an easy luxury when the main priority is getting a performant compiler out the door.

Sean Stangl [:sstangl]

Comment 7

•

13 years ago

(In reply to Paul Wright from comment #5) > The obvious question, though, is why these pros/cons were not discussed > in more detail during the design phase(s) of Ion. Or maybe they were, > but didn't have enough real-world data to make a more informed choice? LSRA is the same allocator used in v8. We decided not to delve into allocators that are unproven in the JIT context when a known-good algorithm exists and is expected to match the competition's performance. Now that Ion has stabilized, it is reasonable to experiment with unproven allocators, such as LLVM's Greedy.

Paul Wright

Comment 8

•

13 years ago

(In reply to Sean Stangl from comment #7) > > LSRA is the same allocator used in v8. We decided not to delve into > allocators that are unproven in the JIT context when a known-good algorithm > exists and is expected to match the competition's performance. Now that Ion > has stabilized, it is reasonable to experiment with unproven allocators, > such as LLVM's Greedy. Makes sense.. thanks.

Brian Hackett [Laid off!]

Assignee

Updated

•

13 years ago

Depends on: 814396

Brian Hackett [Laid off!]

Assignee

Comment 9

•

13 years ago

Attached patch patch (a8832e8df0c8) (obsolete) — Details — Splinter Review

This patch seems to work on jit-tests --ion, modulo codegen issues due to the dependent bugs.

Assignee: general → bhackett1024

Attachment #684192 - Attachment is obsolete: true

Attachment #684416 - Flags: review?(jdemooij)

Jan de Mooij [:jandem]

Comment 10

•

13 years ago

Comment on attachment 684416 [details] [diff] [review] patch (a8832e8df0c8) Review of attachment 684416 [details] [diff] [review]: ----------------------------------------------------------------- This is really nice. I like the integrity checks. I think TBPL should run jit-tests at least once with the stupid allocator. If we don't, it's going to bit rot very soon. ::: js/src/ion/LIR.cpp @@ +318,5 @@ > +LMoveGroup::add(LAllocation *from, LAllocation *to) > +{ > + JS_ASSERT(*from != *to); > + for (size_t i = 0; i < moves_.length(); i++) > + JS_ASSERT(*to != *moves_[i].to()); This loop should be in an #ifdef DEBUG, to not depend on the compiler's DCE. ::: js/src/ion/LIR.h @@ +960,4 @@ > { } > void addLiveRegister(AnyRegister reg) { > + if (!liveRegs_.has(reg)) > + liveRegs_.add(reg); Nit: liveRegs_.addUnchecked(reg); @@ +966,5 @@ > return liveRegs_; > } > void addGcRegister(Register reg) { > + if (!gcRegs_.has(reg)) > + gcRegs_.add(reg); Nit: addUnchecked @@ +995,5 @@ > + return true; > + } > + return false; > + } > + return true; Is this for the LArgument case? Can we JS_ASSERT(alloc.isArgument())? @@ +1109,5 @@ > + return true; > + } > + return addValueSlot(slot); > + } > + return true; JS_ASSERT(alloc.isArgument())? @@ +1117,5 @@ > + if (alloc.isRegister()) > + return valueRegs().has(alloc.toRegister().gpr()); > + if (alloc.isStackSlot()) > + return hasValueSlot(alloc.toStackSlot()->slot()); > + return true; Same here. ::: js/src/ion/RegisterAllocator.cpp @@ +16,5 @@ > + // Ignore repeated record() calls. > + if (!instructions.empty()) > + return; > + > + instructions.reserve(graph.numInstructions()); The reserve/append calls in this method should propagate OOM. @@ +68,5 @@ > +AllocationIntegrityState::check(bool populateSafepoints) > +{ > + JS_ASSERT(!instructions.empty()); > + > + for (size_t blockIndex = 0; blockIndex < graph.numBlocks(); blockIndex++) { This loop should be in an #ifdef DEBUG, since this function can be called in opt builds. @@ +129,5 @@ > + } > + } > + > + if (IonSpewEnabled(IonSpew_RegAlloc)) > + dump(); Nit: indentation. @@ +208,5 @@ > + break; > +#ifdef JS_NUNBOX32 > + case LDefinition::TYPE: > + if (populateSafepoints) > + safepoint->addNunboxType(vreg, alloc); Why is there no "check(safepoint->hasNunboxType(..), ..);" here? @@ +212,5 @@ > + safepoint->addNunboxType(vreg, alloc); > + break; > + case LDefinition::PAYLOAD: > + if (populateSafepoints) > + safepoint->addNunboxPayload(vreg, alloc); Same here. @@ +269,5 @@ > + if (p) > + return; > + seen.add(p, item); > + > + worklist.append(item); Nit: handle OOM? @@ +280,5 @@ > + if (IonSpewEnabled(IonSpew_RegAlloc)) > + dump(); > + printf("%s\n", msg); > + JS_NOT_REACHED("Regalloc integrity failure"); > + } Nit: indentation. Also, should this be debug-only? @@ +284,5 @@ > + } > +} > + > +void > +AllocationIntegrityState::dump() This is a pretty big function, we should make it debug only. ::: js/src/ion/StupidAllocator.cpp @@ +275,5 @@ > + // different, as the phi could be for the value of the input in a previous > + // loop iteration. > + > + for (size_t i = 0; i < registerCount; i++) > + syncRegister(ins, i); We sync all registers here, but according to the comment in go() we can keep values in registers across basic blocks? Do we have to update the comment? @@ +342,5 @@ > + LMoveGroup *input = getInputMoveGroup(ins->id()); > + LAllocation *source = stackLocation(vreg); > + LAllocation *dest = new LAllocation(reg); > + input->addAfter(source, dest); > + registers[index].set(vreg, ins); The last 5 lines are very similar to the last part of ensureHasRegister. Can you add a new function and call it in both places? ::: js/src/ion/StupidAllocator.h @@ +26,5 @@ > + // Virtual register this physical reg backs, or MISSING_ALLOCATION. > + uint32 vreg; > + > + uint32 age; > + bool dirty; "age" and "dirty" should also have a (short) comment. @@ +43,5 @@ > + // Type indicating an index into registers. > + typedef uint32 RegisterIndex; > + > + // Information about each virtual register. > + Vector<LDefinition*, 100, SystemAllocPolicy> virtualRegisters; Since init calls virtualRegisters.reserve(..), this can just be 0 right?

Attachment #684416 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 11

•

13 years ago

Attached patch patch — Details — Splinter Review

Revised for comments. I originally had checks for hasNunboxType and hasNunboxPayload, but they fail with LSRA. Not a correctness bug, but will need to be addressed when we have a moving GC. Added a comment.

Attachment #684416 - Attachment is obsolete: true

Attachment #684769 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Updated

•

13 years ago

Blocks: 814966

Jan de Mooij [:jandem]

Comment 12

•

13 years ago

Comment on attachment 684769 [details] [diff] [review] patch Review of attachment 684769 [details] [diff] [review]: ----------------------------------------------------------------- Nice. Do you think we want to run some tests with --ion-regalloc=stupid on TBPL?

Attachment #684769 - Flags: review?(jdemooij) → review+

Brian Hackett [Laid off!]

Assignee

Comment 13

•

13 years ago

(In reply to Jan de Mooij [:jandem] from comment #12) > Comment on attachment 684769 [details] [diff] [review] > patch > > Review of attachment 684769 [details] [diff] [review]: > ----------------------------------------------------------------- > > Nice. Do you think we want to run some tests with --ion-regalloc=stupid on > TBPL? I'm not sure yet (hopefully the integrity checks should keep non-standard regallocs from bitrotting too easily), but right now the stupid allocator fails some jit-tests and is waiting on the dependent bugs filed to be fixed.

Brian Hackett [Laid off!]

Assignee

Comment 14

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/7e97c5392d81

Ryan VanderMeulen [:RyanVM]

Comment 15

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/7e97c5392d81

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla20

WIP (a8832e8df0c8) 13 years ago Brian Hackett [Laid off!] 70.54 KB, patch		Details \| Diff \| Splinter Review
WIP (a8832e8df0c8) 13 years ago Brian Hackett [Laid off!] 74.21 KB, patch		Details \| Diff \| Splinter Review
patch (a8832e8df0c8) 13 years ago Brian Hackett [Laid off!] 77.59 KB, patch		Details \| Diff \| Splinter Review
patch 13 years ago Brian Hackett [Laid off!] 78.38 KB, patch	jandem : review+	Details \| Diff \| Splinter Review