Closed
Bug 733030
Opened 12 years ago
Closed 9 years ago
IonMonkey: Fix some LSRA performance problems
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: jandem, Assigned: jandem)
References
(Blocks 1 open bug)
Details
(Whiteboard: [ion:t])
Attachments
(4 files, 1 obsolete file)
8.22 KB,
patch
|
sstangl
:
review+
|
Details | Diff | Splinter Review |
6.72 KB,
patch
|
sstangl
:
review+
|
Details | Diff | Splinter Review |
3.69 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
15.24 KB,
patch
|
sstangl
:
review+
|
Details | Diff | Splinter Review |
With the patch in bug 668305, Kraken audio-beat-detection still spends 76 ms (~20%) in the register allocator. The most serious problem is the large number of virtual registers (> 3000) - the bitsets to track live vregs are large. I hope we can avoid iterating over these bitsets in some cases, but eventually we'll need chunked compilation (JM+TI uses at least 22 chunks for this function). In the meantime, however, there are still a number of known regalloc problems we should fix.
Assignee | ||
Comment 1•12 years ago
|
||
If we assign a memory location to an interval, we don't want to add the interval to the active/inactive lists, since these lists are only used for registers.
Attachment #603205 -
Flags: review?(sstangl)
Comment 2•12 years ago
|
||
Comment on attachment 603205 [details] [diff] [review] Part 1: Only add intervals with registers to active/inactive Review of attachment 603205 [details] [diff] [review]: ----------------------------------------------------------------- Great.
Attachment #603205 -
Flags: review?(sstangl) → review+
Comment 3•12 years ago
|
||
Oh, also -- I think finishInterval() can now have "if (!interval->reg()) return;" replaced with a JS_ASSERT().
Assignee | ||
Comment 4•12 years ago
|
||
While addressing comment 3, I noticed a few problems with stack slots: 1) We still have to call finishInterval to free the stack slot, this patch adds an activeSlots list for that. It's just like the "active" list, but we don't have to search it in findBestFreeRegister etc. 2) If the interval is the last interval, finishInterval should free the canonical spill slot (if there is one), not the interval's allocation. If the last interval uses a register, and another interval uses a stack slot, we'll now correctly free the stack slot. 3) The JS_ASSERT(interval->reg()) is no longer necessary since it's immediately dereferenced.
Attachment #603693 -
Flags: review?(sstangl)
Assignee | ||
Comment 5•12 years ago
|
||
Correct version this time.
Attachment #603693 -
Attachment is obsolete: true
Attachment #603700 -
Flags: review?(sstangl)
Attachment #603693 -
Flags: review?(sstangl)
Assignee | ||
Comment 6•12 years ago
|
||
Some changes to setIntervalRequirement: 1) Split the loop in two loops: first search for requirements, then search for hints. This works since the list of uses is guaranteed to be sorted. 2) Don't search for hints if there's a canonical spill location. In this case we should just spill the interval (if there are register uses the interval will be split). This case is pretty common, and the patch is a reasonable crypto-md5 / audio-beat-detection speedup.
Attachment #603708 -
Flags: review?(dvander)
![]() |
||
Updated•12 years ago
|
Attachment #603708 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 7•12 years ago
|
||
This adds a new function addRangeAtHead (better name welcome), which we can use instead of addRange in most cases, because we know the range will be the first one (instructions are visited backwards). validateVirtualRegister is used to verify this.
Attachment #604036 -
Flags: review?(sstangl)
Comment 8•12 years ago
|
||
Just a few questions before review, because I didn't understand some changes. (In reply to Jan de Mooij (:jandem) from comment #4) > 1) We still have to call finishInterval to free the stack slot, this patch > adds an activeSlots list for that. It's just like the "active" list, but we > don't have to search it in findBestFreeRegister etc. Historically, before the optimization in Part 1A, finishInterval() ignored stack slots -- the first thing it did was "if (!interval->reg()) return;". Then it would go and, for that register, free its canonical spill location. If we then have an interval that is a stack slot, does it make sense for it to have a spill location at all? Isn't it already spilled? Shouldn't there be nothing left to free?
Assignee | ||
Comment 9•12 years ago
|
||
(In reply to Sean Stangl from comment #8) > > Historically, before the optimization in Part 1A, finishInterval() ignored > stack slots -- the first thing it did was "if (!interval->reg()) return;". Every virtual register has a list of intervals. Every interval has a pointer to its "parent" virtual register. interval->reg() returns this pointer. interval->reg() can/could be NULL in two cases: 1) "Bogus" intervals. These were removed a while ago (I thought this was what you meant in comment 3). 2) Fixed intervals. These should never end up in the active/inactive/unhandled sets or finishInterval - their allocation is always a single physical register and they are only used to determine if an interval intersects a fixed interval. The "fixedIntervalsUnion" is the union of all fixed intervals, so that we can quickly test whether an interval intersects any fixed interval (to avoid always looping over all fixed intervals). So JS_ASSERT(interval->reg()); just checks if the interval belongs to a virtual register. > Then it would go and, for that register, free its canonical spill location. > > If we then have an interval that is a stack slot, does it make sense for it > to have a spill location at all? Isn't it already spilled? Shouldn't there > be nothing left to free? A virtual register can have a single canonical spill location. If one of its intervals is assigned a stack slot, it must be equal to this canonical spill location. For example: Virtual register 3 (caonical spill: stack slot 5): - interval 0: (allocation: register eax) - interval 1: (allocation: stack slot 5) - interval 2: (allocation: register ecx) When finishInterval is called for interval 2, we notice that it's the last interval so we have to free the canonical spill slot. The bug was that even if we assign a register (ecx) to interval 2, there is still a canonical spill slot to free (interval 1's slot is not freed earlier, a later interval may want to use the same canonical spill location).
Comment 10•12 years ago
|
||
Comment on attachment 603700 [details] [diff] [review] Part 1 follow-up Review of attachment 603700 [details] [diff] [review]: ----------------------------------------------------------------- Thanks for the explanation! Looks good. ::: js/src/ion/LinearScan.h @@ +617,2 @@ > InlineList<LiveInterval> handled; > +#endif Can this be DebugOnly<InlineList<LiveInterval> >?
Attachment #603700 -
Flags: review?(sstangl) → review+
Comment 11•12 years ago
|
||
Comment on attachment 604036 [details] [diff] [review] Part 3: Optimize addRange Review of attachment 604036 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/LinearScan.cpp @@ +546,5 @@ > // the interval to the point of definition. > for (BitSet::Iterator liveRegId(*live); liveRegId; liveRegId++) { > + if (!vregs[*liveRegId].getInterval(0)->addRangeAtHead(inputOf(block->firstId()), > + outputOf(block->lastId()).next())) > + return false; Style guidelines require if()s with multi-line conditions to have {} braces, with the opening brace on its own line. @@ +654,5 @@ > // This is a dead phi, so add a dummy range over all phis. This > // can go away if we have an earlier dead code elimination pass. > + if (!vregs[def].getInterval(0)->addRangeAtHead(inputOf(block->firstId()), > + outputOf(block->firstId()))) > + return false; Same {} nit @@ +673,5 @@ > // Add an interval for this entire loop block > for (BitSet::Iterator liveRegId(*live); liveRegId; liveRegId++) { > + if (!vregs[*liveRegId].getInterval(0)->addRange(inputOf(loopBlock->lir()->firstId()), > + outputOf(loopBlock->lir()->lastId()).next())) > + return false; Same {} nit @@ +720,5 @@ > // rest of the allocator can assume it has at least one range. > if (fixedIntervalsUnion->numRanges() == 0) { > + if (!fixedIntervalsUnion->addRangeAtHead(CodePosition(0, CodePosition::INPUT), > + CodePosition(0, CodePosition::OUTPUT))) > + return false; Same {} nit @@ +1954,5 @@ > +LiveInterval::validateRanges() > +{ > + Range *prev = NULL; > + > + size_t i = ranges_.length() - 1; This could be added to the for(){} without breaking the column limit. @@ +1991,1 @@ > #endif Prefer "#endif // DEBUG" due to length of block.
Attachment #604036 -
Flags: review?(sstangl) → review+
Assignee | ||
Comment 12•12 years ago
|
||
I landed these patches, but backed them out due to regressing Kraken-desaturate on x86. What's strange though is that I can't reproduce this regression locally, and diff'ing ion.cfg shows that the only difference with these patches is that we are now able to reuse a stack slot.
Comment 13•12 years ago
|
||
(In reply to Jan de Mooij (:jandem) from comment #12) > I landed these patches, but backed them out due to regressing > Kraken-desaturate on x86. What's strange though is that I can't reproduce > this regression locally, and diff'ing ion.cfg shows that the only difference > with these patches is that we are now able to reuse a stack slot. Ever get this sorted out?
Assignee | ||
Comment 14•12 years ago
|
||
(In reply to Paul Wright from comment #13) > > Ever get this sorted out? I just relanded part 2 and 3: http://hg.mozilla.org/projects/ionmonkey/rev/bd4524c5b011 http://hg.mozilla.org/projects/ionmonkey/rev/5d02acbb390c Keeping this bug open for part 1.
![]() |
||
Updated•11 years ago
|
Whiteboard: [ion:t]
Comment 15•11 years ago
|
||
Is part 1 a "wontfix"?
Assignee | ||
Comment 16•9 years ago
|
||
LSRA is now disabled by default (and will be removed soon) so we can close this.
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
Assignee | ||
Updated•9 years ago
|
Resolution: INVALID → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•