Closed
Bug 535681
Opened 15 years ago
Closed 15 years ago
nanojit: make iterating over all active registers faster
Categories
(Core Graveyard :: Nanojit, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: n.nethercote, Assigned: n.nethercote)
References
Details
Attachments
(1 file)
19.04 KB,
patch
|
Details | Diff | Splinter Review |
Nanojit frequently iterates over all active registers. This accounts for a non-negligible fraction of compile time. (And see bug 524890 for a surprising consequence of this.) The code always looks something like this: for (Register r = FirstReg; r <= LastReg; r = nextreg(r)) { if (LIns* ins = _allocator.getActive(r)) { ... } } getActive(r) is just an array lookup, ie. _allocator.active[r]. A lot of the time only a small fraction of the registers are active, and we don't need to do random accesses in active[] very much. This points towards changing active[] into a list of (Register, LIns*) pairs. I have started experimenting with this but it turns out to be trickier than I expected because often when we are iterating over the registers we are also doing things like evicting them, ie. we are effectively modifying the list as we iterate over it. Also, intersectRegisterState() and unionRegisterState() don't follow the simple pattern above. I need to do some more work on it.
Assignee | ||
Comment 1•15 years ago
|
||
Comment 2•15 years ago
|
||
If we have a bitmask of the registers we want to iterate over, we could use a variation on nRegisterAllocFromSet() which uses CPU's bit-first-set instruction. Currently, RegAlloc.free is the only mask we maintain, but if we have a "managed" mask, then we can always get active = (~free)&managed, or vice-versa: free = (~active)&managed. (do we need "active" or "free" more often?) call site iteration set --------- ------------- ReleaseRegisters() active evictScratchRegs() active & GpRegs evictAllActiveRegs() active evictSomeActiveRegs() active & regs intersectRegState() cur.active | saved.active (iterate hi->lo) unionRegisterState() cur.active | saved.active (iterate hi->lo) assignSaved() saved.active & ~skip findVictim() allow & active if we are careful, modifying the list as we iterate should be a matter of zeroing out the 1 bits in the mask as we visit them. lo->hi hi->lo ------ ------ intel bsf bsr ppc cntzlw arm clz sparc ? if the cpu doesn't have a reversed bitscan instruction we're sort of stuck. We could use a 256-entry lookup table with 4 loads and stores to build the reversed mask or just do the slow scan. even skipping ahead one register at a time via an index and a bitmask should be marginally better than loading from the active[] array each time.
Comment 3•15 years ago
|
||
(In reply to comment #2) > if the cpu doesn't have a reversed bitscan instruction we're sort of stuck. see http://graphics.stanford.edu/~seander/bithacks.html for various approaches.
Assignee | ||
Comment 4•15 years ago
|
||
'free' is a bitmask but 'active' is an array of LIns*. So I think bit scanning is a red herring. I've made some progress on changing 'active' into a hybrid array/list structure. I've augmented each entry with a 'next' pointer so a list can be threaded through the active entries -- addActive() and removeActive() are then responsible for updating the pointers appropriately and so are marginally slower. It's still a register-indexed array so random accesses are still O(1), but you can traverse the active registers in O(n) time where n is the number of active registers. I should make more progress with it on Monday.
Comment 5•15 years ago
|
||
I didn't say it in the bit-scanning proposal but the gist of the idea is to have RegAlloc have a new bitmask, so that we indeed can do bit scanning on (masks of) the active array.
Assignee | ||
Comment 6•15 years ago
|
||
I just marked bug 524890 as WONTFIX. The likelihood of this bug making a noticeable difference seems smaller now, and what hacking I've done with the hybrid array/list structure has made things a lot more complicated. It doesn't seem worth the effort. I'm not planning to do any more here, so I'm marking this as WONTFIX as well.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•15 years ago
|
Resolution: FIXED → WONTFIX
Updated•10 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•