Closed
Bug 695389
Opened 13 years ago
Closed 6 years ago
[nanojit] Refactorization of RegAlloc, enable overlapping registers for ARM
Categories
(Tamarin Graveyard :: Virtual Machine, defect, P3)
Tamarin Graveyard
Virtual Machine
Tracking
(Not tracked)
RESOLVED
WONTFIX
Q2 12 - Cyril
People
(Reporter: virgilp, Assigned: wmaddox)
References
Details
Attachments
(3 files)
56.62 KB,
patch
|
wmaddox
:
review-
edwsmith
:
review-
|
Details | Diff | Splinter Review |
725 bytes,
patch
|
wmaddox
:
review+
edwsmith
:
review+
|
Details | Diff | Splinter Review |
74.51 KB,
patch
|
edwsmith
:
review+
|
Details | Diff | Splinter Review |
Float/Float4 support needs access to much more ARM registers. This patch prepares the register allocator to handle overlapping registers. It does not actually include the NativeArm changes, it's just the base work to enable adding Qn and Sn registers in NativeArm
Reporter | ||
Updated•13 years ago
|
Blocks: float/float4
Reporter | ||
Comment 1•13 years ago
|
||
Reporter | ||
Comment 2•13 years ago
|
||
Reporter | ||
Comment 3•13 years ago
|
||
Note: ARM doesn't use overlapping registers, that'll be another bug. This patch just adapts the native platforms to the new regalloc interface. Tested on buildbot (works on all platforms). Not tested on PPC, SH4, Sparc - likely won't work there.
Reporter | ||
Comment 4•13 years ago
|
||
Comment on attachment 567800 [details] [diff] [review] RegAlloc base changes Please review or reassign for review,
Attachment #567800 -
Flags: review?(wmaddox)
Attachment #567800 -
Flags: review?(edwsmith)
Reporter | ||
Comment 5•13 years ago
|
||
Comment on attachment 567803 [details] [diff] [review] RegAlloc changes to core (tamarin, not nanojit) wmaddox@adobe.com, edwsmith@adobe.com,
Reporter | ||
Updated•13 years ago
|
Attachment #568029 -
Flags: review?(wmaddox)
Attachment #568029 -
Flags: review?(edwsmith)
Reporter | ||
Updated•13 years ago
|
Attachment #567803 -
Flags: review?(wmaddox)
Attachment #567803 -
Flags: review?(edwsmith)
Reporter | ||
Updated•13 years ago
|
Blocks: float&float4_JIT
Comment 6•13 years ago
|
||
Comment on attachment 567800 [details] [diff] [review] RegAlloc base changes Overall it looks ok, but the scheme for overlapping registers needs to be explained clearly and the allocation/retire code need to be more obviously correct, before R+. substance: The "RegAlloc basics" comment needs an introduction explaining overlapping registers. > inline void RegAlloc::retire(Register r) The for loop doesn't seem necessary, it will always have 1 bit set (r). I searched in this patch and the others for a way for rmask(r) to result in >1 bit set in the mask, but couldn't find it. It would be easier to follow the #ifdefd code if the #endif didn't land between an if() and the then statement, even if it means duplicating a line or two of code. > #define IS_REG_IN_MASK() etc prefer inline functions unless there's a strong justification for macros. in allocReg(): > RegisterMask tmp_set= setA_F_; > for (Register r = lsReg(tmp_set); tmp_set; r = nextLsReg(tmp_set, r)){ > if(!isFree(r)) CLR_REG_IN_MASK(r,setA_F_); > } I think you can do this all with bitwise operations. This is critical path, so its worth the effort. The logic for UnspecifiedReg that can fall through to the eviction code, is too confusing. This is the fusion reactor of the allocator, its got to be ultra clear and fast. Can you refactor the predicates to work on masks, so we can make a single test & branch to either path, like the old code was? The assert looks weird -- it seems to imply release builds can work with broken invariants, which shouldn't be the intention. I can't see how all the versions of findAvailableReg() are sure to return UnspecifiedReg on failure. in some cases they map down to msReg() which IIRC are undefined if passed a mask of 0s. nits: several places, bug for example Assembler.cpp:410 - local style is // C++ comments, capitalize, punctuate, try not to abbreviate. I noticed lots of other local style bustage. Since I'll probably be who lands this, I can do a style pass before landing.
Attachment #567800 -
Flags: review?(edwsmith) → review-
Updated•13 years ago
|
Attachment #567803 -
Flags: review?(edwsmith) → review+
Comment 7•13 years ago
|
||
Comment on attachment 568029 [details] [diff] [review] Regalloc - changes to the various NativeXXX files to work with the allocator NativeARM.cpp: > RegisterMask free = (~_allocator.activeMask()) & AllowableFlagRegs; a RegAlloc.freeMask() method would clean this up, I think i've seen the same expression in a couple of places. If you change it and its just a mechanical change, no need to re-review. > inline bool IsFpSReg(Register _r){ return _r == S0; } Please add a comment for IsFpReg, IsFpSReg, since the distinction is subtle. (IsGpReg() too, for symmetry) NativeX64.cpp: > case LIR_std2f This change looks different from the other mechanical changes... is it fixing a bug?
Attachment #568029 -
Flags: review?(edwsmith) → review+
Reporter | ||
Comment 8•13 years ago
|
||
On activeMask()/ RegAlloc.freeMask(): no, IMO the code should never get its hands on the "free" set to perform allocations (even of temporary registers). This is the only place where I made an exception, and I only made it due to laziness :P - the correct solution there is a "allocTempRegIfFree()" method. I'll do that (I'll make it platform-specific since it's simpler and only ARM appears to need it). Looking again, the loop that you mentioned above is also an artefact of the previous implementation (which was based on "clobbered registers" and is completely useless in this implementation - i.e. has no effect at all.) On the "firstAvailableReg" you are right that the assertion is unnecesary (it's simply redundant, as things are implemented today). BUT: the invariant is not violated in release code, the "if" path is for the "RA_REGISTERS_OVERLAP" case. It is clear that I need to explain the scheme better. Currently some loops appear (ARE) unnecessary because nobody uses it. Here's how it works for ARM: 0-15: GP regs 16-48: Sn regs (single precision floats) 48-64: D16-D31 regs (double precision floats) The registers 16-48 are also D0-15, and also Q0-Q7. The registers D16-D31 are also Q8-Q15 (note that this is exactly the hardware representation). What this means is that rmask(S0) = 0x10000, rmask(S1)=0x20000, and rmask(D0) = 0x30000 (and rmask(Q0)= 0x70000). On overlapping registers the mapping scheme where "rmask(reg)= exactly one bit" is no longer true (and also no longer true is that the relationship between a register and its mask is bidirectional: you can always get a mask from a register, but you can no longer tell if "0x30000" means "D0" or "S0 or S1". I'll add more comments in code, too.
Comment 9•13 years ago
|
||
with this scheme we cannot address D16-D31 or Q8-Q15, without a >64bit RegisterMask, correct? (okay with me, just confirming).
Reporter | ||
Comment 10•13 years ago
|
||
Oh yes we can, that's the beauty of it :) Sn regs are limited to S32, they don't run up to S64. i.e. D16-D31 don't have any sub-registers. We have: 16 GP regs 32 Sn regs 16 Dn regs (upper bank, D16-D31) All the others are just names given to collections of physical registers (e.g. D0 = S0,S1; Q0= S0,S1,S2,S3 ) Check out the masks in https://bugzilla.mozilla.org/show_bug.cgi?id=696730
Assignee | ||
Comment 11•13 years ago
|
||
Comment on attachment 567800 [details] [diff] [review] RegAlloc base changes I find this diff almost impossible to follow -- there needs to be an explanation of the overall scheme, as well as better comments in the code. Locally, there are a few snippets that give me pause, for example: RegisterMask set__F_ = _free; ... RegisterMask setA_F_ = setA___ & set__F_; ... RegisterMask tmp_set= setA_F_; for (Register r = lsReg(tmp_set); tmp_set; r = nextLsReg(tmp_set, r)){ if(!isFree(r)) CLR_REG_IN_MASK(r,setA_F_); } Here, every bit set in setA_A_F_ is also set in _free. But isFree() simply queries _free, so isFree(r) must always return true. There are large blocks of code that have both moved to another file entirely, and also had significant internal changes. I'd like to see the patch broken into two, in which the first does only the refactoring needed to migrate code from Assembler to RegAlloc, and the second introduces the substantive changes to support overlapping registers.
Attachment #567800 -
Flags: review?(wmaddox) → review-
Assignee | ||
Updated•13 years ago
|
Attachment #567803 -
Flags: review?(wmaddox) → review+
Assignee | ||
Comment 12•13 years ago
|
||
In general, it needs to be obvious that the register allocator is producing the same results as before (unless you have made specific improvements that are called out) when new register classes (e.g., float/float4) are not being used in the generated code.
Comment 13•13 years ago
|
||
(This item may well be completed.)
Assignee: nobody → wmaddox
Priority: -- → P2
Target Milestone: --- → Q2 12 - Cyril
Updated•13 years ago
|
Priority: P2 → P3
Comment 14•13 years ago
|
||
changeset: 6798:4f58915784a2 user: Virgil Palanciuc <virgilp@adobe.com> summary: imported patch bug-695389.regalloc.coreChanges.diff http://hg.mozilla.org/tamarin-redux/rev/4f58915784a2
Comment 15•13 years ago
|
||
changeset: 6799:49eeecaec4f1 user: Virgil Palanciuc <virgilp@adobe.com> summary: imported patch bug-695389.regalloc.natives.diff http://hg.mozilla.org/tamarin-redux/rev/49eeecaec4f1
Assignee | ||
Comment 16•12 years ago
|
||
Comment on attachment 568029 [details] [diff] [review] Regalloc - changes to the various NativeXXX files to work with the allocator Removing stale review request.
Attachment #568029 -
Flags: review?(wmaddox)
Updated•6 years ago
|
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•