Closed Bug 516347 Opened 16 years ago Closed 15 years ago

nanojit: avoid premature regalloc state updates in the i386 backend

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)

Attachments

(1 file, 1 obsolete file)

Attached patch patch (obsolete) — Splinter Review
This patch is a first step towards fixing bug 513615, and lays the groundwork for some future bugs. It's deliberately very small, so the general approach can be evaluated carefully. - prepResultReg() and freeRsrcOf() are two prematurely-updating functions that need to be removed eventually. For now, I've added prepareResultReg() and freeResourcesOf() which are safer versions. (I'm not particularly attached to the new names, but I don't like freeRsrcOf() because it's cryptic and I find it hard to type.) - asm_output2 has been added, which is the same as asm_output but takes an additional parameter which is 'PPost' if the current register state is the post-state, 'PPre' if it's the pre-state, and 'PNone' if it's neither (useful for the middle of sequences of generated instructions, when we don't bother to keep the state consistent). asm_output(...) is now equivalent to asm_output2(PPre, ...). Eventually all occurrences will print the post-state and all asm_output2 occurrences can be replaced with asm_output occurrences. The output now indicates which state is being printed, eg: rp_offset = int 8 mov eax,8 post: eax(rp_offset) ecx(ld1527) esi(cx) edi(sp) ld1526 = ld add156[0] mov ecx,0(ecx+eax*1) pre: eax(rp_offset) ecx(ld1527) esi(cx) edi(sp) and (an example involving PNone): not46 = not ld35 mov eax,edx not eax post: eax(not46) ecx(ld36) edx(ld35) ebx(an d724) esi(ld33) edi(sp) xmm0(js_UnboxDouble65) - asm_int() and asm_neg_not() have been modified to use the new state-update functions. This required adding 'pp' parameters to several of the codegen macros; these are passed on to asm_output2. The plan after this patch is to continue modifying parts of the i386 back-end similarly to the way that asm_int() and asm_neg_not() were modified. At some point, Assembler.cpp will need to be modified and then parts of all the back-ends will have to be modified in sync as the generic/arch-specific interface will change.
Attachment #400448 - Flags: review?(rreitmai)
Attachment #400448 - Flags: review?(edwsmith)
Blocks: 513615
Comment on attachment 400448 [details] [diff] [review] patch This one needs some more work before it's ready to be reviewed.
Attachment #400448 - Flags: review?(rreitmai)
Attachment #400448 - Flags: review?(edwsmith)
Depends on: 516909
Renaming this to just be about the i386 backend.
Summary: nanojit: start avoiding premature register allocation state updates → nanojit: avoid premature regalloc state updates in the i386 backend
Depends on: 531347
Attached patch patch v2Splinter Review
This patch overhauls Assembler.cpp and most of Nativei386.cpp so that regstate updates are handled in a more obvious and less error-prone manner. The starting point is the introduction of prepareResultReg() and freeResourcesOf(), which are similar to the old prepResultReg() and freeRsrcOf(), except that the new ones can be used in a prepareResultReg(ins, allow); GENERATE_CODE(); freeResourcesOf(ins); instead of: prepResultReg(ins, allow); GENERATE_CODE(); where prepResultReg() called freeRsrcOf(ins) and so you had to be very careful not to accidentally reuse the register holding 'ins' when generating code. asm_int() and asm_neg_not() are simple examples that are worth looking at first to get a sense of the change. The patch doesn't fully convert the i386 backend to the new scheme. I only changed those functions where it was easy for the new version to generate identical code to the old version, and easy for me to test in TM. I did this so I could test it all carefully, by diffing the old and new code generated for all of SunSpider. I'll do the remainder in a follow-up bug. This should help avoid regressions, and make any introduced regressions easier to find. The patch also adds a number of comments explaining non-obvious aspects of the code generation, and does some minor formatting and naming clean-ups (e.g. using 'ins' consistently for LIns variables, instead of 'i' or 'l' or 'p'). And it adds a few more assertions.
Attachment #400448 - Attachment is obsolete: true
Attachment #417434 - Flags: review?(edwsmith)
Attachment #417434 - Flags: review?(dvander)
Comment on attachment 417434 [details] [diff] [review] patch v2 > r = nRegisterAllocFromSet(set); > _allocator.addActive(r, ins); > ins->setReg(r); >+ > } else { That seems unnecessary. > void Assembler::findRegFor2(RegisterMask allow, LIns* ia, Register& ra, LIns* ib, Register& rb) > { > if (ia == ib) { > ra = rb = findRegFor(ia, allow); >+ > } else { That too. >+ >+ // 'ins' is in 'allow', in register r (different to the old r); >+ // s is the old r. > if ((rmask(s) & GpRegs) && (rmask(r) & GpRegs)) { > #ifdef NANOJIT_ARM >- MOV(s, r); // ie. move 'ins' from its pre-state reg to its post-state reg >+ MOV(s, r); // move 'ins' from its pre-state reg (r) to its post-state reg (s) > #else > MR(s, r); > #endif Since you are in Rome, maybe you want to fix this and put a define into the arm backend.
> > #ifdef NANOJIT_ARM > >- MOV(s, r); // ie. move 'ins' from its pre-state reg to its post-state reg > >+ MOV(s, r); // move 'ins' from its pre-state reg (r) to its post-state reg (s) > > #else > > MR(s, r); > > #endif > > Since you are in Rome, maybe you want to fix this and put a define into the arm > backend. Actually, it's already there: http://hg.mozilla.org/tracemonkey/file/21d3b1b65758/js/src/nanojit/NativeARM.h#l630
Attachment #417434 - Flags: review?(dvander) → review+
(In reply to comment #4) > (From update of attachment 417434 [details] [diff] [review]) > > r = nRegisterAllocFromSet(set); > > _allocator.addActive(r, ins); > > ins->setReg(r); > >+ > > } else { > > That seems unnecessary. You mean the blank line? I can remove it, I initially added a lot more comments about the register state but then I removed most of them, I think this was a leftover related to that. > > #ifdef NANOJIT_ARM > >- MOV(s, r); // ie. move 'ins' from its pre-state reg to its post-state reg > >+ MOV(s, r); // move 'ins' from its pre-state reg (r) to its post-state reg (s) > > #else > > MR(s, r); > > #endif > > Since you are in Rome, maybe you want to fix this and put a define into the arm > backend. Sure, I'll do that too.
Okay, I read all this code carefully and I'm convinced it's good. I ran it on tamarin, though, and this assertion fires: Assertion failed: !ins->getArIndex() (/Users/edwsmith/vm/njn/nanojit/Nativei386.cpp:438) it's in asm_restore for the LIR_param case where we attempt to load stack args directly from the stack instead of spilling them: > int d = (arg - abi_regcount) * sizeof(intptr_t) + 8; > LD(r, d, FP); > NanoAssert(!ins->getArIndex()); // rematerializable; shouldn't be in a spill slot > ins->markAsClear(); I can help diagnose this if you like. It is possible its due to me applying the patch to the wrong point (I just sync'd tamarin-redux backwards until the patch applied cleanly). Anyway, I'm marking this R+ since you know how to run the regressions and use the sandbox. fwiw, here's the details on what I tested: macos 10.5.8 configure.py --enable-shell --enable-debug --enable-debugger runtests.py -E (that binary) --vmargs="-Ojit -Dnodebugger"
Attachment #417434 - Flags: review?(edwsmith) → review+
I thought that rematerializable things like 32-bit constants could end up in spill slots. Turns out they can due to the findMemFor() call in handleLoopCarriedExprs(), but this doesn't manifest in TM. So I just reverted those assertions back to tests, as was in the original code. After that, TR acceptance tests all passed fine. Pushed to NJ-central: http://hg.mozilla.org/projects/nanojit-central/rev/e7b2c0d805eb
Whiteboard: fixed-in-nanojit
Whiteboard: fixed-in-nanojit → fixed-in-nanojit, fixed-in-tracemonkey
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
Merge to TM the Sparc bustage fix: http://hg.mozilla.org/tracemonkey/rev/c6fbac9adb8b
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: