Closed Bug 573912 Opened 14 years ago Closed 14 years ago

TM: avoid two stack stores at the end of every on-trace loop

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

Here's the LIR for bitwise-and.js: ebx = parami 0 ebx esi = parami 1 esi edi = parami 2 edi state = parami 0 ecx label1: sp = ldi.o state[8] cx = ldi.o state[0] eos = ldi.o state[12] ldi3 = ldi.csro cx[0] immi3 = immi 0 eqi1 = eqi ldi3, immi3/*0*/ xf2: xf eqi1 -> pc=0x90527a6 imacpc=(nil) sp+0 rp+0 (GuardID=001) ldi2 = ldi.o eos[1376] ldi1 = ldi.o eos[1368] andi1 = andi ldi2, ldi1 sti.o eos[1376] = andi1 immi2 = immi 1 addxovi1 = addxovi ldi1, immi2/*1*/ -> pc=0x90527b5 imacpc=(nil) sp+0 rp+0 (GuardID=002) sti.o eos[1368] = addxovi1 * sti.s sp[0] = addxovi1 * immi1 = immi 600000 * sti.s sp[8] = immi1/*600000*/ * lti1 = lti addxovi1, immi1/*600000*/ * xf1: xf lti1 -> pc=0x90527c0 imacpc=(nil) sp+16 rp+0 (GuardID=003) * j -> label1 * livei state * x1: x -> pc=0x90527a6 imacpc=(nil) sp+0 rp+0 (GuardID=004) (Note that 'x1' is dead.) I'm interested in the last few instructions, those marked with '*'. Here's an equivalent formulation: sti.o eos[1368] = addxovi1 # sti.s sp[0] = addxovi1 immi1 = immi 600000 # sti.s sp[8] = immi1/*600000*/ lti1 = lti addxovi1, immi1/*600000*/ jt1: jt lti1 -> label1 livei state x1: x -> pc=0x90527a6 imacpc=(nil) sp+0 rp+0 (GuardID=004) It's better for two reasons. First, instead of the common case being "branch conditional (not taken); branch unconditional", it's just "branch conditional (taken)", ie. one fewer instruction for the branching. Second, and more importantly, we have a single guard with "sp+0" instead of two guards, one of which has "sp+16". This means that the stores to sp[0] and sp[8], marked with '#', can be eliminated by the StackFilter. This pattern occurs for just about all loop conditions that involve integer </>/<=/>=, ie. a hell of a lot of them. (Integer ==/!= and FP comparisons look a little different, for reasons I haven't yet fathomed.) Unfortunately, I have no idea how to change jstracer.cpp to use the new formulation. The loop ending code is generated by fuseIf(), emitIf() and closeLoop() and it's all rather complicated. Any suggestions, or confirmation that my analysis above is correct, would be welcome.
Hmm, now I'm not sure if my new formulation is correct. The result of the comparison isn't put on the stack. TraceRecorder::relational() has this comment, which seems relevant but which I don't fully understand: /* * There is no need to write out the result of this comparison if the trace * ends on this operation. */ if (pc[1] == JSOP_IFNE || pc[1] == JSOP_IFEQ) CHECK_STATUS_A(checkTraceEnd(pc + 1));
Yeah, at an IFNE if the op fails we leave the result on the stack. The interpreter will resume and take the other side of the branch. Maybe we can be smarter and adjust the exit pc to follow the other branch, and avoid the store?
With the current formulation, we ensure that the two operands of the comparison are on the stack at exit, but the result of the comparison is not on the stack. So the two operands must not be used after the trace ends?
We did this because its expensive to twiddle the result of the boolean comparison onto the stack. Its cheaper to just leave the operands there. And yes, the boolean comparison consumes the two operands.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.