Old remnant of JM1 tracer integration and tracerecursion. Slightly complicates the call path so might as well do away with it.
Created attachment 466574 [details] [diff] [review] fix This should be a slight perf win, but will make call ICs much saner.
Jacob, this will break ARM - I can attempt a fix, but I don't have easy access to an ARM box right now (OOTO and on dialup for a week). Will post a separate patch tomorrow.
(In reply to comment #2) > Jacob, this will break ARM I think the only thing left is to remove the overridden "ret" in BaseAssembler.h: http://hg.mozilla.org/projects/jaegermonkey/file/73eb2d14f7ac/js/src/methodjit/BaseAssembler.h#l316 It will probably still break, but that's all that's obvious. The tracer stuff doesn't work yet anyway. I'll try to find time to test & update the patch later today.
Ok, I didn't get around to looking at this in detail and the rest of my day is booked up. There's a reference to scriptedReturn in a STATIC_ASSERT in MethodJIT.cpp which needs to go. Other than that, I just got a compile error because JaegerBomb tries to look up JSStackFrame::script, which is a private member. I didn't investigate further, but that didn't look ARM-specific so you'll probably hit the same problem on x86/x64.
OK, I'm going to need some help understanding this. Hopefully that mean by the end we can have really clear, specific comments on how this stuff works. First, to check my understanding, can you tell me if this is right: I. How calling JM-compiled scripts works, leaving aside tracing for now. While JIT code for JSScript |script| is running, this ABI holds: - |FrameReg| holds a pointer to |cx->fp| for this script activation. - I'm pretty sure the purpose of this is fast access to locals and other stack frame elements in JIT code. - |cx->fp->ncode| is the return address for the call to |script|. That is, if we want to return normally from |script|, jump to |cx->fp->ncode|. - I think the main reason for moving this is so that the C stack always looks "right", i.e., ready to call a stub with the VMFrame. - correctCalleeStack generates the prolog code that guarantees the above ABI. - correctReturnStack generates the epilog code to make sure a |ret| instruction returns to the place required by the above ABI. Q1. Did we ever document the reasons for the parts of the ABI? We should have that down in comments. Q2. Some of the names here don't help me understand too much. Do changes like this make sense: correctCalleeStack -> genScriptABIProlog correctReturnStack -> genScriptABIEpilog ncode -> scriptABIReturnAddr (Note that I don't think the names on the right are necessarily the best--just trying to work toward something more specifically descriptive.) II. Adding exception handling When an exception is thrown, we need to have a way to (a) do some unwinding and then (b) jump to the new place in code we should continue executing. The throwpoline handles all this without too much complexity for the user. Am I right: - To start exception handling, we return from a stub call to the throwpoline. - On entry to the throwpoline, the normal conditions of the jit-code ABI are satisfied. - To do the unwinding and find out where to continue executing, we call js_InternalThrow. - js_InternalThrow may return 0, which means the place to continue, if any, is above this JaegerShot activation, so we just return, in the same way the trampoline does. - Otherwise, js_InternalThrow returns a jit-code address to continue execution at. Because the jit-code ABI conditions are satisfied, we can just jump to that point. I'll post more questions in a bit about the hard part, which is presumably also the main point of this patch.
(In reply to comment #5) > While JIT code for JSScript |script| is running, this ABI holds: This is all right on. > Q1. Did we ever document the reasons for the parts of the ABI? I don't think so. I'll add it to the big comment at the top. Scripted ICs will change the ABI a little, but the most important thing is that VMFrame is pinned to a fixed offset from SP (for fast C++ interaction). > Q2. Some of the names here don't help me understand too much. Yeah, those are better. WebKit uses "restoreArgumentsReference" and "restoreReturnAddressBeforeReturn" FWIW. Maybe "saveReturnAddress" and "restoreReturnAddress" would work. > II. Adding exception handling This list of steps looks accurate - and descriptive enough that maybe it should go in a comment above Throwpoline :)
OK, part 2, on the hard stuff. III. Adding tracing The basic operation of RunTracer is not too complicated: 1. Call MonitorTracePoint, i.e., activate the trace monitor. If the trace monitor doesn't do anything, then we are at the same point so we just continue. 2. Otherwise, the trace monitor may have executed a bunch of stuff, either in record mode or by running a trace. The PC may now be almost anywhere. All we know for sure is that we haven't returned past the frame where we started tracing. As a minor detail, if recording/trace execution generated an exception, we may have to do some unwinding. 3. At this point, we have a PC, and we need to continue execution from there. This is done using magic. The magic part has to solve these key problems: A. The easy case is if the new PC is in the same frame we started from (|entryFrame|) and is at a safe point. If so, we mostly just need to figure out what JM code address we should start from, establish the ABI requirements, and then jump there. B. But the PC may not be at a safe point. This means the JM code for that PC expects certain values to be in registers. We can't restore those requirements directly (without adding a whole bunch of annotations that we don't want to add). If the PC is at an arbitrary point, we run the interpreter until we reach a safe point. C. The PC could also be at a RETURN. For reasons I don't understand, a RETURN apparently is not a safe point, but we do know pretty much what we have to do to set up the register state correctly and then return. So that's what we do. D. A further complication is that we might come back in a different frame from where we started. It would be nice if we could just find a safe point there and run, but we can't. The reason is that the stack frames created by the tracer do not have ncode (the native return address) set, so they would return to the interpreter, and we wouldn't get back to JM code. Also, that would crash if we did get back to a JM frame still running in the interpreter. So instead, we need to take control and run our way back to the frame we started with using the interpreter or JM as appropriate. Q3. So why exactly can't we just jump to a return like it's a safe point? Q4. Did I get it right about what we have to do to execute ourselves back to the entry frame? Problem B pretty much takes care of itself without anything too magical. Problem A also turns out to be fairly simple. The code map tells us what address to jump to for a given PC. We then return to the jit-code, thus re-establishing the ABI properties (just as a normal return from a stub call). The jit code just following the call jumps to the returned address and off we go. Problems C and D are harder, necessitating the magic. First, problem C. In this case, we need to return, but we can't just jump to the native code for the return. So instead, we want to execute a return "manually" to establish the ABI, and then jump to the code address we would return to. This happens as follows: 1. set entryFrame->ncode = fp->ncode 2. Return from this stub call to InjectJaegerReturn. This establishes the ABI coming out of a script return and then does a ret. That ABI is, for x86: edx,ecx: data,type for the JS value returned by the script eax: fp->ncode. not sure what this is for -- is this something to do with what the new fp->ncode should be? ebx: fp for the frame we are returning to Some other things I don't understand here: Q5. What is |entryFrame->ncode = f.fp->ncode| for? Q6. What is the address the |ret| in InjectJaegerReturn returns to? I expected to see a push that would put it in place, but clearly I am missing something. As you can probably tell, the return stuff confuses me a lot so I can't comment on it intelligently yet. Problem D basically boils down to starting execution at a safe point (I will call that code address |safePoint| in the middle of a method where we don't necessarily have the right fp->ncode (return address from the frame containing that safe point). This is done with a trampoline, JaegerFromTracer, that sets up fp->ncode and jumps to safePoint. Apparently, we need to set up a new VMFrame to make that call, so we can't just jump to JaegerFromTracer directly. Instead, we use the usual JaegerTrampoline for that, and pass the JaegerTrampoline a code address of JaegerFromTracer, so it starts there. And JaegerFromTracer gets its code address from the arguments area of JaegerTrampoline. I think I understand this part reasonably well. Some name/docs changes that might make this a bit easier: - How about something like "SafePointTrampoline" instead of JaegerFromTracer. The "input ABI" to SafePointTrampoline is the same as for a script, except that the "hijack" parameter must be set to the safe point. - Instead of "hijack", how about naming that "safePoint". - It strikes me that the effect of JaegerTrampoline+JaegerFromTracer could be achieved by just a variation of JaegerTrampoline that just has slightly different setup code. Am I correct in guessing that the reason for the JaegerTrampoline+JaegerFromTracer design is to avoid code duplication of the main trampoline? That seems fine, but it also makes me think we could do it more directly if we were generating the trampolines dynamically (which I think we still want to do someday, but not now). - JaegerShot/JaegerBomb seem to have a fair amount of duplicate code. Can they reasonably be factored to a common core? I think 2 entry points is great, but the interior can maybe be shared.
> Q3. So why exactly can't we just jump to a return like it's a safe point? There is no safe point, since we don't want to sync any tracked values. We could create one near the bottom, but it would be tricky if later we want to remove rval or bypass rval when returning. There's also a small advantage in that it's common for the tracer to end on a return, and injecting a return is somewhat cheaper than JaegerTrampoline. I think it's possible to do it though, that's what JM1 did. > Q4. Did I get it right about what we have to do to execute ourselves back to the entry frame? Yes. > Q5. What is |entryFrame->ncode = f.fp->ncode| for? This is part of the old code and should be removed in the patch. ncode used to be "off" by one frame, so when returning you had to propagate it down. > Q6. What is the address the |ret| in InjectJaegerReturn returns to? I expected > to see a push that would put it in place, but clearly I am missing something. You're right - I forgot a push in the MSVC version, good catch. Will fix in next patch. > How about something like "SafePointTrampoline" > Instead of "hijack", how about naming that "safePoint". Sounds good. > Am I correct in guessing that the reason for the JaegerTrampoline+JaegerFromTracer > design is to avoid code duplication of the main trampoline? Yeah, maintaining four of these is enough. > it also makes me think we could do it more directly if we were generating the > trampolines dynamically Definitely - this is hack so we can put off dynamic generation a little longer. > JaegerShot/JaegerBomb seem to have a fair amount of duplicate code. Yeah. The details are pretty different but I'll give it a shot.
Created attachment 467076 [details] [diff] [review] rebased, renamed, fixes Also includes some ARM changes, but there's one TODO left in ARM tracer support.
Attachment #467076 - Flags: review?(dmandelin) → review+
Created attachment 467414 [details] [diff] [review] Tweaks for ARM. Some tweaks to get it working on ARM. (The patch sits on top of yours; I didn't merge them.) I took the liberty of cleaning a few things up and patching a few comments (for example, where you talk about JaegerBomb or refer to scriptedReturn). Thanks for the excellent comment block in MethodJIT.cpp! Very helpful indeed. ---- ARM tracer integration is still incomplete but I'll post a separate patch to bug 588022 when it's done.
Created attachment 467434 [details] [diff] [review] Tweaks for ARM (v2). Fixes a bug in my previous patch: SafePointTrampoline didn't set up JSFrameReg (r11). The x86 implementation doesn't have to, because it's already in the proper register (ebx).
Attachment #467414 - Attachment is obsolete: true
Thanks, Jacob! Pushed both patches: http://hg.mozilla.org/projects/jaegermonkey/rev/04bc789f7a43 http://hg.mozilla.org/projects/jaegermonkey/rev/b88bab8e77c5
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
I had to back both of these out due to debug trace-test failures: http://hg.mozilla.org/projects/jaegermonkey/rev/8acd89c26786 http://hg.mozilla.org/projects/jaegermonkey/rev/56d0d08bd835
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The test failures seem to appear only on Windows. But I might be mistaken. There's a lot of Tinderbox confusion today. :-)
(In reply to comment #14) > The test failures seem to appear only on Windows. SafePointTrampoline doesn't look equivalent between _MSC_VER and __GNUC__. The "mov" appears to have its arguments the wrong way around.
Status: REOPENED → ASSIGNED
Created attachment 467830 [details] [diff] [review] Patch, rebased with SafePointTrampoline fix (In reply to comment #15) > (In reply to comment #14) > > The test failures seem to appear only on Windows. > > SafePointTrampoline doesn't look equivalent between _MSC_VER and __GNUC__. The > "mov" appears to have its arguments the wrong way around. Thanks! I think that's it--it passes trace-tests with that change. I will reland both patches once this morning's merge cycles. If anyone has an opinion on whether I should land the two separately again or combine them, let me know. I'll do separate by default.
Attachment #467076 - Attachment is obsolete: true
Either one is fine. Thanks for catching this Jacob, and for buddying this back in, Dave!
8 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago → 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.