I'm splitting this into two workloads. Part 1: Move recorder instantiation from the interpreter to the method JIT. The method compiler will identify back edges. Each back edge will get an index into a vector attached to the script. Each entry in the vector will store: * hit count * recording attempts * trace trees for that bytecode location Each back edge will emit code to check the hit count/recording attempts, and if hot, invoke a stub call (let's say jsl_InvokeTracer) to record a trace. jsl_InvokeTracer will instantiate a recorder, then directly call js_Interpret with JSFRAME_RECORDING. The interpreter will never record a trace itself. Rather when this frame flag is present, it will automatically record from the get-go. When recording stops, it will immediately return to its caller (without passing through the usual exit/inline_return labels). The stub call will detect whether the recording was successful, and proceed to either invoke the trace, or blacklist, and fix up left-over inline frames so we can proceed to run method JIT'd code. To make this easier, we will only record into function calls if we can method JIT them. Blacklisting should be straightforward. We can afford a rather cheap check in the emitted code, and when we want to permanently blacklist, patch it away. Part 2: Invoke traces from compiled methods. Similar to how we'll never record traces from the interpreter, we'll also never execute traces from inside the interpreter. The ramification of this is that the tracing JIT will require the method JIT to be of any use, and that means the method JIT will need platform parity with the tracing JIT. But we want this anyway. We already have parity for our three major architectures, and someday we'd like to not have the interpreter. All of this means the fragment+blacklist tables will go away.
9 years ago
Assignee: general → dvander
Status: NEW → ASSIGNED
I should clarify: "never record traces from the interpreter." We will ALWAYS record traces from the interpreter (though someday we may JIT code calls into the recorder). The interpreter will never actually _start_ the recorder though. Only the method JIT will cause recording to start.
I'm going to conservatively estimate another week on this. It's turning out to be a large overhaul of how we manage fragments.
The tracer overhaul side of this is working. I'm having problems when the tracer or recorder bails out inside a method call. The plan was to only trace functions which could be method compiled, and then fix up return addresses on the way out. But imacros are trouble. A few options: 1) Method compile imacros... somehow. This doesn't seem easy or useful. Otherwise, we'll need to flip back and forth between the interpreter and method JIT more liberally. 2) Banish any callstack with imacros to the interpreter. 3) One-by-one, either JaegerShot() or js_Interpret() frames until the extra frames are gone. Don't do any return address fixing. 4) Do return address fixing, JaegerShot() slices of the callstack that don't have imacros. For those, js_Interpret them. Another problem is that when the method JIT or JaegerShot() invokes a function, it expects a script header, which will fix up the VMFrame stack alignment. So this will need to be reworked, but we'd need to for generators anyway.
(In reply to comment #3) > The tracer overhaul side of this is working. > > I'm having problems when the tracer or recorder bails out inside a method call. > The plan was to only trace functions which could be method compiled, and then > fix up return addresses on the way out. But imacros are trouble. A few options: > > 1) Method compile imacros... somehow. This doesn't seem easy or useful. > > Otherwise, we'll need to flip back and forth between the interpreter and method > JIT more liberally. > > 2) Banish any callstack with imacros to the interpreter. > 3) One-by-one, either JaegerShot() or js_Interpret() frames until the extra > frames are gone. Don't do any return address fixing. > 4) Do return address fixing, JaegerShot() slices of the callstack that don't > have imacros. For those, js_Interpret them. Option 4 seems most favorable to me, but I think there's also value to doing whatever is easiest; I'm not sure what that would be. Another, more exotic option, is to do a little "fixup" phase to get out of the imacro: either (a) run it forward to completion in the interpreter, then jump back to JM code, or, (b) in cases where it is possible and makes sense, play it backward to get back to the beginning of the op, then jump to JM from there. Maybe (b) doesn't apply in reality. I peeked at imacros.jsasm and it's not clear we ever call something that can bail without first doing an irreversible op sequence.
Imacros often do implicit conversion calls, which can have arbitrary effects -- hard to rewind. They're straight line code, though (unless I missed something), so going forward could work. /be
(In reply to comment #4) For exotic option "a" - is this different from option 4? To clarify, it's the same as 3 except it gives more inline calls to JaegerShot() which is a little harder but more efficient. For frames with an imacpc, the interpreter would run the imacro and then immediately return. JSOP_CALL/EVAL/APPLY would re-enter JM, in case some toString() or something was really expensive.
Preliminary change: http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/rev/d5969a87a8d9 Instead of "call; pop retaddr; store retaddr" the sequence is now: "pop retaddr; store retaddr; call". This is so we can enter JIT'd code at any point, which is necessary for bailing out of traces.
Not worrying about this now, but: If we _do_ decide to eventually method JIT imacros (or get rid of them), we can change snapshot()/LeaveTree() to restore directly to method JIT state.
(In reply to comment #8) > imacros (or get rid of them) What would you do instead? File a new bug or sketch in a wiki page; I'll help. It's hard to get rid of imacros without making the implicit conversion calls explicit, which is too expensive in old SM bytecode to do all the time (hence recording time only imacros). But maybe JM can specialize its code when recording to test for the implicit conversion hooks and call them if they are there, and record what it is doing. /be
(In reply to comment #9) I think it is possible to replace imacros, without making implicit conversion calls explicit, by using inline function calls instead (with some tweaks, of course). But that is a lot of work we probably don't want to do right now. For the present, I don't see the problem with, when we leave trace, finishing any currently-executing imacros in the interpreter before returning to JM-JITed code. It seems like it would be a rare case?
It's a non-issue until otherwise proven -- but I am interested in ways around the problem that resulted in imacros as solution: implicit calls that the recorder needs to PIC-test and conditionally invoke, for fat ops the interpreter wants for best fat-op performance. /be
Ok, so this is passing all tests now. I need to make another review pass over the code but I think it can land today or tomorrow. (In reply to comment #11) > It's a non-issue until otherwise proven Yep, that's why I'm not worrying about it yet. If it turns out that someday, giving imacros full frames is too slow, we could always just JIT them and when bailing out convert the imacpc frames to proper, full frames. It is a very rare case, but the logic to handle it is extremely gross.
Imacros are only for the recorder, and the recorder has to see the implicit call from the right callsite, so something "extra" (imacpc, a hidden frame, etc.) is needed. A hidden frame could work. /be
OS: Mac OS X → All
Hardware: x86 → All
JSOP_TRACE is now JSOP_TRACEPOINT. It has a UINT16 immediate that is an index into a trace tree vector for that script. http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/rev/5786384bb90d
Tracing and method JIT are now integrated. Fragment table is removed. http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/rev/acfaa77d7123 There are a few big TODOs so I'm leaving this bug open: * Not all tracepoints have method JIT hooks yet, it's just JSOP_IFNE/LT. * Perf is bad on a few benchmarks without recursion, so I will add that back in. * I haven't implemented an extremely rare error path in jsl_InvokeTracer - right now it will just assert. Trying to test-case it to get an idea of what to do.
Integrated tracing recursion: http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/rev/c9dc50df5060 The approach here is a little different from tracemonkey. The interpreter used to have a MONITOR_BRANCH hook on LeaveTree, to record "up"-recursive trees. This seems pointless because if a tree is only up-recursive, it is never executed (a heuristic added later to stop earley-boyer from exploding). Now, only inline calls can trigger a recursive trace, and encountering a "return" will record an up-recursive tree.
nemo: What platform was this on? I just tried linux-x64 but couldn't reproduce it.
Whoops - re-tested, hit one crash, fixed: http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/rev/91d2b3ae9ede
This bug was for the original JM compiler. Work will continue in bug 578727 since the approach will be significantly different this time.
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → INCOMPLETE
9 years ago
No longer blocks: 536277
I don't think INCOMPLETE is the correct resolution as per https://bugzilla.mozilla.org/page.cgi?id=fields.html#status Maybe rename it to cover what this bug was actually for and then resolve it as something else.
Resolution: INCOMPLETE → DUPLICATE
Duplicate of bug: 578727
You need to log in before you can comment on or make changes to this bug.