Closed Bug 506182 Opened 12 years ago Closed 10 years ago
nanojit based inline-threading
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:126.96.36.199) Gecko/2009060309 Ubuntu/9.04 (jaunty) Firefox/3.0.11 Build Identifier: I have a work in progress patch against tracemonkey that implements inline threading largely based on http://www.cs.toronto.edu/syslab/pubs/demkea_context.ps and David Mandelin's proof of concept inline-threading patch (Bug 442379). It currently does not integrate with tracing and does not work in the presence of switch statements or generators. Exceptions being thrown cause it to bail back to the main interpreter loop. It currently yields a ~5% performance win on OS X on both SunSpider and the v8 benchmark suite when compared against the interpreter. The win is ~10% on Windows. Reproducible: Always
Since the code for handling individual opcodes needs to exist twice (once for the main interpreter loop, once for functions to be called by inline-threaded code), I move the definitions into another file which is then included from jsinterp.cpp. This patch does that movement. I have it as a separate patch so that the actual changes I have made to the opcode functions are not lost in the noise of the file move.
To be applied on top of the patch to move opcode case definitions. This patch has call threading, inlining of small opcodes, specialization, inlining of integer fast paths for arithmetic, bit operations, comparisons, and increments, and PC update elimination (where possible). The interesting bits are jsithread.cpp and the new macro definitions in jsinterp.cpp.
(In reply to comment #0) > It currently yields a ~5% performance win on OS X on both SunSpider and the v8 > benchmark suite when compared against the interpreter. The win is ~10% on > Windows. This is quite an undersell, at least of the potential. The patch yields a 3x speedup on 'friendly' microbenchmarks. We don't know why the SS speedup is much more modest, but it probably relates to the difficulty of inlining complex ops like setprop.
Status: UNCONFIRMED → NEW
Ever confirmed: true
This is a cleaned up version of the previous patch.
Attachment #390403 - Attachment is obsolete: true
This patch generates code at loop headers to check the operationCallbackFlag and abort script execution if necessary.
Attachment #390536 - Attachment is obsolete: true
Updated jsops move patch.
Attachment #390402 - Attachment is obsolete: true
This patch fixes a handful of bugs. Most importantly, pc update elimination is now turned off for most opcodes, after static analysis and testing revealed that nearly every important opcode can require the pc (mostly for in error reporting).
Attachment #390588 - Attachment is obsolete: true
Fixed a bunch of bugs, including a really bad windows bug and some issues with shifting. Added a workaround for an ARM nanojit deficiency. Perpetrated some (temporary!) unpleasant hacks to get tracing disabled so that tests that expect tracing to not be broken don't fail.
Attachment #391931 - Attachment is obsolete: true
This one actually builds with the new CodeAlloc stuff.
Attachment #393598 - Attachment is obsolete: true
Some bug fixes, merged with changes.
Attachment #393666 - Attachment is obsolete: true
Some notes on the organization of the code (this is pulled from a comment in jsithread.cpp): jsithread.cpp contains the all of the code generation for inline threading. Inline threading required some major refactoring of the interpreter (moving local variables into case specific scope, eliminating inter-opcode jumps, making BEGIN_CASE/END_CASE macros balanced, introducing new CASE macros), but most of it is not relevant to the behavior of inline threading. jsinterp.cpp defines *_CASE macros and other interpreter control macros and includes a copy of jsops.cpp to declare the opcode functions. The interpreter function also sets up the state structure and tries to run inline threaded code. Other than that, there isn't much inline threading specific code. jsops.cpp has a just few bits of inline threading specific code: function and subroutine calls need to return code addresses to jump to, certain interpreter optimizations are turned off, and there are stubs for merging tests and branches.
unfortantly this patch fails to apply correctly against mozilla-central, is it possible to re-sync with trunk?
(In reply to comment #12) > unfortantly this patch fails to apply correctly against mozilla-central, is it > possible to re-sync with trunk? It is currently a patch against tracemonkey, not mozilla-central.
This syncs up the patch with the current tip of the tree. The removal of Fragmento broke how it dealt with finally blocks. Rather than fixing this properly, I just removed the support for finally blocks for the time being. I didn't really like how it was being done anyways, and feel that it should be replaced by something based on LIR_jtbl once that becomes available.
Attachment #394603 - Attachment is obsolete: true
Guess this can't move any further?
(In reply to comment #14) > Created an attachment (id=396826) [details] > work in progress patch 8 > > This syncs up the patch with the current tip of the tree. > > The removal of Fragmento broke how it dealt with finally blocks. Rather than > fixing this properly, I just removed the support for finally blocks for the > time being. I didn't really like how it was being done anyways, and feel that > it should be replaced by something based on LIR_jtbl once that becomes > available. LIR_jtbl now available in patch form on bug 465582 for x86, x64, ppc, ARM. Beta testing & review input is most welcome.
Just another status update. LIR_jtbl has landed in nanojit-central.
This has been supplanted by JaegerMonkey, so I'm closing it WONTFIX.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.