Closed
Bug 543637
(JaegerCompiler)
Opened 15 years ago
Closed 15 years ago
Create fragmentary baseline compiler
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: dmandelin, Assigned: dvander)
References
Details
Attachments
(1 file, 8 obsolete files)
|
253.62 KB,
patch
|
Details | Diff | Splinter Review |
This is the bug for the first version of the baseline compiler itself. The goal is intentionally vague: this will start as a place for patches and discussion on the baseline compiler. The plan is to implement one operation at a time, slowly expanding what we can test and gaining initial experience with the assembler and the compilation strategy.
| Assignee | ||
Comment 1•15 years ago
|
||
This applies on top of Julian's "import Nitro" patch. It's very basic, and handles the following script:
|var x = 3.2;|
That is, it can generate working assembly for JSOP_[TRACE, DEFVAR, DOUBLE, SETGVAR, POP, STOP]. There are hooks in js_Interpret that compile and run scripts. If compilation fails, it won't try compiling it again.
* jager/Stubs.* - Builtin calls for handling tough cases. Right now, DEFVAR/SETGVAR.
* jager/Compiler.* - Self explanatory.
* jager/Jager.* - Frame layout, some general stuff, and the trampolines that transition from js_Interpret to JIT'd code.
Some general notes:
* Exception handling is not implemented yet.
* I have only written trampolines for x64, and thus only tested on x64.
* Dave Mandelin suggested a way to print values. At least for the initial testing, I think we can just make JSOP_CALL always print the values it's given, since I don't have calls implemented anyway.
* The parts where a stack-layout or box representation change will break are sectioned off with a grep'able comment. I'm trying to abstract those.
* Oh God, MacroAssembler::peek/poke multiply your offset by sizeof(void*). This will bite me every single time (as it has already), guaranteed.
* JSScript::jager does not get freed. But, I'm not sure what that means anyway. We're using a mysterious pool allocator that should be investigated.
* At some point we'll need to investigate the thread safety of attaching random stuff to scripts.
Immediate work items needed to get this further off the ground:
* x86 port (it's half there, needs trampolines + testing).
* Exception handling. It is really important to at least make bailing out of stubs work, so that's high on my TODO list.
* Proper memory management of allocated code attached to scripts.
* Build/configure stuff for when we want to land to $repo, pref'd off.
| Reporter | ||
Updated•15 years ago
|
Alias: JaegerCompiler
| Assignee | ||
Comment 2•15 years ago
|
||
Changes from the last version
- Some more abstractions.
- Stopped using _ prefix for member vars since Waldo pointed out the C++ standard doesn't like that.
- Added about 6-7 opcodes, mostly simple ones.
(note, JSOP_GETGVAR SIGABRTs if it sees the JSOP_NAME case.)
- Added layer to track script pc to JIT addresses, and deal with forward branches.
- Added JSOP_GOTO, so it can now compile infinite loops.
- Added JMFLAGS (akin to TMFLAGS), the only flag is "abort" and it will tell you if a script can't be compiled because of an unsupported opcode.
Attachment #424746 -
Attachment is obsolete: true
Comment 3•15 years ago
|
||
Nit (bug a big one): "Jaeger" has an 'e' in it. More precisely, it has an a-with-umlaut, and in German when umlauts are too hard 'ae' is an acceptable alternative.
Comment 4•15 years ago
|
||
Yeh, we ought to use the ae-forms consistently. Jager and Jaeger
certainly pronounce differently and would generally be regarded as
different (although perhaps in this case) related words.
| Assignee | ||
Comment 6•15 years ago
|
||
Fixed a bunch of bugs. Notably:
- x64 trampoline asserts are now correct.
- Stub calls no longer use poke() which always does pointer-width writes.
- Fixed some thinkos in the compiler.
This version also implements JSOP_[LT, LE, GT, GE].
Attachment #425362 -
Attachment is obsolete: true
Comment 7•15 years ago
|
||
It's Jaeger, not Jager. There are places in the patch that incorrectly use the latter.
| Assignee | ||
Comment 8•15 years ago
|
||
I've created a wiki page to track which opcodes are implemented, not implemented, half-implemented, etc. This will matter more once this lands, but anyone else interested in adding more opcodes, please use this link :)
https://wiki.mozilla.org/JaegerMonkey/OpcodeProgress
| Assignee | ||
Comment 9•15 years ago
|
||
compare+branch fusing (w/ asserts on safety, incoming edges cannot "break" fusion). ifne/ifeq support. some cleaning up of how jumps are handled. forgotten info from v3: jaegerframes now associate with cx so GC can walk the roots used by stub calls. (this will go away with conservative stack scanning.)
Attachment #425393 -
Attachment is obsolete: true
| Assignee | ||
Comment 10•15 years ago
|
||
implemented JSOP_NAME.
re-implemented JSOP_GETGVAR so the entire fast-path is inlined - slow path is a stub call to JSOP_NAME.
also added a new "scripts" spew channel and a "full" spew channel that acts like TraceMonkey's TMFLAGS=full. so far this just logs when a script compilation is attempted and when it completes.
[jaeger] Scripts: compiling script (file "x.js") (line "1") (length "6")
[jaeger] Scripts: successfully compiled (code "0x100392000") (size "79")
Mostly to keep an eye on code explosion.
Attachment #425419 -
Attachment is obsolete: true
| Assignee | ||
Comment 11•15 years ago
|
||
Added JSOP_ADD (no fast path), JSOP_INT8, JSOP_UINT16, JSOP_INT32. Fixed some bugs.
Now there is enough in place to run a benchmark:
> for (var i = 0; i < 50000000; i = i + 1)
> ;
All numbers are on x64:
SpiderMonkey: 2.73s
JägerMonkey: 1.83s
TraceMonkey: 0.19s
So this is promising, given that JSOP_SETGVAR and JSOP_ADD have no fast path yet. One thing that warrants investigation next week is whether this code bloat is going to be a problem. This script went from 35 bytes to 639 bytes. We should look at the memory usage of other JS engines.
On x64, storing a 64-bit immediate value to memory requires two instructions totaling a minimum of 10-11 bytes. This makes |regs.pc| updates pretty nasty, since I bake in the address right before stub calls. It also bloats baking in JSAtom*s for stub calls. Given that we already have |regs.pc| updates, one or the other should go. Or even both, it's not terrible to pass a 32-bit index and have the stub call read the atom vector, especially as we inline more fast cases.
Attachment #425600 -
Attachment is obsolete: true
Comment 12•15 years ago
|
||
Very encouraging...
| Assignee | ||
Comment 13•15 years ago
|
||
This patch implements:
* JSOP_CALLNAME
* JSOP_BINDNAME
* JSOP_DEFFUN
* fast path for JSOP_SETGVAR
This new fast path got the above test case down to around 1.5s - encouraging, though I'm going to hold off on the "easy" inlining cases until harder stuff is finished.
This patch also lets stub calls fail now. It uses the same mechanism as Nitro - stub calls can swap their return address with a "throw" trampoline, and this implements exception handling. For now, it just bails back to js_Interpret with |ok = JS_FALSE|.
Attachment #425611 -
Attachment is obsolete: true
| Assignee | ||
Comment 14•15 years ago
|
||
Re-based for latest version of the "import nitro" patch, and tm-tip. From this point on I'm going to split this up into smaller, incremental patches.
Attachment #425884 -
Attachment is obsolete: true
| Assignee | ||
Comment 15•15 years ago
|
||
I've split up these patches and landed them individually on a user-repo. Future work will continue until we can get the Nitro assembler patch officially landed.
http://hg.mozilla.org/users/danderson_mozilla.com/jaegermonkey/
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 16•15 years ago
|
||
I added a few more opcodes. A simple test, newly enabled (by the way, I didn't add any fast paths for mul yet, but this test does use some fast paths made by dvander).
var x = 6;
for (var i = 0; i < 100000; ++i)
x *= 9;
print(x);
Time taken, measured with 'time':
SM: 220 ms
JM: 47 ms
TM: 20 ms
| Reporter | ||
Comment 17•15 years ago
|
||
As of changeset d5c45e2e321a, we can run almost all of SunSpider, and 99.6% of the trace-tests. We are currently 1% or so faster than the interpreter on SS, although there is a bug affecting 3d-raytrace, and when that is fixed, we will get slower on that test.
We're taking a big hit on nbody and base64, so we should try to make those faster soon. We also know that we need to do fast paths for our more common ops, in particular common arithmetic ops and comparisons (and also compare/branch sequences).
Comment 18•15 years ago
|
||
IIRC the property cache (bug 365851 brought home the bacon) helped nbody a lot, back in the day. If you are PICless or property-cache-less that's gotta hurt.
Great progress here.
/be
You need to log in
before you can comment on or make changes to this bug.
Description
•