Last Comment Bug 650180 - (IonMonkey) [meta] Build a new optimizing compiler
(IonMonkey)
: [meta] Build a new optimizing compiler
Status: NEW
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal with 30 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 646923 anion 677141 677953 684381 IonSpeed 716251 716682 721662 IonFuzz 728062 733661 752142 771500 771835 771864 786465 885514 650181 657816 659729 659839 661867 663575 666426 669575 669576 669577 669789 669793 669795 669796 669950 670484 670624 670816 670822 670827 671430 673026 674334 674402 674505 674507 675128 675378 676151 677066 677143 677337 677415 678377 678598 678625 678630 679794 680315 IM+TI 686595 IonOSI 692838 698778 699415 699883 700517 701125 701554 701993 702009 703376 705191 705247 705251 706778 707899 707919 708569 709731 712523 713855 713867 714428 714686 715088 715766 715772 716853 719855 724751 724875 728045 730111 730859 731142 731955 732546 732652 732653 733662 736299 738577 739808 742784 743640 LandIon 754202 755768 758181 760103 760231 764163 764432 765119 765127 766592 767349 770889 771460 772830
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-14 17:43 PDT by David Anderson [:dvander]
Modified: 2014-10-31 11:25 PDT (History)
68 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-


Attachments

Description David Anderson [:dvander] 2011-04-14 17:43:04 PDT
The existing JM2 architecture suffices as a baseline compiler, but further optimization and specialization work is difficult, extremely hacky, or impossible.

There are two problems: the lack of an IR and the design of the compiler itself. The first pass computes stack depths, and the second pass splats out a little template of assembly per-opcode. Operations aren't broken down into small enough units to perform typical optimizations, making cross-basic-block regalloc & LICM rather ad-hoc or limited.

The plan is to design a new set of IRs to assist in making new optimizations possible. We should also be able to break down existing ops into smaller units, for example, we should be able to CSE redundant shape guards or slots-array loads. On top of the IRs we will have a new code generator that re-uses the Nitro assembler. Some planned features:
 * Advanced linear scan regalloc across basic blocks
 * LICM
 * CSE
 * Interval analyses
 * Advanced specialization, via profiling and type inference
 * Better handling of boxing formats, especially on x64
 * More robust recompilation, elimination of compartment-wide debug mode

A more hopeful, but perhaps not realistic goal is to make this new compiler capable of being both an optimizing, type-specializing compiler, and a generic baseline compiler to replace JM. I think we'll know pretty early on whether this is feasible (i.e. whether it can be done without ruining the design or compile times), and if not, we can also focus on simplifying JM.
Comment 1 Nicholas Nethercote [:njn] 2011-04-19 19:58:04 PDT
> A more hopeful, but perhaps not realistic goal is to make this new compiler
> capable of being both an optimizing, type-specializing compiler, and a generic
> baseline compiler to replace JM. I think we'll know pretty early on whether
> this is feasible (i.e. whether it can be done without ruining the design or
> compile times), and if not, we can also focus on simplifying JM.

Let's assume the latter.  What does the execution pipeline look like?

- Interpret a bit
- JaegerMonkey-compile (with type inference) slightly hot code
- IonMonkey-compile (with type inference and IRs) hotter code
- Trace-compile really hot code?

It sounds like TraceMonkey's days are numbered, which isn't necessarily a bad thing.  With type inference in place everywhere the only compelling advantage it provides is inlining of small functions.
Comment 2 David Anderson [:dvander] 2011-04-19 22:28:51 PDT
The type inference branch currently inlines functions, Ion will too. To be honest I'm not sure what the pipeline will look like yet, and the pipeline may change as we figure out how to tune it.

But, one of the up-front design decisions I'd like to make is that Ion code will never call into the tracer. It's an insanely complex path... and reducing the JIT transition matrix is probably best for everyone's sanity. *Especially* Bill's :)

My hope is that Ion won't just be fixing problems in JM, but also TM. TM suffers from an inability to despecialize both types and control flow, as it can't separate compilation from execution. Its IR is also fairly inflexible after it's been emitted. 

These are things we will fix in Ion. And for when we don't have type inference, we'll have the ability to collect type information via profiling. One backend, able to feed from multiple sources of information, should get us both JM3 and TM2. Maybe not right away, but it sounds like a good long-term goal.
Comment 3 Nicholas Nethercote [:njn] 2011-04-27 18:33:05 PDT
(In reply to comment #2)
> 
> But, one of the up-front design decisions I'd like to make is that Ion code
> will never call into the tracer. It's an insanely complex path...

Getting rid of the tracer would make that easier.  Otherwise the base compiler would have to decide whether to use IonMonkey or TM to do optimizing recompilation, which seems weird.

My gut feeling is that two levels of compilation (base + optimizing) should be enough, though bhackett suggested elsewhere that some kind of trace compilation might be good for white-hot loops.  Even if that were the case, it still sounds like a death knell for TM and particularly nanojit;  having two assemblers is weird, the momentum is clearly not with TM/nanojit, and nanojit has enough design limitations that redesigning a new trace jit from scratch based on what we've learnt about tracing sounds more sensible.
Comment 4 Brendan Eich [:brendan] 2011-04-28 01:20:15 PDT
We don't take perf regressions, so TM won't die without faster replacement on many benchmarks (not just the Stupids(tm)).

I suspect bhackett is right and we'll want tracing, semi-static type inference, and baseline/profiled JITting. But it sounds like dvander et al. have a good plan to support all three.

/be
Comment 5 Brian Hackett (:bhackett) 2011-04-28 07:18:54 PDT
(In reply to comment #4)
> We don't take perf regressions, so TM won't die without faster replacement on
> many benchmarks (not just the Stupids(tm)).
> 
> I suspect bhackett is right and we'll want tracing, semi-static type inference,
> and baseline/profiled JITting. But it sounds like dvander et al. have a good
> plan to support all three.
> 
> /be

We should list the benchmarks whose performance we care about and put them on AWFY.  Not just suites, but also microbenchmarks testing individual features important for the web (typed arrays, native getters/setters, ...) and shell tests synthesized from JS bound web pages (bug 643666).  (Also, in a new page, like the test breakdown page, to avoid clutter.)  I filed this a couple weeks ago as bug 649487.

JM+TI won't replace TM, but IonMonkey could, and I think should, replace both TM and JM.  IMO, such replacements can't happen without a concrete way to do a broad evaluation of a JS engine's performance, and I agree the main benchmark suites don't cut it by themselves.
Comment 6 David Mandelin [:dmandelin] 2012-05-01 11:59:59 PDT
Right now, K9O wants bugs that are absolutely required and can be finished quickly. IM will take a bit longer and is more in the "wanted" category, so not blocking K9O for now.
Comment 7 Rob 2014-05-29 08:11:02 PDT
To avoid a Dupe I tried to find the "Compiler Optimization" / "Be 'nice' While Compiling" Bug, but this Meta was the most recent (and it is fairly old).

This Page http://glsl.heroku.com/e#12543.1 simply hangs the Browser for several minutes (I have a slow Computer) and then the compiled Code runs really fast.

All the other Examples on Page http://glsl.heroku.com/ that I tried load and compile almost instantly.

It would be great if the Compiler were faster on the first example. It is a "Bug" that when compiling the Browser hangs and we can not switch Tabs. 

The compiler needs to be 'nice' and not hog the CPU. While we wait we should be able to do other things. This Bug is titled "Build a new optimizing Compiler", this is something that should be considered in its design. 

When running an Example on http://threejs.org/ (specifically http://www.playmapscube.com/ ) the compiled output crashed. A List of these example pages would make a good Test / Benchamrk Farm.
Comment 8 Rob 2014-05-29 09:50:28 PDT
(In reply to Rob from comment #7)
> ...
> All the other Examples on Page http://glsl.heroku.com/ that I tried load and
> compile almost instantly.
> ...

A great example of how slow we compile is here: https://www.shadertoy.com/browse .

1. Copy that URL.
2. Paste it into a new Tab's URL Bar.
3. Let the first example load and then start Google Chrome.
4. Paste the URL into Google Chrome's URL Bar.

Notice that even though Firefox got a head start, and is using most of the CPU, that Chrome whips through that Page much faster. We are unacceptably slow.
Comment 9 Andrew McCreight [:mccr8] 2014-05-29 09:52:04 PDT
IonMonkey, the compiler described in this bug, landed a year or two ago.  Please file new bugs in the Javascript: JIT component in bugzilla for any issues you may have.

Note You need to log in before you can comment on or make changes to this bug.