Bug 650180 (IonMonkey)

[meta] Build a new optimizing compiler

NEW
Unassigned

Status

()

8 years ago
4 years ago

People

(Reporter: dvander, Unassigned)

Tracking

(Depends on: 16 bugs)

Firefox Tracking Flags

(blocking-kilimanjaro:-)

Details

The existing JM2 architecture suffices as a baseline compiler, but further optimization and specialization work is difficult, extremely hacky, or impossible.

There are two problems: the lack of an IR and the design of the compiler itself. The first pass computes stack depths, and the second pass splats out a little template of assembly per-opcode. Operations aren't broken down into small enough units to perform typical optimizations, making cross-basic-block regalloc & LICM rather ad-hoc or limited.

The plan is to design a new set of IRs to assist in making new optimizations possible. We should also be able to break down existing ops into smaller units, for example, we should be able to CSE redundant shape guards or slots-array loads. On top of the IRs we will have a new code generator that re-uses the Nitro assembler. Some planned features:
 * Advanced linear scan regalloc across basic blocks
 * LICM
 * CSE
 * Interval analyses
 * Advanced specialization, via profiling and type inference
 * Better handling of boxing formats, especially on x64
 * More robust recompilation, elimination of compartment-wide debug mode

A more hopeful, but perhaps not realistic goal is to make this new compiler capable of being both an optimizing, type-specializing compiler, and a generic baseline compiler to replace JM. I think we'll know pretty early on whether this is feasible (i.e. whether it can be done without ruining the design or compile times), and if not, we can also focus on simplifying JM.
> A more hopeful, but perhaps not realistic goal is to make this new compiler
> capable of being both an optimizing, type-specializing compiler, and a generic
> baseline compiler to replace JM. I think we'll know pretty early on whether
> this is feasible (i.e. whether it can be done without ruining the design or
> compile times), and if not, we can also focus on simplifying JM.

Let's assume the latter.  What does the execution pipeline look like?

- Interpret a bit
- JaegerMonkey-compile (with type inference) slightly hot code
- IonMonkey-compile (with type inference and IRs) hotter code
- Trace-compile really hot code?

It sounds like TraceMonkey's days are numbered, which isn't necessarily a bad thing.  With type inference in place everywhere the only compelling advantage it provides is inlining of small functions.
The type inference branch currently inlines functions, Ion will too. To be honest I'm not sure what the pipeline will look like yet, and the pipeline may change as we figure out how to tune it.

But, one of the up-front design decisions I'd like to make is that Ion code will never call into the tracer. It's an insanely complex path... and reducing the JIT transition matrix is probably best for everyone's sanity. *Especially* Bill's :)

My hope is that Ion won't just be fixing problems in JM, but also TM. TM suffers from an inability to despecialize both types and control flow, as it can't separate compilation from execution. Its IR is also fairly inflexible after it's been emitted. 

These are things we will fix in Ion. And for when we don't have type inference, we'll have the ability to collect type information via profiling. One backend, able to feed from multiple sources of information, should get us both JM3 and TM2. Maybe not right away, but it sounds like a good long-term goal.
(In reply to comment #2)
> 
> But, one of the up-front design decisions I'd like to make is that Ion code
> will never call into the tracer. It's an insanely complex path...

Getting rid of the tracer would make that easier.  Otherwise the base compiler would have to decide whether to use IonMonkey or TM to do optimizing recompilation, which seems weird.

My gut feeling is that two levels of compilation (base + optimizing) should be enough, though bhackett suggested elsewhere that some kind of trace compilation might be good for white-hot loops.  Even if that were the case, it still sounds like a death knell for TM and particularly nanojit;  having two assemblers is weird, the momentum is clearly not with TM/nanojit, and nanojit has enough design limitations that redesigning a new trace jit from scratch based on what we've learnt about tracing sounds more sensible.
We don't take perf regressions, so TM won't die without faster replacement on many benchmarks (not just the Stupids(tm)).

I suspect bhackett is right and we'll want tracing, semi-static type inference, and baseline/profiled JITting. But it sounds like dvander et al. have a good plan to support all three.

/be
(In reply to comment #4)
> We don't take perf regressions, so TM won't die without faster replacement on
> many benchmarks (not just the Stupids(tm)).
> 
> I suspect bhackett is right and we'll want tracing, semi-static type inference,
> and baseline/profiled JITting. But it sounds like dvander et al. have a good
> plan to support all three.
> 
> /be

We should list the benchmarks whose performance we care about and put them on AWFY.  Not just suites, but also microbenchmarks testing individual features important for the web (typed arrays, native getters/setters, ...) and shell tests synthesized from JS bound web pages (bug 643666).  (Also, in a new page, like the test breakdown page, to avoid clutter.)  I filed this a couple weeks ago as bug 649487.

JM+TI won't replace TM, but IonMonkey could, and I think should, replace both TM and JM.  IMO, such replacements can't happen without a concrete way to do a broad evaluation of a JS engine's performance, and I agree the main benchmark suites don't cut it by themselves.
Depends on: 657816
Depends on: 659729
Depends on: 661867
Depends on: 666426
Depends on: 669575
Depends on: 669576
Depends on: 669577
Depends on: 663575
Depends on: 669789
Depends on: 669793
Depends on: 669795
Depends on: 669796
Depends on: 669950
Depends on: 670624
Depends on: 671430
Depends on: 673026
Depends on: 674334
Depends on: 674402
Depends on: 674505
Depends on: 674506
Depends on: 674507
Depends on: 674656
Depends on: 674664
Depends on: 674680
Depends on: 674689
No longer depends on: 674680
No longer depends on: 674664
No longer depends on: 674656
Depends on: 675128
No longer depends on: 674506
Depends on: 675373
No longer depends on: 675373
Depends on: 675378
Depends on: 676151
No longer depends on: 669575
No longer depends on: 669950
No longer depends on: 674507
Depends on: 669950
Depends on: 674507
Depends on: 669575
Depends on: 677066
Depends on: 677143
Depends on: 677141
Depends on: 677415
Depends on: 677953
Depends on: 678072
No longer depends on: 678072
Depends on: 678377
Depends on: 678598
Depends on: 678625
Depends on: 678630
Depends on: 679794
Depends on: 683037
Depends on: 692838
Depends on: 695017
No longer depends on: 695017
Depends on: 698778
Depends on: 699415
Depends on: 699883
Depends on: 700030
Depends on: 700517
Depends on: 701125
Depends on: 701554
Depends on: 701993
Depends on: 702009
Depends on: 703376
Depends on: 703791
No longer depends on: 703791
Depends on: 705191
Depends on: 705247
Depends on: 705251
Depends on: 705294
Depends on: 706778
Depends on: 706896
No longer depends on: 706896
Depends on: 707899
Depends on: 707919
Depends on: 708569
Depends on: 709731
Depends on: 712523
Depends on: 713855
Depends on: 713867
Depends on: 714428
Depends on: 714686
Depends on: 715088
Depends on: 715766
Depends on: 715772
Depends on: 716251
Depends on: 716682
Depends on: 716853
Depends on: 719855
Depends on: 721662
Depends on: 728045
Depends on: 728062
Depends on: 730111
Depends on: 730859
Depends on: 731142
Depends on: 731955
Depends on: 732546
Depends on: 733661
Depends on: 733662
Depends on: 736299
Depends on: 738577
Depends on: 739808
Depends on: 740563
No longer depends on: 740563
Depends on: 742784
Depends on: 743640
blocking-kilimanjaro: --- → ?
Depends on: 748986
No longer depends on: 748986
Right now, K9O wants bugs that are absolutely required and can be finished quickly. IM will take a bit longer and is more in the "wanted" category, so not blocking K9O for now.
blocking-kilimanjaro: ? → -
Depends on: 752142
Depends on: 755768
Depends on: 754202
No longer depends on: 700030
Depends on: 758181
Depends on: 760103
Depends on: 760231
Depends on: 764163
Depends on: 764432
Depends on: 765126
Depends on: 765127
Depends on: 765128
Depends on: 765119
Depends on: 766592
Depends on: 767349
Depends on: 770889
Depends on: 771460
Depends on: 771500
Depends on: 772830
Depends on: 771835
Depends on: 771864
Depends on: 774075
No longer depends on: 774075
No longer depends on: 765128
No longer depends on: 765126
Depends on: 786465

Updated

7 years ago
Depends on: 790464
No longer depends on: 790464
Depends on: 885514

Comment 7

5 years ago
To avoid a Dupe I tried to find the "Compiler Optimization" / "Be 'nice' While Compiling" Bug, but this Meta was the most recent (and it is fairly old).

This Page http://glsl.heroku.com/e#12543.1 simply hangs the Browser for several minutes (I have a slow Computer) and then the compiled Code runs really fast.

All the other Examples on Page http://glsl.heroku.com/ that I tried load and compile almost instantly.

It would be great if the Compiler were faster on the first example. It is a "Bug" that when compiling the Browser hangs and we can not switch Tabs. 

The compiler needs to be 'nice' and not hog the CPU. While we wait we should be able to do other things. This Bug is titled "Build a new optimizing Compiler", this is something that should be considered in its design. 

When running an Example on http://threejs.org/ (specifically http://www.playmapscube.com/ ) the compiled output crashed. A List of these example pages would make a good Test / Benchamrk Farm.

Comment 8

5 years ago
(In reply to Rob from comment #7)
> ...
> All the other Examples on Page http://glsl.heroku.com/ that I tried load and
> compile almost instantly.
> ...

A great example of how slow we compile is here: https://www.shadertoy.com/browse .

1. Copy that URL.
2. Paste it into a new Tab's URL Bar.
3. Let the first example load and then start Google Chrome.
4. Paste the URL into Google Chrome's URL Bar.

Notice that even though Firefox got a head start, and is using most of the CPU, that Chrome whips through that Page much faster. We are unacceptably slow.
IonMonkey, the compiler described in this bug, landed a year or two ago.  Please file new bugs in the Javascript: JIT component in bugzilla for any issues you may have.
Assignee: general → nobody
You need to log in before you can comment on or make changes to this bug.