Bug 1536612 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

I'm filing this bug as an idea dump since Jan commented on Bug 1382650, and I remembered the issues that Apple faced with adding a 4th JIT tier. I also have been thinking about how we could do better on "library" style code like React, which tends to be highly generic/polymorphic. I have some ideas and would like commentary so that I can flesh them out better.


Why we think we need another inlining level
-------------------------------------------

Suppose we have a hot, polymorphic library function `flux(x)`. `x` will have the types/shapes `A`, `B`, and `C`. This could be a function in React, or a VM built-in.

Our implicit goal is to reduce out the polymorphism to get simple, straight-line code for a single type. For example, this happens when a common, polymorphic function is always called at a single callsite with a single type. In our case, suppose that `blag(x: C)` calls `flux(x: C)`, and we want to compile `blag()`.

Our compilers currently behave as follows:

1. (Interpreter isn't a compiler, just didn't want to start at 2.)

2a. Baseline will compile `flux()`, noting that `x` has types `{A,B,C}`.
2b. Baseline will compile `blag()`, noting that `x` always has type `C`.

3a. If Ion compiles `flux()`, it will generate a function that handles all of types `{A,B,C}`.
3b. If Ion compiles `blag()`, it will attempt to inline `flux()`. The way the inliner works is that it first compiles `flux()` again, using the same `{A,B,C}` types, and then it inserts the CFG of `flux()` into the CFG of `blag()`. Depending on the types assigned in the CFG, we may be able to simplify some of the embedded-`flux()` typesets, but generally it still is suboptimal.

At this point, we usually have (and Apple had) the observation that if we just recorded the types when executing *the inlined code* in Ion, then we would have better *internal* type information for `flux-when-inlined`, and therefore we would be able to *regenerate* Ion code that has better type information. This is true.

There's another way to get trace trees that keeps 3 tiers
---------------------------------------------------------

Another way to look at this problem, though, is that, really, we effectively just want a trace tree of what happens internally in `flux()` when it's called with arguments of different types.

You could generate such a trace tree for a 4th tier by referring to the ICs of a 3rd tier -- that's effectively what it's doing.

You could also get a trace tree for the *3rd* tier compiler by just recording more information in the *2nd* tier.

AFL-inspired type tuples
------------------------

The AFL whitepaper (http://lcamtuf.coredump.cx/afl/technical_details.txt) describes a system that encodes tuples of taken branches for the fuzzer, which creates an implicit trace tree and is good enough to recreate program flow.

We could use a similar mechanism in the Baseline ICs to record *directional* types: each IC would record the branch in the *previous* IC that was taken, and the branch in the *current* IC that resulted (without needing to worry about causality!).

Specifically, we would store `(&prev_ic_stub, &cur_ic_stub)`.

If we recorded this information, `IonBuilder` would be consult both the last instruction's cache as well as the current instruction's cache, and would receive a list of *all possible IC types seen when the previous path was taken*.

With this change, 3rd-tier Ion would have the information that 4th-tier Ion would work, and we could compile equivalent code in the 3rd-tier.
I'm filing this bug as an idea dump since Jan commented on Bug 1382650, and I remembered the issues that Apple faced with adding a 4th JIT tier. I also have been thinking about how we could do better on "library" style code like React, which tends to be highly generic/polymorphic. I have some ideas and would like commentary so that I can flesh them out better.


Why we think we need another inlining level
-------------------------------------------

Suppose we have a hot, polymorphic library function `flux(x)`. `x` will have the types/shapes `A`, `B`, and `C`. This could be a function in React, or a VM built-in.

Our implicit goal is to reduce out the polymorphism to get simple, straight-line code for a single type. For example, this happens when a common, polymorphic function is always called at a single callsite with a single type. In our case, suppose that `blag(x: C)` calls `flux(x: C)`, and we want to compile `blag()`.

Our compilers currently behave as follows:

1. (Interpreter isn't a compiler, just didn't want to start at 2.)

2a. Baseline will compile `flux()`, noting that `x` has types `{A,B,C}`.
2b. Baseline will compile `blag()`, noting that `x` always has type `C`.

3a. If Ion compiles `flux()`, it will generate a function that handles all of types `{A,B,C}`.
3b. If Ion compiles `blag()`, it will attempt to inline `flux()`. The way the inliner works is that it first compiles `flux()` again, using the same `{A,B,C}` types, and then it inserts the CFG of `flux()` into the CFG of `blag()`. Depending on the types assigned in the CFG, we may be able to simplify some of the embedded-`flux()` typesets, but generally it still is suboptimal.

At this point, we usually have (and Apple had) the observation that if we just recorded the types when executing *the inlined code* in Ion, then we would have better *internal* type information for `flux-when-inlined`, and therefore we would be able to *regenerate* Ion code that has better type information. This is true.

There's another way to get trace trees that keeps 3 tiers
---------------------------------------------------------

Another way to look at this problem, though, is that, really, we effectively just want a trace tree of what happens internally in `flux()` when it's called with arguments of different types.

You could generate such a trace tree for a 4th tier by referring to the ICs of a 3rd tier -- that's effectively what it's doing.

You could also get a trace tree for the *3rd* tier compiler by just recording more information in the *2nd* tier.

AFL-inspired type tuples
------------------------

The AFL whitepaper (http://lcamtuf.coredump.cx/afl/technical_details.txt) describes a system that encodes tuples of taken branches for the fuzzer, which creates an implicit trace tree and is good enough to recreate program flow.

We could use a similar mechanism in the Baseline ICs to record *directional* types: each IC would record the branch in the *previous* IC that was taken, and the branch in the *current* IC that resulted (without needing to worry about causality!).

Specifically, we would store `(&prev_ic_stub, &cur_ic_stub)`.

If we recorded this information, `IonBuilder` would be able to consult both the last instruction's cache as well as the current instruction's cache, and would receive a list of *all possible IC types seen when the previous path was taken*.

With this change, 3rd-tier Ion would have the information that a 4th-tier Ion would want, and we could compile equivalent code in the 3rd-tier.
I'm filing this bug as an idea dump since Jan commented on Bug 1382650, and I remembered the issues that Apple faced with adding a 4th JIT tier. I also have been thinking about how we could do better on "library" style code like React, which tends to be highly generic/polymorphic. I have some ideas and would like commentary so that I can flesh them out better.


Why we think we need another inlining level
-------------------------------------------

Suppose we have a hot, polymorphic library function `flux(x)`. `x` will have the types/shapes `A`, `B`, and `C`. This could be a function in React, or a VM built-in.

Our implicit goal is to reduce out the polymorphism to get simple, straight-line code for a single type. For example, this happens when a common, polymorphic function is always called at a single callsite with a single type. In our case, suppose that `blag(x: C)` calls `flux(x: C)`, and we want to compile `blag()`.

Our compilers currently behave as follows:

1. (Interpreter isn't a compiler, just didn't want to start at 2.)

2a. Baseline will compile `flux()`, noting that `x` has types `{A,B,C}`.
2b. Baseline will compile `blag()`, noting that `x` always has type `C`.

3a. If Ion compiles `flux()`, it will generate a function that handles all of types `{A,B,C}`.
3b. If Ion compiles `blag()`, it will attempt to inline `flux()`. The way the inliner works is that it first compiles `flux()` again, using the same `{A,B,C}` types, and then it inserts the CFG of `flux()` into the CFG of `blag()`. Depending on the types assigned in the CFG, we may be able to simplify some of the embedded-`flux()` typesets, but generally it still is suboptimal.

At this point, we usually have (and Apple had) the observation that if we just recorded the types when executing *the inlined code* in Ion, then we would have better *internal* type information for `flux-when-inlined`, and therefore we would be able to *regenerate* Ion code that has better type information. This is true.

There's another way to get trace trees that keeps 3 tiers
---------------------------------------------------------

Another way to look at this problem, though, is that, really, we effectively just want a trace tree of what happens internally in `flux()` when it's called with arguments of different types.

You could generate such a trace tree for a 4th tier by referring to the ICs of a 3rd tier -- that's effectively what it's doing.

You could also get a trace tree for the *3rd* tier compiler by just recording more information in the *2nd* tier.

AFL-inspired type tuples
------------------------

The AFL whitepaper (http://lcamtuf.coredump.cx/afl/technical_details.txt) describes a system that encodes tuples of taken branches for the fuzzer, which creates an implicit trace tree and is good enough to recreate program flow.

We could use a similar mechanism in the Baseline ICs to record *directional* types: each IC would record the branch in the *previous* IC that was taken, and the branch in the *current* IC that resulted (without needing to worry about causality!).

Specifically, we would store `(&prev_ic_stub, &cur_ic_stub)`.

If we recorded this information, `IonBuilder` would be able to consult both the last instruction's cache as well as the current instruction's cache, and would receive a list of *all possible IC types seen when the previous path was taken*. From this you can reconstruct complete function execution flow.

With this change, 3rd-tier Ion would have the information that a 4th-tier Ion would want, and we could compile equivalent code in the 3rd-tier.

Back to Bug 1536612 Comment 0