IonMonkey: Support LICM across dominator trees for OSR.

NEW
Assigned to

Status

()

Core
JavaScript Engine
6 years ago
3 years ago

People

(Reporter: sstangl, Assigned: sstangl)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [ion:t])

Attachments

(1 attachment)

(Assignee)

Description

6 years ago
The OSR case is extremely common but implies significant deoptimization because it partitions the CFG into multiple dominator trees across which optimizations are not performed. This is particularly bad in the case of nested loops: since the inner loop executes more frequently, it is more likely to have an OSR block attached to its preheader, and that OSR block entirely prohibits LICM for the outer loop. Other optimizations are prohibited also, but this is the most important.

We can handle LICM for the case of multiple dominator trees by duplicating instructions into each and inserting a phi at each reachable join point after the intended hoisting location. This will be a bit tricky.

Handling LICM for OSR should greatly benefit SS 3d-raytrace and 3d-cube. (An interesting datum: 3d-cube runs 25% faster with --ion-osr=off.)
We should also be able to entirely ignore the OSR block when computing GVN information, and maintain correctness.
Created attachment 632474 [details] [diff] [review]
Quick patch to list possible improvements

(In reply to Marty Rosenberg [:mjrosenb] from comment #1)
> We should also be able to entirely ignore the OSR block when computing GVN
> information, and maintain correctness.

I had the same idea. I tried it to see what sort of improvements we can expect. I've added the quick and dirty patch I've used to test it. The improvements on SS are in 3d-cube, 3d-morph and 3d-raytrace. The rest stays the same.

3d:         1.021x as fast   68.7ms +/- 0.6%  67.3ms +/- 0.5%   significant
  cube:     1.037x as fast   24.0ms +/- 0.8%  23.2ms +/- 0.9%   significant
  morph:    1.008x as fast   14.3ms +/- 0.5%  14.2ms +/- 0.4%   significant
  raytrace: 1.014x as fast   30.4ms +/- 0.8%  29.9ms +/- 0.3%   significant

No improvement on kraken and to much noise to see anything on v8.
Whiteboard: [ion:t]
The loop unrolling patch (bug 1039458) introduced canClone() and clone methods on MInstruction. This should be useful for the optimization described in comment 0.
You need to log in before you can comment on or make changes to this bug.