The OSR case is extremely common but implies significant deoptimization because it partitions the CFG into multiple dominator trees across which optimizations are not performed. This is particularly bad in the case of nested loops: since the inner loop executes more frequently, it is more likely to have an OSR block attached to its preheader, and that OSR block entirely prohibits LICM for the outer loop. Other optimizations are prohibited also, but this is the most important. We can handle LICM for the case of multiple dominator trees by duplicating instructions into each and inserting a phi at each reachable join point after the intended hoisting location. This will be a bit tricky. Handling LICM for OSR should greatly benefit SS 3d-raytrace and 3d-cube. (An interesting datum: 3d-cube runs 25% faster with --ion-osr=off.)
We should also be able to entirely ignore the OSR block when computing GVN information, and maintain correctness.
Created attachment 632474 [details] [diff] [review] Quick patch to list possible improvements (In reply to Marty Rosenberg [:mjrosenb] from comment #1) > We should also be able to entirely ignore the OSR block when computing GVN > information, and maintain correctness. I had the same idea. I tried it to see what sort of improvements we can expect. I've added the quick and dirty patch I've used to test it. The improvements on SS are in 3d-cube, 3d-morph and 3d-raytrace. The rest stays the same. 3d: 1.021x as fast 68.7ms +/- 0.6% 67.3ms +/- 0.5% significant cube: 1.037x as fast 24.0ms +/- 0.8% 23.2ms +/- 0.9% significant morph: 1.008x as fast 14.3ms +/- 0.5% 14.2ms +/- 0.4% significant raytrace: 1.014x as fast 30.4ms +/- 0.8% 29.9ms +/- 0.3% significant No improvement on kraken and to much noise to see anything on v8.
The loop unrolling patch (bug 1039458) introduced canClone() and clone methods on MInstruction. This should be useful for the optimization described in comment 0.