Open Bug 1331568 Opened 3 years ago Updated 5 months ago

Wasm baseline: General instruction selection concerns

Categories

(Core :: Javascript: WebAssembly, task, P5)

task

Tracking

()

REOPENED

People

(Reporter: lth, Unassigned)

References

(Blocks 1 open bug)

Details

(Before doing work here talk to :lth about priorities.  It's not obvious that we should do this work.)

There are many instruction selection opportunities in the baseline compiler that we are not currently pursuing.  Taken together they will usually result in significantly better code.  These opportunties fall into several broad groups:

Strength reduction optimizations:
- multiply-by-constant becomes shift or shift-and-add; note it is hard to beat
  native multiplication unless one is very careful (bug 1316803)
- n^-1 becomes ~x (bug 1276009)
- (presumably others)

Peephole optimization:
- x+0 etc becomes x, this can result from indexing expressions and context-unaware
  code generation but we don't want to special-case this everywhere
- optimization across masm invocation boundaries
- use three-address instructions on ARM more generally
- propagate condition codes when already available
- (very long list)

Instruction selection:
- short jumps rather than long, when we can (bug 1312147)
- bit extract / insert, when available
- LEA on x86
- (long list)

The baseline compiler needs to be fast and any tech solving these problems probably needs to be integrated into a whole - a typical case would be an automata-based peephole optimizer integrated into a platform-specific back-end with reasonable smarts (ie a LIR pipeline for a very low level LIR).  Cretonne, I'm looking at you as a possible solution, or as part of a possible solution.
Cretonne does have a very low-level LIR, and it will eventually be able to perform all of these optimizations, so it could be used as a back-end for the baseline compiler. Cretonne's pipeline boils down to:

1. Legalize any instructions not supported by the target platform. For example, expand i64 arithmetic on 32-bit targets.
2. Run these peephole optimizations.
3. Allocate registers.
4. Emit binary machine code.

So the question becomes, what is the difference between running Cretonne as a second-tier compiler and running it as a backend for the baseline compiler? Some differences could be:

- Second-tier Cretonne will convert all wasm locals to SSA form registers, and the register allocator will spill as needed. Baseline keeps all locals in a designated stack slot which makes register allocation a lot faster because live ranges are short, and there are no PHIs.
- Second-tier Cretonne will optimize heap bounds checking.
- Second-tier Cretonne will hack on the CFG and improve basic block layout. Ideally with profile feedback.

We need more experimental data on Cretonne's performance before we can tell if this is a good idea.
See Also: → 1350933
URL: 1350993
See Also: 1350933
URL: 1350993
See Also: → 1350993
Per policy at https://wiki.mozilla.org/Bug_Triage/Projects/Bug_Handling/Bug_Husbandry#Inactive_Bugs. If this bug is not an enhancement request or a bug not present in a supported release of Firefox, then it may be reopened.
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → INACTIVE
Status: RESOLVED → REOPENED
Resolution: INACTIVE → ---
Component: JavaScript Engine: JIT → Javascript: Web Assembly
Type: defect → task
You need to log in before you can comment on or make changes to this bug.