Open Bug 1443082 Opened 3 years ago Updated 4 months ago

ARM64: 25%-30% of wasm baseline compilation time spent in vixl assembler


(Core :: Javascript: WebAssembly, enhancement, P5)





(Reporter: lth, Unassigned)


(Blocks 2 open bugs)


(Whiteboard: [arm64:m4])

Here are the top vixl entries (from a perf recording of a run that compiles Tanks on a single core):

   8.25% vixl::MozBaseAssembler::Emit
   3.85% vixl::MacroAssembler::LoadStoreMacro
   3.05% vixl::Assembler::LoadStoreMemOperand
   2.03% vixl::MacroAssembler::AddSubMacro
   1.84% vixl::MozBaseAssembler::Emit
   1.66% vixl::MemOperand::IsImmediateOffset
   1.27% vixl::Assembler::AddSub
   1.26% vixl::Assembler::LoadStore
   0.93% vixl::Operand::IsImmediate
   0.87% vixl::MemOperand::IsPostIndex
   0.84% vixl::MacroAssembler::Ldr
   0.69% vixl::MemOperand::MemOperand
   0.64% vixl::MacroAssembler::B
   0.62% vixl::MozBaseAssembler::LinkAndGetOffsetTo

Clearly the assembler cannot take zero time, but cutting the time by even half, eg by judicious inlining or by bypassing the MacroAssembler and going directly to lower-level assembler functions in very hot cases where we don't need masm generality, is probably worthwhile.  We could hope to rely on PGO to do that for us - if indeed we use PGO on the platform - but if it's easy to do it manually we should.
A quick experiment compiling ZenGarden (which is much larger than Tanks) gives a very similar profile.
We'll get to this if it becomes necessary.

