Emunittest WebAssembly Mandelbrot benchmark is +32.9% slower in Firefox compared to Chrome
Categories
(Core :: JavaScript Engine: JIT, defect, P3)
Tracking
()
People
(Reporter: jujjyl, Unassigned)
References
(Blocks 3 open bugs)
Details
(Keywords: triage-deferred)
Attachments
(3 files)
Reporter | ||
Comment 1•8 years ago
|
||
Reporter | ||
Comment 2•8 years ago
|
||
Comment 4•8 years ago
|
||
Reporter | ||
Comment 6•8 years ago
|
||
Reporter | ||
Comment 7•8 years ago
|
||
Comment 9•8 years ago
|
||
Comment 10•8 years ago
|
||
Comment 11•8 years ago
|
||
Reporter | ||
Comment 12•8 years ago
|
||
Comment 13•8 years ago
|
||
Comment 14•8 years ago
|
||
Updated•7 years ago
|
Comment 15•4 years ago
|
||
I've been investigating this issue: most of the execution time is spend within a loop (no surprise here). As per the paper "Linear Scan Register Allocation on SSA Form", the vregs within the loop are live over the entire loop.
Within the loop, there is a WasmCall instruction. According the LWasmCall::isCallPreserved
, all physical registers are clobbered by the WasmCall. Hence assigning a register for the vregs fails (indicated by log "rdx collides with fixed use ...") and they're added to spilledBundles
.
Later, tryAllocatingRegistersForSpillBundles
attempts to find a physical register but again fails due to the WasmCall. As a consequence, the vregs within the loop are spilled to the stack.
My thoughts towards alleviating the issue: the lifetime of a vreg might require a hole during a WasmCall.
Furthermore, if LWasmCall::isCallPreserved
had information about callee-saved registers (depending on the ABI of the OS/arch) at least some of the vregs wouldn't need to be spilled.
If this sounds about right and someone is up to mentoring this bug I'd give it a try.
Comment 16•4 years ago
|
||
Thanks for the investigation, sounds very plausible. This is probably pretty hard, and Julian is working on this very problem on a different bug, so let's wait for him to finish that first.
Comment 18•4 years ago
|
||
It is unfortunately indeed the case that our regalloc generates an utterly
wretched piece of code. The hottest two blocks, representing 87% of the
executed instructions, are below. They are mostly reloads and spills.
=-=-=-=-=-=-=-=-=-=-=-=-=-= begin SB rank 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=
0: (167061703 66.33%) 167061703 66.33% 0x2b767065e98d
==== SB 68386 (evchecks 0) [tid 0] 0x2b767065e98d UNKNOWN_FUNCTION UNKNOWN_OBJECT+0x0
0x2B767065E98D: movl 44(%rsp),%eax
0x2B767065E991: addl 88(%rsp),%eax
0x2B767065E995: xorps %xmm0,%xmm0
0x2B767065E998: cvtsi2ss %eax,%xmm0
0x2B767065E99C: mulss 76(%rsp),%xmm0
0x2B767065E9A2: addss 84(%rsp),%xmm0
0x2B767065E9A8: movl 44(%rsp),%eax
0x2B767065E9AC: shll 0x2:I8, %eax
0x2B767065E9AF: movl %eax,40(%rsp)
0x2B767065E9B3: movl 60(%rsp),%eax
0x2B767065E9B7: addl 40(%rsp),%eax
0x2B767065E9BB: movl %eax,36(%rsp)
0x2B767065E9BF: movss (%r15,%rax),%xmm1
0x2B767065E9C5: movsd 592(%r14),%xmm2
0x2B767065E9CE: cvtsd2ss %xmm2,%xmm2
0x2B767065E9D2: ucomiss %xmm2,%xmm1
0x2B767065E9D5: jp-32 0x2B767065E9E1
=-=-=-=-=-=-=-=-=-=-=-=-=-= end SB rank 0 =-=-=-=-=-=-=-=-=-=-=-=-=-=
=-=-=-=-=-=-=-=-=-=-=-=-=-= begin SB rank 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=
1: (216101528 85.80%) 49039825 19.47% 0x2b767065ec71
==== SB 68386 (evchecks 0) [tid 0] 0x2b767065ec71 UNKNOWN_FUNCTION UNKNOWN_OBJECT+0x0
0x2B767065EC71: movl 44(%rsp),%eax
0x2B767065EC75: addl $1, %eax
0x2B767065EC78: movl %eax,44(%rsp)
0x2B767065EC7C: cmpl 48(%rbp),%eax
0x2B767065EC7F: jne-32 0x2B767065E980
=-=-=-=-=-=-=-=-=-=-=-=-=-= end SB rank 1 =-=-=-=-=-=-=-=-=-=-=-=-=-=
Comment 19•4 years ago
|
||
Do you have the ability to check this on ARM64? I'm curious whether you're seeing good regalloc there, as we saw on the other program.
Comment 20•4 years ago
|
||
I built the js shell with ac_add_options --target=aarch64-linux-android
and ran the mandelbrot benchmark on a Pixel device. The regalloc logs contain plenty of ... collides with fixed use v0 [197,198)
as well. I'll upload the log and ion.json files so you can see for yourselves.
Comment 21•4 years ago
|
||
Comment 22•4 years ago
|
||
Comment 23•4 years ago
|
||
Thanks. Nothing immediately actionable here, we'll wait for other investigations to complete I guess.
Updated•3 years ago
|
Updated•2 years ago
|
Description
•