Open
Bug 1062544
Opened 11 years ago
Updated 3 years ago
DOMMatrix runtime performance is much lower on windows than other platforms when NaNs are involved, due to x87 instructions in gfx::Matrix::operator*
Categories
(Core :: Graphics, defect)
Tracking
()
NEW
People
(Reporter: cabanier, Unassigned)
References
()
Details
Run: http://jsperf.com/dommatrix-perf on different platform.
On all platforms 'Native' and 'JS equivalent' have similar performance except on Windows where 'Native' is 3 to 4 times slower
| Reporter | ||
Updated•11 years ago
|
Updated•11 years ago
|
Component: JavaScript Engine → JavaScript Engine: JIT
| Reporter | ||
Updated•11 years ago
|
Summary: DOMMatrix runtime performance is much lower on windows that other platforms → DOMMatrix runtime performance is much lower on windows than other platforms
Comment 1•11 years ago
|
||
I seriously doubt this is a jit issue. Need a profile on Windows.
Kyle, do you have a profiling setup, or know who does?
Flags: needinfo?(khuey)
Comment 2•11 years ago
|
||
dmajor said he will do profiling after lunch.
Couldn't this be a jit issue, especially if people on mac and linux use 64bit builds and on Windows 32bit. We've seen slower perf on Windows quite often.
But better to wait for profiles.
Comment 3•11 years ago
|
||
I tested a 32-bit build on Mac and saw pretty much the same numbers as in a 64-bit build. Sorry, should have mentioned this in comment 1.
If dmajor can't profile this bent can.
Flags: needinfo?(khuey)
Comment 5•11 years ago
|
||
Some more data:
1) The original testcase ended up with a lot of NaNs. http://jsperf.com/dommatrix-perf/7 doesn't have the same issue and has different numbers, but still a bit slower on Windows compared to JS than on other platforms.
2) dmajor's profile shows time mostly taken in mozilla::gfx::Matrix::operator* and dmajor was kind enough to pastebin the codegen from MSVC for that function. It looks sort of like this:
157 5b3e98e6 d902 fld dword ptr [edx]
157 5b3e98e8 8b4508 mov eax,dword ptr [ebp+8]
157 5b3e98eb d809 fmul dword ptr [ecx]
157 5b3e98ed d94208 fld dword ptr [edx+8]
157 5b3e98f0 d84904 fmul dword ptr [ecx+4]
157 5b3e98f3 dec1 faddp st(1),st
etc.
For comparison, here's the same function on Mac (64-bit, but I bet 32-bit is the same):
0x0000000103d22a2b <_ZNK7mozilla3gfx6MatrixmlERKS1_+23>: movss 0x4(%rbx),%xmm1
0x0000000103d22a30 <_ZNK7mozilla3gfx6MatrixmlERKS1_+28>: movss -0x18(%rbp),%xmm0
0x0000000103d22a35 <_ZNK7mozilla3gfx6MatrixmlERKS1_+33>: movaps %xmm4,%xmm2
0x0000000103d22a38 <_ZNK7mozilla3gfx6MatrixmlERKS1_+36>: mulss %xmm0,%xmm2
0x0000000103d22a3c <_ZNK7mozilla3gfx6MatrixmlERKS1_+40>: addss %xmm5,%xmm2
0x0000000103d22a40 <_ZNK7mozilla3gfx6MatrixmlERKS1_+44>: movss 0xc(%rbx),%xmm11
etc.
The point being on Mac, even in 32-bit mode, and on linux64, and in our JIT we know we can use SSE2 instructions for floating point math, but MSVC with our compile options uses x87 instructions.
Apparently x87 instruction on Intel hardware are really slow when dealing with non-finite floats.
Our options here are basically:
1) Have a runtime-detected version of operator* that uses SSE2 stuff.
2) Ignore the issue because this NaN business should be rare and in any case we want to
move people to 64-bit builds.
as far as I can tell. None of this has anything to do with the JIT.
Component: JavaScript Engine: JIT → Graphics
Summary: DOMMatrix runtime performance is much lower on windows than other platforms → DOMMatrix runtime performance is much lower on windows than other platforms when NaNs are involved, due to x87 instructions in gfx::Matrix::operator*
Comment 6•11 years ago
|
||
Can we just re-implement DOMMatrix as a JS-implemented WebIDL component, and let the JIT take care of things instead?
Comment 7•11 years ago
|
||
The binding part will get much slower, then, sadly. The call from C++ to JS is not that cheap. :(
| Reporter | ||
Comment 8•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #7)
> The binding part will get much slower, then, sadly. The call from C++ to JS
> is not that cheap. :(
Would that be for live object where the C++ side would query/change the JS object?
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•