Closed
Bug 707644
Opened 13 years ago
Closed 13 years ago
f2i appears very poorly implemented
Categories
(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P2)
Tracking
(Not tracked)
VERIFIED
FIXED
Q2 12 - Cyril
People
(Reporter: lhansen, Assigned: edwsmith)
References
Details
Test case:
package
{
function f(x:float): int
{
return x;
}
f(0.0f);
}
Generated code:
immi2 = immi 4 (in reg? -1)
ldf1 = ldf.o/c parami3[4] (in reg? -1)
0x671fb2 xorpd xmm0,xmm0
0x671fb6 movss xmm0,4(eax)
floatToAtom1 = calli.none #floatToAtom ( immi1/*0x516020 core*/ ldf1 )
(in reg? -1)
0x671fbb sub esp,8
0x671fbe sub esp,4
0x671fc1 movss 0(esp),xmm0
0x671fc6 push 5333024
0x671fcb call floatToAtom
0x671fd0 add esp,16
integer1 = calli.none #integer ( floatToAtom1 ) (in reg? -1)
0x671fd3 sub esp,12
0x671fd6 push eax
0x671fd7 call integer
0x671fdc add esp,16
ldi4 = ldi.o methodFrame[0] (in reg? -1)
0x671fdf mov ecx,-12(ebp)
sti.o immi1/*0x516020 core*/[44] = ldi4 (in reg? -1)
0x671fe2 mov 5333068(0),ecx
reti integer1 (in reg? -1)
0x671fe8 mov esp,ebp
0x671fea pop ebp
0x671feb ret
livei methodFrame (in reg? -1)
livei parami1 (in reg? -1)
livei immi1/*0x516020 core*/ (in reg? -1)
livei immi2/*4*/ (in reg? -1)
If I read that correctly we first call floatToAtom() to box the atom, then integer() to convert the atom to an int. Given the guaranteed boxing that is fairly bad.
In contrast, for a Number -> int conversion we call doubleToInt32 directly (and then only if the argument is not an integer value).
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → edwsmith
Assignee | ||
Comment 1•13 years ago
|
||
Doing a fast f2i (CVTTSS2SI), then taking a slow path for inexact results is straightfoward. in the slow path, is doubleToInt32((double)f) okay, or should we develop an optimized floatToInt32() helper? (a lot of bit banging happens in doubleToInt32). I'm inclined to just reuse the code for double instead of rewrite it.
Reporter | ||
Comment 2•13 years ago
|
||
For now it's probably OK to reuse the double code but you should file a bug to follow up on it unless you have good evidence that the performance difference between that and the "optimal" code is close. Which it might be, given that it's the slow path.
Assignee | ||
Comment 3•13 years ago
|
||
It appears that promoting to float, then using all the existing double->int machenery, is the fastest thing to do at least on x86-32:
float->double->int (CVTTSD2SI fastpath, doubleToInt32 slowpath)
test avg avg %diff
Dir: asmicro/
float-toint-1 264.2 1591.4 502.4 502.4 // int-valued floats
float-toint-2 265.2 1601.4 503.8 503.8 // fractional floats
float-toint-3 207.8 437.6 110.6 110.6 // out-of-range floats
float->int (CVTTSS2SI fastpath, doubleToInt32(float->double) slowpath)
test avg avg %diff
Dir: asmicro/
float-toint-1 265.7 1272.7 379.1 379.1
float-toint-2 265.2 1254.7 373.1 373.1
float-toint-3 206.6 62.3 -69.9 -69.9
Notice how along the fast *and* slow path, its faster on x86 to use doubles.
Assignee | ||
Comment 4•13 years ago
|
||
tr-float:
changeset: 7032:16fc97ad3fc6
tag: tip
user: Edwin Smith <edwsmith@adobe.com>
date: Mon Dec 05 12:55:34 2011 -0500
summary: Bug 707644 - f2i appears very poorly implemented
This patch implements f2i as AvmCore::integer_d(double(f)), which
tested fastest on my Core i7 notebook with x83-32. I opened Bug 707738
to investigate why CVTTSS2SI wasn't faster; in the mean time this establishes
a toe-hold.
Assignee | ||
Updated•13 years ago
|
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•