Closed
Bug 600585
Opened 15 years ago
Closed 15 years ago
inline fastpath for integer_d_sse2
Categories
(Tamarin Graveyard :: Verifier, defect)
Tamarin Graveyard
Verifier
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: wsharp, Assigned: wsharp)
References
Details
(Whiteboard: PACMAN)
Attachments
(2 files, 3 obsolete files)
|
91.82 KB,
text/plain
|
Details | |
|
11.11 KB,
patch
|
wmaddox
:
review+
edwsmith
:
superreview+
|
Details | Diff | Splinter Review |
An experiment with inlining the fastpath for integer_d_ss2 shows a 40% improvement in v8.5\typed\crypto.as. We inline a asm_d2i and only call our helper if the result comes back 0x80000000.
| Assignee | ||
Updated•15 years ago
|
Whiteboard: PACMAN
| Assignee | ||
Comment 1•15 years ago
|
||
Please ignore the evilness of me putting a CodegenLIR ptr inside of Specializer. This will be cleaned up under a separate refactoring bug to move the integer_d specialization logic into CodegenLIR itself. Please review the logic inside of optimize_integer_d for performing an inline CVTSD2SI with failure check and calling out to the integer_d_sse2 helper as a backup.
As I type this I realize that we should really call doubleToInt32 directly as the helper instead of integer_d_sse which will just repeat the CVTSD2SI call. I will clean this up for the next patch.
Attachment #479491 -
Flags: superreview?(edwsmith)
Attachment #479491 -
Flags: review?(wmaddox)
Updated•15 years ago
|
Attachment #479491 -
Flags: superreview?(edwsmith)
Updated•15 years ago
|
Assignee: nobody → wsharp
Comment 2•15 years ago
|
||
(In reply to comment #1)
> Created attachment 479491 [details] [diff] [review]
> inline double->int, check for 0x80000000 failure
>
> Please ignore the evilness of me putting a CodegenLIR ptr inside of
> Specializer. This will be cleaned up under a separate refactoring bug to move
> the integer_d specialization logic into CodegenLIR itself. Please review the
> logic inside of optimize_integer_d for performing an inline CVTSD2SI with
> failure check and calling out to the integer_d_sse2 helper as a backup.
>
> As I type this I realize that we should really call doubleToInt32 directly as
> the helper instead of integer_d_sse which will just repeat the CVTSD2SI call.
> I will clean this up for the next patch.
The logic looks sound, and the results are very promising! I won't give this an R+ because it looks like you've already got another patch coming.
Local labels in inline expanders can be stack allocated.
Write
CodegenLabel skip_label("goodint");
rather than
CodegenLabel &skip_label = createLabel("goodint");
See bug 562744 for some nontrivial examples of speculative inlining with inserted labels and branches.
| Assignee | ||
Comment 3•15 years ago
|
||
A 1 at the end of the line signifies that we simplified it to an integer expression.
Some opportunities for integer optimization
5289 doubleConst
726 (i2d(LIR_ldi) / intConst) (integer divide is equivalent (what about zero?)
345 LIR_negd (can we do a negi instead?)
312 intConst (weird that we get in here with an integer constant already)
219 ((ui2d(LIR_ldi) + ui2d(LIR_calli)) + intConst) (three adds wont overflow)
| Assignee | ||
Comment 4•15 years ago
|
||
Clean up this function, add double constant support.
Nanojit requires some updating since d2i is rounding and we need it to truncate.
var n:Number = 1.9
var i:int = n;
// i needs to be 1 not 2.
Removed some unused integer64 routines.
Attachment #479491 -
Attachment is obsolete: true
Attachment #479491 -
Flags: review?(wmaddox)
| Assignee | ||
Updated•15 years ago
|
| Assignee | ||
Comment 5•15 years ago
|
||
1. remove our AvmCore::integer64* calls. These are no longer required.
2 Clean up coerceNumberToInt. Add support for double constant coercion. Add SSE support for inlining our integer_d_sse2 logic. The #ifdefs are ugly but we need to support SSE_ALWAYS combined with our i386_sse2 flag check.
3. Add CVTTSD2SI support to nanojit. This is bug 601135 and required for our inline double to integer coercion to match existing behavior.
Attachment #479906 -
Attachment is obsolete: true
Attachment #480123 -
Flags: superreview?(edwsmith)
Attachment #480123 -
Flags: review?(wmaddox)
Comment 6•15 years ago
|
||
Comment on attachment 480123 [details] [diff] [review]
finalize code
This should be split this into two separate patches, one for the NanoJIT d2i changes, and the other for the Tamarin changes proper. The NanoJIT changes should be separately reviewed for bug 601135 and pushed prior to this patch. Other issues with the NanoJIT patch have already been much discussed in bug 601135. Please post a rebased Tamarin-only patch, and I'll R+ it, as the Tamarin changes look good.
Nits:
I would factor out the LIR_calld case in CodegenLIR::coerceNumberToInt().
We are specializing a call in a context that will be converted to integer.
There will likely be other cases, and this seems a good point to break up
the length of coerceNumberToInt().
I prefer to declare labels, e.g., 'CodegenLabel skip_label("goodint") at the beginning, rather than just before first use, so that the calls to the code emitters are more directly suggestive of assembler notation for the generated code itself, being less cluttered. Example:
suspendCSE();
CodegenLabel skip_label("skip_label");
...
branchToLabel(LIR_jt, c, skip_label);
...
emitLabel(skip_label);
...
resumeCSE();
return result;
Attachment #480123 -
Flags: review?(wmaddox) → review-
| Assignee | ||
Comment 7•15 years ago
|
||
Split nanojit part out which is bug 601135. If we go with d2i_z, this patch will change ever so slightly but should not require re-review. If we go with d2i for the nanojit bug, it should land unchanged.
Attachment #480123 -
Attachment is obsolete: true
Attachment #482398 -
Flags: superreview?(edwsmith)
Attachment #482398 -
Flags: review?(wmaddox)
Attachment #480123 -
Flags: superreview?(edwsmith)
Updated•15 years ago
|
Attachment #482398 -
Flags: review?(wmaddox) → review+
Comment 8•15 years ago
|
||
Comment on attachment 482398 [details] [diff] [review]
split nanojit part from patch
In imm2Int(), is there any hazard for imm = -0? (In C++, 0 == -0). I don't think so when it is called from the optimization code for LIR_muld, but what about addd/subd?
If you can think of a clean way to condense this, please do:
#if defined AVMPLUS_IA32 || defined AVMPLUS_AMD64
#ifndef AVMPLUS_SSE2_ALWAYS
SSE2_ONLY(if(core->config.njconfig.i386_sse2))
#endif // AVMPLUS_SSE2_ALWAYS
Regardless, please also add a comment ahead of the blob of inline code that summarizes lucidly what is being generated. For example:
// generate inline fast path for integer_d_sse2():
// if ((intResult = d2i(arg)) != 0xffffffff)
// intResult = doubleToInt32(arg)
Attachment #482398 -
Flags: superreview?(edwsmith) → superreview+
| Assignee | ||
Comment 9•15 years ago
|
||
The imm2Int logic has been in place for several revs (since the beginning?) so nothing has really changed there besides some organization. What is your worry? If we had a bug and were treated negative zero as integer zero in an expression that is stored in an integer, how could an error occur?
int = int *+- (negative zero)
One idea for the #ifdef ugliness would be something like this:
#if defined AVMPLUS_IA32 || defined AVMPLUS_AMD64
#ifdef AVMPLUS_SSE2_ALWAYS
# define SSE2_CHECK(...) if (1) __VA_ARGS__
#else
# define SSE2_CHECK(...) if (core->config.njconfig.i386_sse2) __VA_ARGS__
#endif
#endif
Then we could just do:
SSE2_CHECK(
{
}
)
Comment 10•15 years ago
|
||
(In reply to comment #9)
> The imm2Int logic has been in place for several revs (since the beginning?)
I hear ya, but given our want of improving things, old code is fair game when touched or called from new places.
> What is your worry? If we had a bug and were treated negative
> zero as integer zero in an expression that is stored in an integer,
> how could an error occur?
I homed in on this expression:
else if (imm->isImmD()) {
double val = imm->immD();
double cvt = (int32_t)val;
if (val == 0 || val == cvt)
^^^^^^^^
I think a -0 could slip through here and be treated as an int 0. You're probably right that it's fine; my goal in reviewing it was to ensure the
edge case had been thought about. its not clear it had (no comments or
asserts). For dormant code thats fine but if we're generating new calls
to imm2Int, then old invariants could be broken.
> int = int *+- (negative zero)
> One idea for the #ifdef ugliness would be something like this:
(snip). works for me. another idea is a few cleaned up ifdef switches,
such as a sinle #ifdef VMCFG_SSE2 (set in avmfeatures.as) along with a
runtime test. true that the runtime test is always true on mac or x64,
but optimizing that out isn't worth the eye bleed. if we *really* wanted
to optimize that out, then Config.use_sse2 could be conditionally compiled
as a static const. then runtime tests can be compiled in but optimized out.
| Assignee | ||
Comment 11•15 years ago
|
||
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•