Closed
Bug 477779
Opened 16 years ago
Closed 6 years ago
Add LIR_ned opcode and use for double->bool conversion
Categories
(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P3)
Tracking
(Not tracked)
RESOLVED
WONTFIX
Future
People
(Reporter: brbaker, Unassigned)
References
Details
(Whiteboard: PACMAN must-fix-candidate)
Attachments
(4 files, 2 obsolete files)
12.63 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
5.11 KB,
patch
|
Details | Diff | Splinter Review | |
15.33 KB,
patch
|
Details | Diff | Splinter Review | |
15.33 KB,
patch
|
n.nethercote
:
review-
|
Details | Diff | Splinter Review |
Comparing MIR vs LIR on windows 64 (only platform that has both 64bit versions of the jit) shows a performance increase across the board for LIR except for 2 testcases.
canaries/simpleflexapp.as
MIR: 327.8
LIR: 403.1
%diff: -18.7%
misc/primes.as
MIR: 12731.1
LIR: 14720.1
%diff: -13.5%
These results were repeatable every time I ran the tests.
Flags: flashplayer-triage+
Flags: flashplayer-qrb?
Comment 1•16 years ago
|
||
Out of curiosity, what's the typical performance increase on the other cases?
Comment 2•16 years ago
|
||
~5%
Assignee: nobody → edwsmith
Flags: flashplayer-qrb? → flashplayer-qrb+
Updated•15 years ago
|
Priority: P3 → P4
Updated•15 years ago
|
Status: NEW → ASSIGNED
Comment 3•15 years ago
|
||
for coerce_bool <double>, MIR emits:
MIR_ne(d, 0.0) => ucomisd, setne
and nanojit emits:
LIR_calli doubleToBool(d) => push (d), call (plus spillage)
the MIR_ne instruction has no analog in LIR; MIR_ne actually implements the ecma double->boolean conversion correctly in two instructions. It is not the opposite of MIR_eq, and is only used for double->boolean conversions!
Comment 4•15 years ago
|
||
implements slightly different (and simpler) logic than feq
fune feq
unordered 0 0
equal 0 1
less 1 0
greater 1 0
which means it's suitable for double->bool:
X fune(X,0.0)
NaN 0
+0.0 0
-0.0 0
otherwise 1
posting for prototyping and early feedback. LIR_fune might be an obscure instruction. LIR_f2b (float to bool) would be more consise and less obscure, but would also hardcode the operand of 0.0.
Net speedup is 11-12% on primes on X64 (mac)
Comment 5•15 years ago
|
||
testing simpleflexapp.abc:
$AVM -Ojit simpleflexapputil/avmglue_abc simpleflexapputil/hello_frame1_abc simpleflexapputil/hello_frame2_abc simpleflexapp.abc |grep metric
MIR: metric time 218
LIR: metric time 109
=> 50% speedup
testing primes.abc
$AVM -Ojit primes.abc
MIR: metric time 10452
LIR: metric time 9704
=> 7% speedup
Comment 6•15 years ago
|
||
In retrospect, the 'u' distinction between fune and feq seemed unimportant. all the float comparisons (including fne) return false for unordered values.
- added LIns::isFcmp() (write things once).
- SSE2 x86 and x64 implemented so far.
directional feedback? the main alternative to LIR_fne, would be LIR_f2b. (f2b(x) == fne(x,0.0)). f2b is the only thing we need for now, but it feels too special purpose and I fear we'd just need to add fne later. on the other hand, why be overly general? opinions welcome.
ARM note: on ARM, FCMPDZ apparently corresponds exactly to LIR_f2b, so perhaps f2b is not too special purpose. (ARM fne => FCMPD). on other cpu's, implementing f2b in the backend would require a temp scratch register to hold the value 0.0; fne is thus more friendly to register allocation by making the immediate 0.0 visible in the IR.
Attachment #417606 -
Attachment is obsolete: true
Attachment #417743 -
Flags: review?(nnethercote)
![]() |
||
Comment 7•15 years ago
|
||
Comment on attachment 417743 [details] [diff] [review]
adds new LIR_fne (float not equal), use it for double->bool
Looks good, although you'll have to split into NJ and TR parts.
I wonder if this will be useful for TM -- Gal, do we have a doubleToBool builtin? I couldn't find one.
Question: what is MIR?
Attachment #417743 -
Flags: review?(nnethercote) → review+
Comment 8•15 years ago
|
||
(In reply to comment #7)
> Question: what is MIR?
Err... a Soviet Space Station?
(Actually, I think it was an acronym for Macromedia Intermediate Representation... basically, the JIT that FP9/FP10 used, pre-nanojit)
Comment 9•15 years ago
|
||
I searched tracemonkey for a while, found js_ValueToBoolean() (which corresponds to tamarin's AvmCore::boolean()), but never did find the code that traces it. but i'd be surprised if it wasn't traced, somewhere.
it's apparently invoked by POP_BOOLEAN macro in jsinterp, if that's any help.
Comment 10•15 years ago
|
||
I think we do the conversion in some C code. Never really appeared in any benchmarks for us (I think, I might be wrong) as a hot spot.
Comment 11•15 years ago
|
||
(In reply to comment #7)
> (From update of attachment 417743 [details] [diff] [review])
> Looks good, although you'll have to split into NJ and TR parts.
yeah, i'm going to do the ppc32/64, arm, and sparc ports first. the asm_fcmp cleanup you did in the x86 backend made that one easy going.
the benchmark in question is tamarin-redux/test/performance/misc/primes.as, you could rip off the type annotations and try it in TM if interested.
if tracemonkey manages to make the mod in primes.as always be an int mod, then double->bool probably never shows up hot.
Comment 12•15 years ago
|
||
Yeah, we speculate aggressively on integer modulo. This might show up in the double slow path that gets hit in case of an overflow.
Comment 13•15 years ago
|
||
Suprisingly, this patch actually makes primes.as slower on PPC. Posting here so we don't lose it, but instead of spending more time on it, I plan to just add a flag and only use LIR_fne on backends that have it. initially, sparc and ppc won't have it, so we'll just use the function helper there.
on x86, x64, and ARM, LIR_fne provides about 10% speedup on primes.as
Comment 14•15 years ago
|
||
Using LIR_eq(cond,0) is equivalent to LIR_xor(cond,1), but ExprFilter knows how to optimize branches using eq(x,0) by inverting the sense of the branch, which results in better optimization across the board.
in the PPC backend, quad constants were not 8-aligned, this patch is a fix.
Attachment #419033 -
Flags: review?(rreitmai)
Updated•15 years ago
|
Attachment #419033 -
Flags: review?(stejohns)
Updated•15 years ago
|
Attachment #419033 -
Flags: review?(stejohns) → review+
Updated•15 years ago
|
Attachment #419033 -
Flags: review?(rreitmai) → review+
Comment 15•15 years ago
|
||
Comment on attachment 419033 [details] [diff] [review]
Fix alignment bug in PPC asm_quad(), use eq0 instead of xor for OP_not
pushed http://hg.mozilla.org/tamarin-redux/rev/1b59cdfc8946
Attachment #419033 -
Attachment is obsolete: true
Comment 16•15 years ago
|
||
nanojit part pushed
http://hg.mozilla.org/projects/nanojit-central/rev/5218b6ab33db
and merged to tamarin-redux
http://hg.mozilla.org/tamarin-redux/rev/7354ae1a7e72
Comment 17•15 years ago
|
||
Marking this closed/fixed now on the assumption that TM does not plan to use LIR_fne immediately and thus doesn't need to track this bug in the near term.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
![]() |
||
Comment 18•15 years ago
|
||
TM ends up taking all the NJ commits, even the ones that don't really affect TM, and we like to have the full paper trail between Bugzilla and the repositories... so I'll reopen this.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: fixed-in-nanojit, fixed-in-tamarin
![]() |
||
Comment 19•15 years ago
|
||
Whiteboard: fixed-in-nanojit, fixed-in-tamarin → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey
Comment 20•15 years ago
|
||
I don't know how i missed this, but the above LIR_fne patch has never landed; what landed were some precursor patches. Good thing it was reopened!
Updated•15 years ago
|
Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey, PACMAN
Comment 21•15 years ago
|
||
clearing whiteboard, updated subject line for clarity. The scope of this bug is to add a "fne" opcode with the truth table from comment #4 (which happily matches the truth table for ToBool(double) from ECMAScript.
Summary: x64 LIR is slower than x64 MIR on 2 performance testcases, needs investigation → Add LIR_fne opcode and use for double->bool conversion
Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey, PACMAN → PACMAN
![]() |
||
Comment 22•15 years ago
|
||
FWIW, I have a patch to add LIR_nei and LIR_neq that I've been thinking about resurrecting. Not having LIR_ne* always felt strange to me.
Comment 24•15 years ago
|
||
(In reply to comment #22)
> FWIW, I have a patch to add LIR_nei and LIR_neq that I've been thinking about
> resurrecting. Not having LIR_ne* always felt strange to me.
If the condition is feeding a branch, LIR_eqi and LIR_eqq plus jt/jf cover all the cases. But, LIR_nei/neq are better when you want the 0/1 value of the condition. so, +1 for completeness, we'd use those if available.
Updated•15 years ago
|
Assignee: edwsmith → wsharp
Updated•15 years ago
|
OS: Windows Vista → All
Priority: P4 → P3
Comment 25•15 years ago
|
||
From primes.as:
function prime(n:int) {
var m : int = n;
var limit : int = n/2;
for (var i : int = 2; i<=limit;++i) {
if(!(m%i)) {
return false;
}
}
return true;
}
It would be great to have enough info to do an integer modulo there and completely avoid the (int,int)->(double,double)->modulo->negate->doubleToBool roundtrip. (Outside the scope of this LIR_fne work item)
Comment 26•15 years ago
|
||
Add LIR_ned support for i386, i386 w/o SSE, x64 and ARM. NJ_NED_SUPPORTED controls whether fast path is supported on various platforms.
Attachment #452048 -
Flags: superreview?(edwsmith)
Attachment #452048 -
Flags: review?(nnethercote)
Comment 27•15 years ago
|
||
Comment on attachment 452048 [details] [diff] [review]
Updated patch for latest code (i386, i386 w/o SSE, x64, ARM)
one bug in this patch. disregard.
Attachment #452048 -
Flags: superreview?(edwsmith)
Attachment #452048 -
Flags: review?(nnethercote)
Comment 28•15 years ago
|
||
fix last patch which incorrectly used NJ_JTBL_SUPPORTED instead of NJ_NED_SUPPORTED
Attachment #452254 -
Flags: superreview?(edwsmith)
Attachment #452254 -
Flags: review?(nnethercote)
![]() |
||
Comment 29•15 years ago
|
||
Comment on attachment 452254 [details] [diff] [review]
updated patch
Werner, can you add this to your ~/.hgrc file?
[defaults]
diff=-p -U 8
It gives 8 lines of context instead of 4, which makes patches easier to review.
Also, do you know about committing to the nanojit-central repository? See
https://developer.mozilla.org/en/NanojitMerge for details.
>@@ -2507,6 +2510,7 @@
> static double FASTCALL muld(double a, double b) { return a * b; }
> static double FASTCALL divd(double a, double b) { return a / b; }
> static int32_t FASTCALL eqd(double a, double b) { return a == b; }
>+ static int32_t FASTCALL ned(double a, double b) { return a == a && a != b; }
I think this is wrong. "NaN != 0" should give true.
>diff -r 7c9acb06a202 nanojit/LIRopcode.tbl
>--- a/nanojit/LIRopcode.tbl Thu Jun 17 15:51:37 2010 +0200
>+++ b/nanojit/LIRopcode.tbl Fri Jun 18 06:53:01 2010 -0700
>@@ -244,8 +244,7 @@
> OP___(gtd, 75, Op2, I, 1) // double greater-than
> OP___(led, 76, Op2, I, 1) // double less-than-or-equal
> OP___(ged, 77, Op2, I, 1) // double greater-than-or-equal
>-
>-OP_UN(78)
>+OP___(ned, 78, Op2, I, 1) // double inequality (unordered|equal => false; else true)
I'd prefer 'ned' to follow just after 'eqd'. This will require shifting the
others down, but that's ok. You may need get some static assertions because
some of the comparisons must have even opcodes, some must have odd, this is
for the x^1 inversion, I can't remember the exact details. So you may
need to shift the opcodes around some more.
Hmm, there are some static assertions regarding the order of the comparison
operations in LIR.h around line 104. They'll need to be updated too.
> NIns* Assembler::asm_branchd(bool onFalse, LIns *cond, NIns *target) {
> LOpcode condop = cond->opcode();
> NIns *patch;
> LIns *a = cond->oprnd1();
> LIns *b = cond->oprnd2();
>- if (condop == LIR_eqd) {
>+ if (condop == LIR_ned) {
>+ // unordered and equal are "false", otherwise "true" (no need to check parity bit)
>+ if (onFalse) {
>+ JE(0, target);
>+ } else {
>+ JNE(0, target);
>+ }
>+ patch = _nIns;
>+ }
>+ else if (condop == LIR_eqd) {
> if (onFalse) {
> // branch if unordered or !=
> JP(16, target); // underrun of 12 needed, round up for overhang --> 16
Can you put the 'eqd' code before the 'ned' code?
>+ // LIR_ned:
>+ // ucomisd ZPC outcome (SETNE/JNE succeeds if Z==0)
>+ // ------- --- -------
>+ // UNORDERED 111 SETNE/JNE fails
>+ // EQUAL 100 SETNE/JNE fails
>+ // GREATER_THAN 000 SETNE/JNE succeeds
>+ // LESS_THAN 001 SETNE/JNE succeeds
As above, AIUI the UNORDERED case should succeed. Eg. "NaN != 0" gives
true. This problem sinks the patch as it stands.
I'll file a follow-up bug for LIR_nei/LIR_neq, they'd be good to have as
well for completeness.
Attachment #452254 -
Flags: review?(nnethercote) → review-
Comment 30•15 years ago
|
||
the truth table I'm going for with fne (now named ned) is:
comparison eqd ned
---------- --- ---
x == y true false
x < y false true
x > y false true
unordered false false
This is what x86/arm do naturally, and its what works for ecma-compatible double->bool conversion. It is not symmetric with eqd; but to me that's its attraction, since otherwise it'd be easy to just flip the sense of jt/jf used with eqd.
(however for nei/neq, they should be symmetric with eqi/eqq).
(in retrospect, we should write down these tables for the LIR float-comparison instructions somewhere in LIR.h or LIRopcodes.h).
Updated•15 years ago
|
Summary: Add LIR_fne opcode and use for double->bool conversion → Add LIR_ned opcode and use for double->bool conversion
Comment 31•15 years ago
|
||
(In reply to comment #30)
fwiw, i'd be open to a better name for an opcode with the above truth table. the way the proposed truth table makes sense to me, and why its not the inverse of eqd, is if you read "ned" as "<>"... i.e (x <> y). in that light, (NaN <> 0) = false since (x <> y) => (x < y || x > y) => (NaN < 0 || NaN > 0) => false||false.
![]() |
||
Comment 32•15 years ago
|
||
If what you want is really double-to-bool conversion, then this should be a unary op called LIR_d2b. We can add LIR_ned (the inverse of LIR_eqd) as part of bug 573416.
So what are the semantics of LIR_d2b? I guess 0.0, -0.0, and NaN convert to 0 and everything else to 1?
Comment 33•15 years ago
|
||
We usually stay away from adding LIR opcodes for things that have a specific JS semantics.
Comment 34•15 years ago
|
||
(In reply to comment #32)
> If what you want is really double-to-bool conversion, then this should be a
> unary op called LIR_d2b. We can add LIR_ned (the inverse of LIR_eqd) as part
> of bug 573416.
>
> So what are the semantics of LIR_d2b? I guess 0.0, -0.0, and NaN convert to 0
> and everything else to 1?
Correct. I considered tradeoffs of f2b in comment #4 and comment #6, decided to go with a binary op to simplify the backend implementations and make the instruction less special purpose. (Only ARM has a FCMPZ instruction with an implicit 0.0 argument). I'm not opposed to a LIR_ned that's the inverse of eqd, but I have no need for it either.
I dont feel strongly either way; d2b and ned both will serve the need just fine. the apparent issues are backend complexity and LIR semantics. here's how i'd summarize:
argument against special puropse ops: "support LIR_ned, maybe under a different name."
argument against over generalization: "support LIR_d2b".
In support of the first argument, here's another proposal:
1. Change LIR_ned's name to LIR_gld meaning "greater or less than", i.e. "x<>y". this more accurately reflects its truth table since NaN is not greater or less than anything (gld(NaN,x)=false). its inverse would be 'ngld' if we ever wanted it.
2. Inverses. Inverses of the floating point operators get a prepended "n". neqd, nltd, etc. These are strictly computed as !(op). Thus, nltd(x,y) != ged(x,y), since the unordered cases are different. For integer operations, LIR_nlti == LIR_gti, so we could optionally add those as aliases. LIR_nei/q would be new, there's no existing equivalent.
![]() |
||
Comment 35•15 years ago
|
||
I'm happier if it's not called LIR_ned.
Re the over/undergeneralization question: under what circumstances would you want the 2nd operand to be something other than 0.0?
Also, would isCmp() be true for LIR_gld? In my mind there are six obvious comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird.
(In reply to comment #33)
>
> We usually stay away from adding LIR opcodes for things that have a specific JS
> semantics.
Gal, which opcode has JS-specific semantics?
Comment 36•15 years ago
|
||
(In reply to comment #35)
> I'm happier if it's not called LIR_ned.
>
> Re the over/undergeneralization question: under what circumstances would you
> want the 2nd operand to be something other than 0.0?
I can't think of any, my only concrete use case is with 0.0. A guard for div or mod (testing for nonzero/non-nan divisor) might use this opcode, but also would still use 0.0 on the RHS. I can imagine cases when, mathematically, you might want to compare X with a constant and exclude NaN, however since its not possible in C/Java/JS, there's not much justification for it.
if our optimizers were able to recognize the pattern (X == X && X != C) or (!isNaN(X) && X != C) then they could use this single opcode as part of a strength reduction. That's pretty far fetched, but it is a code pattern we could look for in real code, if we wanted to search for more use cases. doing so would stray deeply into "solution looking for a problem" territory.
> Also, would isCmp() be true for LIR_gld? In my mind there are six obvious
> comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird.
Yes, I think isCmp(LIR_gld) would be true. NaN's make floating point comparisons weird, for sure. that there's two variants of inequality doesn't bother me though.
> (In reply to comment #33)
> >
> > We usually stay away from adding LIR opcodes for things that have a specific JS
> > semantics.
>
> Gal, which opcode has JS-specific semantics?
I think he meant that a LIR_d2b would have JS-specific semantics. If we went with that opcode, we should just specify its behavior as a truth table, with the happy coincidence that it matched JS semantics for ToBool(Number).
Sounds like you're advocating LIR_d2b(x) more strongly than I'm advocating LIR_gld(x,0.0). That works for me. There's time for more discussion/opinions, too; this isn't an urgent bug.
Comment 37•15 years ago
|
||
(In reply to comment #36)
> > Also, would isCmp() be true for LIR_gld? In my mind there are six obvious
> > comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird.
Strictly speaking, there are 12 comparison operators for floats, due to NaN; the six above, and six more with inverted truth tables (prefixed with 'n' probably), which treat NaN the opposite way. For ints, the 12 collapse down to 6 since half of the truth tables work out the same, because of no NaN.
Updated•15 years ago
|
Attachment #452254 -
Flags: superreview?(edwsmith)
Comment 38•15 years ago
|
||
status?
Comment 39•15 years ago
|
||
Patch needs updating for the LIR_d2b change and the feedback from Nick. Also, it did not work on ARM for some reason so that needs investigation after I get set up wth some ARM hardware.
Comment 40•14 years ago
|
||
Per team mtg, removing my name from bugs I'm not actively working on.
Assignee: wsharp → nobody
Updated•14 years ago
|
Assignee: nobody → rreitmai
Updated•14 years ago
|
Whiteboard: PACMAN PACMAN → PACMAN must–fix-candidate PACMAN must–fix-candidate
Updated•14 years ago
|
Whiteboard: PACMAN must–fix-candidate PACMAN must–fix-candidate → PACMAN must-fix-candidate PACMAN must-fix-candidate
Updated•14 years ago
|
Flags: flashplayer-injection-
Target Milestone: Q3 11 - Serrano → Future
Comment 41•13 years ago
|
||
Rick, is this bug still active? Assign to nobody if you are not working on it. Removing Andre blocker.
No longer depends on: Andre
Updated•13 years ago
|
Assignee: rreitmai → nobody
Comment 42•6 years ago
|
||
Tamarin is a dead project now. Mass WONTFIX.
Status: REOPENED → RESOLVED
Closed: 15 years ago → 6 years ago
Resolution: --- → WONTFIX
Comment 43•6 years ago
|
||
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in
before you can comment on or make changes to this bug.
Description
•