Closed Bug 477779 Opened 16 years ago Closed 6 years ago

Add LIR_ned opcode and use for double->bool conversion

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P3)

Product:

Component:

Platform:

x86_64

All

Type:

defect

Priority:

P3

Severity:

normal

Tracking

(Not tracked)

Status:

RESOLVED WONTFIX

Milestone:

Future

People

(Reporter: brbaker, Unassigned)

References

Details

(Whiteboard: PACMAN must-fix-candidate)

Attachments

(4 files, 2 obsolete files)

adds new LIR_fune (unordered float not equal), use it for double->bool 15 years ago Edwin Smith 6.55 KB, patch		Details \| Diff \| Splinter Review
adds new LIR_fne (float not equal), use it for double->bool 15 years ago Edwin Smith 12.63 KB, patch	n.nethercote : review+	Details \| Diff \| Splinter Review
LIR_fne support for PPC 15 years ago Edwin Smith 5.11 KB, patch		Details \| Diff \| Splinter Review
Fix alignment bug in PPC asm_quad(), use eq0 instead of xor for OP_not 15 years ago Edwin Smith 1.17 KB, patch	rreitmai : review+ stejohns : review+	Details \| Diff \| Splinter Review
Updated patch for latest code (i386, i386 w/o SSE, x64, ARM) 15 years ago Werner Sharp 15.33 KB, patch		Details \| Diff \| Splinter Review
updated patch 15 years ago Werner Sharp 15.33 KB, patch	n.nethercote : review-	Details \| Diff \| Splinter Review

Reporter

Description

•

16 years ago

Comparing MIR vs LIR on windows 64 (only platform that has both 64bit versions of the jit) shows a performance increase across the board for LIR except for 2 testcases. canaries/simpleflexapp.as MIR: 327.8 LIR: 403.1 %diff: -18.7% misc/primes.as MIR: 12731.1 LIR: 14720.1 %diff: -13.5% These results were repeatable every time I ran the tests.

Flags: flashplayer-triage+

Flags: flashplayer-qrb?

Updated

•

16 years ago

Blocks: 477785

Comment 1

•

16 years ago

Out of curiosity, what's the typical performance increase on the other cases?

Comment 2

•

16 years ago

~5%

Updated

•

16 years ago

Blocks: 478870

Updated

•

16 years ago

No longer blocks: 477785

Updated

•

16 years ago

Blocks: 481413

Updated

•

16 years ago

No longer blocks: 478870

Updated

•

16 years ago

No longer blocks: 481413

Target Milestone: --- → flash10.x

Updated

•

16 years ago

Assignee: nobody → edwsmith

Flags: flashplayer-qrb? → flashplayer-qrb+

Updated

•

15 years ago

Priority: -- → P3

Updated

•

15 years ago

Priority: P3 → P4

Updated

•

15 years ago

Status: NEW → ASSIGNED

Comment 3

•

15 years ago

for coerce_bool <double>, MIR emits: MIR_ne(d, 0.0) => ucomisd, setne and nanojit emits: LIR_calli doubleToBool(d) => push (d), call (plus spillage) the MIR_ne instruction has no analog in LIR; MIR_ne actually implements the ecma double->boolean conversion correctly in two instructions. It is not the opposite of MIR_eq, and is only used for double->boolean conversions!

Comment 4

•

15 years ago

Attached patch adds new LIR_fune (unordered float not equal), use it for double->bool (obsolete) — Details — Splinter Review

implements slightly different (and simpler) logic than feq fune feq unordered 0 0 equal 0 1 less 1 0 greater 1 0 which means it's suitable for double->bool: X fune(X,0.0) NaN 0 +0.0 0 -0.0 0 otherwise 1 posting for prototyping and early feedback. LIR_fune might be an obscure instruction. LIR_f2b (float to bool) would be more consise and less obscure, but would also hardcode the operand of 0.0. Net speedup is 11-12% on primes on X64 (mac)

Comment 5

•

15 years ago

testing simpleflexapp.abc: $AVM -Ojit simpleflexapputil/avmglue_abc simpleflexapputil/hello_frame1_abc simpleflexapputil/hello_frame2_abc simpleflexapp.abc |grep metric MIR: metric time 218 LIR: metric time 109 => 50% speedup testing primes.abc $AVM -Ojit primes.abc MIR: metric time 10452 LIR: metric time 9704 => 7% speedup

Comment 6

•

15 years ago

Attached patch adds new LIR_fne (float not equal), use it for double->bool — Details — Splinter Review

In retrospect, the 'u' distinction between fune and feq seemed unimportant. all the float comparisons (including fne) return false for unordered values. - added LIns::isFcmp() (write things once). - SSE2 x86 and x64 implemented so far. directional feedback? the main alternative to LIR_fne, would be LIR_f2b. (f2b(x) == fne(x,0.0)). f2b is the only thing we need for now, but it feels too special purpose and I fear we'd just need to add fne later. on the other hand, why be overly general? opinions welcome. ARM note: on ARM, FCMPDZ apparently corresponds exactly to LIR_f2b, so perhaps f2b is not too special purpose. (ARM fne => FCMPD). on other cpu's, implementing f2b in the backend would require a temp scratch register to hold the value 0.0; fne is thus more friendly to register allocation by making the immediate 0.0 visible in the IR.

Attachment #417606 - Attachment is obsolete: true

Attachment #417743 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Comment 7

•

15 years ago

Comment on attachment 417743 [details] [diff] [review] adds new LIR_fne (float not equal), use it for double->bool Looks good, although you'll have to split into NJ and TR parts. I wonder if this will be useful for TM -- Gal, do we have a doubleToBool builtin? I couldn't find one. Question: what is MIR?

Attachment #417743 - Flags: review?(nnethercote) → review+

Comment 8

•

15 years ago

(In reply to comment #7) > Question: what is MIR? Err... a Soviet Space Station? (Actually, I think it was an acronym for Macromedia Intermediate Representation... basically, the JIT that FP9/FP10 used, pre-nanojit)

Comment 9

•

15 years ago

I searched tracemonkey for a while, found js_ValueToBoolean() (which corresponds to tamarin's AvmCore::boolean()), but never did find the code that traces it. but i'd be surprised if it wasn't traced, somewhere. it's apparently invoked by POP_BOOLEAN macro in jsinterp, if that's any help.

Andreas Gal :gal

Comment 10

•

15 years ago

I think we do the conversion in some C code. Never really appeared in any benchmarks for us (I think, I might be wrong) as a hot spot.

Comment 11

•

15 years ago

(In reply to comment #7) > (From update of attachment 417743 [details] [diff] [review]) > Looks good, although you'll have to split into NJ and TR parts. yeah, i'm going to do the ppc32/64, arm, and sparc ports first. the asm_fcmp cleanup you did in the x86 backend made that one easy going. the benchmark in question is tamarin-redux/test/performance/misc/primes.as, you could rip off the type annotations and try it in TM if interested. if tracemonkey manages to make the mod in primes.as always be an int mod, then double->bool probably never shows up hot.

Andreas Gal :gal

Comment 12

•

15 years ago

Yeah, we speculate aggressively on integer modulo. This might show up in the double slow path that gets hit in case of an overflow.

Comment 13

•

15 years ago

Attached patch LIR_fne support for PPC — Details — Splinter Review

Suprisingly, this patch actually makes primes.as slower on PPC. Posting here so we don't lose it, but instead of spending more time on it, I plan to just add a flag and only use LIR_fne on backends that have it. initially, sparc and ppc won't have it, so we'll just use the function helper there. on x86, x64, and ARM, LIR_fne provides about 10% speedup on primes.as

Comment 14

•

15 years ago

Attached patch Fix alignment bug in PPC asm_quad(), use eq0 instead of xor for OP_not (obsolete) — Details — Splinter Review

Using LIR_eq(cond,0) is equivalent to LIR_xor(cond,1), but ExprFilter knows how to optimize branches using eq(x,0) by inverting the sense of the branch, which results in better optimization across the board. in the PPC backend, quad constants were not 8-aligned, this patch is a fix.

Attachment #419033 - Flags: review?(rreitmai)

Updated

•

15 years ago

Attachment #419033 - Flags: review?(stejohns)

Updated

•

15 years ago

Attachment #419033 - Flags: review?(stejohns) → review+

Updated

•

15 years ago

Attachment #419033 - Flags: review?(rreitmai) → review+

Comment 15

•

15 years ago

Comment on attachment 419033 [details] [diff] [review] Fix alignment bug in PPC asm_quad(), use eq0 instead of xor for OP_not pushed http://hg.mozilla.org/tamarin-redux/rev/1b59cdfc8946

Attachment #419033 - Attachment is obsolete: true

Comment 16

•

15 years ago

nanojit part pushed http://hg.mozilla.org/projects/nanojit-central/rev/5218b6ab33db and merged to tamarin-redux http://hg.mozilla.org/tamarin-redux/rev/7354ae1a7e72

Comment 17

•

15 years ago

Marking this closed/fixed now on the assumption that TM does not plan to use LIR_fne immediately and thus doesn't need to track this bug in the near term.

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Nicholas Nethercote [inactive]

Comment 18

•

15 years ago

TM ends up taking all the NJ commits, even the ones that don't really affect TM, and we like to have the full paper trail between Bugzilla and the repositories... so I'll reopen this.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Whiteboard: fixed-in-nanojit, fixed-in-tamarin

Nicholas Nethercote [inactive]

Comment 19

•

15 years ago

http://hg.mozilla.org/tracemonkey/rev/408f0bf6fb1c

Whiteboard: fixed-in-nanojit, fixed-in-tamarin → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey

Updated

•

15 years ago

Target Milestone: flash10.1 → flash10.2

Comment 20

•

15 years ago

I don't know how i missed this, but the above LIR_fne patch has never landed; what landed were some precursor patches. Good thing it was reopened!

Updated

•

15 years ago

Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey, PACMAN

Comment 21

•

15 years ago

clearing whiteboard, updated subject line for clarity. The scope of this bug is to add a "fne" opcode with the truth table from comment #4 (which happily matches the truth table for ToBool(double) from ECMAScript.

Summary: x64 LIR is slower than x64 MIR on 2 performance testcases, needs investigation → Add LIR_fne opcode and use for double->bool conversion

Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey, PACMAN → PACMAN

Nicholas Nethercote [inactive]

Comment 22

•

15 years ago

FWIW, I have a patch to add LIR_nei and LIR_neq that I've been thinking about resurrecting. Not having LIR_ne* always felt strange to me.

Comment 24

•

15 years ago

(In reply to comment #22) > FWIW, I have a patch to add LIR_nei and LIR_neq that I've been thinking about > resurrecting. Not having LIR_ne* always felt strange to me. If the condition is feeding a branch, LIR_eqi and LIR_eqq plus jt/jf cover all the cases. But, LIR_nei/neq are better when you want the 0/1 value of the condition. so, +1 for completeness, we'd use those if available.

Updated

•

15 years ago

Blocks: 552542

Updated

•

15 years ago

Assignee: edwsmith → wsharp

Updated

•

15 years ago

OS: Windows Vista → All

Priority: P4 → P3

Comment 25

•

15 years ago

From primes.as: function prime(n:int) { var m : int = n; var limit : int = n/2; for (var i : int = 2; i<=limit;++i) { if(!(m%i)) { return false; } } return true; } It would be great to have enough info to do an integer modulo there and completely avoid the (int,int)->(double,double)->modulo->negate->doubleToBool roundtrip. (Outside the scope of this LIR_fne work item)

Comment 26

•

15 years ago

Attached patch Updated patch for latest code (i386, i386 w/o SSE, x64, ARM) — Details — Splinter Review

Add LIR_ned support for i386, i386 w/o SSE, x64 and ARM. NJ_NED_SUPPORTED controls whether fast path is supported on various platforms.

Attachment #452048 - Flags: superreview?(edwsmith)

Attachment #452048 - Flags: review?(nnethercote)

Comment 27

•

15 years ago

Comment on attachment 452048 [details] [diff] [review] Updated patch for latest code (i386, i386 w/o SSE, x64, ARM) one bug in this patch. disregard.

Attachment #452048 - Flags: superreview?(edwsmith)

Attachment #452048 - Flags: review?(nnethercote)

Comment 28

•

15 years ago

Attached patch updated patch — Details — Splinter Review

fix last patch which incorrectly used NJ_JTBL_SUPPORTED instead of NJ_NED_SUPPORTED

Attachment #452254 - Flags: superreview?(edwsmith)

Attachment #452254 - Flags: review?(nnethercote)

Nicholas Nethercote [inactive]

Comment 29

•

15 years ago

Comment on attachment 452254 [details] [diff] [review] updated patch Werner, can you add this to your ~/.hgrc file? [defaults] diff=-p -U 8 It gives 8 lines of context instead of 4, which makes patches easier to review. Also, do you know about committing to the nanojit-central repository? See https://developer.mozilla.org/en/NanojitMerge for details. >@@ -2507,6 +2510,7 @@ > static double FASTCALL muld(double a, double b) { return a * b; } > static double FASTCALL divd(double a, double b) { return a / b; } > static int32_t FASTCALL eqd(double a, double b) { return a == b; } >+ static int32_t FASTCALL ned(double a, double b) { return a == a && a != b; } I think this is wrong. "NaN != 0" should give true. >diff -r 7c9acb06a202 nanojit/LIRopcode.tbl >--- a/nanojit/LIRopcode.tbl Thu Jun 17 15:51:37 2010 +0200 >+++ b/nanojit/LIRopcode.tbl Fri Jun 18 06:53:01 2010 -0700 >@@ -244,8 +244,7 @@ > OP___(gtd, 75, Op2, I, 1) // double greater-than > OP___(led, 76, Op2, I, 1) // double less-than-or-equal > OP___(ged, 77, Op2, I, 1) // double greater-than-or-equal >- >-OP_UN(78) >+OP___(ned, 78, Op2, I, 1) // double inequality (unordered|equal => false; else true) I'd prefer 'ned' to follow just after 'eqd'. This will require shifting the others down, but that's ok. You may need get some static assertions because some of the comparisons must have even opcodes, some must have odd, this is for the x^1 inversion, I can't remember the exact details. So you may need to shift the opcodes around some more. Hmm, there are some static assertions regarding the order of the comparison operations in LIR.h around line 104. They'll need to be updated too. > NIns* Assembler::asm_branchd(bool onFalse, LIns *cond, NIns *target) { > LOpcode condop = cond->opcode(); > NIns *patch; > LIns *a = cond->oprnd1(); > LIns *b = cond->oprnd2(); >- if (condop == LIR_eqd) { >+ if (condop == LIR_ned) { >+ // unordered and equal are "false", otherwise "true" (no need to check parity bit) >+ if (onFalse) { >+ JE(0, target); >+ } else { >+ JNE(0, target); >+ } >+ patch = _nIns; >+ } >+ else if (condop == LIR_eqd) { > if (onFalse) { > // branch if unordered or != > JP(16, target); // underrun of 12 needed, round up for overhang --> 16 Can you put the 'eqd' code before the 'ned' code? >+ // LIR_ned: >+ // ucomisd ZPC outcome (SETNE/JNE succeeds if Z==0) >+ // ------- --- ------- >+ // UNORDERED 111 SETNE/JNE fails >+ // EQUAL 100 SETNE/JNE fails >+ // GREATER_THAN 000 SETNE/JNE succeeds >+ // LESS_THAN 001 SETNE/JNE succeeds As above, AIUI the UNORDERED case should succeed. Eg. "NaN != 0" gives true. This problem sinks the patch as it stands. I'll file a follow-up bug for LIR_nei/LIR_neq, they'd be good to have as well for completeness.

Attachment #452254 - Flags: review?(nnethercote) → review-

Comment 30

•

15 years ago

the truth table I'm going for with fne (now named ned) is: comparison eqd ned ---------- --- --- x == y true false x < y false true x > y false true unordered false false This is what x86/arm do naturally, and its what works for ecma-compatible double->bool conversion. It is not symmetric with eqd; but to me that's its attraction, since otherwise it'd be easy to just flip the sense of jt/jf used with eqd. (however for nei/neq, they should be symmetric with eqi/eqq). (in retrospect, we should write down these tables for the LIR float-comparison instructions somewhere in LIR.h or LIRopcodes.h).

Updated

•

15 years ago

Summary: Add LIR_fne opcode and use for double->bool conversion → Add LIR_ned opcode and use for double->bool conversion

Comment 31

•

15 years ago

(In reply to comment #30) fwiw, i'd be open to a better name for an opcode with the above truth table. the way the proposed truth table makes sense to me, and why its not the inverse of eqd, is if you read "ned" as "<>"... i.e (x <> y). in that light, (NaN <> 0) = false since (x <> y) => (x < y || x > y) => (NaN < 0 || NaN > 0) => false||false.

Nicholas Nethercote [inactive]

Comment 32

•

15 years ago

If what you want is really double-to-bool conversion, then this should be a unary op called LIR_d2b. We can add LIR_ned (the inverse of LIR_eqd) as part of bug 573416. So what are the semantics of LIR_d2b? I guess 0.0, -0.0, and NaN convert to 0 and everything else to 1?

Andreas Gal :gal

Comment 33

•

15 years ago

We usually stay away from adding LIR opcodes for things that have a specific JS semantics.

Comment 34

•

15 years ago

(In reply to comment #32) > If what you want is really double-to-bool conversion, then this should be a > unary op called LIR_d2b. We can add LIR_ned (the inverse of LIR_eqd) as part > of bug 573416. > > So what are the semantics of LIR_d2b? I guess 0.0, -0.0, and NaN convert to 0 > and everything else to 1? Correct. I considered tradeoffs of f2b in comment #4 and comment #6, decided to go with a binary op to simplify the backend implementations and make the instruction less special purpose. (Only ARM has a FCMPZ instruction with an implicit 0.0 argument). I'm not opposed to a LIR_ned that's the inverse of eqd, but I have no need for it either. I dont feel strongly either way; d2b and ned both will serve the need just fine. the apparent issues are backend complexity and LIR semantics. here's how i'd summarize: argument against special puropse ops: "support LIR_ned, maybe under a different name." argument against over generalization: "support LIR_d2b". In support of the first argument, here's another proposal: 1. Change LIR_ned's name to LIR_gld meaning "greater or less than", i.e. "x<>y". this more accurately reflects its truth table since NaN is not greater or less than anything (gld(NaN,x)=false). its inverse would be 'ngld' if we ever wanted it. 2. Inverses. Inverses of the floating point operators get a prepended "n". neqd, nltd, etc. These are strictly computed as !(op). Thus, nltd(x,y) != ged(x,y), since the unordered cases are different. For integer operations, LIR_nlti == LIR_gti, so we could optionally add those as aliases. LIR_nei/q would be new, there's no existing equivalent.

Nicholas Nethercote [inactive]

Comment 35

•

15 years ago

I'm happier if it's not called LIR_ned. Re the over/undergeneralization question: under what circumstances would you want the 2nd operand to be something other than 0.0? Also, would isCmp() be true for LIR_gld? In my mind there are six obvious comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird. (In reply to comment #33) > > We usually stay away from adding LIR opcodes for things that have a specific JS > semantics. Gal, which opcode has JS-specific semantics?

Comment 36

•

15 years ago

(In reply to comment #35) > I'm happier if it's not called LIR_ned. > > Re the over/undergeneralization question: under what circumstances would you > want the 2nd operand to be something other than 0.0? I can't think of any, my only concrete use case is with 0.0. A guard for div or mod (testing for nonzero/non-nan divisor) might use this opcode, but also would still use 0.0 on the RHS. I can imagine cases when, mathematically, you might want to compare X with a constant and exclude NaN, however since its not possible in C/Java/JS, there's not much justification for it. if our optimizers were able to recognize the pattern (X == X && X != C) or (!isNaN(X) && X != C) then they could use this single opcode as part of a strength reduction. That's pretty far fetched, but it is a code pattern we could look for in real code, if we wanted to search for more use cases. doing so would stray deeply into "solution looking for a problem" territory. > Also, would isCmp() be true for LIR_gld? In my mind there are six obvious > comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird. Yes, I think isCmp(LIR_gld) would be true. NaN's make floating point comparisons weird, for sure. that there's two variants of inequality doesn't bother me though. > (In reply to comment #33) > > > > We usually stay away from adding LIR opcodes for things that have a specific JS > > semantics. > > Gal, which opcode has JS-specific semantics? I think he meant that a LIR_d2b would have JS-specific semantics. If we went with that opcode, we should just specify its behavior as a truth table, with the happy coincidence that it matched JS semantics for ToBool(Number). Sounds like you're advocating LIR_d2b(x) more strongly than I'm advocating LIR_gld(x,0.0). That works for me. There's time for more discussion/opinions, too; this isn't an urgent bug.

Comment 37

•

15 years ago

(In reply to comment #36) > > Also, would isCmp() be true for LIR_gld? In my mind there are six obvious > > comparison operators, ==/!=/</>/<=/>= and going beyond that is a bit weird. Strictly speaking, there are 12 comparison operators for floats, due to NaN; the six above, and six more with inverted truth tables (prefixed with 'n' probably), which treat NaN the opposite way. For ints, the 12 collapse down to 6 since half of the truth tables work out the same, because of no NaN.

Updated

•

15 years ago

Attachment #452254 - Flags: superreview?(edwsmith)

Comment 38

•

15 years ago

status?

Comment 39

•

15 years ago

Patch needs updating for the LIR_d2b change and the feedback from Nick. Also, it did not work on ARM for some reason so that needs investigation after I get set up wth some ARM hardware.

Comment 40

•

14 years ago

Per team mtg, removing my name from bugs I'm not actively working on.

Assignee: wsharp → nobody

Andre Kruetzfeldt

Updated

•

14 years ago

Assignee: nobody → rreitmai

Updated

•

14 years ago

Whiteboard: PACMAN PACMAN → PACMAN mustâ€“fix-candidate PACMAN must–fix-candidate

Updated

•

14 years ago

Whiteboard: PACMAN mustâ€“fix-candidate PACMAN must–fix-candidate → PACMAN must-fix-candidate PACMAN must-fix-candidate

Andre Kruetzfeldt

Updated

•

14 years ago

Flags: flashplayer-injection-

Target Milestone: Q3 11 - Serrano → Future

Andre Kruetzfeldt

Updated

•

14 years ago

Depends on: Andre

Andre Kruetzfeldt

Comment 41

•

13 years ago

Rick, is this bug still active? Assign to nobody if you are not working on it. Removing Andre blocker.

No longer depends on: Andre

Updated

•

13 years ago

Assignee: rreitmai → nobody

Sylvestre Ledru [:Sylvestre]

Comment 42

•

6 years ago

Tamarin is a dead project now. Mass WONTFIX.

Status: REOPENED → RESOLVED

Closed: 15 years ago → 6 years ago

Resolution: --- → WONTFIX

Sylvestre Ledru [:Sylvestre]

Comment 43

•

6 years ago

Tamarin isn't maintained anymore. WONTFIX remaining bugs.

You need to log in before you can comment on or make changes to this bug.