Closed Bug 363534 Opened 13 years ago Closed 12 years ago

Combine JSOP_LT and JSOP_IFEQ, etc., pairs

Categories

(Core :: JavaScript Engine, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1a1

People

(Reporter: brendan, Assigned: brendan)

References

Details

Attachments

(1 file, 1 obsolete file)

See bug 363529 for data.  SpiderMonkey's bytecode was factored at the expense of performance, to separate condition-generating bytecodes for relational, equality, etc. operators, from if-true and if-false branch instructions.  These should be combined.

/be
So JSOP_IFLT instead of JSOP_LT; JSOP_IFNE, and so forth.

/be
Assignee: general → brendan
Target Milestone: mozilla1.9alpha1 → mozilla1.9alpha3
Status: NEW → ASSIGNED
Target Milestone: mozilla1.9alpha3 → ---
Priority: -- → P1
Target Milestone: --- → mozilla1.9.1a1
Attached patch fix (obsolete) — Splinter Review
** TOTAL **:           1.052x as fast    3119.1ms +/- 0.2%   2966.3ms +/- 0.2%     significant

This patch fuses dispatch for all relational and equality ops (strict and loose). Not worrying about code size at this stage -- more on that in other bugs.

/be
Attachment #325833 - Flags: review?(igor)
Flags: wanted1.9.1+
Comment on attachment 325833 [details] [diff] [review]
fix

>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>+#define TRY_BRANCH_AFTER_COND(spdec)                                          \
>+    JS_BEGIN_MACRO                                                            \

Here is a place to put an assert that js_CodeSpec[op].length is one here. r+ with this fixed.

>+        op2 = (JSOp) regs.pc[1];                                              \
>+        if (op2 == JSOP_IFNE || op2 == JSOP_IFEQ) {                           \
>+            regs.sp -= spdec;                                                 \
>+            if (cond == (op2 == JSOP_IFNE)) {                                 \

Given that JSOP_IFNE == JSOP_IFEQ + 1 this can be written as:

>+        uintN diff = (uintN) (JSOp) regs.pc[1] - JSOP_IFEQ;                   \
>+        if (diff <= 1) {                                                      \
>+            regs.sp -= spdec;                                                 \
>+            if (cond == (diff != 0)) {                                        \

Or can compilers do this optimization themselves? Another thing is that IMO it is better to pass cond as a macro parameter.
Attachment #325833 - Flags: review?(igor)
Comment on attachment 325833 [details] [diff] [review]
fix

>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>+#define TRY_BRANCH_AFTER_COND(spdec)                                          \
>+    JS_BEGIN_MACRO                                                            \
>+        op2 = (JSOp) regs.pc[1];                                              \
>+        if (op2 == JSOP_IFNE || op2 == JSOP_IFEQ) {                           \
>+            regs.sp -= spdec;                                                 \
>+            if (cond == (op2 == JSOP_IFNE)) {                                 \
>+                ++regs.pc;                                                    \
>+                len = GET_JUMP_OFFSET(regs.pc);                               \
>+                CHECK_BRANCH(len);                                            \
>+                DO_NEXT_OP(len);                                              \
>+            }                                                                 \
>+            regs.pc += 1 + JSOP_IFEQ_LENGTH;                                  \
>+            op = (JSOp) *regs.pc;                                             \
>+            DO_OP();                                                          \

One more thing: why not to use
 
  len = 1 + JSOP_IFEQ_LENGTH
  DO_NEXT_OP(len) 

instead of the explicit 3 lines here? DO_NEXT_OP should save code with MSVC where computed goto is not available.
Attached patch fix, v2Splinter Review
Good suggestions. I also tried unifying the DO_NEXT_OP, via if/else, but it was a clear performance loss with SunSpider.

/be
Attachment #325833 - Attachment is obsolete: true
Attachment #325951 - Flags: review?(igor)
Comment on attachment 325951 [details] [diff] [review]
fix, v2

>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>--- a/js/src/jsinterp.cpp
>+++ b/js/src/jsinterp.cpp
>@@ -2443,16 +2443,20 @@ JS_STATIC_ASSERT(JSOP_SETNAME_LENGTH == 
> JS_STATIC_ASSERT(JSOP_SETNAME_LENGTH == JSOP_SETPROP_LENGTH);
> 
> /* Ensure we can share deffun and closure code. */
> JS_STATIC_ASSERT(JSOP_DEFFUN_LENGTH == JSOP_CLOSURE_LENGTH);
> 
> 
> /* See comments in FETCH_SHIFT macro. */
> JS_STATIC_ASSERT((JSVAL_TO_INT(JSVAL_VOID) & 31) == 0);
>+
>+/* See TRY_BRANCH_AFTER_COND. */
>+JS_STATIC_ASSERT(JSOP_IFNE_LENGTH == JSOP_IFEQ_LENGTH);
>+JS_STATIC_ASSERT(JSOP_IFNE == JSOP_IFEQ + 1);
> 
> JSBool
> js_Interpret(JSContext *cx)
> {
>     JSRuntime *rt;
>     JSStackFrame *fp;
>     JSScript *script;
>     uintN inlineCallCount;
>@@ -3427,16 +3431,34 @@ interrupt:
>           BEGIN_CASE(JSOP_BITXOR)
>             BITWISE_OP(^);
>           END_CASE(JSOP_BITXOR)
> 
>           BEGIN_CASE(JSOP_BITAND)
>             BITWISE_OP(&);
>           END_CASE(JSOP_BITAND)
> 
>+#define TRY_BRANCH_AFTER_COND(cond,spdec)                                     \
>+    JS_BEGIN_MACRO                                                            \
>+        uintN diff_;                                                          \
>+        JS_ASSERT(js_CodeSpec[op].length == 1);                               \
>+        diff_ = (uintN) regs.pc[1] - (uintN) JSOP_IFEQ;                       \
>+        if (diff_ <= 1) {                                                     \
>+            regs.sp -= spdec;                                                 \
>+            if (cond == (diff_ != 0)) {                                       \
>+                ++regs.pc;                                                    \
>+                len = GET_JUMP_OFFSET(regs.pc);                               \
>+                CHECK_BRANCH(len);                                            \
>+                DO_NEXT_OP(len);                                              \
>+            }                                                                 \
>+            len = 1 + JSOP_IFEQ_LENGTH;                                       \
>+            DO_NEXT_OP(len);                                                  \
>+        }                                                                     \
>+    JS_END_MACRO

Right, unification of DO_NEXT_OP only harms here. With computed goto it would decrease usability of branch prediction since more code paths go through the single place. And with MSVC DO_NEXT_OP(len) is just a goto so no win from the unification.


>+
> #define RELATIONAL_OP(OP)                                                     \
>     JS_BEGIN_MACRO                                                            \
>         rval = FETCH_OPND(-1);                                                \
>         lval = FETCH_OPND(-2);                                                \
>         /* Optimize for two int-tagged operands (typical loop control). */    \
>         if ((lval & rval) & JSVAL_INT) {                                      \
>             ltmp = lval ^ JSVAL_VOID;                                         \
>             rtmp = rval ^ JSVAL_VOID;                                         \
>@@ -3457,16 +3479,17 @@ interrupt:
>                 str2 = JSVAL_TO_STRING(rval);                                 \
>                 cond = js_CompareStrings(str, str2) OP 0;                     \
>             } else {                                                          \
>                 VALUE_TO_NUMBER(cx, -2, lval, d);                             \
>                 VALUE_TO_NUMBER(cx, -1, rval, d2);                            \
>                 cond = JSDOUBLE_COMPARE(d, OP, d2, JS_FALSE);                 \
>             }                                                                 \
>         }                                                                     \
>+        TRY_BRANCH_AFTER_COND(cond, 2);                                       \
>         regs.sp--;                                                            \
>         STORE_OPND(-1, BOOLEAN_TO_JSVAL(cond));                               \
>     JS_END_MACRO
> 
> /*
>  * NB: These macros can't use JS_BEGIN_MACRO/JS_END_MACRO around their bodies
>  * because they begin if/else chains, so callers must not put semicolons after
>  * the call expressions!
>@@ -3545,16 +3568,17 @@ interrupt:
>                     cond = js_EqualStrings(str, str2) OP JS_TRUE;             \
>                 } else {                                                      \
>                     VALUE_TO_NUMBER(cx, -2, lval, d);                         \
>                     VALUE_TO_NUMBER(cx, -1, rval, d2);                        \
>                     cond = JSDOUBLE_COMPARE(d, OP, d2, IFNAN);                \
>                 }                                                             \
>             }                                                                 \
>         }                                                                     \
>+        TRY_BRANCH_AFTER_COND(cond, 2);                                       \
>         regs.sp--;                                                            \
>         STORE_OPND(-1, BOOLEAN_TO_JSVAL(cond));                               \
>     JS_END_MACRO
> 
>           BEGIN_CASE(JSOP_EQ)
>             EQUALITY_OP(==, JS_FALSE);
>           END_CASE(JSOP_EQ)
> 
>@@ -3562,16 +3586,17 @@ interrupt:
>             EQUALITY_OP(!=, JS_TRUE);
>           END_CASE(JSOP_NE)
> 
> #define STRICT_EQUALITY_OP(OP)                                                \
>     JS_BEGIN_MACRO                                                            \
>         rval = FETCH_OPND(-1);                                                \
>         lval = FETCH_OPND(-2);                                                \
>         cond = js_StrictlyEqual(cx, lval, rval) OP JS_TRUE;                   \
>+        TRY_BRANCH_AFTER_COND(cond, 2);                                       \
>         regs.sp--;                                                            \
>         STORE_OPND(-1, BOOLEAN_TO_JSVAL(cond));                               \
>     JS_END_MACRO
> 
>           BEGIN_CASE(JSOP_STRICTEQ)
>             STRICT_EQUALITY_OP(==);
>           END_CASE(JSOP_STRICTEQ)
>
Attachment #325951 - Flags: review?(igor) → review+
The branch-not-taken path becomes no longer (branch on false if condition around the len = GET_JUMP_OFFSET(...) code to set len to a constant, and then to fall through from this short else into common DO_NEXT_OP), but the branch-taken path has a jump around the else, in the case where DO_NEXT_OP is unified.

This jump from end of then over short else is unconditional, so no prediction is necessary. But it apparently costs, or something else is going on -- I will look at the optimized code when I have time.

/be
I also forgot to mention that DO_OP -> DO_NEXT_OP would also help the changes dur to the bug 312354.
(In reply to comment #7)

To clarify, that unification means to replace the following code in the patch,

+            if (cond == (diff_ != 0)) {                                       \
+                ++regs.pc;                                                    \
+                len = GET_JUMP_OFFSET(regs.pc);                               \
+                CHECK_BRANCH(len);                                            \
+                DO_NEXT_OP(len);                                              \
+            }                                                                 \
+            len = 1 + JSOP_IFEQ_LENGTH;                                       \
+            DO_NEXT_OP(len);                                                  \

with:

+            if (cond == (diff_ != 0)) {                                       \
+                ++regs.pc;                                                    \
+                len = GET_JUMP_OFFSET(regs.pc);                               \
+                CHECK_BRANCH(len);                                            \
+            } else                                                            \
+                len = 1 + JSOP_IFEQ_LENGTH;                                   \
+            }                                                                 \
+            DO_NEXT_OP(len);                                                  \

I think the non-unified version is faster because the dispatching to the next opcode is carried out in 2 different places. Thus, when there is a correlation between the code in the js_Interpret that will be executed when the "if" condition is true and a separated correlation with the false case, the non-unified version through branch prediction will take advantage of it. On the other hand, the unification of jumps will destroy this correlation. 

It would be interesting to extends the opcode metering to record not only bytecode pairs, but also the jump condition for the first bytecode in the pair.
I thought from other work that gcc was tail-merging aggressively, so would merge those DO_NEXT_OP() calls (most of them -- uncommon len computation of course), but you are right: they're distinct (adjacent) basic blocks. Huzzah!

changeset:   15464:9ceee4d8b366
tag:         tip
user:        Brendan Eich <brendan@mozilla.org>
date:        Fri Jun 20 13:17:51 2008 -0700
summary:     Fuse branch after relational or equality op (363534, r=igor).

/be
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Flags: in-testsuite-
Flags: in-litmus-
You need to log in before you can comment on or make changes to this bug.