Closed Bug 469942 Opened 13 years ago Closed 13 years ago

TM: 20% slower to compute -""


(Core :: JavaScript Engine, defect)

Not set





(Reporter: jruderman, Assigned: Waldo)


(Keywords: perf, testcase, verified1.9.1, Whiteboard: fixed-in-tracemonkey)


(1 file, 1 obsolete file)

About 20% slower with -j than without:

for (var i=0;i<3000000;++i) -"";
Here's a more testcase that I think is more likely to be similar to real-world scripts.  It shows a 22% slowdown:

for (var i=0;i<3000000;++i) +"3";
Abort recording (line 1, pc 14): JSOP_POS.
recording starting from -e:1@11
Assignee: general → jwalden+bmo
Attached patch patch, crashes for +[] in a loop (obsolete) — Splinter Review
This implements unary imacros for JSOP_POS (+). The same should be done for JSOP_NEG. I get the following crash with the current patch:

Assertion failure: fp->imacpc ? atoms == COMMON_ATOMS_START(&rt->atomState) && GET_INDEX(regs.pc + 0) < js_common_atom_count : (size_t)(atoms - script->atomMap.vector) < (size_t)(script->atomMap.length - GET_INDEX(regs.pc + 0)), at ../jsinterp.cpp:5332

Brendan, any suggestion?
Note: with this patch the body of the loop becomes empty (dead code elimination!). The speedup should be pretty substantial.
I didn't see that Waldo grabbed this one. Its all yours from here. You might want to squash the other unary ops in the same patch.
If you add imacros for an opcode that has not any yet, then you must update jstracer.h's RECORD_ARGS macro body, look for JSOP_IS_BINARY. We may need JSOP_IS_UNARY. (If the opcode already has imacros, you must add to the .igroup for that opcode or opcode range, of course.)

The assembler does not auto-generate anything used by RECORD_ARGS, and I didn't want to try in the first cut. We could add a JOF_IMACRO format flag to jsopcode.tbl but even then there'd be no imacro_asm.js-driven generation. We could certainly make a separate table (bitmap, even) but it means indexing and loading instead of just range-testing opcodes.

Note how (per the comment) all the logic to test whether the opcode could call an imacro in RECORD_ARGS is *compile-time* -- the point is to erase this code bloat for most ops, expanding it in the R_JSOP_*: labels (in the indirect-threaded interpreter) only for those that need it.

So a lookup table or bitmap would defeat this code size savings.

I say hand-craft RECORD_ARGS' constant condition expression for now. Add JSOP_IS_UNARY or just test for x == JSOP_NEG || x == JSOP_POS.

.igroup wants one opcode after the group name, or a dense range of opcodes. To handle JSOP_NEG and JSOP_POS, suggest swapping JSOP_NEW and JSOP_POS in jsopcode.tbl (updating JSXDR_BYTECODE_VERSION in jsxdrapi.h of course) to make POS come right after NEG.

BTW, the imacro_asm source language of course lets you write add, sub, etc. instead of JSOP_ADD, JSOP_SUB -- but for the .igroup second operand, which is an opcode or opcode range, I purposely chose to require the JSOP_ADD, e.g., spelling.

The idea is to make the opcodes in the group as concrete as possible in terms of jsopcode.tbl order and spelling.

But maybe this is just unwanted verbosity (and effort in!). Comments welcome.

We could also extend the source language to allow op1,op2-op3 and the like, but I'm loath to make that regexp any more complicated, and making grouped opcodes adjacent wins in perf/codesize anyway.

Attached patch PatchSplinter Review
h-118:~/moz/shell-js/js/src jwalden$ echo 'for (var i=0;i<3000000;++i) -"";' | time ./js
        0.66 real         0.54 user         0.07 sys
h-118:~/moz/shell-js/js/src jwalden$ echo 'for (var i=0;i<3000000;++i) -"";' | time ./js -j
        0.09 real         0.03 user         0.02 sys

There was a suggestion to do all the unary opcodes at once, but I think it's probably best to do that in other bugs, especially given how fragile the imacros.c.out generation step is.
Attachment #353367 - Attachment is obsolete: true
Attachment #353887 - Flags: review?(brendan)
Attachment #353887 - Flags: review?(brendan) → review+
Comment on attachment 353887 [details] [diff] [review]

>+.igroup unary JSOP_NEG-JSOP_POS
>+    .imacro sign                        # obj
>+        dup                             # obj obj
>+        dup                             # obj obj obj
>+        getprop valueOf                 # obj obj valueOf
>+        ifprimtop 2                     # obj obj valueOf
>+        swap                            # obj valueOf obj
>+        call 0                          # obj lval
>+        ifprimtop 1                     # obj lval
>+        goto 3                          # obj lval
>+1:      swap                            # lval obj
>+        pop                             # lval
>+        goto 4                          # lval
>+2:      pop                             # obj obj
>+3:      pop                             # obj
>+        callprop toString               # toString obj
>+        call 0                          # lval
>+        primtop                         # lval
>+4:      imacop                          # aval
>+        stop
>+    .end
> .igroup apply JSOP_APPLY
>     .imacro apply0                          # apply fun this arr

Behold the righteous asm-style inline comment start column of 40! Thanks for using it. Why didn't gal? :-P

Fix at will, attach a diff -w if you can.

>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>--- a/js/src/jsinterp.cpp
>+++ b/js/src/jsinterp.cpp
>@@ -6804,6 +6804,16 @@ js_Interpret(JSContext *cx)
> #endif /* JS_HAS_GENERATORS */
>+            lval = FETCH_OPND(-1);
>+            if (!JSVAL_IS_PRIMITIVE(lval)) {
>+                js_ReportValueError2(cx, JSMSG_CANT_CONVERT_TO,
>+                                    JSDVG_SEARCH_STACK, lval, NULL,
>+                                    "primitive type");
>+                goto error;
>+            }

Please put this right after JSOP_IFPRIMTOP.

>@@ -5847,7 +5851,60 @@ TraceRecorder::record_JSOP_NEG()
>         set(&v, a);
>         return true;
>     }
>-    return false;
>+    if (JSVAL_IS_NULL(v)) {
>+        jsdpun u;
>+        u.d = -0.0;
>+        set(&v, lir->insImmq(u.u64));
>+        return true;
>+    }
>+    LIns* args[] = { get(&v), cx_ins };

Looks like this is commoned for use in both remaining cases:

>+    if (JSVAL_IS_STRING(v)) {
>+        set(&v, lir->ins1(LIR_fneg, lir->insCall(&js_StringToNumber_ci, args)));
>+        return true;
>+    }
>+    if (JSVAL_TAG(v) == JSVAL_BOOLEAN) {
>+        LIns* args[] = { get(&v), cx_ins };

... but recomputed here?

>+        set(&v, lir->ins1(LIR_fneg, lir->insCall(&js_BooleanOrUndefinedToNumber_ci, args)));
>+        return true;
>+    }
>+    JS_NOT_REACHED("unhandled type in JSOP_NEG");
>+    return false;

Indeed there is no unhandled type. jsval tag is three bits, low bit set means int. You've handled all of those + 2 (double) by handling numbers, also object and null, and finally string and boolean.

So rather than waste space and beg questions, common args between the string and boolean cases, and do not test tag for boolean, rather assert it and make the final basic block be the js_BooleanOrUndefinedToNumber one that ends with return true at the bottom of the function.

>+    jsval& r = stackval(-1);
>+    if (!JSVAL_IS_PRIMITIVE(r))
>+        return call_imacro(unary_imacros.sign);
>+    if (isNumber(r))
>+        return true;
>+    if (JSVAL_IS_NULL(r)) {
>+        set(&r, lir->insImmq(0));
>+        return true;
>+    }
>+    LIns* args[] = { get(&r), cx_ins };
>+    set(&r, lir->insCall(JSVAL_IS_STRING(r)
>+                         ? &js_StringToNumber_ci
>+                         : &js_BooleanOrUndefinedToNumber_ci,
>+                         args));
>+    return true;

Which matches this better. Indeed, if you can share more code by using this pattern in the previous method, please do so.

>+#define UNUSED(n)                        \
>+JS_REQUIRES_STACK bool                   \
>+TraceRecorder::record_JSOP_UNUSED##n() { \
>+    JS_NOT_REACHED("JSOP_UNUSED" # n);   \
>+    return false;                        \

House style indents macro bodies four chars, and puts \ in column 79. Don't cheat :-P.

r=me with these picked, beware conflicts with igor's patch for bug 469233.

Checked into TM, with above changes made; will fix gal's imacro indentation at some later time when I happen to be touching the imacros, not now, given the fragility of the imacros.c.out generation step (since it triggers a rebuild of ./js, which means when you run it you have to have only changes to imacros.jsasm in your tree (or use a shell elsewhere, bleh).

I've used +str commonly in my own code for converting to number, and I wouldn't be surprised if others out there use it as well.  The speedup would definitely be nice for that purpose in 191, especially since right now we only trace these things if they're nops, which is a bit stupid when it's so easy to trace the slightly-harder cases (and with imacros easy to trace even the hard cases, so long as they stay in code we already understand).
Flags: blocking1.9.1?
Whiteboard: fixed-in-tracemonkey
merged to mc
Closed: 13 years ago
Flags: blocking1.9.1? → blocking1.9.1+
Resolution: --- → FIXED
test included in js1_8_1/trace/trace-test.js
Flags: in-testsuite+
Flags: in-litmus-
v 1.9.1, 1.9.2
You need to log in before you can comment on or make changes to this bug.