Closed Bug 418641 Opened 14 years ago Closed 14 years ago

++ and -- cases in the interpreter generates too much code

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: igor, Assigned: igor)

References

Details

(Keywords: memory-footprint)

Attachments

(2 files, 5 obsolete files)

Currently in the interpreter loop all the code implementing various pre- and post-increment/decrement bytecodes uses NONINT_INCREMENT_OP_MIDDLE macro to implement non-integer case of the operation. Since the macro contains a lot of code, its expansion at three places leads to bloat in js_Interpret just to support this the rarely executed case.

It would be nice to replace the macro by a function to shrink the code.
No longer depends on: 418128
The test cases covers most of ++/-- bytecodes to ensure the operation correctness. The only exception is fast global bytecodes.
Attached patch v1 (obsolete) — Splinter Review
The patch moves NONINT_INCREMENT_OP_MIDDLE into separated function NonIntIncDec and reorganizes fast ++/-- cases to share more code. 

With GCC 4.3 on Fedora-9 alpha the patch shrinks the size of jsinterp.o from 54756 to 52288 or by 2468 bytes (4.5%).
To implement the bug 419632 the code would need stronger rooting. So I will update this patch so the new implementation of ++/-- can provide all the necessary strong roots for allocating the double numbers.
Blocks: 419632
Attached patch v2 (obsolete) — Splinter Review
the new version makes ++/-- code not to rely on weak rooting of numbers at all so that can be switched of. For that it was necessary to have extra 2 slots on the stack so the patch added JS_TMPSLOT2. The patch also changes the meaning of the extra slots to mean an extra space besides JSCodeSpec.uses that should exist on the stack. This is the reason the patch has added JS_TMPSLOT2 for ++name-like operations despite the fact that JSCodeSpec.uses is zero and JSCodeSpec.defs is one for them. The extra slot just makes sure that the space is available after nuses. If some of it will be used for defs, then fine.
Attachment #304578 - Attachment is obsolete: true
Attachment #305838 - Flags: review?(brendan)
Version: unspecified → Trunk
Attached patch v3 (obsolete) — Splinter Review
The new version re-syncs v2 with trunk changes.
Attachment #305838 - Attachment is obsolete: true
Attachment #307360 - Flags: review?(brendan)
Attachment #305838 - Flags: review?(brendan)
Comment on attachment 307360 [details] [diff] [review]
v3

>-    jsval *vp, lval, rval, ltmp, rtmp;
>+    jsval *vp, lval, rval, ltmp, rtmp, limit, delta, delta2;

Suggest putting novelties in blocks, if necessary blocks around numerous "cases" (computed goto target / computed goto opcode implementations). We have some evidence (IIRC you did) that top-level locals hurt optimization, and anyway these additions are a bit too special to deserve to take up such space at the top ;-).

Same for op2.

>+                    sp[-1] = v + delta;

Instead of delta, perhaps incr or adjust? Just a thought.

>+          finish_prop_incop:
>+            if (cs->nuses == 0) {
>+                /* sp[-1] already contains the result of name increment. */
>+            } else {
>+                rtmp = sp[-1];
>+                sp -= cs->nuses;
>+                sp[-1] = rtmp;
>+            }
>+            len = cs->length;
>             DO_NEXT_OP(len);

The vaguer convention for label name is end_*op and it can be a label at the very end, or at an epilog such as this. Perhaps finish_*op is better, but end_*op is the current convention. Your call.

>+            if (JS_LIKELY(JSVAL_IS_INT(rval) && rval != limit)) {

In the rest of the code, "max" is greatest possible value, limit is one beyond or fencepost. Suggest maxval rather than limit.

>+ * Find the results of incrementing or decrementing *vp when it is expected
>+ * that it will not fit JS int. On exit for pre-increments both vp and vp2

s/JS int/jsval int/

/be
Attached patch v4 (obsolete) — Splinter Review
The new version addresses the nits and restricts the range of fast inc/dec operations to [-2*29, 2**29), not (-2*30, 2**30). Such int jsvals do not require overflow checks and can be tested via:

#define CAN_DO_FAST_INC_DEC(v)     (((((v) << 1) ^ v) & 0x80000001) == 1)

With GCC it saved 97 bytes of code as the new checks avoids loading of max/min constants for overflow detection.
Attachment #307360 - Attachment is obsolete: true
Attachment #307433 - Flags: review?(brendan)
Attachment #307360 - Flags: review?(brendan)
Attached patch v5 (obsolete) — Splinter Review
The new version adds static asserts to verify that CAN_DO_FAST_INC_DEC(v) is false when v is INT_TO_JSVAL(JSVAL_INT_MIN|MAX) or JSVAL_VOID.
Attachment #307433 - Attachment is obsolete: true
Attachment #307453 - Flags: review?(brendan)
Attachment #307433 - Flags: review?(brendan)
Comment on attachment 307453 [details] [diff] [review]
v5

>+#define CAN_DO_FAST_INC_DEC(v)     (((((v) << 1) ^ v) & 0x80000001) == 1)

Good to know it not de-optimize any increment ops we care about in at least one popular benchmark ;-).

>+JS_STATIC_ASSERT(JSVAL_INT == 1);
>+JS_STATIC_ASSERT(!CAN_DO_FAST_INC_DEC(JSVAL_VOID));
>+JS_STATIC_ASSERT(!CAN_DO_FAST_INC_DEC(INT_TO_JSVAL(JSVAL_INT_MIN)));
>+JS_STATIC_ASSERT(!CAN_DO_FAST_INC_DEC(INT_TO_JSVAL(JSVAL_INT_MAX)));
>+JS_STATIC_ASSERT(!CAN_DO_FAST_INC_DEC(JSVAL_VOID));

Righteous.

>+ * Find the results of incrementing or decrementing *vp. On exit for
>+ * pre-increments both vp and vp2 will contain the result. For post-increments

"For pre-increments, both *vp and *vp2 will contain the result on return." etc.

Need caffeine and I'll do another review, just wanted to note this -- great patch AFAICT at this point. Let's optimize with r+ if it passes mochitest and js tests.

/be
Attachment #307453 - Flags: review?(brendan) → review+
Attached patch v6Splinter Review
The new version addresses the nits from the comment 9.
Attachment #307453 - Attachment is obsolete: true
Comment on attachment 307501 [details] [diff] [review]
v6

The patch shrinks the interpreter by 2.5K and allows to implement faster path for allocating double values.
Attachment #307501 - Flags: review+
Attachment #307501 - Flags: approval1.9?
The patch from the comment 10 passed mochi tests (Linux build) and the test suite in jsshell (thread-safe debug build).
Comment on attachment 307501 [details] [diff] [review]
v6

a1.9+=damons
Attachment #307501 - Flags: approval1.9? → approval1.9+
I checked in the patch from comment 10 to the trunk:

http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&branch=HEAD&cvsroot=%2Fcvsroot&date=explicit&mindate=1204748522&maxdate=1204748925&who=igor%25mir2.org
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Keywords: footprint
/cvsroot/mozilla/js/tests/js1_7/regress/regress-418641.js,v  <--  regress-418641.js
initial revision: 1.1
Flags: in-testsuite+
Flags: in-litmus-
v 1.9.0
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.