Closed
Bug 475761
Opened 16 years ago
Closed 16 years ago
TM: js_Any_GetProp and friends can reenter
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jorendorff, Assigned: jorendorff)
References
Details
(Keywords: verified1.9.1)
Attachments
(3 files, 1 obsolete file)
|
1.77 KB,
patch
|
Details | Diff | Splinter Review | |
|
24.69 KB,
patch
|
brendan
:
review+
|
Details | Diff | Splinter Review |
|
15.19 KB,
patch
|
Details | Diff | Splinter Review |
This code reenters the interpreter:
var obj = {a:1, b:1, c:1, d:1, get e() 1000 };
for (var p in obj)
obj[p];
The is a lot like bug 468782, and I think a similar fix would work.
| Assignee | ||
Comment 1•16 years ago
|
||
This has to be fixed before bug 462027 (a release blocker) can land. Patch coming.
Assignee: general → jorendorff
| Assignee | ||
Comment 2•16 years ago
|
||
This is percolating through the try server now. There is some risk of having to review again later; but trace-tests and regression tests pass here.
Like bug 468782, this patch is a first step; the real fix comes when these new builtins become _FAIL functions that can deep-bail.
Attachment #359427 -
Flags: review?(brendan)
| Assignee | ||
Comment 3•16 years ago
|
||
The Try Server came up red on Windows due to a silly mistake.
Still waiting for Talos to finish (on Mac and Linux)...
| Assignee | ||
Comment 4•16 years ago
|
||
Attachment #359427 -
Attachment is obsolete: true
Attachment #359452 -
Flags: review?(brendan)
Attachment #359427 -
Flags: review?(brendan)
| Assignee | ||
Comment 5•16 years ago
|
||
Talos finished. No apparent perf regression.
Do I need to root the ids in the JSFastNative versions of those functions?
Comment 6•16 years ago
|
||
Comment on attachment 359452 [details] [diff] [review]
v2
>+.igroup setelem JSOP_SETELEM
>+
>+ .imacro setprop # obj name val
>+ dup # obj name val val
>+ pick 3 # name val val obj
>+ callbuiltin (JSBUILTIN_SetProperty) # name val val fun obj
>+ pick 4 # val val fun obj name
>+ pick 4 # val fun obj name val
>+ call 2 # val junk
>+ pop # val
>+ stop
>+ .end
>+
>+ .imacro setelem # obj i val
>+ dup # obj i val val
>+ pick 3 # i val val obj
>+ callbuiltin (JSBUILTIN_SetElement) # i val val fun obj
>+ pick 4 # val val fun obj i
>+ pick 4 # val fun obj i val
>+ call 2 # val junk
>+ pop # val
>+ stop
>+ .end
>+
>+.end
>+
>+.igroup initelem JSOP_INITELEM
>+
>+ .imacro initelem # obj i val
>+ pick 2 # i val obj
>+ dup # i val obj obj
>+ callbuiltin (JSBUILTIN_SetElement) # i val obj fun obj
>+ pick 4 # val obj fun obj i
>+ pick 4 # obj fun obj i val
>+ call 2 # obj junk
>+ pop # obj
>+ stop
>+ .end
>+
Indentaton wackiness (2 vs. 4 space offset) above.
>+static jsval FASTCALL
>+GetProperty_tn(JSContext *cx, JSObject *obj, JSString *name)
>+{
>+ jsid id;
>+ jsval v;
>+
>+ if (!js_ValueToStringId(cx, STRING_TO_JSVAL(name), &id)
>+ || !OBJ_GET_PROPERTY(cx, obj, id, &v)) {
Prevailing style has || and && at end of line.
>+SetProperty_tn(JSContext* cx, JSObject* obj, JSString* idstr, jsval v)
>+{
>+ jsid id;
>+
>+ if (!js_ValueToStringId(cx, STRING_TO_JSVAL(idstr), &id)
>+ || !OBJ_SET_PROPERTY(cx, obj, id, &v)) {
Ditto.
Great patch, more imacros to the rescue. r=me with fixes + JSOP_PICK stack balance issue sorted out (mentioned on IRC).
/be
Attachment #359452 -
Flags: review?(brendan) → review+
| Assignee | ||
Comment 7•16 years ago
|
||
Pushed with those fixes, rooting for the ids in {Get,Set}{Property,Element}, an Emacs modeline for imacros.jsasm, and a small bugfix:
The JSFastNative versions of Set{Property,Element} weren't setting the return value; they were just returning whatever the setter left in *vp. They must set *vp to a boolean or undefined, even though the next instruction is a POP. Otherwise the snapshot after the call will have a random type on the top of the stack (the type of the value at record-time), leading to trouble at execute-time when side exiting there.
http://hg.mozilla.org/tracemonkey/rev/76a7344912ae
Comment 8•16 years ago
|
||
This patch is a 30ms regression on ./bench.sh on my machine. According to the no-prisoners policy we should back out, but otoh we are behind schedule and this blocks the rest of Jason's work. Maybe we can fix the perf regression and leave the patch in in the meantime?
Comment 9•16 years ago
|
||
No difference in trace stats for sunspider (same traces, same exits/entries, same trigger counts).
Comment 10•16 years ago
|
||
TEST COMPARISON FROM TO DETAILS
=============================================================================
** TOTAL **: *1.022x as slow* 1061.0ms +/- 0.3% 1084.2ms +/- 0.3% significant
=============================================================================
3d: - 154.5ms +/- 0.8% 154.5ms +/- 0.4%
cube: - 41.4ms +/- 2.0% 40.8ms +/- 2.0%
morph: *1.036x as slow* 30.3ms +/- 1.6% 31.4ms +/- 1.6% significant
raytrace: 1.006x as fast 82.8ms +/- 0.4% 82.3ms +/- 0.6% significant
access: *1.057x as slow* 132.6ms +/- 1.0% 140.1ms +/- 1.4% significant
binary-trees: *1.030x as slow* 40.0ms +/- 1.5% 41.2ms +/- 0.7% significant
fannkuch: *1.107x as slow* 55.9ms +/- 0.9% 61.9ms +/- 3.1% significant
nbody: ?? 23.8ms +/- 1.9% 24.0ms +/- 0.0% not conclusive: might be *1.008x as slow*
nsieve: ?? 12.9ms +/- 1.8% 13.0ms +/- 5.2% not conclusive: might be *1.008x as slow*
bitops: ?? 37.9ms +/- 1.7% 38.8ms +/- 3.1% not conclusive: might be *1.024x as slow*
3bit-bits-in-byte: 1.25x as fast 2.0ms +/- 0.0% 1.6ms +/- 23.1% significant
bits-in-byte: 1.064x as fast 10.0ms +/- 3.4% 9.4ms +/- 3.9% significant
bitwise-and: - 2.8ms +/- 10.8% 2.7ms +/- 12.8%
nsieve-bits: *1.087x as slow* 23.1ms +/- 1.0% 25.1ms +/- 2.8% significant
controlflow: 1.021x as fast 33.8ms +/- 0.9% 33.1ms +/- 0.7% significant
recursive: 1.021x as fast 33.8ms +/- 0.9% 33.1ms +/- 0.7% significant
crypto: - 61.3ms +/- 1.0% 61.2ms +/- 0.7%
aes: ?? 35.1ms +/- 0.6% 35.3ms +/- 1.0% not conclusive: might be *1.006x as slow*
md5: 1.031x as fast 19.8ms +/- 1.5% 19.2ms +/- 1.6% significant
sha1: ?? 6.4ms +/- 5.8% 6.7ms +/- 5.2% not conclusive: might be *1.047x as slow*
date: - 173.6ms +/- 0.3% 172.9ms +/- 0.5%
format-tofte: - 67.9ms +/- 0.6% 67.7ms +/- 0.7%
format-xparb: - 105.7ms +/- 0.3% 105.2ms +/- 0.6%
math: ?? 39.8ms +/- 1.4% 40.0ms +/- 1.2% not conclusive: might be *1.005x as slow*
cordic: ?? 19.6ms +/- 1.9% 19.8ms +/- 1.5% not conclusive: might be *1.010x as slow*
partial-sums: - 14.0ms +/- 0.0% 14.0ms +/- 0.0%
spectral-norm: - 6.2ms +/- 4.9% 6.2ms +/- 4.9%
regexp: - 56.2ms +/- 0.5% 56.1ms +/- 0.7%
dna: - 56.2ms +/- 0.5% 56.1ms +/- 0.7%
string: *1.044x as slow* 371.3ms +/- 0.4% 387.5ms +/- 0.4% significant
base64: *1.006x as slow* 16.0ms +/- 0.0% 16.1ms +/- 1.4% significant
fasta: ?? 76.1ms +/- 0.7% 76.2ms +/- 0.6% not conclusive: might be *1.001x as slow*
tagcloud: *1.021x as slow* 108.3ms +/- 0.3% 110.6ms +/- 1.5% significant
unpack-code: *1.096x as slow* 138.7ms +/- 0.5% 152.0ms +/- 0.7% significant
validate-input: ?? 32.2ms +/- 1.4% 32.6ms +/- 1.1% not conclusive: might be *1.012x as slow*
Comment 11•16 years ago
|
||
unpack-code seems to have lost a lot, fannkuch as well. unpack-code uses string element lookup, fannkuch arrays.
Comment 12•16 years ago
|
||
We could have faster paths for already-string, already-interned-as-atom, id values, maybe. The imacro stack manipulation boils away. Where's the time going?
Is JSOP_CALLBUILTIN's LIR any less tight than the old way (which could re-enter)? It would be helpful to see LIR or asm diffs.
/be
Comment 13•16 years ago
|
||
var q = {};
q.z = 5.3;
var x = 5.3;
var s = "z";
for (var i = 0; i < 10; ++i)
x += q[s];
Run without and with the patch and diff -u of verbose output:
whale:src gal$ diff -u /tmp/before /tmp/after
--- /tmp/before 2009-01-29 01:24:38.000000000 -0800
+++ /tmp/after 2009-01-29 01:25:18.000000000 -0800
@@ -53,21 +53,33 @@
guard(shape) = eq shape, 148
xf2: xf guard(shape) -> 0:60 sp+24 rp+0
- js_Any_getprop1 = js_Any_getprop ( cx $global1 $global2 )
- eq1 = eq js_Any_getprop1, JSVAL_ERROR_COOKIE
- xt2: xt eq1 -> 0:60 sp+24 rp+0
+00000: swap
+00000: callbuiltin 2
+00000: pick 2
+00000: call 1
+ sti sp[16] = $global1
+ sti sp[8] = $global2
+ sti sp[24] = $global1
+ obj
+ sti sp[16] = obj
+ sti sp[8] = obj
+ sti sp[16] = $global1
+ sti sp[24] = $global2
+ GetProperty_tn1 = GetProperty_tn ( cx $global1 $global2 )
+ sti sp[8] = GetProperty_tn1
JSVAL_TAGMASK
- and1 = and js_Any_getprop1, JSVAL_TAGMASK
+ and1 = and GetProperty_tn1, JSVAL_TAGMASK
2
- eq2 = eq and1, 2
+ eq1 = eq and1, 2
JSVAL_INT
- and2 = and js_Any_getprop1, JSVAL_INT
- or1 = or and2, eq2
- eq3 = eq or1, 0
- xt3: xt eq3 -> 0:60 sp+24 rp+0
+ and2 = and GetProperty_tn1, JSVAL_INT
+ or1 = or and2, eq1
+ eq2 = eq or1, 0
+ xt2: xt eq2 -> 10:60 sp+16 rp+0
- js_UnboxDouble1 = js_UnboxDouble ( js_Any_getprop1 )
+ js_UnboxDouble1 = js_UnboxDouble ( GetProperty_tn1 )
+00000: stop
00061: 6 add
00062: 6 setgvar "x"
00066: 5 incgvar "i"
@@ -82,7 +94,7 @@
#3FF00000:0 /* 1 */
add1 = add ld4, JSVAL_INT
ov1 = ov add1
- xt4: xt ov1 -> 0:66 sp+0 rp+0
+ xt3: xt ov1 -> 0:66 sp+0 rp+0
00069: 5 pop
00070: 5 getgvar "i"
@@ -107,14 +119,14 @@
st gp[680] = add1
loop
-live instruction count 32, total 72, max pressure 9
+live instruction count 33, total 72, max pressure 9
side exits 1
state state = param 0 ecx
state sp sp = ld state[0]
state sp rp rp = ld state[4]
state sp cx cx = ld state[12]
state sp cx gp gp = ld state[8]
-state sp cx gp eos eos = ld state[JSVAL_ERROR_COOKIE]
+state sp cx gp eos eos = ld state[16]
state sp cx gp eor eor = ld state[20]
state sp cx gp globalObj globalObj = ld state[36]
state sp cx gp ld1 ld1 = ld cx[0]
@@ -128,10 +140,11 @@
state sp cx gp $global0 $global1 $global2 sti sp[16] = $global2
state sp cx gp $global0 $global1 $global2 ld2 ld2 = ld $global1[0]
state sp cx gp $global0 $global1 $global2 ld2 ops ops = ld ld2[4]
-state sp cx gp $global0 $global1 $global2 ld2 ld3 ld3 = ld ops[JSVAL_ERROR_COOKIE]
-state sp cx gp $global0 $global1 $global2 shape shape = ld ld2[JSVAL_ERROR_COOKIE]
-state sp gp $global0 js_Any_getprop1 js_Any_getprop1 = js_Any_getprop ( cx $global1 $global2 )
-state sp gp $global0 js_UnboxDouble1 js_UnboxDouble1 = js_UnboxDouble ( js_Any_getprop1 )
+state sp cx gp $global0 $global1 $global2 ld2 ld3 ld3 = ld ops[16]
+state sp cx gp $global0 $global1 $global2 shape shape = ld ld2[16]
+state sp gp $global0 GetProperty_tn1 GetProperty_tn1 = GetProperty_tn ( cx $global1 $global2 )
+state sp gp $global0 GetProperty_tn1 sti sp[8] = GetProperty_tn1
+state sp gp $global0 js_UnboxDouble1 js_UnboxDouble1 = js_UnboxDouble ( GetProperty_tn1 )
state sp gp fadd1 fadd1 = fadd $global0, js_UnboxDouble1
state sp gp stq gp[664] = fadd1
state sp gp ld4 ld4 = ld gp[680]
@@ -143,6 +156,7 @@
state loop
0x256ef0 [prologue]
+ nop6
nop9
push edi
push esi
@@ -150,10 +164,10 @@
push ebp
push ebp
mov ebp,esp
- 0x256f00 [frag entry]
+ 0x256f06 [frag entry]
sub esp,24
-patching 0x256ff3 to 0x256f03
- 0x256f03:
+patching 0x256ff3 to 0x256f09
+ 0x256f09:
compiling trunk 0x257004
state = param 0 ecx
sp = ld state[0]
@@ -173,14 +187,14 @@
sub eax,4096 eax(ld1) ecx(cx) ebx(state) esi(gp) edi(sp)
mov 0(ecx),eax eax(sub1) ecx(cx) ebx(state) esi(gp) edi(sp)
cmp eax,0 eax(sub1) ecx(cx) ebx(state) esi(gp) edi(sp)
- jle 0x2b8f9d ecx(cx) ebx(state) esi(gp) edi(sp)
+ jle 0x2b8fac ecx(cx) ebx(state) esi(gp) edi(sp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
- 0x2b8f9d:
+ 0x2b8fac:
merging registers (intersect) with existing edge
mov ecx,-12(ebp)
mov eax,2453748
mov esp,ebp
- 0x2b8fa7:
+ 0x2b8fb6:
jmp 0x256ff8
--------------------------------------- end exit block 0x257138
$global0 = ldq gp[664]
@@ -191,7 +205,7 @@
sti sp[16] = $global2
ld2 = ld $global1[0]
ops = ld ld2[4]
- ld3 = ld ops[JSVAL_ERROR_COOKIE]
+ ld3 = ld ops[16]
guard(native-map) = eq ld3, OP(&js_ObjectOps)
xf1: xf guard(native-map) -> 0:60 sp+24 rp+0
movq xmm0,664(esi) ecx(cx) ebx(state) esi(gp) edi(sp)
@@ -206,85 +220,74 @@
mov -20(ebp),eax eax(ld2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
mov eax,4(eax) eax(ld2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
mov eax,16(eax) eax(ops) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
- cmp eax,647590 eax(ld3) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
+ cmp eax,646854 eax(ld3) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
mov eax,-20(ebp) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
- jne 0x2b8fac eax(ld2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
+ jne 0x2b8fbb eax(ld2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
- 0x2b8fac:
+ 0x2b8fbb:
merging registers (intersect) with existing edge
mov ecx,-12(ebp)
mov eax,2453880
mov esp,ebp
- 0x2b8fb6:
+ 0x2b8fc5:
jmp 0x256ff8
--------------------------------------- end exit block 0x2571c0
- shape = ld ld2[JSVAL_ERROR_COOKIE]
+ shape = ld ld2[16]
guard(shape) = eq shape, 148
xf2: xf guard(shape) -> 0:60 sp+24 rp+0
mov eax,16(eax) eax(ld2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
cmp eax,148 eax(shape) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
mov eax,-16(ebp) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
- jne 0x2b8fbb eax($global2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
+ jne 0x2b8fca eax($global2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
- 0x2b8fbb:
+ 0x2b8fca:
merging registers (intersect) with existing edge
mov ecx,-12(ebp)
mov eax,2453968
mov esp,ebp
- 0x2b8fc5:
+ 0x2b8fd4:
jmp 0x256ff8
--------------------------------------- end exit block 0x257218
- js_Any_getprop1 = js_Any_getprop ( cx $global1 $global2 )
- eq1 = eq js_Any_getprop1, JSVAL_ERROR_COOKIE
- xt2: xt eq1 -> 0:60 sp+24 rp+0
+ GetProperty_tn1 = GetProperty_tn ( cx $global1 $global2 )
+ sti sp[8] = GetProperty_tn1
+ and1 = and GetProperty_tn1, JSVAL_TAGMASK
+ eq1 = eq and1, 2
+ and2 = and GetProperty_tn1, JSVAL_INT
+ or1 = or and2, eq1
+ eq2 = eq or1, 0
+ xt2: xt eq2 -> 10:60 sp+16 rp+0
sub esp,12 eax($global2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
push eax eax($global2) ecx(cx) edx($global1) ebx(state) esi(gp) edi(sp)
- call js_Any_getprop ebx(state) esi(gp) edi(sp)
+ call GetProperty_tn ebx(state) esi(gp) edi(sp)
add esp,12 ebx(state) esi(gp) edi(sp)
- mov ecx,eax eax(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- cmp ecx,16 ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- je 0x2b8fca ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
---------------------------------------- exit block (LIR_xt|LIR_xf)
- 0x2b8fca:
- merging registers (intersect) with existing edge
- mov ecx,-12(ebp)
- mov eax,2454060
- mov esp,ebp
- 0x2b8fd4:
- jmp 0x256ff8
---------------------------------------- end exit block 0x257274
- and1 = and js_Any_getprop1, JSVAL_TAGMASK
- eq2 = eq and1, 2
- and2 = and js_Any_getprop1, JSVAL_INT
- or1 = or and2, eq2
- eq3 = eq or1, 0
- xt3: xt eq3 -> 0:60 sp+24 rp+0
- mov eax,ecx ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- and eax,7 ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- cmp eax,2 eax(and1) ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- sete edx ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- movzx edx,edx ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- mov eax,ecx ecx(js_Any_getprop1) edx(eq2) ebx(state) esi(gp) edi(sp)
- and eax,1 ecx(js_Any_getprop1) edx(eq2) ebx(state) esi(gp) edi(sp)
- or eax,edx eax(and2) ecx(js_Any_getprop1) edx(eq2) ebx(state) esi(gp) edi(sp)
- test eax,eax eax(or1) ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
- je 0x2b8fd9 ecx(js_Any_getprop1) ebx(state) esi(gp) edi(sp)
+ mov ecx,eax eax(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ mov 8(edi),ecx ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ mov eax,ecx ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ and eax,7 ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ cmp eax,2 eax(and1) ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ sete edx ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ movzx edx,edx ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ mov eax,ecx ecx(GetProperty_tn1) edx(eq1) ebx(state) esi(gp) edi(sp)
+ and eax,1 ecx(GetProperty_tn1) edx(eq1) ebx(state) esi(gp) edi(sp)
+ or eax,edx eax(and2) ecx(GetProperty_tn1) edx(eq1) ebx(state) esi(gp) edi(sp)
+ test eax,eax eax(or1) ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
+ je 0x2b8fd9 ecx(GetProperty_tn1) ebx(state) esi(gp) edi(sp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
0x2b8fd9:
merging registers (intersect) with existing edge
mov ecx,-12(ebp)
- mov eax,2454168
+ mov eax,2454128
mov esp,ebp
0x2b8fe3:
jmp 0x256ff8
---------------------------------------- end exit block 0x2572e0
- js_UnboxDouble1 = js_UnboxDouble ( js_Any_getprop1 )
+--------------------------------------- end exit block 0x2572b8
+ js_UnboxDouble1 = js_UnboxDouble ( GetProperty_tn1 )
fadd1 = fadd $global0, js_UnboxDouble1
stq gp[664] = fadd1
ld4 = ld gp[680]
add1 = add ld4, JSVAL_INT
ov1 = ov add1
- xt4: xt ov1 -> 0:66 sp+0 rp+0
+ xt3: xt ov1 -> 0:66 sp+0 rp+0
call js_UnboxDouble ebx(state) esi(gp) edi(sp)
movq xmm0,-8(ebp) ebx(state) esi(gp) edi(sp)
mov ecx,ebx ebx(state) esi(gp) edi(sp) xmm0($global0)
@@ -297,11 +300,11 @@
jo 0x2b8fe8 ecx(state) ebx(add1) esi(gp) edi(sp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
0x2b8fe8:
- mov eax,2454300
+ mov eax,2454260
mov esp,ebp
0x2b8fef:
jmp 0x256ff8
---------------------------------------- end exit block 0x257364
+--------------------------------------- end exit block 0x25733c
st gp[680] = add1
sti sp[0] = add1
sti sp[8] = 10
@@ -314,11 +317,11 @@
jnl 0x2b8ff4 ecx(state) ebx(add1) esi(gp)
--------------------------------------- exit block (LIR_xt|LIR_xf)
0x2b8ff4:
- mov eax,2454416
+ mov eax,2454376
mov esp,ebp
0x2b8ffb:
jmp 0x256ff8
---------------------------------------- end exit block 0x2573d8
+--------------------------------------- end exit block 0x2573b0
st gp[680] = add1
JSVAL_INT
loop
@@ -341,7 +344,7 @@
entering trace at x.js:6@51, native stack slots: 4 code: 0x256ef0
global: double<15.9> object<0x292100:Object> string<0x294650> int<2>
stack:
-leaving trace at x.js:5@75, op=lt, lr=0x25739c, exitType=1, sp=2, calldepth=0, cycles=93900
+leaving trace at x.js:5@75, op=lt, lr=0x257374, exitType=1, sp=2, calldepth=0, cycles=146575
double<58.3> object<0x292100:Object> string<0x294650> int<10>
stack0=int<10> stack1=int<10>
recorder: started(1), aborted(0), completed(1), different header(0), trees trashed(0), slot promoted(0), unstable loop variable(0), breaks(0), returns(0), unstableInnerCalls(0)
Comment 14•16 years ago
|
||
| Assignee | ||
Comment 15•16 years ago
|
||
(In reply to comment #11)
> unpack-code seems to have lost a lot, fannkuch as well. unpack-code uses string
> element lookup, fannkuch arrays.
String element gets and dense array gets/sets should not have been affected. Investigating.
> Is JSOP_CALLBUILTIN's LIR any less tight than the old way (which could
> re-enter)? It would be helpful to see LIR or asm diffs.
I don't know why this patch makes things slower, but N.B. bug 462027 will make it less tight in several ways. There will be a snapshot (LIR_xbarrier) before the call, which forces the function arguments to be written to the native stack. There will be an extra store before the call (cx->bailExit). The guard after the call will have an extra load in it (cx->builtinStatus). The snapshot for the guard after the call will force the return value to be written to the native stack.
We *might* be able to recover the lost perf by throwing away this patch and instead just make js_Any_GetProp and friends more paranoid. Let's look closer first...
| Assignee | ||
Comment 16•16 years ago
|
||
SunSpider tells me:
fannkuch: ?? 67.1ms +/- 2.0% 67.2ms +/- 2.3% not conclusive: might be *1.001x as slow*
unpack-code: ?? 181.3ms +/- 2.0% 184.4ms +/- 1.5% not conclusive: might be *1.017x as slow*
When I run fannkuch with breakpoints on all the new call_imacro paths, none of them hit.
| Assignee | ||
Comment 17•16 years ago
|
||
A full run, after rebooting. (There is no reason for anything to be faster, though, so these numbers are at least questionable.)
TEST COMPARISON FROM TO DETAILS
=============================================================================
** TOTAL **: 1.005x as fast 2354.7ms +/- 0.2% 2342.4ms +/- 0.2% significant
=============================================================================
3d: 1.008x as fast 298.8ms +/- 0.3% 296.4ms +/- 0.4% significant
cube: 1.015x as fast 102.4ms +/- 0.4% 100.9ms +/- 0.5% significant
morph: ?? 94.9ms +/- 0.2% 95.2ms +/- 0.5% not conclusive: might be *1.003x as slow*
raytrace: 1.012x as fast 101.5ms +/- 0.4% 100.3ms +/- 0.5% significant
access: *1.009x as slow* 372.0ms +/- 0.3% 375.4ms +/- 0.4% significant
binary-trees: 1.009x as fast 44.9ms +/- 0.5% 44.5ms +/- 0.8% significant
fannkuch: *1.022x as slow* 139.6ms +/- 0.4% 142.7ms +/- 0.5% significant
nbody: ?? 139.4ms +/- 0.4% 140.1ms +/- 0.3% not conclusive: might be *1.005x as slow*
nsieve: - 48.1ms +/- 0.5% 48.1ms +/- 0.5%
bitops: 1.022x as fast 259.5ms +/- 0.3% 253.9ms +/- 0.3% significant
3bit-bits-in-byte: - 42.6ms +/- 0.9% 42.5ms +/- 1.2%
bits-in-byte: 1.082x as fast 75.3ms +/- 0.5% 69.6ms +/- 0.7% significant
bitwise-and: - 62.0ms +/- 0.5% 62.0ms +/- 0.0%
nsieve-bits: ?? 79.6ms +/- 0.5% 79.8ms +/- 0.4% not conclusive: might be *1.003x as slow*
controlflow: 1.022x as fast 37.7ms +/- 0.9% 36.9ms +/- 0.6% significant
recursive: 1.022x as fast 37.7ms +/- 0.9% 36.9ms +/- 0.6% significant
crypto: 1.010x as fast 156.2ms +/- 0.5% 154.6ms +/- 0.4% significant
aes: - 62.6ms +/- 0.6% 62.5ms +/- 0.6%
md5: - 46.3ms +/- 0.7% 46.0ms +/- 1.0%
sha1: 1.026x as fast 47.3ms +/- 0.7% 46.1ms +/- 0.5% significant
date: 1.009x as fast 205.1ms +/- 0.4% 203.2ms +/- 0.8% significant
format-tofte: ?? 79.6ms +/- 0.6% 79.8ms +/- 0.7% not conclusive: might be *1.003x as slow*
format-xparb: 1.017x as fast 125.5ms +/- 0.5% 123.4ms +/- 1.0% significant
math: 1.008x as fast 266.3ms +/- 0.3% 264.3ms +/- 0.5% significant
cordic: 1.026x as fast 112.2ms +/- 0.4% 109.4ms +/- 0.5% significant
partial-sums: ?? 101.3ms +/- 0.5% 101.6ms +/- 0.8% not conclusive: might be *1.003x as slow*
spectral-norm: ?? 52.8ms +/- 0.9% 53.3ms +/- 0.6% not conclusive: might be *1.009x as slow*
regexp: - 226.4ms +/- 0.2% 225.9ms +/- 0.5%
dna: - 226.4ms +/- 0.2% 225.9ms +/- 0.5%
string: - 532.7ms +/- 0.4% 531.8ms +/- 0.4%
base64: ?? 48.9ms +/- 0.8% 49.4ms +/- 0.7% not conclusive: might be *1.010x as slow*
fasta: - 125.7ms +/- 0.8% 125.4ms +/- 0.5%
tagcloud: - 123.6ms +/- 0.4% 123.3ms +/- 0.5%
unpack-code: 1.005x as fast 168.9ms +/- 0.4% 168.0ms +/- 0.4% significant
validate-input: ?? 65.6ms +/- 0.8% 65.7ms +/- 0.7% not conclusive: might be *1.002x as slow*
| Assignee | ||
Comment 18•16 years ago
|
||
Yeah, I captured the spew for fannkuch before and after this patch, and did
perl -p -i.orig -e 's/0x[0-9a-f]+/0xnnnnnnnn/g' fannkuch-[ab].log
perl -p -i.bak -e 's/cycles=([0-9]+)/cycles=nnnnnn/g' fannkuch-[ab].log
perl -p -i.bak -e 's/,(..)?[a-f0-9]{6,}/,\1nnnnnn/g' fannkuch-[ab].log
This produced two identical files. The new stuff isn't even called.
Comment 19•16 years ago
|
||
I just love benchmarks ... I will rerun again. Might be architecture-specific. My machine is a lot faster than yours (2x), so we must have different CPUs.
Comment 20•16 years ago
|
||
Moved perf discussion to this bug:
https://bugzilla.mozilla.org/show_bug.cgi?id=475998
Comment 21•16 years ago
|
||
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Updated•16 years ago
|
Flags: blocking1.9.1+
Comment 22•16 years ago
|
||
Flags: in-testsuite+
Flags: in-litmus-
Comment 23•16 years ago
|
||
Keywords: fixed1.9.1
Comment 24•16 years ago
|
||
js1_8_1/trace/trace-test.js
v 1.9.1, 1.9.2
Status: RESOLVED → VERIFIED
Keywords: fixed1.9.1 → verified1.9.1
You need to log in
before you can comment on or make changes to this bug.
Description
•