Closed Bug 619858 Opened 14 years ago Closed 6 years ago

Thread-local storage via an explicit context argument to AVM runtime functions.

Categories

(Tamarin Graveyard :: Virtual Machine, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: kpalacz, Assigned: kpalacz)

Details

Attachments

(4 files, 7 obsolete files)

No description provided.
Attached patch Initial exploration (obsolete) — Splinter Review
Introduces thread-local storage in form of ExecEnv, a stack allocated structure currently containing gc, core and toplevel. Patched codebase passes acceptance tests. ExecEnv is passed to many VM runtime functions as well as jit-compiled code. The jit calling convention has been modified to pass an explicit ExecEnv* argument alongside MethodEnv* to every compiled method. This brute force approach allows jit-compiled functions to pass ExecEnv* into VM runtime calls. Not all runtime functions have been converted - ExecEnv* can be retrieved from a GCThreadLocal as a cop-out/slow path. Despite a longer calling convention he impact on performance is in some cases positive, here's one datapoint in from jsbench: untyped: Crypt 3090 2819 9.6% Euler 6047 6416 -5.7% FFT 5912 5860 0.8% HeapSort 2931 3007 -2.5% LUFact 4883 4903 -0.4% Moldyn 8360 8161 2.4% RayTracer 5913 6186 -4.4% SOR 26917 26248 2.5% Series 7447 7597 -1.9% SparseMatmult 8232 8311 -0.9% typed: Crypt 813 804 1.1% Euler 7269 6715 8.2% FFT 1111 1125 -1.2% HeapSort 1115 1167 -4.4% LUFact 1100 1290 -14.7% Moldyn 3668 3333 10.0% RayTracer 1124 1199 -6.2% SOR 3641 3719 -2.0% Series 6331 6405 -1.1% SparseMatmult 2276 2269 0.3% Improvements in untyped Crypt and typed Moldyn appear to be significant (as is the degradation in typed LUFact). Note that not all paths treated as slow are indeed slow. Microbenchmark differences appeared to be insignificant on visual inspection. One explanation of the observable speedups would be that calls such as ScriptObject::core() and ScriptObject::toplevel() require loads via intermediate memory locations (vtable, traits) which differ from instance to instance, while accesses via ExecEnv result in loads from the vicinity of a single hot memory location.
Assignee: nobody → kpalacz
Attachment #498280 - Attachment is obsolete: true
Way cool. Definition of class ExecEnv appears to be missing.
If this lands, it would enable some JIT cleanup. A number of helper functions take Toplevel* explicitly, and instead could take ExecEnv* since they have to anyway. Other ones take MethodEnv* solely for the purpose of getting Toplevel* on the error path, to access the global exception class objects. EE* works there too. Its also worth noting that AvmCore is the place we store a bunch of essentially thread-local heads of lists that grow & shrink with the call stack: CallStackNode* callStack; // debugger callstack MethodFrame* currentMethodFrame; // AS3 callstack ExceptionFrame* exceptionFrame; // innermost catch on call stack Its not clear if moving all this to EE will make anything faster, but it sure would make things simpler, and its one of the things we'd have to do anyway to support multiple AS3 threads.
(In reply to comment #4) > Its also worth noting that AvmCore is the place we store a bunch of essentially > thread-local heads of lists that grow & shrink with the call stack: > > CallStackNode* callStack; // debugger callstack > MethodFrame* currentMethodFrame; // AS3 callstack > ExceptionFrame* exceptionFrame; // innermost catch on call stack > > Its not clear if moving all this to EE will make anything faster, but it sure > would make things simpler, and its one of the things we'd have to do anyway to > support multiple AS3 threads. Even VM-threads need exception support (e.g. XML parsing can throw exceptions). So exceptionFrame is current a VMThreadLocal in bug 582782
ARM's calling convention supports 4 register arguments. The AS3 calling convention for interface calls would now require 5 arguments. This might not be a problem, but I'm noting it now; interface-calls could get a little bit slower. Same for any other cpu with 4 register arguments. (affects Windows X64, MIPS, and SH4). A long-standing desire of mine is to have a custom jit calling convention, but that's more work. Another solution might be to store the extra iid argument in the EE struct instead of passing it as a real argument.
(In reply to comment #3) > Way cool. > > Definition of class ExecEnv appears to be missing. It's defined as a struct at the end of AvmCore.h (for correctness it should actually behave more like GCAutoEnter and handle nested and recursive entry, but it's good enough for the avmshell).
(In reply to comment #7) > > It's defined as a struct at the end of AvmCore.h I was confused because it's forward declared as 'class ExecEnv' so that's what I searched for :-)
(In reply to comment #6) > Another solution might be to store the extra iid argument in > the EE struct instead of passing it as a real argument. That's what I was expecting with the jit'd helpers we'd migrate some of the 'common' args into EE. Of course the problem will be when we start tinkering with this , it will mighty time consuming to measure the effects on all the platforms.
Nice concept. Probably worth doing a patch queue with subsequent patches exploring some of the likely JIT cleanup that Edwin suggests before landing, to see the full implications.
(In reply to comment #8) > I was confused because it's forward declared as 'class ExecEnv' so that's what > I searched for :-) Some compilers will complain if you forward-declare as class then real-declare as struct (or vice versa)...
(In reply to comment #4) > If this lands, it would enable some JIT cleanup. A number of helper functions > take Toplevel* explicitly, and instead could take ExecEnv* since they have to > anyway. Other ones take MethodEnv* solely for the purpose of getting Toplevel* > on the error path, to access the global exception class objects. EE* works > there too. The patch contains some cleanup already, in particular Toplevel::op_call, callproperty and constructprop have been made static (for minimal source code delta). > Its also worth noting that AvmCore is the place we store a bunch of essentially > thread-local heads of lists that grow & shrink with the call stack: > > CallStackNode* callStack; // debugger callstack > MethodFrame* currentMethodFrame; // AS3 callstack > ExceptionFrame* exceptionFrame; // innermost catch on call stack > > Its not clear if moving all this to EE will make anything faster, but it sure > would make things simpler, and its one of the things we'd have to do anyway to > support multiple AS3 threads. The patch already moves currentMethodFrame to ExecEnv, in fact this was one more reason why the JIT maintains an ExecEnv pointer: method prologue updates the currentMethodFrame stored in ExecEnv. For multiple AS3 threads it wouldn't be particularly appealing to call TLS functions instead, even in a prototype.
(In reply to comment #11) > (In reply to comment #8) > > I was confused because it's forward declared as 'class ExecEnv' so that's what > > I searched for :-) > > Some compilers will complain if you forward-declare as class then real-declare > as struct (or vice versa)... They shall be silenced.
Comment on attachment 498281 [details] [diff] [review] Oops, previous submission contained the wrong patch. NB the patch applies cleanly to TR changeset 5689:b6ff1c5b89d6 http://hg.mozilla.org/tamarin-redux/rev/b6ff1c5b89d6 (the project file goes amuck with 5690; probably easy to fix but not worth the time for quick initial evaluations.)
Attached patch Even more ginormous refactoring (obsolete) — Splinter Review
Passes ExecEnv* all the way to GCAlloc::AllocFromQuickList and GCAlloc::Free on at least some code paths. Improved performance observed: untyped: Crypt 3090 2829 9.2% Euler 6047 6282 -3.7% FFT 5912 5912 0% HeapSort 2931 3017 -2.8% LUFact 4883 4842 0.8% Moldyn 8360 8258 1.2% RayTracer 5913 6105 -3.1% SOR 26917 26111 3.0% Series 7447 7502 -0.7% SparseMatmult 8232 8324 -1.1% typed: Crypt 813 812 0.1% Euler 7269 6698 8.5% FFT 1111 1115 -0.3% HeapSort 1115 1129 -1.2% LUFact 1100 1093 0.6% Moldyn 3668 3380 8.5% RayTracer 1124 1178 -4.5% SOR 3641 3609 0.8% Series 6331 6391 -0.9% SparseMatmult 2276 2311 -1.5%
Attachment #498281 - Attachment is obsolete: true
(In reply to comment #6) > ARM's calling convention supports 4 register arguments. The AS3 calling > convention for interface calls would now require 5 arguments. This might not > be a problem, but I'm noting it now; interface-calls could get a little bit > slower. Same for any other cpu with 4 register arguments. (affects Windows > X64, MIPS, and SH4). > > A long-standing desire of mine is to have a custom jit calling convention, but > that's more work. Another solution might be to store the extra iid argument in > the EE struct instead of passing it as a real argument. Another idea would be to pass 3 arguments (ExecEnv*, argc, argv), but move the allocation and a part of the initialization of the MethodFrame structure from the callee to the caller. Currently, regardless whether the MethodEnv* argument is passed in a register or on the stack, it's stored on the stack in the method prologue anyway. The calller could perform the store instead, MethodEnv* wouldn't be an explicit argument to AS calls any more, and the AS method could still get cheaply to the MethodEnv* when needed. As a result, the calling sequence would still be longer compared to the baseline, but the number of stack stores and the frame size wouldn't change. To implement this I'd need a stack pointer or a base pointer from nanojit, something to reach into the caller frame, haven't figured out yet if it's possible.
Attached patch fixes for win32 (obsolete) — Splinter Review
kpalacz: please fold this (or something like it) into your patch at some point. Its necessary to get around VS2008 build warnings-as-errors. (If others want to use this for windows builds in the meantime, note that you'll need to re-run builtin.py and shell_toplevel.py to propagate those changes through.)
Attachment #499115 - Flags: feedback?(kpalacz)
Attached patch update (obsolete) — Splinter Review
More passing around of ExecEnv. Quick and dirty performance measurements (avg time over 3 iterations on a Xeon MacPro) untyped: Crypt -1.0% Euler -0.8% FFT -0.1% HeapSort -1.0% LUFact 0.6% Moldyn 1.1% RayTracer -0.8% SOR -0.9% Series -1.0% SparseMatmult -0.0% typed: Crypt -0.4% Euler 6.0% FFT 0.6% HeapSort -1.1% LUFact 0.4% Moldyn 8.3% RayTracer -4.3% SOR -0.5% Series -0.6% SparseMatmult -0.7% The differences (+ and -) seem to have disappeared in untyped code, typed/RayTracer.as is the only major negative outlier.
Attachment #498484 - Attachment is obsolete: true
Attachment #499115 - Attachment is obsolete: true
Attachment #499115 - Flags: feedback?(kpalacz)
(In reply to comment #18) [...] > Quick and dirty performance measurements (avg time over 3 iterations on a Xeon MacPro) Percentage runtime improvement over baseline (positive means improvement, higher is better).
I'm surprised that I'm consistently seeing a number of the sunspider benchmarks reporting that they took 0 time to run. I am able to replicate the results with manual avm runs, though the reported times vary from 0 to 20. (This is not specific to this bug, but I just wanted to say something to acknowledge potential confusion in the benchmark results I just posted.)
Caveat: just heard on phone that using Windows7+VisualStudio2008 may yield invalid results. So I'm going to have revise my benchmarking infrastructure to gather more trustworthy numbers. I am learning to use VTune in order to dissect the performance issues on Windows. My first attempt to use VTune was on jsbench/typed/LUFact, which regressed by >2x time relative to a baseline build. However, the regression did not replicate under VTune, so I switched to a different regressing benchmark. Second, I tried jsbench/Moldyn, which regressed by >1.5x time, and that result did replicate under VTune. baseline bz619858 test best avg best avg %dBst %dAvg Metric: time jsbench\Moldyn 8031 8126 12223 12241.2 -52.2 -50.6 -- VTune indicates that doubleToAtom_sse2 is taking 1.22sec on the baseline and 2.32sec on bz619858 (the v2.patch plus some windows fixes). Further analysis with VTune indicates that the CPI Rate of doubleToAtom_sse2 for baseline is 0.525; for bz619858 the CPI Rate is 1.220. (Given the caveat I noted at the outset, I'm going to stop transcribing data here and wait until I have a more trustworthy setup to run.)
My money is on alignment penalties. on windows, ordinary method calls use THISCALL convention, which puts the this argument in ECX, and everything else on the stack. core->doubleToAtom_sse2(x) puts core in ECX, and x on the stack (assume ESP is 8-aligned). adding the ee argument after the double seems like a good idea: core->doubleToAtom_sse2(x, ee) this still misaligns the stack inside doubleToAtom_sse2(), but x should still be aligned. The problem would be when you call any delegate methods and pass x, or if x spills for some reason; you'll be doing unaligned reads/writes. A different way would be to make doubleToAtom_sse2 be FASTCALL, so ee and core can both be in registers: core->doubleToAtom_sse2(ee, x) this would put ECX=core, EDX=ee, and only x on the stack, as before.
Among other changes, this patch changes argument order on AvmCore::doubleToAtom_sse2_static() for an improved calling convention.
Attachment #499342 - Attachment is obsolete: true
> VTune indicates that doubleToAtom_sse2 is taking 1.22sec on the baseline and > 2.32sec on bz619858 (the v2.patch plus some windows fixes). > > Further analysis with VTune indicates that the CPI Rate of doubleToAtom_sse2 > for baseline is 0.525; for bz619858 the CPI Rate is 1.220. Are you seeing doubleToAtom_sse2 or doubleToAtom_sse2_static? I thought the latter should be on the hot path. If I remember correctly, it appeared in v2 of the patch and I just submitted a version that fixes the order of arguments along the lines of what Ed is suggesting.
(In reply to comment #23) > Caveat: just heard on phone that using Windows7+VisualStudio2008 may yield > invalid results. So I'm going to have revise my benchmarking infrastructure to > gather more trustworthy numbers. Update from IRC channel: rickr says Windows7+VisualStudio2008 should be fine. stejohns made a stronger assertion, along the lines that it _must_ be fine. :)
(In reply to comment #26) > > VTune indicates that doubleToAtom_sse2 is taking 1.22sec on the baseline and > > 2.32sec on bz619858 (the v2.patch plus some windows fixes). > > > > Further analysis with VTune indicates that the CPI Rate of doubleToAtom_sse2 > > for baseline is 0.525; for bz619858 the CPI Rate is 1.220. > > Are you seeing doubleToAtom_sse2 or doubleToAtom_sse2_static? I thought the > latter should be on the hot path. If I remember correctly, it appeared in v2 of > the patch and I just submitted a version that fixes the order of arguments > along the lines of what Ed is suggesting. VTune is largely blaming doubleToAtom_sse2.
(In reply to comment #28) > VTune is largely blaming doubleToAtom_sse2. Then that's weird, b/c I think I haven't touched AvmCore::doubleToAtom_sse2() (the doubleToAtom_sse2 jit helper CallInfo is defined to AvmCore::doubleToAtom_sse2_static()).
For those without access to my personal account on my machine, the 'rebased to 5761' patch was rebased to hex 5648edf7ff8b.
Attachment #502031 - Attachment description: rebased to 5761, various changes → rebased to 5737:5648edf7ff8b, various changes
Comment on attachment 502031 [details] [diff] [review] rebased to 5737:5648edf7ff8b, various changes This is almost certainly wrong: + { + __asm d2a_alloc: + return allocDouble_static(n, ExecEnv* ee); + }
Attached patch windows fixes for v3 of patch (obsolete) — Splinter Review
kpalacz: I'm using feedback flag as proxy for "please feed this back into your own patch."
Attachment #502057 - Flags: feedback?(kpalacz)
relative to 5737:5648edf7ff8b
Attachment #502031 - Attachment is obsolete: true
Attachment #502057 - Attachment is obsolete: true
Attachment #502057 - Flags: feedback?(kpalacz)
(In reply to comment #24) > My money is on alignment penalties. Boo yah. Went back through my email, found reference to Bug 409216, comment 7 (thanks for keeping great notes, Werner!) That led me to tell VTune to instrument LD1_SPLIT.STORES and LD1_SPLIT.LOADS, and that is quite helpful. - I've also included AvmCore::compare in the results below. It was also one of the top functions blamed by VTune, and its time from baseline to bz619858 increased from 0.8sec to 1.8sec. The reason why becomes clear when you see these results AvmCore::doubleToAtom_sse2 SPLIT.LOADS SPLIT.STORES baseline 0 0 bz619858 55,000,000 141,000,000 AvmCore::compare SPLIT.LOADS SPLIT.STORES baseline 0 0 bz619858 110,500,000 0 Other functions that have 0 split loads/stores on baseline but have many millions of one or the other with Kryzystof's patch: - GCAlloc::Alloc - AvmCore::number - AvmCore::primitive - avmplus::getprop_index - ArrayObject::_getUintProperty
(In reply to comment #34) > > That led me to tell VTune to instrument LD1_SPLIT.STORES and > LD1_SPLIT.LOADS, and that is quite helpful. (Note: in VTune Amplifier XE 2011, the events are prefixed with "L1D", not the erroneous "LD1" I wrote in my previous comment.)
In the baseline all the JIT-compiled methods use 3 words for the incoming arguments (JIT-compiled methods are always CDECL), and then build a 3 word MethodFrame instance on the stack. If ESP was dword-aligned before the call, it would still be after the prologue. In the patched version there are now 4 words from incoming arguments, maybe that's the source of this, well, oddness?
Sounds plausible. any further updates to confirm/deny this?
This patch reverts to using 3 arguments int the JIT calling convention. Instead of (MethodEnv* env, int argc, uint32_t* args) it is now (ExecEnv* ee, int argc, ExtendedArgs* args) where ExtendedArgs (whose name will prolly change) combines MethodFrame and AS method arguments. As a result, the ExtendedArgs object must be created in the caller and its MethodEnv* slot must be filled by the caller (in the baseline the ExecEnv* gets passed as an explicit argument and then the MethodFrame slot is initialized by the callee's prologue). In other words, work is shifted from callee to caller, which should result in roughly the same amount of instructions executed, but probably slightly larger code (initialization of MethodFrames is partially "inlined"). Other side effects I can think of are in the interpreter (calling the jit from the interpreter requires an additional copy of arguments to construct ExtendedArgs), and possibly the fact the the MethodEnv* may be a bit more expensive to retrieve from the stack than it was from C arguments (wherever the calling convention puts them). Besides these, the stack space use and number of instructions used by the calling convention should not increase compared to the baseline. The patch is even larger than the previous one (1.7MB), needs cleanup, fails on one acceptance test, and probably won't work on Windows without a tweak. Some performance results below, measured on a MacPro, compared to 5826:f06efcc015c4 as baseline. Microbenchmarks are all over the map, there are regressions, when involving some particular functionality that hasn't been converted to the new style (and may frequently resort to ExecEnv::getActiveExecEnv(), like certain array operations) but there are also some improvements. v8 tests are surprising in that one would think they shouldn't be too different from jsbench, but show regressions, while jsbench looks OK (except for typed/RayTracer). Tamarin tests started: 2011-01-25 12:16:32.621468 Executing 460 test(s) avm: ./avm-baseline version: cyclone avm2: ./avm version: cyclone iterations: 3 avm avm2 test best avg best avg %dBst %dAvg Metric: iterations/second Dir: asmicro/ alloc-1 46.5 46.5 47.1 47.1 1.1 1.2 + alloc-10 16.2 16.1 16.5 16.5 2.2 2.2 + alloc-11 15.9 15.9 15.9 15.9 0 -0.0 alloc-12 16.0 15.9 16.0 16.0 0.5 0.5 alloc-13 100.6 100.6 100.3 100.3 -0.3 -0.3 alloc-14 84.2 82.8 83.1 82.9 -1.3 0.1 alloc-2 23.2 23.0 22.6 22.6 -2.3 -1.5 - alloc-3 18.4 18.4 18.1 18.1 -1.7 -1.8 - alloc-4 53.1 53.1 54.9 54.9 3.4 3.4 + alloc-5 36.8 36.8 38.8 38.7 5.3 5.2 ++ alloc-6 81.5 81.5 81.8 81.8 0.4 0.4 alloc-7 43.7 43.6 43.0 43.0 -1.5 -1.5 - alloc-8 17.8 17.7 17.9 17.9 1.0 0.9 alloc-9 17.7 17.7 17.9 17.8 1.1 0.9 + arguments-1 828.2 823.5 719.3 718.4 -13.1 -12.8 -- arguments-2 524.5 523.3 461.1 457.1 -12.1 -12.7 -- arguments-3 23.7 23.7 23.7 23.7 0.1 0.1 arguments-4 32.1 32.1 32.5 32.5 1.3 1.2 + array-1 1942.1 1940.7 1946.1 1945.4 0.2 0.2 + array-2 671.3 670.7 672.3 672.3 0.1 0.2 + array-pop-1 371.6 368.0 361.3 360.1 -2.8 -2.1 - array-push-1 318.0 316.3 295.4 294.3 -7.1 -6.9 -- array-shift-1 162.4 160.8 140.3 140.3 -13.6 -12.8 -- array-slice-1 23.9 23.9 22.4 22.3 -6.4 -6.6 -- array-sort-1 31.2 31.1 31.4 31.4 0.8 0.9 array-sort-2 2.8 2.8 2.8 2.7 -0.3 -0.1 array-sort-3 24.2 24.2 24.2 24.1 -0.3 -0.1 array-sort-4 11.5 11.5 9.2 9.2 -20.5 -20.6 -- array-unshift-1 187.4 187.2 176.9 176.7 -5.6 -5.6 -- closedvar-read-1 4365.6 4364.3 4622.4 4614.4 5.9 5.7 ++ closedvar-write-1 4619.4 4562.4 6024.0 6011.7 30.4 31.8 ++ closedvar-write-2 5650.3 5627.4 4370.6 4367.0 -22.6 -22.4 -- do-1 4352.6 4351.3 4353.6 4353.6 0.0 0.1 for-1 4352.6 4350.6 4352.6 4352.3 0 0.0 for-2 2652.3 2651.7 2653.3 2653.0 0.0 0.1 + for-3 2773.2 2771.2 2772.2 2771.9 -0.0 0.0 for-in-1 400.6 400.1 402.6 402.5 0.5 0.6 + for-in-2 385.2 384.8 387.6 387.4 0.6 0.7 + funcall-1 386.6 386.6 351.9 351.9 -9.0 -9.0 -- funcall-2 256.7 256.6 248.0 248.0 -3.4 -3.3 - funcall-3 350.9 350.9 337.7 337.6 -3.8 -3.8 - funcall-4 123.9 123.8 126.2 125.7 1.9 1.6 + globalvar-read-1 4371.6 4368.6 4375.6 4370.0 0.1 0.0 globalvar-write-1 4504.5 4500.5 5935.1 5802.5 31.8 28.9 ++ isNaN-1 4346.7 4344.7 4284.7 4276.7 -1.4 -1.6 - isNaN-2 4350.6 4343.3 4348.7 4348.7 -0.0 0.1 isNaN-3 3810.2 3808.9 3810.2 3809.5 0 0.0 lookup-array-fetch-1 799.2 798.9 833.2 833.2 4.2 4.3 + lookup-array-in-1 1745.3 1744.3 1845.2 1845.2 5.7 5.8 ++ lookup-negindex-array-1 482.5 482.5 474.6 474.2 -1.6 -1.7 - lookup-negindex-array-2 420.6 420.6 421.2 421.2 0.1 0.1 lookup-negindex-object-1 503.0 503.0 492.0 491.7 -2.2 -2.2 - lookup-negindex-object-2 473.5 473.5 467.1 466.9 -1.4 -1.4 - lookup-object-fetch-1 896.1 896.1 915.1 914.8 2.1 2.1 + lookup-object-in-1 1635.4 1632.4 1592.4 1592.4 -2.6 -2.4 - number-toString-1 6.3 6.3 6.3 6.3 -0.3 -0.3 number-toString-2 80.8 80.7 80.0 80.0 -0.9 -0.8 oop-1 4.2 4.2 4.2 4.2 -0.7 -0.6 parseFloat-1 83.3 83.1 80.8 80.7 -3.0 -2.9 - parseInt-1 191.6 191.5 197.2 197.1 2.9 2.9 + regex-exec-1 67.9 67.9 68.3 68.3 0.6 0.6 regex-exec-2 81.8 81.7 83.3 83.3 1.9 2.0 + regex-exec-3 124.3 124.1 127.0 126.8 2.2 2.2 + regex-exec-4 321.7 321.5 324.4 323.8 0.8 0.7 + restarg-1 826.2 819.2 717.6 715.8 -13.1 -12.6 -- restarg-2 503.5 502.8 451.5 449.1 -10.3 -10.7 -- restarg-3 42.0 42.0 43.9 43.9 4.4 4.4 + restarg-4 32.2 32.2 32.6 32.6 1.4 1.3 + string-casechange-1 27.9 27.9 28.0 28.0 0.3 0.2 string-casechange-2 27.6 27.6 27.3 27.3 -1.1 -1.1 - string-charAt-1 1560.4 1560.1 1563.4 1562.4 0.2 0.1 + string-charAt-2 90.6 90.6 89.6 89.6 -1.1 -1.1 - string-charCodeAt-1 1648.4 1648.0 1659.3 1658.0 0.7 0.6 + string-charCodeAt-2 1566.4 1565.8 1556.4 1555.8 -0.6 -0.6 - string-charCodeAt-3 1237.8 1236.8 1240.8 1240.1 0.2 0.3 + string-charCodeAt-4 2045.0 2044.6 2096.9 2095.9 2.5 2.5 + string-charCodeAt-5 1040.0 1040.0 1040.0 1040.0 0 0 string-charCodeAt-6 1650.3 1649.4 1658.3 1657.7 0.5 0.5 + string-charCodeAt-7 2045.0 2044.6 2097.9 2097.2 2.6 2.6 + string-fromCharCode-1 359.6 359.5 350.3 350.1 -2.6 -2.6 - string-fromCharCode-2 72.3 72.3 72.3 72.2 0 -0.1 string-indexOf-1 234.1 234.0 222.1 222.0 -5.1 -5.1 -- string-indexOf-2 176.3 176.2 139.7 139.6 -20.7 -20.7 -- string-indexOf-3 125.1 125.1 126.1 126.1 0.8 0.8 string-lastIndexOf-1 564.4 564.4 593.8 593.8 5.2 5.2 ++ string-lastIndexOf-2 174.8 174.7 179.8 178.4 2.9 2.1 + string-lastIndexOf-3 170.8 170.7 175.6 175.6 2.8 2.9 + string-slice-1 131.3 131.3 129.7 129.7 -1.2 -1.2 - string-split-1 10.4 10.4 10.2 10.2 -2.0 -2.1 - string-split-2 10.2 10.2 10.0 9.9 -2.6 -2.7 - string-substring-1 139.3 139.3 137.6 137.6 -1.2 -1.2 - switch-1 470.1 470.1 460.6 460.4 -2.0 -2.0 - switch-2 126.4 126.3 125.4 125.3 -0.8 -0.8 switch-3 231.5 231.5 228.1 228.1 -1.5 -1.5 - try-1 268.2 268.1 205.8 202.1 -23.3 -24.6 -- try-2 17.7 17.6 16.8 16.7 -5.0 -5.1 - try-3 53.6 53.6 52.3 52.2 -2.5 -2.5 - vector-push-1 47.9 47.8 46.5 46.4 -2.8 -2.9 - while-1 4351.6 4351.6 4352.6 4351.3 0.0 -0.0 Metric: time Dir: jsbench/ Crypt 2831 2832 2846 2846.3 -0.5 -0.5 - Euler 5804 5811.3 5848 5852 -0.8 -0.7 - FFT 5793 5801.7 5829 5852.3 -0.6 -0.9 - HeapSort 2938 2940.7 2895 2895.3 1.5 1.5 + LUFact 4676 4676.7 4743 4747.3 -1.4 -1.5 - Moldyn 8218 8240.3 8194 8201 0.3 0.5 + RayTracer 5817 5826.7 5880 5888.7 -1.1 -1.1 - SOR 26189 26193 26202 26208.3 -0.0 -0.1 - Series 7222 7224 7312 7324.7 -1.2 -1.4 - SparseMatmult 7903 7954.7 8024 8070.3 -1.5 -1.5 - Dir: jsbench/typed/ Crypt 800 800.3 798 799 0.2 0.2 + Euler 6920 6937 6547 6569.7 5.4 5.3 ++ FFT 1076 1081.3 1067 1069.3 0.8 1.1 + HeapSort 1126 1127 1129 1130.7 -0.3 -0.3 - LUFact 1091 1091.3 1088 1089 0.3 0.2 + Moldyn 3549 3554 3266 3275 8.0 7.9 ++ RayTracer 1115 1116.3 1180 1180.7 -5.8 -5.8 -- SOR 3555 3556 3566 3569.3 -0.3 -0.4 - Series 6171 6172.3 6317 6323 -2.4 -2.4 - SparseMatmult 2146 2173 2139 2165.3 0.3 0.4 Metric: iterations/second Dir: jsmicro/ alloc-1 44.8 44.8 45.3 45.2 1.0 1.0 + alloc-10 15.9 15.8 16.1 16.0 1.7 1.4 alloc-11 14.3 14.3 14.0 14.0 -2.5 -2.4 - alloc-12 14.3 14.3 13.9 13.9 -3.1 -3.0 - alloc-13 87.3 87.3 87.5 87.4 0.2 0.2 alloc-14 75.2 75.0 73.9 73.6 -1.8 -1.9 - alloc-2 22.3 22.3 22.0 22.0 -1.4 -1.4 - alloc-3 17.9 17.9 17.9 17.9 -0.5 -0.4 alloc-4 50.9 50.9 52.5 52.5 3.1 3.2 + alloc-5 36.1 36.1 38.7 38.7 7.3 7.3 ++ alloc-6 71.4 71.4 71.9 71.8 0.6 0.5 alloc-7 42.9 42.9 42.5 42.5 -1.0 -1.0 alloc-8 17.2 17.2 17.4 17.3 0.9 0.8 alloc-9 17.2 17.1 17.4 17.3 1.0 1.4 arguments-1 178.5 178.3 178.8 178.2 0.2 -0.1 arguments-2 128.9 128.9 129.5 129.4 0.5 0.4 arguments-3 20.3 20.3 20.6 20.6 1.2 1.3 + array-1 463.1 463.1 449.6 449.6 -2.9 -2.9 - array-2 324.0 323.9 310.7 310.3 -4.1 -4.2 - array-pop-1 79.3 79.1 74.6 74.2 -6.0 -6.2 -- array-push-1 54.1 54.0 52.1 52.1 -3.7 -3.6 - array-shift-1 65.4 65.0 62.2 62.0 -4.9 -4.6 - array-slice-1 19.8 19.8 18.7 18.7 -5.7 -5.7 -- array-sort-1 29.7 29.7 29.6 29.6 -0.5 -0.4 array-sort-2 2.7 2.7 2.7 2.7 0.4 0.4 array-sort-3 24.2 24.1 24.2 24.1 -0.4 -0.2 array-sort-4 11.9 11.9 9.7 9.7 -18.8 -18.8 -- array-unshift-1 25.6 25.6 25.4 25.4 -0.9 -0.9 closedvar-read-1 728.3 727.8 718.3 717.6 -1.4 -1.4 - closedvar-write-1 522.5 522.1 511.0 510.3 -2.2 -2.3 - closedvar-write-2 531.5 531.1 510.0 509.3 -4.0 -4.1 - do-1 738.5 738.3 719.6 719.4 -2.6 -2.6 - for-1 739.5 739.4 723.3 723.3 -2.2 -2.2 - for-2 228.8 228.5 216.4 216.4 -5.4 -5.3 -- for-3 162.8 162.7 166.0 165.9 1.9 2.0 + for-in-1 311.7 311.6 301.7 301.6 -3.2 -3.2 - for-in-2 300.7 300.3 301.1 301.0 0.1 0.2 funcall-1 252.5 252.5 229.9 229.8 -9.0 -9.0 -- funcall-2 246.0 245.9 233.5 233.5 -5.1 -5.1 -- funcall-3 244.8 244.2 223.1 223.0 -8.8 -8.7 -- funcall-4 1516.5 1515.8 1557.4 1557.1 2.7 2.7 + globalvar-read-1 728.3 727.7 715.3 715.3 -1.8 -1.7 - globalvar-write-1 532.5 532.5 510.0 509.8 -4.2 -4.3 - isNaN-1 632.7 632.7 619.4 619.4 -2.1 -2.1 - lookup-array-fetch-1 707.3 706.1 730.5 730.4 3.3 3.4 + lookup-array-in-1 1181.8 1181.2 1208.8 1208.5 2.3 2.3 + lookup-object-fetch-1 777.2 776.6 789.2 788.6 1.5 1.6 + lookup-object-in-1 1064.9 1064.6 1076.9 1076.9 1.1 1.2 + number-toString-1 6.2 6.2 6.2 6.2 -0.5 -0.5 number-toString-2 67.4 67.4 67.1 67.1 -0.4 -0.4 oop-1 4.3 4.3 4.3 4.3 -0.3 -0.2 parseFloat-1 60.3 60.3 60.5 60.4 0.2 0.2 parseInt-1 137.3 137.3 133.3 133.0 -2.9 -3.1 - regex-exec-1 59.8 59.8 60.6 60.6 1.4 1.3 + regex-exec-2 74.6 74.6 76.8 76.7 3.0 2.9 + regex-exec-3 112.1 111.9 115.5 115.4 3.1 3.1 + regex-exec-4 267.7 267.6 269.2 269.0 0.5 0.6 + string-casechange-1 19.2 19.2 19.3 19.3 0.4 0.4 string-casechange-2 19.1 19.1 19.1 19.1 -0.2 -0.2 string-charAt-1 120.6 120.5 115.1 115.1 -4.6 -4.5 - string-charAt-2 52.0 52.0 52.3 52.2 0.5 0.4 string-charCodeAt-1 107.4 107.4 104.1 104.1 -3.1 -3.1 - string-charCodeAt-2 107.5 107.5 105.5 105.4 -1.9 -1.9 - string-fromCharCode-1 90.7 90.6 87.6 87.6 -3.4 -3.3 - string-fromCharCode-2 45.5 45.5 44.8 44.8 -1.7 -1.7 - string-fromCharCode-3 78.8 78.5 77.2 77.0 -2.0 -1.9 - string-fromCharCode-4 82.8 82.8 82.2 82.2 -0.7 -0.7 string-indexOf-1 95.2 95.2 97.4 97.4 2.3 2.2 + string-indexOf-2 69.7 69.7 62.9 62.8 -9.8 -9.8 -- string-indexOf-3 55.7 55.6 59.9 59.9 7.7 7.6 ++ string-lastIndexOf-1 94.2 94.2 97.2 97.1 3.2 3.1 + string-lastIndexOf-2 70.4 70.3 70.2 70.1 -0.4 -0.3 string-lastIndexOf-3 70.5 70.3 69.9 69.9 -0.9 -0.6 - string-slice-1 54.9 54.9 56.8 56.7 3.3 3.4 + string-split-1 9.6 9.6 9.6 9.5 -0.6 -0.7 string-split-2 9.5 9.5 9.3 9.3 -1.7 -1.7 - string-substring-1 59.2 59.2 59.6 59.6 0.8 0.7 switch-1 332.3 332.2 340.0 339.8 2.3 2.3 + switch-2 66.8 66.8 66.5 66.4 -0.5 -0.5 switch-3 86.1 86.1 85.7 85.7 -0.5 -0.4 try-1 176.3 176.2 175.6 175.5 -0.4 -0.4 try-2 17.5 17.4 16.6 16.6 -5.2 -5.0 -- try-3 47.7 47.7 43.6 43.5 -8.6 -8.7 -- while-1 739.5 739.3 723.3 722.9 -2.2 -2.2 - Metric: time Dir: language/describetype/ desctypeperf 479 479 491 491 -2.5 -2.5 - Dir: language/e4x/ addingToXMLList 14 14.7 15 15 -7.1 -2.3 -- appendChildAndString 43 43 43 43 0 0 concatenatingStringsFromE4X 6 6 6 6.3 0 -5.6 simpleStringConcatenation 1 1.3 1 1 0 25.0 usingAppendChildAndE4X 44 44.3 45 45 -2.3 -1.5 - Dir: language/string/ append_concat 61 61.3 60 60 1.6 2.2 + append_equal_plus 49 49.3 49 49.3 0 0 append_plus_equal 48 49 49 49 -2.1 0 - charAt 142 142.3 150 150.7 -5.6 -5.9 -- charCodeAt 170 170.3 175 175.7 -2.9 -3.1 - indexOf 244 244 250 250 -2.5 -2.5 - lastIndexOf 171 171.7 172 172 -0.6 -0.2 - replace 404 406.3 405 406.7 -0.2 -0.1 replace2 693 693.7 699 699.7 -0.9 -0.9 - search 31 31 30 30.7 3.2 1.1 + slice 234 234.7 241 241 -3.0 -2.7 - split 264 264.3 253 253.7 4.2 4.0 + static_ascii_array_100 179 179.3 180 180.7 -0.6 -0.7 - static_ascii_array_50 166 166.7 166 166.3 0 0.2 static_latin1_array_100 341 341.7 343 343.7 -0.6 -0.6 - static_latin1_array_50 170 170.7 171 171.7 -0.6 -0.6 - substr 174 174 178 178.3 -2.3 -2.5 - substring 166 166 169 169.7 -1.8 -2.2 - Dir: language/string/typed/ append_concat 54 54.3 54 54.7 0 -0.6 append_equal_plus 44 44.7 44 44.3 0 0.7 append_plus_equal 44 44.7 44 44.7 0 0 charAt 9 9.3 8 8.7 11.1 7.1 + charCodeAt 9 9 9 9 0 0 indexOf 244 244.3 250 250 -2.5 -2.3 - lastIndexOf 171 171.7 171 171.7 0 0 replace 405 406 405 406.3 0 -0.1 replace2 691 691.3 698 699 -1.0 -1.1 - search 30 30.7 30 30.3 0 1.1 slice 156 156.3 157 157.7 -0.6 -0.9 - split 263 263.7 262 262.3 0.4 0.5 + substr 122 122 121 121.3 0.8 0.5 + substring 111 111 110 110.3 0.9 0.6 + Dir: misc/ boids 1534 1534.3 1581 1581.3 -3.1 -3.1 - boidshack 355 358.3 355 355 0 0.9 gameoflife 1188 1188.7 1288 1288 -8.4 -8.4 -- primes 4213 4213.3 4420 4421 -4.9 -4.9 - Dir: mmgc/ gcbench 2430 2441.3 2444 2445 -0.6 -0.2 - ofib-rc 183 183.3 184 184.7 -0.5 -0.7 - ofib 863 866 859 863.3 0.5 0.3 sfib 308 308.3 318 319.3 -3.2 -3.6 - Dir: scimark/ FFT 1697 1697.7 1713 1713.3 -0.9 -0.9 - LU 2409 2410 2452 2454.7 -1.8 -1.9 - MonteCarlo 2243 2244 2296 2296 -2.4 -2.3 - SOR 1745 1746.7 1740 1740 0.3 0.4 + SparseCompRow 79 79.3 80 80 -1.3 -0.8 - Dir: sunspider/ access-binary-trees 28 28.3 30 30 -7.1 -5.9 -- access-fannkuch 63 63.3 66 66.7 -4.8 -5.3 - access-nbody 68 68.3 68 68.7 0 -0.5 access-nsieve 21 21 21 21.7 0 -3.2 bitops-3bit-bits-in-byte 8 8.3 9 9 -12.5 -8.0 -- bitops-bits-in-byte 24 24.7 25 25.7 -4.2 -4.1 - bitops-bitwise-and 175 175.7 183 185.7 -4.6 -5.7 - bitops-nsieve-bits 34 34 34 34.3 0 -1.0 controlflow-recursive 12 12.7 13 13.3 -8.3 -5.3 - crypto-aes 34 34 34 34.3 0 -1.0 crypto-md5 16 16 16 16.3 0 -2.1 crypto-sha1 15 15.3 15 15.3 0 0 date-format-tofte 140 141.7 147 148.3 -5 -4.7 - math-cordic 44 44 44 44.7 0 -1.5 math-partial-sums 159 159.3 141 141.3 11.3 11.3 ++ math-spectral-norm 22 22.7 23 23 -4.5 -1.5 - s3d-cube 55 55 54 54.3 1.8 1.2 + s3d-morph 31 31.7 33 33.7 -6.5 -6.3 -- s3d-raytrace 64 64.7 64 64 0 1.0 string-fasta 69 69 67 67 2.9 2.9 + string-unpack-code 141 142 157 157.3 -11.3 -10.8 -- string-validate-input 36 36.3 36 36 0 0.9 Dir: sunspider/as3/ access-binary-trees 8 8 8 8 0 0 access-fannkuch 36 36.3 36 36 0 0.9 access-nbody 6 6 6 6 0 0 access-nsieve 11 11.7 12 12 -9.1 -2.9 -- bitops-3bit-bits-in-byte 4 4.7 5 5.3 -25 -14.3 - bitops-bits-in-byte 7 7.7 8 8 -14.3 -4.3 -- bitops-bitwise-and 2 2 2 2 0 0 bitops-nsieve-bits 22 22.3 22 22 0 1.5 controlflow-recursive 3 3 3 3 0 0 crypto-aes 25 25 25 25.3 0 -1.3 crypto-md5 19 19.3 20 20 -5.3 -3.4 -- crypto-sha1 14 14 14 14.3 0 -2.4 date-format-tofte 128 130.7 133 137.3 -3.9 -5.1 math-cordic 11 11 10 10.7 9.1 3.0 ++ math-partial-sums 51 51.7 54 54 -5.9 -4.5 -- math-spectral-norm 5 5 5 5 0 0 s3d-cube 16 16.7 16 16.7 0 0 s3d-morph 22 23 23 23 -4.5 0 s3d-raytrace 25 25.7 26 26 -4 -1.3 - string-fasta 33 33 32 32 3.0 3.0 + string-unpack-code 140 140.7 156 156.3 -11.4 -11.1 -- string-validate-input 27 27.3 27 27.3 0 0 Dir: sunspider/as3vector/ access-fannkuch 20 20.3 20 20 0 1.6 access-nbody 5 5.7 5 5.7 0 0 access-nsieve 10 10 9 9 10 10 ++ bitops-nsieve-bits 7 7 6 6.3 14.3 9.5 + math-cordic 10 10.3 13 13.3 -30 -29.0 -- math-spectral-norm 13 13 13 13.3 0 -2.6 s3d-cube 12 12.7 12 12.7 0 0 s3d-morph 13 13 13 13.3 0 -2.6 string-fasta 34 34.7 34 34.7 0 0 string-validate-input 28 28.7 29 29 -3.6 -1.2 - Dir: sunspider-0.9.1/js/ access-binary-trees 27 27.7 29 29 -7.4 -4.8 -- access-fannkuch 62 62 66 66 -6.5 -6.5 -- access-nbody 64 64.7 64 64.7 0 0 access-nsieve 20 20.7 22 22 -10 -6.5 -- bitops-3bit-bits-in-byte 7 7.3 7 7.3 0 0 bitops-bits-in-byte 24 24 24 24.7 0 -2.8 bitops-bitwise-and 180 180.3 167 167 7.2 7.4 ++ bitops-nsieve-bits 31 31.7 32 32.7 -3.2 -3.2 - controlflow-recursive 12 12.7 13 13.3 -8.3 -5.3 - crypto-aes 28 28.7 29 29 -3.6 -1.2 - crypto-md5 13 13.3 14 14 -7.7 -5.0 -- crypto-sha1 14 14 14 14 0 0 math-cordic 43 43 44 44 -2.3 -2.3 - math-partial-sums 153 153 151 151.7 1.3 0.9 + math-spectral-norm 22 22 22 22 0 0 regexp-dna 670 690 688 690 -2.7 0 - s3d-cube 51 51 51 51 0 0 s3d-morph 46 46 47 47.3 -2.2 -2.9 - s3d-raytrace 59 59 58 58 1.7 1.7 + string-fasta 48 48 47 47.3 2.1 1.4 + string-unpack-code 3379 3401 3358 3378 0.6 0.7 string-validate-input 34 34.7 34 34.7 0 0 Dir: sunspider-0.9.1/typed/ access-binary-trees 7 7 7 7 0 0 access-fannkuch 39 39.3 39 39.3 0 0 access-nbody 3 3.3 3 3.7 0 -10.0 access-nsieve 10 10.7 10 10.7 0 0 bitops-3bit-bits-in-byte 4 4.3 4 4 0 7.7 bitops-bits-in-byte 7 7.3 7 7 0 4.5 bitops-bitwise-and 2 2 1 1.3 50 33.3 + bitops-nsieve-bits 20 20.3 20 20.3 0 0 controlflow-recursive 2 2.7 3 3.7 -50 -37.5 - crypto-aes 21 21 21 21 0 0 crypto-md5 3 3 3 3 0 0 crypto-sha1 3 3 3 3.7 0 -22.2 math-cordic 10 10 10 10.7 0 -6.7 math-partial-sums 9 9.7 9 9.3 0 3.4 math-spectral-norm 6 6.7 7 7 -16.7 -5.0 -- regexp-dna 664 673.3 662 662.3 0.3 1.6 s3d-cube 15 15 14 14 6.7 6.7 ++ s3d-morph 36 36.7 38 38 -5.6 -3.6 -- s3d-raytrace 7 7 7 7 0 0 string-fasta 25 25.7 25 25.7 0 0 string-unpack-code 3409 3420.3 3359 3400.3 1.5 0.6 + string-validate-input 22 22.3 23 23 -4.5 -3.0 - Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8/ crypto 580 579 568 567.3 -2.1 -2.0 - deltablue 1873 1871 1838 1836.7 -1.9 -1.8 - earley-boyer 1231 1228 1195 1189.7 -2.9 -3.1 - raytrace 3541 3536.3 3488 3487 -1.5 -1.4 - richards 1263 1262.3 1202 1201.3 -4.8 -4.8 - Dir: v8/typed/ crypto 605 604.3 582 582 -3.8 -3.7 - deltablue 2874 2869.3 2801 2798 -2.5 -2.5 - earley-boyer 1225 1223 1197 1194 -2.3 -2.4 - raytrace 7687 7677 7495 7485 -2.5 -2.5 - richards 2320 2318 2210 2210 -4.7 -4.7 - Dir: v8.5/js/ crypto 517 517 505 504.7 -2.3 -2.4 - deltablue 351 350.7 351 351 0 0.1 earley-boyer 1230 1228.3 1069 1065 -13.1 -13.3 -- raytrace 798 797.3 776 775.7 -2.8 -2.7 - regexp 107 106.3 106 106 -0.9 -0.3 - richards 298 298 294 294 -1.3 -1.3 - splay 901 891.7 908 900 0.8 0.9 Dir: v8.5/optimized/ crypto 4288 4279.3 4280 4278 -0.2 -0.0 deltablue 3780 3775.3 3595 3590 -4.9 -4.9 - earley-boyer 1223 1218.7 1064 1063.3 -13.0 -12.7 -- raytrace 8954 8945 8759 8738.7 -2.2 -2.3 - regexp 107 106.7 107 106.7 0 0 richards 4223 4218 3911 3911 -7.4 -7.3 -- splay 6435 6417.7 6511 6498 1.2 1.3 + Dir: v8.5/typed/ crypto 3250 3248 3252 3240.7 0.1 -0.2 deltablue 3858 3854.7 3663 3663 -5.1 -5.0 -- earley-boyer 1220 1214.3 1064 1061.7 -12.8 -12.6 -- raytrace 8958 8937.3 8724 8715.3 -2.6 -2.5 - regexp 107 106.3 107 106.3 0 0 richards 4218 4216.3 3915 3915 -7.2 -7.1 -- splay 1115 1111 1101 1100.3 -1.3 -1.0 - Dir: v8.5/untyped/ crypto 569 568.3 552 552 -3.0 -2.9 - deltablue 1953 1950.7 1880 1878.7 -3.7 -3.7 - earley-boyer 1218 1215.3 1062 1057.7 -12.8 -13.0 -- raytrace 3801 3797.3 3720 3714.7 -2.1 -2.2 - regexp 107 107 106 106 -0.9 -0.9 richards 465 463.7 512 512 10.1 10.4 ++ splay 1059 1052.7 1055 1021 -0.4 -3.0
Flags: flashplayer-qrb+
Flags: flashplayer-injection-
Flags: flashplayer-bug-
Target Milestone: --- → Future
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: