Closed
Bug 556023
Opened 14 years ago
Closed 14 years ago
operator[] on integer keys in 2^28 .. 2^30 range grossly slower than on string keys
Categories
(Tamarin Graveyard :: Virtual Machine, defect, P3)
Tamarin Graveyard
Virtual Machine
Tracking
(Not tracked)
RESOLVED
FIXED
Q3 11 - Serrano
People
(Reporter: pnkfelix, Assigned: pnkfelix)
References
Details
(Whiteboard: PACMAN)
Attachments
(7 files, 11 obsolete files)
4.63 KB,
text/plain
|
Details | |
375.72 KB,
text/plain
|
Details | |
3.50 KB,
patch
|
Details | Diff | Splinter Review | |
7.68 KB,
patch
|
Details | Diff | Splinter Review | |
2.30 KB,
patch
|
Details | Diff | Splinter Review | |
2.50 KB,
patch
|
Details | Diff | Splinter Review | |
5.74 KB,
patch
|
Details | Diff | Splinter Review |
Ed explained in bug #555010 that integer keys > 2^28-1 are converted to strings when used as array indices. Such allocation should not be necessary for pure lookup operations: If the property exists, then we can reuse the existing string that's already present, and if it does not, then in principle we can find out without actually allocating a string on the heap. The attached test case illustrates that integer key lookup can cost more than 3x string key lookup.
Assignee | ||
Comment 1•14 years ago
|
||
(prior upload was erroneous)
Attachment #435940 -
Attachment is obsolete: true
Assignee | ||
Comment 2•14 years ago
|
||
Note that the majority of the extra time is very likely to be due to increased GC pressure from all of those unnecessary string allocations; one can confirm this by running the benchmark with -Dnogc. When I did this, the integer based lookup continued to be slower than string based lookup, but only by a factor of 1.3x rather than 3.4x.
Assignee: nobody → fklockii
Flags: flashplayer-qrb+
Priority: -- → P3
Target Milestone: --- → flash10.2
Assignee | ||
Comment 3•14 years ago
|
||
(In reply to comment #2) > Note that the majority of the extra time is very likely to be due to increased > GC pressure from all of those unnecessary string allocations; one can confirm > this by running the benchmark with -Dnogc. > > When I did this, the integer based lookup continued to be slower than string > based lookup, but only by a factor of 1.3x rather than 3.4x. I tried applying my intern-avoidance patch, attachment 437851 [details] [diff] [review] from bug 555982, and rerunning this benchmark, but int-lookup with intern-avoidance is still 2.3-2.5x slower than string-lookup. This surprised me; I had thought the -Dnogc experiment was a reasonable way to isolate the added cost of extra string allocations, but it seems like other artifacts of the gc creep into this experiment. (Perhaps extra time is spent scanning the stack, which holds the remaining artifacts of int->char* conversion even with the intern-avoidance patch?)
Assignee | ||
Comment 4•14 years ago
|
||
Just to include some concrete data (preserved for historical record) and provide some context, here is the performance of this benchmark on an unpatched 32-bit debug build at TR revision 4277, on my 2.8 Ghz Core 2 Duo macbook pro. % $AVM cmp_int_and_str_lookup.abc int_dur: 170445ms str_dur: 7411ms factor: 22.99 Here is same run after patching to avoid intermediate stringp allocation: % AVM cmp_int_and_str_lookup.abc int_dur: 15577ms str_dur: 7700ms factor: 2.02 I'll attached the referenced patch after I post this comment.
Assignee | ||
Comment 5•14 years ago
|
||
(In reply to comment #4) > I'll attached the referenced patch after I post this comment. No need, its the same as attachment 437851 [details] [diff] [review].
Assignee | ||
Comment 6•14 years ago
|
||
Important detail about this benchmark: for each lookup "arr[key]", the key may or may not be present in ar. A little less than half the keys are present for the default setting of running from key=2^29 through 2^29+10000 I happened to notice that if you change the benchmark to start the keys off at 5*10^8, but leave arr unchanged, then things get much much slower for the int-lookup iteration, even with the patch to avoid intermediate stringp allocation.
Assignee | ||
Comment 7•14 years ago
|
||
(In reply to comment #6) > I happened to notice that if you change the benchmark to start the keys off at > 5*10^8, but leave arr unchanged, then things get much much slower for the > int-lookup iteration, even with the patch to avoid intermediate stringp > allocation. In hindsight this should not have surprised me. If a key is present, then the only additional work we are paying is to turn the integer into a string (potentially on the stack) and then looking it up in the intern table, where we find it and are done. If a key is absent, then we actually go through the work of interning the string version of the int key (with all the presumed overheads like load balancing the table), all for naught since the key never actually gets used (since it is never going to be a key in the object). Anyway the obvious thing is to avoid interning the string version of the int at all; that goes right along with avoiding allocating it.
Assignee | ||
Updated•14 years ago
|
Whiteboard: PACMAN
Assignee | ||
Comment 8•14 years ago
|
||
I have come up with a series of patches, culminating in one that avoids interning integers that are being used for a getprop lookup. However, performance results are mixed: the changeset causes the key-absent case to go much faster, but has slowed down the key-present case signficantly, which I had not expected. I'm putting the patches up just so the work is not lost, but I do not recommend for most of these changes be incorporated until this performance issue is resolved. (The main exception is the patch to avoid intermediate stringp allocation; that seems like a clear win when it matters, as the numbers will illustrate.) With each patch, I'll be posting the measurements I gathered via a new version of the cmp_int_and_str_lookup benchmark.
Assignee | ||
Comment 9•14 years ago
|
||
Revised benchmark that times iterated lookup for four different cases: (1) where the keys have never been interned, (2) where the keys have been interned elsewhere, (3) where the keys sparsely populate the target, and (4) where the keys densely populate the target. Baseline performance for Mac 2.8 Ghs (running on battery), 32-bit build: initialization complete 289ms nowhere int_dur: 7103ms str_dur: 246ms factor: 28.87 elsewhe int_dur: 398ms str_dur: 228ms factor: 1.74 herespa int_dur: 4033ms str_dur: 241ms factor: 16.73 hereden int_dur: 368ms str_dur: 200ms factor: 1.84 This illustrates that int-lookup is ~29x slower than str-lookup in case (1), and ~17x slower in case (3), but is ~1.8x slower in cases (2) and (4).
Attachment #435941 -
Attachment is obsolete: true
Assignee | ||
Comment 10•14 years ago
|
||
initialization complete 247ms nowhere int_dur: 5435ms str_dur: 193ms factor: 28.16 elsewhe int_dur: 317ms str_dur: 276ms factor: 1.14 herespa int_dur: 3106ms str_dur: 240ms factor: 12.94 hereden int_dur: 244ms str_dur: 206ms factor: 1.18 This illustrates that avoiding the intermediate stringp allocation helps cases (2), (3), and (4). It actually helps case (1) as well, as you can see if you look at the changes in the absolute number; the performance factor did not drop as much because both the int- and str- tests are getting sped up by the reduced gc pressure. (This reveals an error in my benchmarking strategy; I really should be doing comparing completely separate avm runs to stop such artifacts from polluting the results; but I'm trying to tie this ticket up for the short term, not make more work for myself.)
Assignee | ||
Comment 11•14 years ago
|
||
initialization complete 242ms nowhere int_dur: 5573ms str_dur: 261ms factor: 21.35 elsewhe int_dur: 271ms str_dur: 230ms factor: 1.17 herespa int_dur: 3185ms str_dur: 245ms factor: 13 hereden int_dur: 242ms str_dur: 210ms factor: 1.15 This is a refactoring in prep for a future addition; it should not affect performance significantly.
Assignee | ||
Comment 12•14 years ago
|
||
initialization complete 243ms nowhere int_dur: 5559ms str_dur: 250ms factor: 22.23 elsewhe int_dur: 276ms str_dur: 235ms factor: 1.17 herespa int_dur: 3164ms str_dur: 248ms factor: 12.75 hereden int_dur: 246ms str_dur: 214ms factor: 1.14 A future patch needed to calculate the hashcode of an integer, so here is that functionality. I spent a while trying to devise a more straight-forward approach, but it is a tricky problem. Nothing in this patch should exercise the new code, so performance should be unchanged.
Assignee | ||
Comment 13•14 years ago
|
||
intern-observers.patch initialization complete 243ms nowhere int_dur: 5497ms str_dur: 229ms factor: 24 elsewhe int_dur: 277ms str_dur: 239ms factor: 1.15 herespa int_dur: 3148ms str_dur: 260ms factor: 12.1 hereden int_dur: 245ms str_dur: 215ms factor: 1.13 Again, this is functionality required by a future patch. It is also related to bug 561080. (The next patch is going to actually exercise this functionality.)
Assignee | ||
Comment 14•14 years ago
|
||
the goal! except its not an overall win: nowhere int_dur: 845ms str_dur: 249ms factor: 3.39 elsewhe int_dur: 785ms str_dur: 234ms factor: 3.35 herespa int_dur: 791ms str_dur: 236ms factor: 3.35 hereden int_dur: 792ms str_dur: 214ms factor: 3.7 This patch sped up the cases where the key is absent everywhere, but at the cost of slowing down the cases where key is present somewhere (even in a different property table). That's no good. Nothing obvious sprung out at me as to why this would be so; it seems like the workload would be the same in either case, /except/ if there's something funny going on where I am inadvertently introducing a double hash calculation on a common path, but I cannot tell where that would be coming from. So, again, I'm not uploading this to be checked in; I'm just checkpointing the work before I move on to something else.
Assignee: fklockii → nobody
Priority: P3 → --
Target Milestone: flash10.x - Serrano → Future
Assignee | ||
Comment 16•14 years ago
|
||
The problem hasn't been fixed. I proposed one fix, but it caused regressions elsewhere (see comment 14). I could close as WONTFIX, but that makes it sound like some sort of real policy decision has been made, when really it is just that I couldn't think of anything to do about the problem at the time I stopped working on it. I'll skim over the patch I originally proposed and see if any new approach/fix jumps out at me.
Assignee | ||
Comment 17•14 years ago
|
||
Comment on attachment 441162 [details] [diff] [review] avoid intermediate stringp allocation when interning an int This patch was already checked in as part of Bug 564167 See TR changeset 4873:580383e5a6f6 http://hg.mozilla.org/tamarin-redux/rev/580383e5a6f6
Attachment #441162 -
Attachment is obsolete: true
Assignee | ||
Comment 18•14 years ago
|
||
(In reply to comment #14) > Nothing obvious sprung out at me as to why this would be so; it seems like the > workload would be the same in either case, /except/ if there's something funny > going on where I am inadvertently introducing a double hash calculation on a > common path, but I cannot tell where that would be coming from. When running the attached benchmark with the proposed patches, the number of probes of the intern table during getUintProperty lookups seems to consistently hover around 200, which seems high to me. (Some of my instrumentation seemed to indicate that before the last patch, the probe count was more like 50, which could explain a 4x slowdown.) Maybe there's a policy issue that is masked with the old code because it aggressively inserts new strings into the intern table. Continuing my investigation.
Assignee | ||
Comment 19•14 years ago
|
||
(In reply to comment #18) > (Some of my instrumentation seemed to indicate that before the last patch, the > probe count was more like 50, which could explain a 4x slowdown.) Even 50 seems ridiculously high though, which makes me suspect something was/is buggy with my instrumentation technique. But this still seems like a fruitful area of attack.
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #19) > (In reply to comment #18) > > (Some of my instrumentation seemed to indicate that before the last patch, the > > probe count was more like 50, which could explain a 4x slowdown.) > > Even 50 seems ridiculously high though, which makes me suspect something was/is > buggy with my instrumentation technique. But this still seems like a fruitful > area of attack. The 200 probe count seems like it was valid (verified via more direct instrumentation in the table lookup looping routine); the 50 probe count was not (the probe count tends to be much smaller). Most important: This was the clue that led to the bug in my patch: I was not accessing m_index properly. It looks like String::parseIndex may be one reasonable option; another would be to make a new variant that does not compute the index if it does not already exist (and signals this via an appropriate return code). But the point is that I've gotten the probe count down so that it looks like it matches what you'd get with the old convert-to-string before lookup approach. Will hopefully have finished version ready soon.
Assignee | ||
Comment 21•14 years ago
|
||
Woot: initialization complete 778ms nowhere int_dur: 4289ms str_dur: 704ms factor: 6.09 elsewhe int_dur: 919ms str_dur: 965ms factor: 0.95 herespa int_dur: 2664ms str_dur: 643ms factor: 4.14 hereden int_dur: 576ms str_dur: 622ms factor: 0.92
Assignee | ||
Comment 22•14 years ago
|
||
(an overhead of 4x-6x over the string itself still seems like an opportunity for improvement, but this is much better than the 12x-24x from before.)
Assignee | ||
Updated•14 years ago
|
Attachment #441163 -
Attachment description: refactor: lift shared load balancing code to one function → 1refactor: lift shared load balancing code to one function
Assignee | ||
Updated•14 years ago
|
Attachment #441166 -
Attachment description: methods to hash an uint w/o intermediate string construction → 2methods to hash an uint w/o intermediate string construction
Assignee | ||
Updated•14 years ago
|
Attachment #441167 -
Attachment description: methods to observe whether uint is interned → 3methods to observe whether uint is interned
Assignee | ||
Updated•14 years ago
|
Attachment #441172 -
Attachment description: avoids interning at all on indexed property lookup → 4avoids interning at all on indexed property lookup
Assignee | ||
Comment 23•14 years ago
|
||
Revised to use String::parseIndex method rather than accessing the m_extra.index state directly (which may be uninitialized or simply invalid anyway). This fixes the performance problems I was seeing (and probably fixes correctness issues that I had not yet seen).
Attachment #441166 -
Attachment is obsolete: true
Assignee | ||
Comment 24•14 years ago
|
||
just a rebase of the patch.
Assignee | ||
Updated•14 years ago
|
Attachment #441167 -
Attachment is obsolete: true
Assignee | ||
Comment 25•14 years ago
|
||
Did a benchmarking run (on my MacBookPro, a 2.8 Ghz Intel Core 2 Duo with 4 GB of RAM). Most of the numbers aren't terribly interesting since this is fixing a slow path for a particularly special case that we probably don't hit often. I'm only posting the v8 numbers below. The others seemed low enough to be in the noise and not worth reporting. There are a couple of --'s in the v8's below with about a 7% speed regression that I may want to look into a little more deeply, if I can replicate them consistently, before I put this up for review (which will hopefully be very soon). Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8/ crypto 474 471.2 470 466.8 -0.8 -0.9 deltablue 1497 1487 1515 1510 1.2 1.5 + earley-boyer 1006 1002.4 1010 1006.6 0.4 0.4 raytrace 2932 2927.2 2926 2918 -0.2 -0.3 richards 955 945.6 967 958.2 1.3 1.3 Dir: v8/typed/ crypto 478 472.4 483 480.6 1.0 1.7 deltablue 2469 2453.8 2460 2445.8 -0.4 -0.3 earley-boyer 1006 995 1012 1000 0.6 0.5 raytrace 6378 6349.8 6372 6360.4 -0.1 0.2 richards 1927 1918.2 1906 1904 -1.1 -0.7 - Dir: v8.5/js/ crypto 423 422.6 422 421.2 -0.2 -0.3 deltablue 306 305.8 284 281.6 -7.2 -7.9 -- earley-boyer 995 979.6 988 978.6 -0.7 -0.1 raytrace 652 649.6 647 644.6 -0.8 -0.8 - regexp 69.5 68.9 70.8 70.2 1.9 1.9 + richards 254 253 248 245 -2.4 -3.2 - splay 732 725.8 758 747 3.6 2.9 + Dir: v8.5/optimized/ crypto 3751 3717.4 3743 3711.8 -0.2 -0.2 deltablue 2974 2937 2991 2930.4 0.6 -0.2 earley-boyer 1007 982.4 1006 988.2 -0.1 0.6 raytrace 7640 7589.8 7640 7608.8 0 0.3 regexp 68.4 67.1 70.3 69.3 2.8 3.2 richards 3593 3552 3611 3594.4 0.5 1.2 splay 5754 5604.4 5311 5249.6 -7.7 -6.3 -- Dir: v8.5/typed/ crypto 2179 2164.2 2263 2252.2 3.9 4.1 + deltablue 3244 3190.8 3222 3125.2 -0.7 -2.1 earley-boyer 1009 1002.4 1010 1003.8 0.1 0.1 raytrace 7680 7648.8 7648 7620.8 -0.4 -0.4 regexp 68.9 68.3 70.5 70.4 2.3 3.0 + richards 3567 3233 3611 3490.4 1.2 8.0 splay 942 926.4 923 863.6 -2.0 -6.8 Dir: v8.5/untyped/ crypto 455 453.8 454 451.8 -0.2 -0.4 deltablue 1537 1528.4 1527 1523.6 -0.7 -0.3 - earley-boyer 1013 1010.4 1005 1001 -0.8 -0.9 - raytrace 3130 3116.2 3133 3118 0.1 0.1 regexp 68.9 68.2 70.4 69.3 2.2 1.6 + richards 396 390 401 400 1.3 2.6 splay 902 858.4 910 881.6 0.9 2.7
Assignee | ||
Comment 26•14 years ago
|
||
The 7% regression on deltablue and splay seems to be attributable to the fourth (and final) queued patch (attachment 441172 [details] [diff] [review]).
Assignee | ||
Comment 27•14 years ago
|
||
(In reply to comment #26) > The 7% regression on deltablue and splay seems to be attributable to the fourth > (and final) queued patch (attachment 441172 [details] [diff] [review]). But at the same time, control does not seem to actually flow through any of isInternedUint for the regressing benchmarks. Is the regression an artifact of program address changes after adding the code? Or am I missing something here? (NB these particular benchmarks also seem particularly noisy. I may give my frequency locked linux box a shot and see how they look there.) I'm going to put this up for review, and in parallel I'll generate and attach the full data set as a csv file.
Assignee | ||
Updated•14 years ago
|
Attachment #441163 -
Flags: superreview?(stejohns)
Attachment #441163 -
Flags: review?(treilly)
Assignee | ||
Updated•14 years ago
|
Attachment #481646 -
Flags: superreview?(stejohns)
Attachment #481646 -
Flags: review?(treilly)
Assignee | ||
Updated•14 years ago
|
Attachment #481647 -
Flags: superreview?(stejohns)
Attachment #481647 -
Flags: review?(treilly)
Assignee | ||
Updated•14 years ago
|
Attachment #441172 -
Flags: superreview?(stejohns)
Attachment #441172 -
Flags: review?(treilly)
Updated•14 years ago
|
Attachment #441163 -
Flags: superreview?(stejohns) → superreview+
Assignee | ||
Comment 28•14 years ago
|
||
performance results on linux look like its all in the noise. This, combined with my earlier investigation into the control flow of the regressing benchmarks outlined in earlier comments, leads me to think that the 7% regression is a fluke on Mac OS X.
Comment 29•14 years ago
|
||
Comment on attachment 481646 [details] [diff] [review] 2methods to hash an uint w/o intermediate string construction nit: should never check in code with "#if 1" (or "#if 0")... if you want the dead code in hashCodeUInt to remain, put it in a comment with an explanation
Attachment #481646 -
Flags: superreview?(stejohns) → superreview+
Updated•14 years ago
|
Attachment #481647 -
Flags: superreview?(stejohns) → superreview+
Comment 30•14 years ago
|
||
Comment on attachment 441172 [details] [diff] [review] 4avoids interning at all on indexed property lookup nit: add comment explaining that if the key isn't interned, it couldn't possibly be found, hence no lookup necessary
Attachment #441172 -
Flags: superreview?(stejohns) → superreview+
Comment 31•14 years ago
|
||
Comment on attachment 441163 [details] [diff] [review] 1refactor: lift shared load balancing code to one function +1 but numStringsCheckLoadBalance belongs in AvmCore-inlines.h no? I thought the whole point of the -inlines.h files was that REALLY_INLINE in a cpp doesn't work.
Attachment #441163 -
Flags: review?(treilly) → review+
Comment 32•14 years ago
|
||
Comment on attachment 481646 [details] [diff] [review] 2methods to hash an uint w/o intermediate string construction +1 but again I'd put those REALLY_INLINE helpers in StringObject-inlines.h (creating it if necesssary).
Attachment #481646 -
Flags: review?(treilly) → review+
Assignee | ||
Comment 33•14 years ago
|
||
The impression I had was that the -inlines.h files were to work around scope issues when you want to write inline functions that refer to members of classes defined in other headers. I was not aware of any issue with having a REALLY_INLINE function defined in a .cpp (assuming that the inline function in question is only referred to in that .cpp). Should private methods that are only referenced in the cpp still go into the -inlines.h? I'm happy to follow whatever directive I get, i just want to make sure I understand the issue.
Comment 34•14 years ago
|
||
Comment on attachment 481647 [details] [diff] [review] 3methods to observe whether uint is interned A method called is* is usually const and doesn't have a dramatic undocumented side-affect like automatically interning all negative integers.
Attachment #481647 -
Flags: review?(treilly) → review-
Comment 35•14 years ago
|
||
REALLY_INLINE in a .cpp is fine (preferred, maybe) if all callers are in the same file. the goal with -inlines.h is to separate API from implementation by moving inline code out of .h files.
Comment 36•14 years ago
|
||
Comment on attachment 441172 [details] [diff] [review] 4avoids interning at all on indexed property lookup Why do we never have to look up integers that aren't interned? Are we implicitly relying on the fact that integer properties if they exist will be interned already? If so some comments would be nice. Otherwise looks fine.
Attachment #441172 -
Flags: review?(treilly) → review+
Assignee | ||
Comment 37•14 years ago
|
||
(In reply to comment #34) > Comment on attachment 481647 [details] [diff] [review] > 3methods to observe whether uint is interned > > A method called is* is usually const and doesn't have a dramatic undocumented > side-affect like automatically interning all negative integers. I assume you're referring to the logic at the beginning that potentially invokes internDouble. I'm shocked thatI kept overlooking that issue. Will remove and reevaluate (that logic came from somewhere; its just a question of whether its a dumb cut-and-pasto or if it was working around something bad.)
Comment 38•14 years ago
|
||
(In reply to comment #35) > REALLY_INLINE in a .cpp is fine (preferred, maybe) if all callers are in the > same file. the goal with -inlines.h is to separate API from implementation by > moving inline code out of .h files. I keep forgetting that rule, never mind that criticism Felix.
Comment 39•14 years ago
|
||
(In reply to comment #35) > REALLY_INLINE in a .cpp is fine (preferred, maybe) if all callers are in the > same file. the goal with -inlines.h is to separate API from implementation by > moving inline code out of .h files. The only exception to the rule being that 'REALLY_INLINE static' methods in cpp files do not compile with SunPro C++.
Assignee | ||
Comment 40•14 years ago
|
||
(In reply to comment #37) > (In reply to comment #34) > > Comment on attachment 481647 [details] [diff] [review] [details] > > 3methods to observe whether uint is interned > > > > A method called is* is usually const and doesn't have a dramatic undocumented > > side-affect like automatically interning all negative integers. > > I assume you're referring to the logic at the beginning that potentially > invokes internDouble. I'm shocked thatI kept overlooking that issue. Will > remove and reevaluate (that logic came from somewhere; its just a question of > whether its a dumb cut-and-pasto or if it was working around something bad.) Oh, I think my motivation was that I did not want to spend time trying to figure out how to compute the hashcode without doing the uint->double->string conversion first. But the easiest workaround is to do the uint->double->string conversion on this path alone, then compute the hashcode from the resulting string. (Another workaround is to change the name of the function, removing the "is", to make it clear that it guarantees strictly nothing more than what its docs say. The old control flow path that isUintInterned is replacing in the 4th patch (attachment 441172 [details] [diff] [review]) would be doing the intern anyway, so this is not crazy. But I want to see if the other approach is doable first.)
Assignee | ||
Comment 41•14 years ago
|
||
Revised initial logic of isInternedUint so that it does not modify the intern table when input uint has its high-bit set. Still need to add const qualifier suggested by Tommy in his review. (I overlooked it this time.)
Attachment #481647 -
Attachment is obsolete: true
Assignee | ||
Comment 42•14 years ago
|
||
(In reply to comment #41) > Still need to add const qualifier suggested by Tommy in his review. (I > overlooked it this time.) Tried adding const qualifier to the method, but then isInternedUint cannot call out to findString et al because they are not const-qualified. I think its beyond the scope of this bug to investigate that. So I'm going to put the revised patch up for review.
Assignee | ||
Updated•14 years ago
|
Attachment #486242 -
Flags: superreview?(stejohns)
Attachment #486242 -
Flags: review?(treilly)
Comment 43•14 years ago
|
||
Comment on attachment 486242 [details] [diff] [review] 3methods to observe whether uint is interned if this needs to be fast could you avoid the second branch and just do: other = strings[iSlotForIndex]; *result = other; return other > AVMPLUS_STRING_DELETED; Looks good, consider this feedback optional.
Attachment #486242 -
Flags: review?(treilly) → review+
Updated•14 years ago
|
Attachment #486242 -
Flags: superreview?(stejohns) → superreview+
Assignee | ||
Comment 44•14 years ago
|
||
Dan and Tommy suggested that i look a little bit more at the 7% regression -- e.g. see if it is replicable on my desktop mac, perhaps post comparative output from sharking the two binaries. Am doing that now.
Comment 45•14 years ago
|
||
Can you post one big patch and I will investigate on Windows + VTUNE?
Comment 46•14 years ago
|
||
never mind, i used the 4 little patches okay.
Assignee | ||
Comment 47•14 years ago
|
||
(just rebasing; inherits r=treilly and sr=stejohns.)
Attachment #441163 -
Attachment is obsolete: true
Assignee | ||
Comment 48•14 years ago
|
||
(rebasing and removed #if0'ed code at stejohns' request; inherits r=treilly and sr=stejohns.)
Attachment #481646 -
Attachment is obsolete: true
Comment 49•14 years ago
|
||
v8.5/optimized/splay.cpp is entirely new object creation/deletion + marking/sweeping, etc. It has some random performance results perhaps related to GC behavior. v8.5/js/deltablue is a lot of untyped AS code with the top function begin getAtomProperty. None of the new code in these patches is anywhere on the VTUNE output. With a win32 release shell, both these tests vary a bit for each run (a couple percent?) but are basically identical with our without the patches applied.
Assignee | ||
Comment 50•14 years ago
|
||
(In reply to comment #45) > Can you post one big patch and I will investigate on Windows + VTUNE? (In reply to comment #46) > never mind, i used the 4 little patches okay. Okay, great! I have not done performance testing on windows (at the moment I only have VMware images for Windows, and I do not know if those make good basis for performance evaluation). When I did performance eval on Linux (see comment 28), I did not see the same problem (the 7% regression) there. So I will not be surprised if you are unable to reproduce the regression on Windows.
Assignee | ||
Comment 51•14 years ago
|
||
rebased, added FIXME tag to note potential future work (with associated bugzilla ticket). (inherits r=treilly and sr=stejohns.)
Attachment #486242 -
Attachment is obsolete: true
Assignee | ||
Comment 52•14 years ago
|
||
rebased, added comment explaining logic (uninterned implies absence) at request of both reviewers. :) (inherits r=treilly and sr=stejohns.)
Attachment #441172 -
Attachment is obsolete: true
Assignee | ||
Comment 53•14 years ago
|
||
(In reply to comment #44) > Dan and Tommy suggested that i look a little bit more at the 7% regression -- > e.g. see if it is replicable on my desktop mac, perhaps post comparative output > from sharking the two binaries. Am doing that now. I cannot replicate the regression any more, even on the MacBookPro that I used for the original measurements. I'm now seeing results like this: Dir: v8.5/js/ deltablue 315 311.2 316 314.2 0.3 1.0 Dir: v8.5/optimized/ splay 5741 5712 5764 5747.6 0.4 0.6 Frustrating. (But it does go along with my earlier hypothesis that this was an artifact of something orthogonal to my patch.) So I don't think this should hold me up from landing this.
Assignee | ||
Comment 54•14 years ago
|
||
Opened Bug 607627 to track adding an analogous change for the in-operator.
Assignee | ||
Comment 55•14 years ago
|
||
Pushed patches in the order I presented them. (Arguably I could have folded the last three together, but the dependency chain here is pretty sane.) TR changeset - 5408:6bcd572a8f16 http://hg.mozilla.org/tamarin-redux/rev/6bcd572a8f16 TR changeset - 5409:c76789eefea7 http://hg.mozilla.org/tamarin-redux/rev/c76789eefea7 TR changeset - 5410:848c57c9cb86 http://hg.mozilla.org/tamarin-redux/rev/848c57c9cb86 TR changeset - 5411:520a9ddfe412 http://hg.mozilla.org/tamarin-redux/rev/520a9ddfe412
Assignee | ||
Comment 56•14 years ago
|
||
Ah, one more thing: after sharking the benchmark on this ticket (attachment 441158 [details]), I realized that we could be inlining the early-exit fast paths in parseIndex, and leave the rest of the code out-of-line. That may or may not be outside the scope of this ticket, depending on how the inlining is implemented (i.e. should I inline the fast-paths of parseIndex at every invocation of parseIndex, or just the invocations that I added in these patches for this ticket). Here are the concrete results I get comparing the original out-of-line parseIndex (i.e. the tip of TR) and an variant that inlines the two early-exit fast paths. Note in particular the 1.5x-2x drop in running times for nowhere int_dur and herespa int_dur below. (I'll attach the patch in a follow-up comment.) tamarin-redux/objdir-rel32 (hg:bug607627) % $AVM.fast-parse-paths-outline ~/Dev/Bugz/bugz607627/cmp_int_and_str_lookup.abc initialization complete 201ms nowhere int_dur: 21322ms str_dur: 1502ms factor: 14.19 elsewhe int_dur: 2159ms str_dur: 2245ms factor: 0.96 herespa int_dur: 12470ms str_dur: 1841ms factor: 6.77 hereden int_dur: 1645ms str_dur: 1676ms factor: 0.98 tamarin-redux/objdir-rel32 (hg:bug607627) % $AVM.fast-parse-paths-inlined ~/Dev/Bugz/bugz607627/cmp_int_and_str_lookup.abc initialization complete 203ms nowhere int_dur: 12413ms str_dur: 1507ms factor: 8.23 elsewhe int_dur: 1795ms str_dur: 1948ms factor: 0.92 herespa int_dur: 7311ms str_dur: 1812ms factor: 4.03 hereden int_dur: 1608ms str_dur: 1664ms factor: 0.96
Assignee | ||
Comment 57•14 years ago
|
||
Here's the patch I mentioned in comment 56. I'm planning to evaluate how this performs on our benchmark suite; if it performs well, I'll open a separate ticket and document the results there. If it performs poorly, I'll revise the work to isolate the inlining to just the cases covered by the patches posted earlier here.
Assignee | ||
Comment 58•14 years ago
|
||
(In reply to comment #57) > Created attachment 486738 [details] [diff] [review] > inlines 2 early-exit tests in parseIndex > > Here's the patch I mentioned in comment 56. > > I'm planning to evaluate how this performs on our benchmark suite; if it > performs well, I'll open a separate ticket and document the results there. If > it performs poorly, I'll revise the work to isolate the inlining to just the > cases covered by the patches posted earlier here. It does not perform all that well. Mac OS X performance deltas were too noisy and small for me to evaluate quickly. On my cpu-locked Ubuntu thinkpad, it _hurt_ most benchmarks (by a small amount, but still not good), except for these: avm avm2 test best avg best avg %dBst %dAvg Dir: asmicro/ lookup-array-in-1 532 530.2 537 534.9 0.9 0.9 + vector-push-1 13 13 14 13.8 7.7 6.2 ++ Dir: jsmicro/ arguments-3 5 5 6 5.8 20 16.0 ++ I'm noting these cases here just so I'll remember to double-check them when I revise the patch to specialize where the inlining happens.
Assignee | ||
Comment 59•14 years ago
|
||
New version of the patch that focuses the inlining on just the one case that this ticket brought up: the case where you're invoking parseIndex from the inner loop of the hash-table look up for index properties. Here's the performance results for the microbenchmark on this ticket. Unsurprisingly, inlining in just this call-site reaps the benefit that I was trying to achieve earlier when I tried inlining all call sites of parseIndex: lookups using large integers go down to being 4-6x slower than strings rather than 6-14x slower. test/performance (hg:bug556023) % ../../objdir-rel32/shell/avmshell.findIndexOutline ~/Dev/Bugz/bugz556023/cmp_int_and_str_lookup.abc ../../objdir-rel32/shell/avmshell.findIndexOutline ~/Dev/Bugz/bugz556023/cmp_int_and_str_lookup.abc initialization complete 234ms nowhere int_dur: 24465ms str_dur: 1742ms factor: 14.04 elsewhe int_dur: 2099ms str_dur: 2234ms factor: 0.93 herespa int_dur: 14345ms str_dur: 2148ms factor: 6.67 hereden int_dur: 1878ms str_dur: 2006ms factor: 0.93 test/performance (hg:bug556023) % ../../objdir-rel32/shell/avmshell.findIndexInlined ~/Dev/Bugz/bugz556023/cmp_int_and_str_lookup.abc ../../objdir-rel32/shell/avmshell.findIndexInlined ~/Dev/Bugz/bugz556023/cmp_int_and_str_lookup.abc initialization complete 248ms nowhere int_dur: 14761ms str_dur: 2284ms factor: 6.46 elsewhe int_dur: 2461ms str_dur: 2674ms factor: 0.92 herespa int_dur: 8527ms str_dur: 2278ms factor: 3.74 hereden int_dur: 1796ms str_dur: 1962ms factor: 0.91 Below are the performance results from my MacBookPro (2.8 Ghz Core 2 Duo). The executive summary is that adding this does not make much of a difference for our benchmark suite. (Unsurprising since it is probably uncommon to be using such large integers as property keys, at least in objects rather than arrays; I should try the above microbenchmark on arrays, actually...) Anyway, notable exceptions to the "not much a difference" include: simpleStringConcatenation (-26%) bitops-bitwise-and (-23%). There are a couple positive figures sprinkled throughout but nothing jumped out at me as amazing. So, probably still should not land the change even in this more restricted form. % python runtests.py --iterations 10 --avm ../../objdir-rel32/shell/avmshell.findIndexOutline --avm2 ../../objdir-rel32/shell/avmshell.findIndexOutline python runtests.py --iterations 10 --avm ../../objdir-rel32/shell/avmshell.findIndexOutline --avm2 ../../objdir-rel32/shell/avmshell.findIndexOutline Tamarin tests started: 2010-11-02 19:04:34.521946 Executing 460 test(s) avm: ../../objdir-rel32/shell/avmshell.findIndexOutline version: cyclone avm2: ../../objdir-rel32/shell/avmshell.findIndexOutline version: cyclone iterations: 10 avm avm2 test best avg best avg %dBst %dAvg Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: asmicro/ alloc-1 39 38.3 39 38.7 0 1.0 alloc-10 14 14 14 14 0 0 alloc-11 12 12 12 12 0 0 alloc-12 6 6 6 6 0 0 Metric: iterations/second alloc-13 83 82.2 83 82 0 -0.2 alloc-14 69 68.6 69 68 0 -0.9 alloc-2 17 16.7 18 17 5.9 1.8 alloc-3 15 14.6 15 14.4 0 -1.4 alloc-4 46 46 46 45.7 0 -0.7 alloc-5 34 32.6 34 33.4 0 2.5 alloc-6 63 62.3 63 62.1 0 -0.3 alloc-7 36 36 36 35.9 0 -0.3 alloc-8 15 15 15 15 0 0 alloc-9 15 14.9 15 14.9 0 0 arguments-1 675 672.3 675 672.1 0 -0.0 arguments-2 388 384.9 385 383.6 -0.8 -0.3 - arguments-3 18 18 18 18 0 0 arguments-4 27 26.3 27 26.1 0 -0.8 array-1 1858 1853.5 1859 1853.3 0.1 -0.0 array-2 590 588.4 592 587.9 0.3 -0.1 array-pop-1 360 358.2 361 357.3 0.3 -0.3 array-push-1 232 230.4 232 229.9 0 -0.2 array-shift-1 130 129.2 131 129.3 0.8 0.1 array-slice-1 18 17.9 18 17.9 0 0 array-sort-1 25 24.6 25 24.5 0 -0.4 array-sort-2 2 2 2 2 0 0 array-sort-3 19 18.2 19 18.6 0 2.2 array-sort-4 8 8 8 8 0 0 array-unshift-1 139 138.7 139 138.1 0 -0.4 closedvar-read-1 4724 4682.4 4752 4713.5 0.6 0.7 closedvar-write-1 3808 3675.3 3825 3751.9 0.4 2.1 closedvar-write-2 3832 3804.9 3831 3810.7 -0.0 0.2 do-1 3962 3956.3 3970 3947.4 0.2 -0.2 for-1 3960 3920 3953 3925.8 -0.2 0.1 for-2 2669 2663.8 2669 2666.9 0 0.1 for-3 2774 2763.6 2776 2771.6 0.1 0.3 for-in-1 358 357.2 357 356.2 -0.3 -0.3 for-in-2 154 151.9 153 151.7 -0.6 -0.1 funcall-1 297 296 297 296.9 0 0.3 funcall-2 197 195.4 198 196.2 0.5 0.4 funcall-3 261 258.9 261 259.3 0 0.2 funcall-4 107 106.6 107 106.7 0 0.1 globalvar-read-1 4761 4750.6 4760 4739.3 -0.0 -0.2 globalvar-write-1 3876 3856.1 3881 3863.4 0.1 0.2 isNaN-1 3910 3899.1 3908 3894.3 -0.1 -0.1 isNaN-2 3954 3949.7 3954 3947 0 -0.1 isNaN-3 3941 3936.1 3931 3920.8 -0.3 -0.4 lookup-array-fetch-1 727.3 718.4 727.3 720.7 0 0.3 lookup-array-in-1 1594.4 1586.0 1594.4 1583.6 0 -0.2 lookup-negindex-array-1 418.2 415.9 417.6 415.8 -0.1 -0.0 lookup-negindex-array-2 359.3 356.1 359.6 358.7 0.1 0.7 lookup-negindex-object-1 438.1 435.0 438.7 434.4 0.1 -0.1 lookup-negindex-object-2 411.6 407.4 412.2 409.0 0.1 0.4 lookup-object-fetch-1 781.2 778.3 781.2 777.9 0 -0.1 lookup-object-in-1 1333.7 1330.2 1332.7 1327.5 -0.1 -0.2 number-toString-1 5.1 5.1 5.1 5.1 0 0.0 number-toString-2 60.6 60.4 60.7 60.3 0.1 -0.1 oop-1 3.6 3.5 3.6 3.6 0.5 0.7 parseFloat-1 68.2 67.8 68.3 68.0 0.1 0.3 parseInt-1 152.5 149.0 150.5 148.5 -1.3 -0.3 regex-exec-1 56.6 55.8 56.8 56.3 0.3 0.9 regex-exec-2 65.9 64.3 66.0 64.2 0.2 -0.2 regex-exec-3 93.1 92.5 92.9 91.9 -0.2 -0.6 regex-exec-4 258.7 257.2 258.7 257.8 0 0.3 restarg-1 671.7 668.2 673.3 666.8 0.2 -0.2 restarg-2 400.8 391.3 401.6 396.5 0.2 1.3 restarg-3 34.7 34.2 34.7 33.7 0 -1.4 restarg-4 27.3 26.8 27.3 26.7 0.2 -0.3 string-casechange-1 24.8 24.6 24.8 24.5 0.1 -0.5 string-casechange-2 24.9 24.6 24.9 24.7 0.2 0.4 string-charAt-1 1518.5 1510.8 1517.5 1510.8 -0.1 0.0 string-charAt-2 73.7 73.2 73.8 73.4 0.1 0.4 string-charCodeAt-1 1168.8 1161.8 1168.8 1162.1 0 0.0 string-charCodeAt-2 1079.9 1057.6 1079.9 1077.9 0 1.9 string-charCodeAt-3 1063.9 966.7 1065.9 997.6 0.2 3.2 string-charCodeAt-4 1971.0 1952.8 1970.0 1958.7 -0.1 0.3 string-charCodeAt-5 893.1 891.1 892.2 889.6 -0.1 -0.2 string-charCodeAt-6 1168.8 1166.6 1167.8 1167.2 -0.1 0.1 string-charCodeAt-7 1984.0 1982.3 1985.0 1982.3 0.1 0 string-fromCharCode-1 249.8 248.9 249.0 247.1 -0.3 -0.7 string-fromCharCode-2 58.3 58.2 58.4 58.3 0.1 0.2 string-indexOf-1 197.6 197.2 197.2 195.5 -0.2 -0.9 string-indexOf-2 126.0 124.8 125.9 125.6 -0.1 0.6 string-indexOf-3 68.4 68.2 68.3 67.7 -0.1 -0.8 string-lastIndexOf-1 522.0 519.2 523.5 519.8 0.3 0.1 string-lastIndexOf-2 126.0 125.4 126.0 125.4 0 0.0 string-lastIndexOf-3 128.9 128.8 128.9 128.7 0 -0.1 string-slice-1 112.2 111.8 112.3 112.1 0.1 0.3 string-split-1 8.8 8.7 8.7 8.7 -0.5 0.3 string-split-2 8.7 8.5 8.7 8.5 -0.1 -0.0 string-substring-1 117.9 117.7 117.9 117.7 0 0.0 switch-1 868.1 866.6 869.1 868.3 0.1 0.2 switch-2 132.6 132.4 132.6 132.4 0 -0.0 switch-3 214.4 214.0 214.9 214.4 0.3 0.2 try-1 197.4 196.8 197.6 196.5 0.1 -0.1 try-2 14.7 14.6 14.7 14.6 0 0.2 try-3 46.7 46.6 46.6 46.5 -0.1 -0.1 vector-push-1 40.3 40.2 40.2 40.0 -0.1 -0.4 while-1 3960.0 3957.6 3964.0 3959.6 0.1 0.1 Metric: time Dir: jsbench/ Crypt 3706 3709 3702 3707.5 0.1 0.0 Euler 7739 7770.8 7740 7767.3 -0.0 0.0 FFT 6787 6834.4 6802 6861.2 -0.2 -0.4 HeapSort 3413 3452.6 3417 3446.3 -0.1 0.2 LUFact 5443 5461.4 5427 5450.8 0.3 0.2 Moldyn 11111 11161.2 11101 11151.9 0.1 0.1 RayTracer 7092 7148.5 7090 7144.5 0.0 0.1 SOR 31073 31247.2 31035 31175.7 0.1 0.2 Series 8734 8754.4 8738 8772.6 -0.0 -0.2 SparseMatmult 9427 9477.4 9430 9511.9 -0.0 -0.4 Dir: jsbench/typed/ Crypt 981 984.7 981 983.9 0 0.1 Euler 8575 8603.5 8561 8615.1 0.2 -0.1 FFT 1919 1922.2 1921 1924.1 -0.1 -0.1 HeapSort 1263 1267.8 1261 1267.7 0.2 0.0 LUFact 1705 1717.9 1705 1708.5 0 0.5 Moldyn 3838 3870.3 3843 3872.4 -0.1 -0.1 RayTracer 1397 1399 1395 1398.4 0.1 0.0 SOR 5698 5723.2 5701 5724.6 -0.1 -0.0 Series 7792 7819.2 7797 7813 -0.1 0.1 SparseMatmult 2919 2933.6 2932 2941.5 -0.4 -0.3 Metric: iterations/second Dir: jsmicro/ alloc-1 37.7 37.4 37.7 37.5 0 0.4 alloc-10 13.9 13.9 13.9 13.9 0 -0.0 alloc-11 11.7 11.6 11.7 11.6 0.1 0.3 alloc-12 6.7 6.7 6.7 6.7 0.1 0.2 alloc-13 72.2 72.0 72.2 72.0 0 0.0 alloc-14 61.9 60.8 61.6 60.4 -0.4 -0.6 alloc-2 17.3 17.2 17.4 17.3 0.4 0.4 alloc-3 15.0 14.8 15.0 14.7 0.4 -0.4 alloc-4 44.0 43.8 44.1 43.7 0.1 -0.1 alloc-5 32.6 32.6 32.6 32.5 -0.2 -0.1 alloc-6 56.9 56.5 56.9 56.9 0 0.5 alloc-7 38.8 38.8 38.8 38.3 0 -1.2 alloc-8 15.4 15.2 15.4 15.3 -0.1 0.3 alloc-9 15.4 15.2 15.4 15.3 -0.2 0.3 arguments-1 142.9 142.4 143.1 142.8 0.2 0.3 arguments-2 93.6 93.3 93.5 93.2 -0.1 -0.1 arguments-3 16.4 16.3 16.4 16.4 -0.1 0.3 array-1 360.6 360.3 360.9 360.4 0.1 0.0 array-2 277.2 276.9 276.9 276.6 -0.1 -0.1 array-pop-1 57.9 57.4 57.8 57.5 -0.3 0.2 array-push-1 40.1 39.5 40.1 39.9 0 1.0 array-shift-1 52.2 51.2 52.1 51.8 -0.1 1.1 array-slice-1 15.1 15.0 15.1 15.0 0 0.1 array-sort-1 23.9 23.8 24.0 23.8 0.2 0.2 array-sort-2 2.4 2.4 2.4 2.4 0.3 0.2 array-sort-3 19.8 19.7 19.7 19.5 -0.4 -0.6 array-sort-4 9.3 9.3 9.4 9.3 0.4 0.4 array-unshift-1 21.5 21.4 21.5 21.5 0 0.2 closedvar-read-1 613.8 612.8 613.8 612.4 0 -0.1 closedvar-write-1 419.6 419.0 419.2 418.9 -0.1 -0.0 closedvar-write-2 425.6 423.8 425.1 423.9 -0.1 0.0 do-1 632.7 631.8 632.7 632.4 0 0.1 for-1 607.8 602.8 607.4 601.5 -0.1 -0.2 for-2 200.8 200.5 201.0 200.8 0.1 0.2 for-3 153.4 153.0 153.2 152.7 -0.1 -0.2 for-in-1 270.5 270.1 270.7 270.3 0.1 0.1 for-in-2 133.2 133.0 133.2 132.9 0 -0.1 funcall-1 194.6 194.4 194.6 194.4 0 0.0 funcall-2 190.6 190.3 190.4 190.3 -0.1 -0.0 funcall-3 187.4 186.6 187.6 187.1 0.1 0.2 funcall-4 1255.7 1253.6 1255.7 1252.8 0 -0.1 globalvar-read-1 617.4 615.6 617.8 616.4 0.1 0.1 globalvar-write-1 428.1 426.8 428.6 426.9 0.1 0.0 isNaN-1 546.5 545.2 547.5 546.1 0.2 0.2 lookup-array-fetch-1 609.8 608.9 610.4 609.3 0.1 0.1 lookup-array-in-1 939.1 937.4 938.1 934.5 -0.1 -0.3 lookup-object-fetch-1 645.7 640.1 645.4 640.7 -0.1 0.1 lookup-object-in-1 832.2 828.6 832.3 829.8 0.0 0.2 number-toString-1 5.1 5.1 5.1 5.1 -0.2 -0.1 number-toString-2 55.4 55.3 55.5 55.4 0.1 0.1 oop-1 3.6 3.6 3.6 3.6 0.1 0.5 parseFloat-1 50.8 50.7 50.7 50.6 -0.1 -0.3 parseInt-1 109.1 108.0 109.2 107.4 0.1 -0.5 regex-exec-1 49.5 49.4 49.5 49.3 0 -0.2 regex-exec-2 59.2 58.9 59.7 59.3 0.8 0.6 regex-exec-3 84.3 84.2 84.3 84.1 0 -0.0 regex-exec-4 215.8 214.1 215.4 213.5 -0.2 -0.3 string-casechange-1 15.8 15.8 15.8 15.8 0.1 0.0 string-casechange-2 15.9 15.9 15.9 15.9 0 0.0 string-charAt-1 85.8 84.9 85.8 84.8 0 -0.1 string-charAt-2 39.4 38.8 39.4 38.9 -0.1 0.1 string-charCodeAt-1 80.9 80.9 81.0 80.9 0.1 0.0 string-charCodeAt-2 82.0 81.8 82.0 81.9 0 0.1 string-fromCharCode-1 69.6 69.5 69.4 69.4 -0.2 -0.1 string-fromCharCode-2 35.6 35.4 35.6 35.6 0.1 0.5 string-fromCharCode-3 61.8 61.6 61.8 61.6 0 -0.0 string-fromCharCode-4 63.6 63.5 63.6 63.4 0.1 -0.0 string-indexOf-1 72.6 72.3 72.6 71.7 0 -0.7 string-indexOf-2 49.7 49.5 49.7 49.5 0 0.1 string-indexOf-3 34.5 34.3 34.6 34.2 0.2 -0.4 string-lastIndexOf-1 72.6 72.5 72.6 72.5 0.1 -0.0 string-lastIndexOf-2 49.8 49.7 49.8 49.7 0 0.0 string-lastIndexOf-3 50.2 50.2 50.3 50.3 0.1 0.1 string-slice-1 45.0 44.9 45.0 44.9 -0.0 0.0 string-split-1 7.9 7.9 8.0 7.9 0.6 0.2 string-split-2 7.9 7.9 7.9 7.9 0 -0.3 string-substring-1 45.4 45.2 45.3 45.2 -0.1 0.0 switch-1 105.6 105.3 105.6 105.2 0 -0.0 switch-2 62.5 62.2 62.3 62.1 -0.3 -0.0 switch-3 73.4 73.4 73.5 73.4 0.1 0.0 try-1 130.3 130.1 130.5 130.3 0.1 0.1 try-2 14.0 13.9 14.0 13.9 0 -0.2 try-3 36.8 36.7 36.8 36.7 0 0.0 while-1 607.8 607.1 607.4 606.5 -0.1 -0.1 Metric: time Dir: language/describetype/ desctypeperf 577 579.8 579 582.2 -0.3 -0.4 Dir: language/e4x/ addingToXMLList 17 17.1 17 17.1 0 0 appendChildAndString 49 49.8 49 49.7 0 0.2 concatenatingStringsFromE4X 7 7 7 7.1 0 -1.4 simpleStringConcatenation 1 1.5 1 1.9 0 -26.7 usingAppendChildAndE4X 51 51.5 51 51.6 0 -0.2 Dir: language/string/ append_concat 92 92.8 92 93.1 0 -0.3 append_equal_plus 79 79.3 79 79.5 0 -0.3 append_plus_equal 79 79.4 79 79.8 0 -0.5 charAt 178 178.2 178 178.8 0 -0.3 charCodeAt 215 216.6 216 216.8 -0.5 -0.1 indexOf 529 529.7 529 529.7 0 0 lastIndexOf 200 200.5 199 200 0.5 0.2 replace 546 548.7 546 548.9 0 -0.0 replace2 1118 1119.7 1118 1121.2 0 -0.1 search 39 39.3 39 39.6 0 -0.8 slice 306 309.2 306 308.2 0 0.3 split 339 341.3 338 340 0.3 0.4 static_ascii_array_100 962 1015.7 961 1011.8 0.1 0.4 static_ascii_array_50 939 945.2 941 944.4 -0.2 0.1 static_latin1_array_100 1886 1892.7 1884 1891.7 0.1 0.1 static_latin1_array_50 946 950.5 945 956 0.1 -0.6 substr 222 223 223 224.9 -0.5 -0.9 substring 220 220.9 220 220.4 0 0.2 Dir: language/string/typed/ append_concat 84 84.8 84 85 0 -0.2 append_equal_plus 73 73.9 73 74.2 0 -0.4 append_plus_equal 73 73.4 72 73.8 1.4 -0.5 charAt 10 10 10 10 0 0 charCodeAt 10 10.1 10 10.1 0 0 indexOf 529 530.4 529 533.4 0 -0.6 lastIndexOf 199 199.9 199 199.5 0 0.2 replace 545 549 546 548.9 -0.2 0.0 replace2 1126 1127.5 1124 1125.6 0.2 0.2 search 39 39.2 39 39.1 0 0.3 slice 185 186.2 185 186.4 0 -0.1 split 364 365.2 363 364.7 0.3 0.1 substr 135 135.2 135 135.4 0 -0.1 substring 127 127.6 127 127.8 0 -0.2 Dir: misc/ boids 1873 1876.1 1873 1876.2 0 -0.0 boidshack 508 509.6 508 509 0 0.1 gameoflife 2720 2747.3 2719 2732.9 0.0 0.5 primes 4827 4837.7 4826 4838.3 0.0 -0.0 Dir: mmgc/ gcbench 2815 2924.2 2830 2913.9 -0.5 0.4 ofib-rc 271 272 272 273.5 -0.4 -0.6 ofib 1206 1228.8 1218 1225.6 -1.0 0.3 sfib 409 410.7 409 410.6 0 0.0 Dir: scimark/ FFT 2342 2345.3 2339 2344.6 0.1 0.0 LU 3020 3030.3 3018 3039.5 0.1 -0.3 MonteCarlo 2505 2511.3 2503 2510.2 0.1 0.0 SOR 2652 2656.6 2655 2660.6 -0.1 -0.2 SparseCompRow 100 101 100 100.5 0 0.5 Dir: sunspider/ access-binary-trees 35 35.2 34 35 2.9 0.6 + access-fannkuch 83 83.4 83 84.3 0 -1.1 access-nbody 78 78.6 78 78.8 0 -0.3 access-nsieve 44 44.7 44 45.4 0 -1.6 bitops-3bit-bits-in-byte 10 10.9 11 11 -10 -0.9 -- bitops-bits-in-byte 31 31.2 31 31.4 0 -0.6 bitops-bitwise-and 189 190 189 190.6 0 -0.3 bitops-nsieve-bits 39 39.7 39 39.1 0 1.5 controlflow-recursive 17 17.5 17 17.7 0 -1.1 crypto-aes 40 40.6 40 40.7 0 -0.2 crypto-md5 19 19.1 19 19.8 0 -3.7 crypto-sha1 19 19.4 19 19.5 0 -0.5 date-format-tofte 307 316 300 312.5 2.3 1.1 math-cordic 49 49.4 49 49.3 0 0.2 math-partial-sums 167 168.1 167 168.1 0 0 math-spectral-norm 29 29.6 29 29.6 0 0 s3d-cube 64 64.5 64 64.2 0 0.5 s3d-morph 40 41.1 40 40.9 0 0.5 s3d-raytrace 77 77.9 78 78.2 -1.3 -0.4 - string-fasta 81 81.4 81 81.2 0 0.2 string-unpack-code 190 192.5 190 191 0 0.8 string-validate-input 45 45.4 45 45.4 0 0 Dir: sunspider/as3/ access-binary-trees 9 9.3 9 9.6 0 -3.2 access-fannkuch 55 55.1 55 55.3 0 -0.4 access-nbody 6 6.6 6 6.2 0 6.1 access-nsieve 32 33 32 33 0 0 bitops-3bit-bits-in-byte 6 6.1 5 5.9 16.7 3.3 + bitops-bits-in-byte 8 8.6 8 8.7 0 -1.2 bitops-bitwise-and 1 1.7 1 1.6 0 5.9 bitops-nsieve-bits 25 25.3 25 25.4 0 -0.4 controlflow-recursive 4 4.3 4 4.8 0 -11.6 crypto-aes 30 30.6 30 30.6 0 0 crypto-md5 23 23.4 23 23.1 0 1.3 crypto-sha1 18 18.4 18 18.2 0 1.1 date-format-tofte 294 305 300 308.5 -2.0 -1.1 math-cordic 15 15.5 15 15.4 0 0.6 math-partial-sums 59 59.5 59 59.5 0 0 math-spectral-norm 5 5.8 6 6 -20 -3.4 - s3d-cube 20 20 20 20 0 0 s3d-morph 30 30.3 30 30.4 0 -0.3 s3d-raytrace 32 32.7 32 32.9 0 -0.6 string-fasta 37 37.8 37 38 0 -0.5 string-unpack-code 188 190 190 190.5 -1.1 -0.3 - string-validate-input 35 38.7 35 35 0 9.6 Dir: sunspider/as3vector/ access-fannkuch 22 22.9 23 23 -4.5 -0.4 - access-nbody 6 6.4 6 6.1 0 4.7 access-nsieve 11 11.6 11 11.5 0 0.9 bitops-nsieve-bits 7 7.1 7 7.1 0 0 math-cordic 16 16.3 16 16.2 0 0.6 math-spectral-norm 17 17.8 17 17.7 0 0.6 s3d-cube 16 16.4 16 16.5 0 -0.6 s3d-morph 17 17.7 17 17.5 0 1.1 string-fasta 41 41.1 41 41 0 0.2 string-validate-input 36 36.7 36 36.7 0 0 Dir: sunspider-0.9.1/js/ access-binary-trees 34 34.7 34 34.8 0 -0.3 access-fannkuch 81 81.4 81 81.7 0 -0.4 access-nbody 73 73.6 73 73.3 0 0.4 access-nsieve 42 43 42 42.6 0 0.9 bitops-3bit-bits-in-byte 9 9.6 9 9.7 0 -1.0 bitops-bits-in-byte 30 30.8 30 30.7 0 0.3 bitops-bitwise-and 191 191.8 191 192.6 0 -0.4 bitops-nsieve-bits 38 38.5 38 38.5 0 0 controlflow-recursive 17 17.1 17 17.2 0 -0.6 crypto-aes 34 34.8 34 34.6 0 0.6 crypto-md5 15 16 15 15.7 0 1.9 crypto-sha1 17 17.6 17 17.8 0 -1.1 math-cordic 46 46.7 47 47.1 -2.2 -0.9 - math-partial-sums 172 172.7 172 172.7 0 0 math-spectral-norm 28 28.9 28 28.8 0 0.3 regexp-dna 1175 1192.3 1174 1200.1 0.1 -0.7 s3d-cube 57 58.2 58 58.4 -1.8 -0.3 s3d-morph 57 57.3 56 57.5 1.8 -0.3 s3d-raytrace 72 72.5 72 72.1 0 0.6 string-fasta 56 57.1 56 56.9 0 0.4 string-unpack-code 5613 5648.2 5550 5617.6 1.1 0.5 string-validate-input 42 42.6 42 42.6 0 0 Dir: sunspider-0.9.1/typed/ access-binary-trees 7 7.6 7 7.7 0 -1.3 access-fannkuch 61 62 62 62.1 -1.6 -0.2 - access-nbody 4 4.1 4 4 0 2.4 access-nsieve 29 30 29 29.9 0 0.3 bitops-3bit-bits-in-byte 5 5.6 5 5.8 0 -3.6 bitops-bits-in-byte 7 7.8 7 7.6 0 2.6 bitops-bitwise-and 1 1.3 1 1.6 0 -23.1 bitops-nsieve-bits 24 24.3 24 24.1 0 0.8 controlflow-recursive 4 4 4 4 0 0 crypto-aes 25 26.1 25 26.5 0 -1.5 crypto-md5 4 4.7 4 4.5 0 4.3 crypto-sha1 4 4.9 4 4.5 0 8.2 math-cordic 14 14.5 14 14.3 0 1.4 math-partial-sums 10 13.1 10 11.2 0 14.5 math-spectral-norm 7 7.9 7 7.9 0 0 regexp-dna 1150 1167.6 1158 1164.4 -0.7 0.3 s3d-cube 17 17.8 17 17.5 0 1.7 s3d-morph 46 46.8 46 46.8 0 0 s3d-raytrace 8 8.9 8 9 0 -1.1 string-fasta 29 29.7 29 29.6 0 0.3 string-unpack-code 5562 5622.2 5566 5628.9 -0.1 -0.1 string-validate-input 28 29 29 29.3 -3.6 -1.0 Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8/ crypto 464 460.8 464 459.8 0 -0.2 deltablue 1479 1473.2 1476 1473.8 -0.2 0.0 earley-boyer 998 993.4 999 993.1 0.1 -0.0 raytrace 2903 2897 2903 2896.8 0 -0.0 richards 959 957.9 958 956.9 -0.1 -0.1 Dir: v8/typed/ crypto 477 476 477 476.1 0 0.0 deltablue 2451 2442.7 2451 2443 0 0.0 earley-boyer 999 991.4 996 990.5 -0.3 -0.1 raytrace 6372 6348.1 6372 6354.6 0 0.1 richards 1899 1896.2 1898 1896.6 -0.1 0.0 Dir: v8.5/js/ crypto 411 406.6 411 408 0 0.3 deltablue 311 309.8 311 310.9 0 0.4 earley-boyer 1009 1002.3 1008 999.6 -0.1 -0.3 raytrace 646 643.6 645 644.1 -0.2 0.1 regexp 69.6 69.1 69.5 69.2 -0.1 0.2 richards 254 253.6 254 253.5 0 -0.0 splay 745 730.8 745 730.8 0 0 Dir: v8.5/optimized/ crypto 3494 3486.9 3496 3486.7 0.1 -0.0 deltablue 3062 2993.6 3065 3020.1 0.1 0.9 earley-boyer 1009 998.4 1009 999.2 0 0.1 raytrace 7688 7675.4 7672 7656.5 -0.2 -0.2 regexp 70 69.6 69.8 69.6 -0.3 -0.0 richards 3619 3603.2 3626 3618.2 0.2 0.4 splay 5551 5475.9 5551 5509.3 0 0.6 Dir: v8.5/typed/ crypto 2236 2232.9 2236 2227.3 0 -0.3 deltablue 3265 3255.8 3265 3255.2 0 -0.0 earley-boyer 1007 1001.1 1005 1002.6 -0.2 0.1 raytrace 7688 7670.4 7695 7674.5 0.1 0.1 regexp 70.4 69.9 70.7 69.7 0.4 -0.3 richards 3623 3587.8 3623 3596 0 0.2 splay 936 921.5 936 931.7 0 1.1 Dir: v8.5/untyped/ crypto 451 450.6 451 450.3 0 -0.1 deltablue 1526 1521.9 1526 1521.5 0 -0.0 earley-boyer 1009 1004 1008 1001.6 -0.1 -0.2 raytrace 3127 3116.5 3127 3120.7 0 0.1 regexp 69.5 69.3 69.8 69.4 0.4 0.2 richards 381 379.7 382 380.7 0.3 0.3 splay 894 870.7 894 886.9 0 1.9
Attachment #486738 -
Attachment is obsolete: true
Assignee | ||
Comment 60•14 years ago
|
||
Here are results on Linux. Again, seems more bad than good, and mostly noise. I'm going to close this bug since the gross inefficiency has been taken care of. test/performance (hg:bug556023,inline-variant) % python runtests.py --iterations 10 --avm ../../objdir-rel32/shell/avmshell.outline.core0x2 --avm2 ../../objdir-rel32/shell/avmshell.inlined.core0x2 Tamarin tests started: 2010-11-04 20:23:01.792823 Executing 460 test(s) avm: ../../objdir-rel32/shell/avmshell.outline.core0x2 version: cyclone avm2: ../../objdir-rel32/shell/avmshell.inlined.core0x2 version: cyclone iterations: 10 avm avm2 test best avg best avg %dBst %dAvg Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: asmicro/ alloc-1 30 29.6 30 30 0 1.4 alloc-10 10 10 10 10 0 0 alloc-11 9 9 9 9 0 0 alloc-12 5 5 5 5 0 0 alloc-13 60 60 60 59.9 0 -0.2 alloc-14 47 46.8 47 46.9 0 0.2 alloc-2 14 14 14 14 0 0 alloc-3 11 10.9 10 10 -9.1 -8.3 -- alloc-4 35 34.9 35 35 0 0.3 alloc-5 25 25 25 25 0 0 alloc-6 49 48.6 48 47.9 -2.0 -1.4 - alloc-7 29 28.4 29 28.6 0 0.7 alloc-8 11 11 11 11 0 0 alloc-9 11 11 11 11 0 0 Metric: iterations/second arguments-1 535 532.8 536 533.6 0.2 0.2 arguments-2 321 316.2 321 316.2 0 0 arguments-3 14 14 15 14.5 7.1 3.6 ++ arguments-4 21 20.9 21 21 0 0.5 Metric: v8 custom v8 normalized metric (hardcoded in the test) array-1 1156 1154 1157 1153.9 0.1 -0.0 array-2 430 428 430 427.9 0 -0.0 array-pop-1 261 257.6 258 256.7 -1.1 -0.3 - array-push-1 193 192.4 192 190.9 -0.5 -0.8 array-shift-1 103 101.3 103 102.4 0 1.1 array-slice-1 13 13 13 13 0 0 array-sort-1 20 20 20 19.9 0 -0.5 array-sort-2 1 1 1 1 0 0 array-sort-3 14 14 15 14.1 7.1 0.7 ++ array-sort-4 6 6 6 6 0 0 array-unshift-1 73 72.9 73 72.8 0 -0.1 closedvar-read-1 3270 3255.1 3270 3255.5 0 0.0 closedvar-write-1 1776 1775.6 1783 1776.5 0.4 0.1 + closedvar-write-2 1778 1775.7 1786 1777.3 0.4 0.1 + do-1 3727 3720.3 3725 3720.7 -0.1 0.0 for-1 2999 2998 2998 2997 -0.0 -0.0 for-2 2262 2261.3 2262 2261.5 0 0.0 for-3 2460 2451.4 2461 2459.3 0.0 0.3 for-in-1 246 245 247 245.2 0.4 0.1 for-in-2 119 118.2 119 117.2 0 -0.8 funcall-1 245 241.8 245 241.6 0 -0.1 funcall-2 164 161.8 164 160.1 0 -1.1 funcall-3 213 211.1 213 210.5 0 -0.3 funcall-4 86 85.3 86 85.9 0 0.7 globalvar-read-1 3266 3253.7 3262 3253.1 -0.1 -0.0 globalvar-write-1 1774 1773.3 1775 1774 0.1 0.0 isNaN-1 2867 2864.2 2867 2866.2 0 0.1 Metric: iterations/second isNaN-2 2869 2868.4 2869 2868.3 0 -0.0 isNaN-3 3255 3217.5 3258 3233 0.1 0.5 Metric: v8 custom v8 normalized metric (hardcoded in the test) lookup-array-fetch-1 558 554.3 553 549.9 -0.9 -0.8 lookup-array-in-1 1249 1243.2 1254 1247.8 0.4 0.4 lookup-negindex-array-1 341 339.8 341 341 0 0.4 lookup-negindex-array-2 297 295 298 297.2 0.3 0.7 lookup-negindex-object-1 351 350.3 351 350.6 0 0.1 lookup-negindex-object-2 335 333.9 336 335 0.3 0.3 lookup-object-fetch-1 584 574.7 577 567.6 -1.2 -1.2 lookup-object-in-1 1009 996 1014 979.1 0.5 -1.7 number-toString-1 3 3 3 3 0 0 number-toString-2 46 45.1 46 45.6 0 1.1 oop-1 2 2 2 2 0 0 parseFloat-1 45 44.9 45 44.9 0 0 parseInt-1 113 110.9 113 111.6 0 0.6 regex-exec-1 47 46 48 47.6 2.1 3.5 regex-exec-2 56 55 57 57 1.8 3.6 + regex-exec-3 80 79.1 80 79 0 -0.1 regex-exec-4 224 222.9 226 224.4 0.9 0.7 restarg-1 535 533 535 532.9 0 -0.0 restarg-2 321 315.7 322 318.2 0.3 0.8 restarg-3 27 27 27 27 0 0 restarg-4 21 20.8 21 21 0 1.0 string-casechange-1 19 18.9 19 18.9 0 0 string-casechange-2 19 19 19 19 0 0 string-charAt-1 1000 995.7 999 992.2 -0.1 -0.4 string-charAt-2 53 53 53 52.9 0 -0.2 string-charCodeAt-1 1016 1002.4 1016 1012.9 0 1.0 string-charCodeAt-2 914 910.3 912 910.2 -0.2 -0.0 Metric: iterations/second string-charCodeAt-3 366 363.9 366 365.2 0 0.4 string-charCodeAt-4 1308 1297.7 1311 1304.5 0.2 0.5 string-charCodeAt-5 301 296.2 296 295.5 -1.7 -0.2 - string-charCodeAt-6 1017 1012.9 1016 1013 -0.1 0.0 string-charCodeAt-7 1311 1307.5 1308 1298.8 -0.2 -0.7 Metric: v8 custom v8 normalized metric (hardcoded in the test) string-fromCharCode-1 226 223.7 226 223.5 0 -0.1 string-fromCharCode-2 43 42.9 43 42.8 0 -0.2 string-indexOf-1 127 126.5 127 126.5 0 0 string-indexOf-2 92 91.7 94 92.2 2.2 0.5 + string-indexOf-3 79 76.3 79 76.9 0 0.8 string-lastIndexOf-1 380 376.8 380 373.8 0 -0.8 string-lastIndexOf-2 81 81 81 81 0 0 Metric: iterations/second string-lastIndexOf-3 89 87.2 89 87.2 0 0 Metric: v8 custom v8 normalized metric (hardcoded in the test) string-slice-1 87 86.7 88 87 1.1 0.3 string-split-1 6 6 6 6 0 0 string-split-2 6 6 6 6 0 0 string-substring-1 90 89.8 90 89.8 0 0 switch-1 643 643 643 643 0 0 switch-2 79 77.1 79 77.1 0 0 switch-3 148 148 148 148 0 0 try-1 164 159.6 164 159.8 0 0.1 try-2 11 11 11 11 0 0 try-3 40 40 40 39.9 0 -0.3 vector-push-1 32 32 33 33 3.1 3.1 + while-1 2999 2997.4 2998 2997.9 -0.0 0.0 Metric: time Dir: jsbench/ Crypt 6039 6053.5 6034 6052.1 0.1 0.0 Euler 11759 11804.4 11755 11813.2 0.0 -0.1 FFT 11402 11491.2 11391 11438.7 0.1 0.5 HeapSort 4765 4784.3 4773 4786.9 -0.2 -0.1 LUFact 8016 8033 8008 8058.8 0.1 -0.3 Moldyn 16600 16680 16873 17075.8 -1.6 -2.4 RayTracer 10485 10529.2 10525 10545.7 -0.4 -0.2 SOR 46923 47015.6 46924 47014.6 -0.0 0.0 Series 13629 13682.3 13639 13695 -0.1 -0.1 SparseMatmult 15708 15859.1 15694 15944.5 0.1 -0.5 Dir: jsbench/typed/ Crypt 1387 1389.8 1387 1391 0 -0.1 Euler 13190 13238.6 13249 13296.9 -0.4 -0.4 FFT 3772 3792.6 3757 3779.8 0.4 0.3 HeapSort 1975 1982 1975 1981 0 0.1 LUFact 3528 3546 3527 3540.3 0.0 0.2 Moldyn 5823 5887.4 5837 5864.5 -0.2 0.4 RayTracer 1780 1786 1782 1790.9 -0.1 -0.3 SOR 8918 8946.9 8920 8972.2 -0.0 -0.3 Series 11955 12025.3 11910 12036.7 0.4 -0.1 SparseMatmult 3621 3718.1 3631 3708.2 -0.3 0.3 Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: jsmicro/ alloc-1 28 27.8 28 28 0 0.7 alloc-10 10 9.9 10 10 0 1.0 alloc-11 8 8 8 8 0 0 alloc-12 5 5 5 5 0 0 alloc-13 51 50.9 51 51 0 0.2 alloc-14 42 41.1 41 40.9 -2.4 -0.5 - alloc-2 14 14 14 14 0 0 alloc-3 10 10 10 10 0 0 alloc-4 33 33 33 33 0 0 alloc-5 24 23.7 24 23.6 0 -0.4 alloc-6 42 42 41 41 -2.4 -2.4 - alloc-7 28 28 28 27.8 0 -0.7 alloc-8 11 11 11 11 0 0 alloc-9 11 10.9 11 10.9 0 0 Metric: iterations/second arguments-1 100 99.3 100 99.5 0 0.2 arguments-2 73 71.9 73 71.7 0 -0.3 arguments-3 13 13 14 13.8 7.7 6.2 ++ Metric: v8 custom v8 normalized metric (hardcoded in the test) array-1 221 215.5 221 215.4 0 -0.0 array-2 180 178.1 180 177.2 0 -0.5 array-pop-1 44 43.9 46 45.3 4.5 3.2 + array-push-1 32 31.9 33 32.1 3.1 0.6 array-shift-1 40 39.6 41 40.8 2.5 3.0 array-slice-1 11 11 11 11 0 0 array-sort-1 19 18.7 18 18 -5.3 -3.7 -- array-sort-2 1 1 1 1 0 0 array-sort-3 14 14 14 14 0 0 array-sort-4 7 7 7 7 0 0 array-unshift-1 16 15.4 16 15.6 0 1.3 closedvar-read-1 329 318.9 330 318.3 0.3 -0.2 closedvar-write-1 259 259 259 259 0 0 closedvar-write-2 259 259 259 258 0 -0.4 do-1 346 345.2 346 345.2 0 0 for-1 321 320.8 321 320.7 0 -0.0 for-2 97 96.9 97 96 0 -0.9 for-3 69 69 70 69 1.4 0 + for-in-1 171 167.7 171 167.7 0 0 for-in-2 95 93.8 94 93.8 -1.1 0 funcall-1 144 140.4 142 139.4 -1.4 -0.7 funcall-2 140 138.2 140 137 0 -0.9 funcall-3 135 131.8 135 131.8 0 0 funcall-4 1005 988.8 1005 984.9 0 -0.4 globalvar-read-1 329 318.4 331 317.8 0.6 -0.2 globalvar-write-1 261 260.8 261 260.1 0 -0.3 isNaN-1 297 296.3 297 296.3 0 0 lookup-array-fetch-1 489 486.9 485 480 -0.8 -1.4 lookup-array-in-1 808 791.8 795 788.3 -1.6 -0.4 lookup-object-fetch-1 496 492 500 494.7 0.8 0.5 lookup-object-in-1 694 682.9 701 691.7 1.0 1.3 number-toString-1 3 3 3 3 0 0 number-toString-2 38 38 38 38 0 0 oop-1 2 2 2 2 0 0 parseFloat-1 33 32.3 33 32.4 0 0.3 parseInt-1 73 71.5 72 71.2 -1.4 -0.4 regex-exec-1 40 39.7 41 40.7 2.5 2.5 + regex-exec-2 50 49.5 51 50 2 1.0 regex-exec-3 71 70.9 70 69.7 -1.4 -1.7 - regex-exec-4 184 182 182 180.6 -1.1 -0.8 string-casechange-1 13 12.8 13 13 0 1.6 string-casechange-2 13 13 13 13 0 0 string-charAt-1 64 62.9 64 62.2 0 -1.1 string-charAt-2 29 28.9 29 29 0 0.3 string-charCodeAt-1 56 56 57 56.2 1.8 0.4 + string-charCodeAt-2 57 56.6 57 56.3 0 -0.5 string-fromCharCode-1 55 54.7 55 54.6 0 -0.2 string-fromCharCode-2 27 26.7 27 27 0 1.1 string-fromCharCode-3 47 46.7 46 46 -2.1 -1.5 - string-fromCharCode-4 48 47.9 48 48 0 0.2 string-indexOf-1 55 54.6 55 54.6 0 0 string-indexOf-2 38 37.9 38 38 0 0.3 string-indexOf-3 36 35.1 36 35.1 0 0 string-lastIndexOf-1 59 58.1 59 58.2 0 0.2 string-lastIndexOf-2 39 38.9 39 38.6 0 -0.8 string-lastIndexOf-3 53 52.3 53 52.1 0 -0.4 string-slice-1 31 30.1 31 30.7 0 2.0 string-split-1 5 5 5 5 0 0 string-split-2 5 5 5 5 0 0 string-substring-1 32 31.1 32 31.1 0 0 switch-1 78 77.6 78 77.9 0 0.4 switch-2 40 39.8 40 40 0 0.5 switch-3 53 52.5 53 52.6 0 0.2 try-1 104 102.1 104 102.5 0 0.4 try-2 11 11 11 11 0 0 try-3 31 31 31 30.9 0 -0.3 while-1 321 320.7 321 320.8 0 0.0 Metric: time Dir: language/describetype/ desctypeperf 744 749.3 750 754.8 -0.8 -0.7 Dir: language/e4x/ addingToXMLList 26 26.9 26 26.6 0 1.1 appendChildAndString 71 71.9 71 71.8 0 0.1 concatenatingStringsFromE4X 9 9.8 9 9.7 0 1.0 simpleStringConcatenation 2 2.8 2 2.9 0 -3.6 usingAppendChildAndE4X 72 73 73 73.4 -1.4 -0.5 Dir: language/string/ append_concat 121 122 121 121.3 0 0.6 append_equal_plus 104 105 104 104.9 0 0.1 append_plus_equal 104 104.8 104 104.7 0 0.1 charAt 301 304.9 301 305.3 0 -0.1 charCodeAt 374 375.8 374 375.8 0 0 indexOf 377 377.7 377 377.8 0 -0.0 lastIndexOf 342 342.6 342 342.6 0 0 replace 551 552.1 550 551.6 0.2 0.1 replace2 1247 1249.4 1255 1255.7 -0.6 -0.5 - search 46 47.7 47 47.4 -2.2 0.6 slice 480 519.3 480 484.8 0 6.6 split 461 461.5 460 462.4 0.2 -0.2 static_ascii_array_100 1270 1273.3 1268 1272.1 0.2 0.1 static_ascii_array_50 1241 1244.4 1240 1244.4 0.1 0 static_latin1_array_100 2492 2499.8 2496 2503.2 -0.2 -0.1 static_latin1_array_50 1246 1248.7 1246 1249.1 0 -0.0 substr 342 344.5 342 342.8 0 0.5 substring 335 339.9 335 339.2 0 0.2 Dir: language/string/typed/ append_concat 109 109.7 109 110 0 -0.3 append_equal_plus 95 96 95 95.6 0 0.4 append_plus_equal 95 95.9 95 95.7 0 0.2 charAt 12 12.1 12 12.2 0 -0.8 charCodeAt 12 12.4 12 12.5 0 -0.8 indexOf 377 377.6 377 377.8 0 -0.1 lastIndexOf 342 342.2 342 342.2 0 0 replace 549 551 550 551.4 -0.2 -0.1 replace2 1249 1251.7 1256 1260.1 -0.6 -0.7 search 46 47 47 47.1 -2.2 -0.2 - slice 229 230.8 230 232.2 -0.4 -0.6 split 459 462.3 459 464.7 0 -0.5 substr 168 169.1 168 168.4 0 0.4 substring 160 161.2 160 160.3 0 0.6 Dir: misc/ boids 2598 2623.7 2598 2617.4 0 0.2 boidshack 645 648.9 645 647.4 0 0.2 gameoflife 3824 3844.4 3825 3836.9 -0.0 0.2 primes 6721 6723.3 6720 6721.2 0.0 0.0 Dir: mmgc/ gcbench 3633 3653.2 3637 3648.2 -0.1 0.1 ofib-rc 359 360.9 359 359.6 0 0.4 ofib 1493 1498.5 1489 1504.1 0.3 -0.4 sfib 543 543.9 552 553.3 -1.7 -1.7 - Dir: scimark/ FFT 3702 3713.3 3706 3716.1 -0.1 -0.1 LU 4381 4414.8 4381 4403.9 0 0.2 MonteCarlo 3708 3745.2 3733 3742.1 -0.7 0.1 - SOR 4137 4179.4 4135 4159.6 0.0 0.5 SparseCompRow 141 143.3 141 142 0 0.9 Dir: sunspider/ access-binary-trees 45 45.8 45 45.7 0 0.2 access-fannkuch 122 123.2 123 124.2 -0.8 -0.8 access-nbody 111 112.8 111 112 0 0.7 access-nsieve 57 58.3 58 58 -1.8 0.5 - bitops-3bit-bits-in-byte 15 15 14 14.9 6.7 0.7 ++ bitops-bits-in-byte 40 40.2 40 40.1 0 0.2 bitops-bitwise-and 242 245.4 244 246.1 -0.8 -0.3 bitops-nsieve-bits 56 56 56 56.1 0 -0.2 controlflow-recursive 23 23 23 23 0 0 crypto-aes 54 54.7 54 54.8 0 -0.2 crypto-md5 27 27.9 27 27.9 0 0 crypto-sha1 28 28.2 28 28.4 0 -0.7 date-format-tofte 159 160.1 159 159.6 0 0.3 math-cordic 88 89.6 88 88.3 0 1.5 math-partial-sums 215 216.9 217 218.2 -0.9 -0.6 math-spectral-norm 38 39 38 39.1 0 -0.3 s3d-cube 105 105.4 104 105 1.0 0.4 s3d-morph 63 63.8 63 63.9 0 -0.2 s3d-raytrace 105 105.6 105 105.3 0 0.3 string-fasta 110 110.9 110 110.7 0 0.2 string-unpack-code 232 233.7 231 232.2 0.4 0.6 string-validate-input 54 54.9 54 54.4 0 0.9 Dir: sunspider/as3/ access-binary-trees 12 12.8 12 12.8 0 0 access-fannkuch 73 74.3 73 74.1 0 0.3 access-nbody 9 9 9 9 0 0 access-nsieve 40 40.7 40 40.3 0 1.0 bitops-3bit-bits-in-byte 7 7.5 7 7.5 0 0 bitops-bits-in-byte 10 10.3 10 10.5 0 -1.9 bitops-bitwise-and 2 2 1 2 50 0 ++ bitops-nsieve-bits 37 37 37 37 0 0 controlflow-recursive 5 5.2 5 5.2 0 0 crypto-aes 39 39.1 38 38.9 2.6 0.5 + crypto-md5 34 34.8 34 34.8 0 0 crypto-sha1 25 25.2 25 25.1 0 0.4 date-format-tofte 145 145.9 145 146.4 0 -0.3 math-cordic 28 28.3 28 28.2 0 0.4 math-partial-sums 81 82.4 81 82.1 0 0.4 math-spectral-norm 8 8.6 8 8.8 0 -2.3 s3d-cube 25 25.7 25 25.7 0 0 s3d-morph 43 43.1 42 43 2.3 0.2 + s3d-raytrace 45 45.1 45 45.1 0 0 string-fasta 56 56.7 57 57.1 -1.8 -0.7 - string-unpack-code 229 231.2 230 231.4 -0.4 -0.1 string-validate-input 41 41.8 41 41.3 0 1.2 Dir: sunspider/as3vector/ access-fannkuch 35 35.6 35 35.5 0 0.3 access-nbody 9 9 9 9 0 0 access-nsieve 15 15.7 16 16 -6.7 -1.9 -- bitops-nsieve-bits 10 10 9 9.4 10 6.0 + math-cordic 25 25.8 25 26.1 0 -1.2 math-spectral-norm 22 23 22 22.8 0 0.9 s3d-cube 20 20.6 20 20.4 0 1.0 s3d-morph 25 25.4 25 25.5 0 -0.4 string-fasta 61 62.1 62 62 -1.6 0.2 - string-validate-input 42 42.9 42 42.5 0 0.9 Dir: sunspider-0.9.1/js/ access-binary-trees 45 45.5 45 45.2 0 0.7 access-fannkuch 122 123.3 122 123.4 0 -0.1 access-nbody 106 106.4 105 106.4 0.9 0 access-nsieve 57 57.1 57 57.7 0 -1.1 bitops-3bit-bits-in-byte 13 13.3 13 13.5 0 -1.5 bitops-bits-in-byte 40 40 39 39.7 2.5 0.7 + bitops-bitwise-and 242 244.7 242 243.5 0 0.5 bitops-nsieve-bits 56 56.4 55 55.7 1.8 1.2 controlflow-recursive 22 22.6 22 22.7 0 -0.4 crypto-aes 48 48 47 47.8 2.1 0.4 + crypto-md5 24 24 24 24 0 0 crypto-sha1 27 27.1 27 27.4 0 -1.1 math-cordic 86 86.7 86 86.6 0 0.1 math-partial-sums 222 223.5 221 222.6 0.5 0.4 math-spectral-norm 39 39.5 39 39.6 0 -0.3 regexp-dna 1118 1126.6 1123 1125.9 -0.4 0.1 s3d-cube 97 98.2 97 97.8 0 0.4 s3d-morph 87 88.4 88 88.8 -1.1 -0.5 s3d-raytrace 98 99.2 98 99.1 0 0.1 string-fasta 80 81.3 80 80.3 0 1.2 string-unpack-code 5287 5308.5 5297 5314.3 -0.2 -0.1 string-validate-input 52 52.9 52 52.8 0 0.2 Dir: sunspider-0.9.1/typed/ access-binary-trees 10 10.3 10 10.5 0 -1.9 access-fannkuch 81 81.1 80 80.9 1.2 0.2 + access-nbody 6 6.7 6 6.7 0 0 access-nsieve 38 38.9 38 38.9 0 0 bitops-3bit-bits-in-byte 6 6.3 6 6.4 0 -1.6 bitops-bits-in-byte 9 9.3 9 9.2 0 1.1 bitops-bitwise-and 2 2.1 1 1.8 50 14.3 + bitops-nsieve-bits 36 36.7 36 36.7 0 0 controlflow-recursive 4 4.6 4 4.9 0 -6.5 crypto-aes 33 33.4 33 33.2 0 0.6 crypto-md5 6 6.7 6 6.9 0 -3.0 crypto-sha1 7 7.3 7 7.4 0 -1.4 math-cordic 27 27 27 27.1 0 -0.4 math-partial-sums 21 21.5 21 21.4 0 0.5 math-spectral-norm 10 10.8 10 10.6 0 1.9 regexp-dna 1090 1093.9 1089 1092.8 0.1 0.1 s3d-cube 22 22.2 22 22.1 0 0.5 s3d-morph 68 68.7 68 68.9 0 -0.3 s3d-raytrace 12 12.5 12 12.7 0 -1.6 string-fasta 45 45.5 45 45.9 0 -0.9 string-unpack-code 5295 5313.7 5290 5322.3 0.1 -0.2 string-validate-input 35 35.5 35 35.3 0 0.6 Metric: v8 custom v8 normalized metric (hardcoded in the test) Dir: v8/ crypto 270 268.5 269 268.3 -0.4 -0.1 deltablue 1202 1196.9 1208 1198.8 0.5 0.2 earley-boyer 769 762.1 768 764.3 -0.1 0.3 raytrace 2040 2034.2 2053 2046.1 0.6 0.6 + richards 813 811.1 809 800.1 -0.5 -1.4 Dir: v8/typed/ crypto 278 277.3 278 277.6 0 0.1 deltablue 1939 1925.5 1932 1925.4 -0.4 -0.0 earley-boyer 769 766.4 765 763 -0.5 -0.4 raytrace 4733 4716.2 4733 4714.3 0 -0.0 richards 1497 1494.4 1499 1493.5 0.1 -0.1 Dir: v8.5/js/ crypto 244 243.9 245 244.3 0.4 0.2 + deltablue 249 248.4 250 248.3 0.4 -0.0 earley-boyer 785 781.7 783 780.4 -0.3 -0.2 raytrace 510 505.8 506 504.7 -0.8 -0.2 regexp 66 65.7 66 65.8 0 0.1 richards 201 200.1 201 199.8 0 -0.1 splay 493 488.9 496 490.1 0.6 0.2 Dir: v8.5/optimized/ crypto 2803 2791.8 2802 2790.9 -0.0 -0.0 deltablue 2328 2313.6 2333 2320.3 0.2 0.3 earley-boyer 780 773.8 777 773.8 -0.4 0 raytrace 5881 5861.4 5881 5836.4 0 -0.4 regexp 66.3 66 66.5 66.2 0.3 0.2 richards 2878 2870.8 2881 2868.2 0.1 -0.1 splay 4441 4370.4 4479 4412.9 0.9 1.0 Dir: v8.5/typed/ crypto 1297 1291.9 1297 1292.7 0 0.1 deltablue 2491 2475.3 2494 2476.8 0.1 0.1 earley-boyer 780 775.9 777 774.7 -0.4 -0.2 raytrace 5870 5845.7 5893 5869.3 0.4 0.4 regexp 66.6 66.1 66.5 66.2 -0.2 0.2 richards 2878 2866.5 2878 2870.5 0 0.1 splay 595 590.2 596 589.7 0.2 -0.1 Dir: v8.5/untyped/ crypto 262 261.1 262 261.2 0 0.0 deltablue 1235 1224.1 1239 1232.4 0.3 0.7 earley-boyer 779 774.1 774 772 -0.6 -0.3 - raytrace 2279 2271.4 2286 2278.7 0.3 0.3 regexp 65.9 65.6 66 65.9 0.2 0.5 richards 321 318.8 320 317.1 -0.3 -0.5 splay 561 559.1 565 554.6 0.7 -0.8
Assignee | ||
Comment 61•14 years ago
|
||
opened fresh ticket for someone else to take in bug 609827.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Summary: operator[] on integer keys in 2^28 .. 2^30 range slower than on string keys → operator[] on integer keys in 2^28 .. 2^30 range grossly slower than on string keys
Updated•13 years ago
|
Flags: flashplayer-bug+
You need to log in
before you can comment on or make changes to this bug.
Description
•