Closed
Bug 691314
Opened 13 years ago
Closed 8 years ago
Regression in Peacekeeper 2.0 between Fx7 and Fx10
Categories
(Core Graveyard :: Tracking, defect)
Tracking
(firefox9+)
RESOLVED
INCOMPLETE
Tracking | Status | |
---|---|---|
firefox9 | + | --- |
People
(Reporter: zbraniecki, Unassigned)
References
Details
(Keywords: regression)
Spin off of bug 499198 comment 20. In the recently updated Peacekeeper, Fx10 is ~20% slower in most tests than Fx7. https://docs.google.com/spreadsheet/ccc?key=0Ah9TBa-qpKojdHBDNmFDNUduOHFnLUk3XzRvX0swaFE&hl=en_US#gid=0
Comment 1•13 years ago
|
||
Can we get a regression range here? I just tried turning off TI on tip and that does NOT get us back to the Fx7 numbers, for what it's worth... Then again, I'm not sure whether that actually caused the tracer to run or not. :(
Comment 2•13 years ago
|
||
Also, can we get numbers on beta and Aurora so we have an idea of which branches we need to track this on?
Keywords: regression
Comment 3•13 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #1) > Then again, I'm not sure whether that actually caused the tracer to run or not. > :( It should, the tracer still works (and is used in, e.g., mobile chrome).
Comment 4•13 years ago
|
||
That's strange. I just ran stringFilter (http://peacekeeper.futuremark.com/run.action?repeat=1&forceSuiteName=string&forceTestName=stringstringFilter) and FF was close to Chrome (124000 vs. 140000). Running the test in the shell, I find us slightly faster than v8. So I'm not sure how we are so pathetically worse in comment 0.
I ran this on Aurora 9.0a2 (2011-10-03) and I'm seeing a slow down. My numbers are: render renderGrid01 57.81 56.91 render renderGrid02 63.36 59.69 render renderGrid03 4.4 4.75 render renderPhysics 21.6 24.68 experimental experimentalRipple01 11.58 16.67 experimental experimentalRipple02 4.44 6.38 array arrayCombined 12540.5 29007 array arrayWeighted 38008.5 46855 dom domGetElements 243153.5 361713.5 dom domDynamicCreationCreateElement 10846 11597 dom domDynamicCreationInnerHTML 16898.5 16139 dom domJQueryAttributeFilters 1956 2370 dom domJQueryBasicFilters 631 813 dom domJQueryBasics 1309.5 1617.5 dom domJQueryContentFilters 815 1103 dom domJQueryHierarchy 2423.5 2982.5 dom domQueryselector 11990 12864.5 string stringChat 28339 38985 string stringDetectBrowser 108291 143724.5 string stringFilter 2206.5 2159 The first number is from Aurora and the second number is from release. This is on Windows XP.
Comment 6•13 years ago
|
||
What about beta?
Here's with beta. The first column is Aurora, the second column is Beta, and the third is release: render renderGrid01 57.81 57.18 56.91 render renderGrid02 63.36 56.84 59.69 render renderGrid03 4.4 4.73 4.75 render renderPhysics 21.6 24.94 24.68 experimental experimentalRipple01 11.58 16.6 16.67 experimental experimentalRipple02 4.44 6.28 6.38 array arrayCombined 12540.5 27723.5 29007 array arrayWeighted 38008.5 50005 46855 dom domGetElements 243153.5 359478 361713.5 dom domDynamicCreationCreateElement 10846 12005 11597 dom domDynamicCreationInnerHTML 16898.5 15626.5 16139 dom domJQueryAttributeFilters 1956 2365.5 2370 dom domJQueryBasicFilters 631 840 813 dom domJQueryBasics 1309.5 1613.5 1617.5 dom domJQueryContentFilters 815 1083.5 1103 dom domJQueryHierarchy 2423.5 2941.5 2982.5 dom domQueryselector 11990 12800.5 12864.5 string stringChat 28339 35114.5 38985 string stringDetectBrowser 108291 148330 143724.5 string stringFilter 2206.5 2120 2159 A few tests have a small regression in beta, but the biggest regression is from beta to aurora.
(In reply to Luke Wagner [:luke] from comment #4) > That's strange. I just ran stringFilter > (http://peacekeeper.futuremark.com/run. > action?repeat=1&forceSuiteName=string&forceTestName=stringstringFilter) and > FF was close to Chrome (124000 vs. 140000). Running the test in the shell, > I find us slightly faster than v8. So I'm not sure how we are so > pathetically worse in comment 0. stringFilter is one of the few tests that didn't regress.
Comment 9•13 years ago
|
||
(In reply to developer@ckiweb.com from comment #8) Yeah, I know it's technically off-topic but 13x slower is huge (esp. when we beat them when the test is run in isolation (on my MBP just now); something weird is happening). Boris do you know any way to snapshot the test so I can edit it to stop/start shark during stringFilter?
Comment 10•13 years ago
|
||
Mats used to have an hg repo of individual tests for the old peacekeeper with shark hooks. Not sure how hard it would be to update that to the new tests... I hate benchmarks that hide the tests. Requesting tracking for 9 based on comment 7.
tracking-firefox9:
--- → ?
Comment 11•13 years ago
|
||
One more testing request (from comment 1): javascript.options.typeinference=false. (This requires a full browser restart for the change to have effect.)
Comment 12•13 years ago
|
||
I tried to figure out which nightly had the regression. It looks like it is between 8/29 and 8/30 for the Windows XP nightly. For two tests (stringDetectBrowser and domJQueryBasicFilters), I have the following numbers: 08/29/11 08/30/11 stringDetectBrowser 808 671 domJQueryBasicFilters 135000 118195
Comment 13•13 years ago
|
||
Here's the numbers with typeinference=false and after a full browser restart. render renderGrid01 57.73 render renderGrid02 63.15 render renderGrid03 4.64 render renderPhysics 24.03 experimental experimentalRipple01 13.09 experimental experimentalRipple02 4.94 array arrayCombined 20707 array arrayWeighted 44688 dom domGetElements 282188.5 dom domDynamicCreationCreateElement 11415 dom domDynamicCreationInnerHTML 16847 dom domJQueryAttributeFilters 2099 dom domJQueryBasicFilters 738 dom domJQueryBasics 1448.5 dom domJQueryContentFilters 917.54 dom domJQueryHierarchy 2677 dom domQueryselector 12371 string stringChat 33738 string stringDetectBrowser 120291.5 string stringFilter 2047.5
Comment 14•13 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #9) I have a copy of the tests over at peacekeeper.therichins.net. stringFilter is test26.html. Also, bug 503107 deals with the stringFilter test. Its attachment (https://bugzilla.mozilla.org/attachment.cgi?id=387471) looks to be the test in a single javascript file.
Comment 15•13 years ago
|
||
(In reply to developer@ckiweb.com from comment #14) Thanks! Using that file I can repro the massive discrepancy and it looks like we are just missing a simple optimization: bug 691797. It seems like the link in comment 4 was just running totally different js.
Updated•13 years ago
|
Keywords: regressionwindow-wanted
Updated•13 years ago
|
Keywords: regressionwindow-wanted
Comment 16•13 years ago
|
||
(In reply to developer@ckiweb.com from comment #13) > Here's the numbers with typeinference=false and after a full browser restart. Thanks much for running these tests. Collecting, filtering, and rounding a bit, I see: Fx9/noTI Fx9 Fx8 Fx7 array arrayCombined 20700 12540 27723 29000 array arrayWeighted 44700 38000 50000 47000 dom domGetElements 282200 243150 360000 360000 string stringChat 34000 28000 35000 39000 string stringDetect 120000 108000 148000 143000 * It would help to know a typical variance here. E.g., is the stringChat regression from 7 to 8 real, or noise? - All of the above have big regressions from 8 to 9 (I picked them that way). - All of the regressions are mitigated by turning off TI. Only the stringChat regression is eliminated entirely by turning off TI. My first (rather obvious) questions are: 1. How does turning TI off help here? I.e., exactly what's going wrong with the TI optimizations, assuming that is the problem. 2. What's the remaining regression after TI is disabled? It could be from the type objects and other adaptations that enable TI. But it could also be from other sources. It would be good to get that aspect bisected or otherwise diagnosed.
Comment 17•13 years ago
|
||
> * It would help to know a typical variance here. E.g., is the stringChat
> regression from 7 to 8 real, or noise?
For stringChat, the lowest that I've seen in 7 is about 3700 but all the rest are above 3800 (I did about 10 refreshes). In 8, the highest I've seen is a little over 3500 (about 10 refreshes). So, I would think that 7 to 8 was a regression.
Comment 18•13 years ago
|
||
> 1. How does turning TI off help here? Presumably by turning TM back on, right? Some of the deps of bug 499198 were actually analyzed and fixed in TM at one point....
Comment 19•13 years ago
|
||
I haven't confirmed with profiling, but looked at the array/string tests and these are simple benchmarks which pound on array, string and regexp natives (not jitcode) over and over. The array natives are the main concern: these got slower with the TI landing because the VM does the simple, slow, dumb thing of updating type information whenever it modifies an array. e.g. a.concat(b) will cause the VM to explicitly update types in the result for every element in 'a' and 'b' (type updates in the VM are super slow; one of the main fixes for dromaeo DOM perf was moving these into jitcode for monitoring the results of property access stubs). Plan now is to do specialized natives for these. For the natives Dromaeo uses (concat/slice/splice/reverse/sort), this will be faster and easier to do than self hosted (main benefit for self hosted is things like forEach I think). The compiler knows the contents of the arrays and whether they are packed, so can update the types of modified arrays ahead of time (we already do this when inlining Array.push) and can call a native which knows its input array is packed. A lot of the array natives get simpler when the inputs are known to be packed, e.g. concat/slice/splice can be implemented as a memcpy. arrayCombine repeatedly sorts a 500 element packed array of integers. Every time it does this, we make a shadow array with every single element converted to a string, and sort the array using string comparison. We should be able to leverage knowledge about the array's contents to avoid this, using a comparator that sorts integer arrays lexicographically. stringChat calls String.replace over and over. I don't know why this would be affected by the TI merge. stringDetectBrowser does /regexp/.test(string) over and over. I noticed with the dromaeo DOM stuff that /regexp/ cloning got slower with the TI merge, haven't looked into why. The cloning can be avoided entirely for this pattern, though, since the compiler knows a regexp native is being called which won't let its 'this' value escape. Just bake in the parser-created regexp. The microbenchmarky allocate-things-in-a-loop nature of many of these tests means we can't catch Chrome without a generational GC.
Comment 20•13 years ago
|
||
Edit: 'For the natives Dromaeo uses ...' => 'For the natives Peacekeeper uses'
Comment 21•13 years ago
|
||
(In reply to Brian Hackett from comment #19) > The microbenchmarky allocate-things-in-a-loop nature of many of these tests > means we can't catch Chrome without a generational GC. I used to strongly hold this opinion, but it is becoming less clear. Try timing: var o; for (var i = 0; i < 1000000; ++i) o = {x:i}; js -m -n 32-bit is faster than v8. Ok, v8 does poorly in global code, so put it in a closure. Then we are at parity. Ok, this is unrealistic since it only stores, so add an access to o.x; SM is faster. I know TI is winning here and covering for losses due to additional cache misses due to lack of generational GC, but I found this result surprising as it seems to deemphasize generational GC's importance.
Comment 22•13 years ago
|
||
One other note. This was mentioned at http://forums.mozillazine.org/viewtopic.php?p=11198075#p11198075 about a month before this bug got filed. Too bad no bug ever got reported for that. :(
Updated•13 years ago
|
Comment 24•13 years ago
|
||
I am unable to complete Peacekeeper 2.0 benchmark on Nightly. It gives me error on String test.
Comment 25•13 years ago
|
||
Ahmad, can you please file a separate bug on that? When did the problem appear?
Comment 26•13 years ago
|
||
^OK, I am not very much expert in filing bug, if any stuff here and other occur please fix them please. I think it occur when Brian landed some performance patches for making Data benchmark faster.
Comment 27•13 years ago
|
||
(In reply to Ahmad Saleem (zlip) from comment #24) > I am unable to complete Peacekeeper 2.0 benchmark on Nightly. It gives me > error on String test. It WFM in the 11/1 nightly on Windows.
Comment 28•8 years ago
|
||
Marking all tracking bugs which haven't been updated since 2014 as INCOMPLETE. If this bug is still relevant, please reopen it and move it into a bugzilla component related to the work being tracked. The Core: Tracking component will no longer be used.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
Assignee | ||
Updated•8 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•