Closed
Bug 643615
Opened 15 years ago
Closed 12 years ago
Analyze v8-deltablue
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: dmandelin, Assigned: dmandelin)
References
Details
Attachments
(4 files)
On 32-bit linux, v8/nocs is 1.7x faster.
| Assignee | ||
Comment 1•15 years ago
|
||
Initial observations:
- Most of the time is in the methodjit and unknown stubs. One of them is clearly for |new Array|, which I'm sure we need to speed up. Others look like Array.push and possibly some IC stubs, which might not be a big deal.
- Otherwise, the time is distributed through different JS functions, but 60% of the time is in the top 10 functions.
- 3% of time is in GC.
Looks like the next thing to do is go through the top functions individually.
| Assignee | ||
Comment 2•15 years ago
|
||
Top JS function:
frac cum script
0.109870 0.109870 deltablue.js 776
0.097054 0.883 RUN_MJITCODE 776
0.007924 0.072 RUN_MJITCODE 778
0.004839 0.044 RUN_MJITCODE 777
0.000053 0.000 STUB_UNKNOWN 776
Plan.prototype.execute = function () {
for (var i = 0; i < this.size(); i++) {
var c = this.constraintAt(i);
c.execute();
}
}
The profiler points at |this.size()|, which calls through here:
Plan.prototype.size = function () {
// v is an |OrderedCollection|
return this.v.size();
}
OrderedCollection.prototype.size = function() {
// elms is from |new Array|
return this.elms.length;
}
On the whole test, we take about 1700 ms. According to the profiler, we spend 11% of our time in this hot line of code, or 187 ms. We run that line 12M times, so that's 15.6 ns/iteration.
In a reduced microbenchmark that I used to investigate the slowdown, we run 10M iterations in 163 ms, or 16.3 ns/iteration. So the reduced version seems like a valid model. The reduced version just has an empty loop body in |execute|, thus demonstrating that |this.size()| does in fact take almost all the time in this function.
By comparing the reduced version to v8nocs, I found that our slowdown (50ms on the reduced version: 163 ms vs 113 ms) is from these sources:
1. Inlining. If I inline |this.size()| all the way down, we gain 20 ms. So that's 40% of the difference here.
2. |this|. We seem to lose 6 ms from computing |this|.
3. |.v.elms|. We lose 4 ms here. That's pretty small. I think it's probably due to our object layouts and their worse cache behavior.
4. |.length|. 20 ms here, it seems we are running a slow path for array length here, or else our IC stub is way slower.
This explains about 10% of our slowdown so far.
| Assignee | ||
Comment 3•15 years ago
|
||
Bill pointed out that as far as we know v8/nocs doesn't do inlining. And in my tests inlining at the source level does help v8/nocs run faster. So it seems that the difference here is actually that they are faster at method calls, as shown by this benchmark. They are not faster at a plain call to the global function |g|, but they are faster at calling |a.g|.
| Assignee | ||
Comment 4•15 years ago
|
||
About 10% of our time is in Array.push and Array.pop. v8/nocs is much faster (3x) on this microbenchmark.
| Assignee | ||
Comment 5•15 years ago
|
||
On this benchmark, v8/nocs is 2x faster for the not-equal case and 4x faster for the equal case.
| Assignee | ||
Comment 6•15 years ago
|
||
Summarizing, here are the optimizations we need here:
1. Faster method calls. This is probably about 2/3 of what's slowing us down here. See comment 3. Inlining would also fix this.
2. Faster array and object allocation.
3. Faster Array.push/pop
4. Faster equality testing
5. Faster array.length
Comment 7•15 years ago
|
||
(In reply to comment #3)
> They are not faster at a plain call to the global
> function |g|, but they are faster at calling |a.g|.
Since JSOP_CALL should do roughly the same thing for g() vs. a.g(), does that mean that its JSOP_CALLPROP that needs loving?
| Assignee | ||
Comment 8•15 years ago
|
||
(In reply to comment #7)
> (In reply to comment #3)
> > They are not faster at a plain call to the global
> > function |g|, but they are faster at calling |a.g|.
>
> Since JSOP_CALL should do roughly the same thing for g() vs. a.g(), does that
> mean that its JSOP_CALLPROP that needs loving?
That would be my guess, but I don't really know.
Comment 9•12 years ago
|
||
This JM bug has been superseded by IonMonkey bug 768739.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•