Open
Bug 624299
Opened 14 years ago
Updated 2 months ago
2x slower than v8 on recursion+scope chain testcase
Categories
(Core :: JavaScript Engine, enhancement, P3)
Core
JavaScript Engine
Tracking
()
NEW
People
(Reporter: bzbarsky, Unassigned)
References
(Blocks 1 open bug, )
Details
(Whiteboard: [js:t] [js:perf])
Attachments
(2 files, 2 obsolete files)
See bug 614834 comment 27. The testcase in question is in the url field.
Comment 1•11 years ago
|
||
The ratio improved, but we're still slowest here (all numbers on my rMBP@2.7Ghz):
SpiderMonkey:
0.58
0.56
0.565
0.56
0.5575
JSC:
0.34
0.33
0.32
0.3175
0.3175
d8:
0.3
0.29
0.295
0.3325
0.29625
OS: Mac OS X → All
Hardware: x86 → All
Summary: 4x slower than v8 on recursion+scope chain testcase → 2x slower than v8 on recursion+scope chain testcase
Whiteboard: [js:t] [js:perf]
Assignee | ||
Updated•10 years ago
|
Assignee: general → nobody
Comment 2•10 years ago
|
||
Firefox 33 is faster than Chrome 39 for me.
Firefox goes from 0.60 to 0.45 and Chrome goes from 0.70 to 0.55
Comment 3•10 years ago
|
||
For me (same setup as in comment 1), we're still slowest (and note the progress JSC has made):
SpiderMonkey:
0.46
0.44
0.45
0.4525
0.4575
JSC:
0.22
0.2
0.21
0.2175
0.215
d8:
0.26
0.31
0.28
0.2775
0.2625
Current Nightly and Canary also reflect this. Safari is about 50% slower than JSC, but still faster than us.
Comment 4•10 years ago
|
||
This is a lot faster on 32-bit. On OS X I get 0.23-0.26 ms with an x86 build, 0.39-0.42 with an x64 build.
Could be our boxing format or us spilling more registers somewhere, we should investigate.
Reporter | ||
Comment 5•10 years ago
|
||
Reporter | ||
Comment 6•10 years ago
|
||
Reporter | ||
Comment 7•10 years ago
|
||
Attachment #8527735 -
Attachment is obsolete: true
Reporter | ||
Comment 8•10 years ago
|
||
Attachment #8527736 -
Attachment is obsolete: true
Reporter | ||
Comment 9•10 years ago
|
||
Some thoughts in no particular order:
1) The overall time or the testcase on 32-bit is about 0.25 * (50 + 100 + 200 + 400 + 800) = 387.5ms. The x86-64 times are about 2x that, in the 800-900ms range. So we need to account for about 400-500 ms of runtime.
2) The testcase executes about 300e6 Unbox:Int32 instructions. On x86, there's nothing to do for these if we know we have an int. On x86-64, these correspond to a single movl. What this means on the hardware, I don't know, but if we assume that takes one cycle, that's 300e6 cycles, the CPU is at 2.6GHz, so about 115ms. But worse yet, in some of these cases we don't know we have an int. In that case, on 32-bit we get things like:
[MoveGroup]
movl %edx, %eax
[Unbox:Int32]
cmpl $0xffffff81, %ecx
jne ((366))
And on 64-bit we get:
[Unbox:Int32]
movq %rcx, %r11
shrq $47, %r11
cmpl $0x1fff1, %r11d
jne ((383))
movl %ecx, %eax
So that's an extra move and shift, though on 32-bit presumably we paid part of that cost when we initially placed the high 32 bits of the Value in ecx.
3) On X86-64 there's an extra MoveGroup before the first CallKnown. But the actual call is cheaper, and in any case there aren't _that_ many CallKnowns here (about 75e6).
So my money is that the main culprit here is the Unbox:Int32 bits.
Comment 10•10 years ago
|
||
(In reply to Please do not ask for reviews for a bit [:bz] from comment #9)
> So my money is that the main culprit here is the Unbox:Int32 bits.
Yes, I have a patch for x64 Unbox that gets us close to the 32-bit numbers. Will post soon, after testing what it does on some other benchmarks.
Comment 11•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #10)
> Yes, I have a patch for x64 Unbox that gets us close to the 32-bit numbers.
> Will post soon, after testing what it does on some other benchmarks.
Bug 1104199. With the patch there:
x64 before: 0.44, 0.38, 0.425, 0.4, 0.4075
x64 after: 0.28, 0.27, 0.245, 0.26, 0.25125
x86: 0.24, 0.23, 0.25, 0.2475, 0.23625
d8 x64: 0.26, 0.23, 0.235, 0.2425, 0.22625
Updated•2 years ago
|
Severity: normal → S3
Comment 12•2 months ago
|
||
Nightly: 0.312s
Chrome: 0.18s
So we are still 2x slower here.
Updated•2 months ago
|
Updated•2 months ago
|
Severity: S3 → N/A
Type: defect → enhancement
Priority: -- → P3
You need to log in
before you can comment on or make changes to this bug.
Description
•