[...array] is slower than in V8
Categories
(Core :: JavaScript Engine, enhancement, P2)
Tracking
()
People
(Reporter: mstange, Unassigned)
References
(Blocks 2 open bugs)
Details
In a comparison between Spidermonkey's BaselineInterpreter and V8's interpreter, the following function from TodoMVC-Preact stood out as being 30x slower:
function getTodos() {
return [...todos];
}
SM: https://share.firefox.dev/3JYTzKP
V8: https://share.firefox.dev/3n6J1QR
Spidermonkey is running the self-hosted next()
function, whereas V8 has a CreateArrayFromIterable
bytecode op.
Spidermonkey bytecode:
00000: 2 NewArray 0 # []
00005: 2 Zero # [] 0
00006: 2 GetGName "todos" # [] 0 todos
00011: 2 Dup # [] 0 todos todos
00012: 2 Symbol 1 # [] 0 todos todos Symbol.iterator
00014: 2 GetElem # [] 0 todos todos[Symbol.iterator]
00015: 2 Swap # [] 0 todos[Symbol.iterator] todos
00016: 2 CallIter 0 # [] 0 todos[Symbol.iterator]()
00019: 2 CheckIsObj 3 # [] 0 todos[Symbol.iterator]()
00021: 2 Dup # [] 0 todos[Symbol.iterator]() todos[Symbol.iterator]()
00022: 2 GetProp "next" # [] 0 todos[Symbol.iterator]() todos[Symbol.iterator]().next
00027: 2 Swap # [] 0 todos[Symbol.iterator]().next todos[Symbol.iterator]()
00028: 2 Pick 3 # 0 todos[Symbol.iterator]().next todos[Symbol.iterator]() []
00030: 2 Pick 3 # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0
00032: 2 LoopHead (ic: 5, depthHint: 1) # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0
00038: 2 DupAt 3 # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next
00042: 2 DupAt 3 # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next todos[Symbol.iterator]()
00046: 2 Call 0 # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next()
00049: 2 CheckIsObj 0 # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next()
00051: 2 Dup # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next() todos[Symbol.iterator]().next()
00052: 2 GetProp "done" # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next() todos[Symbol.iterator]().next().done
00057: 2 JumpIfTrue 78 (+21) # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next()
00062: 2 JumpTarget (ic: 8) # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next()
00067: 2 GetProp "value" # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next().value
00072: 2 InitElemInc # todos[Symbol.iterator]().next todos[Symbol.iterator]() [...] (intermediate value)
00073: 2 Goto 32 (-41) # todos[Symbol.iterator]().next todos[Symbol.iterator]() [...] (intermediate value)
00078: 2 JumpTarget (ic: 10) # todos[Symbol.iterator]().next todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next()
00083: 2 Pick 4 # todos[Symbol.iterator]() [] 0 todos[Symbol.iterator]().next() todos[Symbol.iterator]().next
00085: 2 Pick 4 # [] 0 todos[Symbol.iterator]().next() todos[Symbol.iterator]().next todos[Symbol.iterator]()
00087: 2 PopN 3 # [] 0
00090: 2 Pop # []
00091: 3 Return #
00092: 3 RetRval # !!! UNREACHABLE !!!
V8 bytecode:
[generated bytecode for function: getTodos (0x160e0019a7cd <SharedFunctionInfo getTodos>)]
Bytecode length: 5
Parameter count 1
Register count 0
Frame size 0
Bytecode age: 0
0x160e0019b74e @ 0 : 21 00 00 LdaGlobal [0], [0]
0x160e0019b751 @ 3 : 7a CreateArrayFromIterable
0x160e0019b752 @ 4 : a9 Return
JSC bytecode:
[ 0] enter
[ 1] get_scope dst:loc4
[ 3] mov dst:loc5, src:loc4
[ 6] check_traps
[ 7] resolve_scope dst:loc7, scope:loc4, var:0, resolveType:GlobalProperty, localScopeDepth:0
[ 14] get_from_scope dst:loc8, scope:loc7, var:0, getPutInfo:2048<ThrowIfNotFound|GlobalProperty|NotInitialization|NotStrictMode>, localScopeDepth:0, offset:0
[ 22] mov dst:loc7, src:loc8
[ 25] spread dst:loc6, argument:loc7
[ 28] new_array_with_spread dst:loc6, argv:loc6, argc:1, bitVector:0
[ 33] ret value:loc6
In this test, we turned off Baseline/Ion for everything, even for self-hosted code. If self-hosted code runs in Baseline and we only force the interpreter for non-self-hosted code, the difference will probably be less pronounced.
Comment 1•1 year ago
|
||
We could maybe do something similar to OptimizeSpreadCall where we specifically handle the [...array]
case (and not, say, [1, ...array]
). It looks like that's what V8 is doing; they have a dedicated op in this case, but if I add another fixed element into the array (eg [1, ...todos]
) then they generate something much closer to what we have:
0 : 7a 00 00 25 CreateArrayLiteral [0], [0], #37
4 : 26 fa Star r1
6 : 12 01 LdaConstant [1]
8 : 26 fb Star r0
10 : 13 02 01 LdaGlobal [2], [1]
13 : 26 f7 Star r4
15 : b1 f7 03 05 GetIterator r4, [3], [5]
19 : a0 07 JumpIfJSReceiver [7] (0x8a10825017c @ 26)
21 : 61 b5 00 fb 00 CallRuntime [ThrowSymbolIteratorInvalid], r0-r0
26 : 26 f8 Star r3
28 : 28 f8 03 07 LdaNamedProperty r3, [3], [7]
32 : 26 f9 Star r2
34 : 58 f9 f8 10 CallProperty0 r2, r3, [16]
38 : 26 f7 Star r4
40 : a0 07 JumpIfJSReceiver [7] (0x8a108250191 @ 47)
42 : 61 ad 00 f7 01 CallRuntime [ThrowIteratorResultNotAnObject], r4-r4
47 : 28 f7 04 12 LdaNamedProperty r4, [4], [18]
51 : 97 13 JumpIfToBooleanTrue [19] (0x8a1082501a8 @ 70)
53 : 28 f7 05 09 LdaNamedProperty r4, [5], [9]
57 : 31 fa fb 0e StaInArrayLiteral r1, r0, [14]
61 : 25 fb Ldar r0
63 : 4c 0d Inc [13]
65 : 26 fb Star r0
67 : 8a 21 00 JumpLoop [33], [0] (0x8a108250184 @ 34)
70 : 25 fa Ldar r1
72 : aa Return
This is a medium-sized amount of work. Unless this function is really hot, or the pattern shows up elsewhere, I doubt this is our lowest-hanging fruit, but we should keep it on our radar.
Comment 2•1 year ago
|
||
JSC still handles { [1,...todos] }
at a high level:
[ 0] enter
[ 1] get_scope dst:loc4
[ 3] mov dst:loc5, src:loc4
[ 6] check_traps
[ 7] mov dst:loc6, src:Int32: 1(const0)
[ 10] resolve_scope dst:loc8, scope:loc4, var:0, resolveType:GlobalProperty, localScopeDepth:0
[ 17] get_from_scope dst:loc9, scope:loc8, var:0, getPutInfo:2048<ThrowIfNotFound|GlobalProperty|NotInitialization|NotStrictMode>, localScopeDepth:0, offset:0
[ 25] mov dst:loc8, src:loc9
[ 28] spread dst:loc7, argument:loc8
[ 31] new_array_with_spread dst:loc6, argv:loc6, argc:2, bitVector:0
[ 36] ret value:Undefined(const1)
Reporter | ||
Comment 3•1 year ago
•
|
||
getTodos() is 1.1% of the updated TodoMVC-Preact benchmark. It's probably called 9 times in total (3 renders of the App component x 3 calls per App render)
Comment 4•1 year ago
|
||
Wow, preact is small. If it's called 9 times in total, it's unlikely to get out of the interpreter.. Maybe into blinterp, if we lower our thresholds.
If we decide to do anything here, we should go look at how JSC implements their spread
op. Just looking at the bytecode makes it seem relatively generic, and one of our problems with optimizing spread calls / spread array initialization / various forms of destructuring is that we've never found a good abstraction that would let us avoid writing a bunch of bespoke implementations.
Updated•1 year ago
|
Description
•