Support MParameter in FoldLoadsWithUnbox
Categories
(Core :: JavaScript Engine: JIT, enhancement, P3)
Tracking
()
People
(Reporter: iain, Assigned: debadree333)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [sp3])
Attachments
(1 obsolete file)
The FoldLoadsWithUnbox optimization combines a load from memory with an unbox to generate slightly faster assembly. We could consider expanding it to support MParameter as an input, because parameters are passed in memory (on the stack).
Updated•1 year ago
|
Updated•1 year ago
|
| Assignee | ||
Comment 1•1 year ago
|
||
Hello iain,
I was just trying to understand this, iiuc for every parameter of the function we generate this MParameter node? and we would have to add an additional case here in https://searchfox.org/mozilla-central/source/js/src/jit/IonAnalysis.cpp#4968 and create a corresponding instruction to do both like MLoadElementAndUnbox and lower it somewhat similarly?
Thank you!!
Comment 2•1 year ago
|
||
Iain is on parental leave so I'll steal this one: yes that's correct. We would handle MParameter like MLoadFixedSlot et al and add a new MIR instruction that generates more efficient code by folding the load + unbox.
Would be good to compare the codegen before/after for some types (int32, double, object are the interesting cases).
| Assignee | ||
Comment 3•1 year ago
|
||
Oh hey! (so sorry for pinging iain didnt know) so basically i looked at a script like this:
function functionWithParams(i, j) {
const arr = [];
for (let start = i; start < j; start++) {
arr.push(start + i);
}
return arr;
}
for (let i = 0; i < 1e5; i++) {
functionWithParams(i, i + 10);
}
i looked at the ion graph and the state of the first block after FoldLoadsWithUnbox looks something like this:
resumepoint 3 3 2 1 0 3 3
0 parameter THIS_SLOT
1 parameter 0
2 parameter 1
3 constant undefined
4 start
5 checkoverrecursed
6 constant shape at 103165763b80
7 newarrayobject constant6:Shape
8 unbox parameter2 to Int32 (fallible)
9 guardshape newarrayobject7:Object
memory 4
10 constant object 10316573e0e0 (Array)
11 guardshape constant10:Object
memory 4
12 slots guardshape11:Object memory 4
57 loaddynamicslotandunbox slots12:Slots (slot 6) 14 unbox parameter1 to Int32 (fallible)
16 constant function push at 103165761430
17 guardspecificfunction loaddynamicslotandunbox57:Object constant16:Object
18 constantproto constant10:Object
so we would ideally want it to look something like where instruction number 8 too look like loadparameterandunbox like we see in instruction 17 for example! right?
Thank you so much for responding!
Comment 4•1 year ago
|
||
That's correct! For the LoadParameterAndUnbox MIR/LIR instructions, you should look at Parameter and LoadFixedSlot - it will probably have elements from both of those.
| Assignee | ||
Comment 5•1 year ago
|
||
Hello jandem (sorry for the repeated needinfo's 😅 ) if we look at the implementation of visitLoadFixedSlotAndUnbox (for ex): we load the address based on the base address (which is the object ptr i believe) and the offset as seen here: https://searchfox.org/mozilla-central/source/js/src/jit/CodeGenerator.cpp#17286 , now for the case of LoadParameterAndUnbox we look at its lowering implementation we only pass around the offset value https://searchfox.org/mozilla-central/source/js/src/jit/Lowering.cpp#76 so what i am not able to figure out is what will be the base address in this case? I i tried to look at the implementation of LArgument and LAllocation i think theres a clue here: https://searchfox.org/mozilla-central/source/js/src/jit/LIR.h#92 but couldnt figure :-(
Thank you so much for your time!!
Comment 6•1 year ago
|
||
We can use the frame pointer register as base, similar to the code here (ignore the BaseRegForAddress::SP case):
Note that a.toArgument()->index() is a bit confusing: it's the offset of the argument, not the argument's index. That's also clear from the Lowering.cpp code you linked to in comment 5.
| Assignee | ||
Comment 7•1 year ago
|
||
Updated•1 year ago
|
| Reporter | ||
Comment 8•1 year ago
•
|
||
For the record, the motivating testcase for this is in bug 1911160. It looks something like this:
C.costBetween = function(from, to) {
if (from >= this.costMatrix.length || from == to)
return 0;
return this.costMatrix[from][to - from - 1];
}
In particular, disassembling the code we generate for loading from and to, we get something like this (taken from this comment):
MoveGroup [arg:8 -> r8, x]
mov r8, qword [rbp + 0x28]
Unbox (r8)
mov r11, r8
shr r11, 0x2f
cmp r11d, 0x1fff1
jnz 0x48d794
mov edi, r8d
Conversely, this.costMatrix generates:
LoadFixedSlotAndUnbox (rax)
mov r11, -0x2000000000000
xor r11, qword [rax + 0x18]
mov rcx, r11
shr r11, 0x2f
jnz 0x48d77f
The latter lets us avoid the cmp by arranging for the shr to set the right flags.
I don't know whether foldLoadsWithUnbox is the best way to accomplish this, or whether we could do the whole optimization during lowering/codegen.
| Assignee | ||
Comment 9•1 year ago
|
||
Oh thanks for the explanation iain i will try investigating more and ask any questions i find!
Updated•11 months ago
|
Comment 10•2 months ago
|
||
Worth repriotizing/reassigning/just-fix-it this?
| Reporter | ||
Comment 11•2 months ago
|
||
Bumping down in priority; I think the gap we were seeing on bug 1911160 had more to do with bug 1956655.
Description
•