Open
Bug 817937
Opened 12 years ago
Updated 2 years ago
Slim down JIT codegen for DOM bindings a bit
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
NEW
People
(Reporter: bzbarsky, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
Attachments
(2 files)
I ran this testcase: var count = 1000000; var xhr = new XMLHttpRequest; for (var i = 0; i < count; ++i) { xhr.readyState; } through the JIT profiler and looked at the resulting asm. There are various things in there that are bit suboptimal. I'll comment on those when I attach the annotated asm.
Reporter | ||
Comment 1•12 years ago
|
||
So I see the following things that could use work: 1) Before we ever get to the GetDOMProperty there is a useless load of whatever object pointer we did the shape guard on (the proto?) into %rax. What's that about? 2) We end up loading JSRuntime* into registers twice: once when saving rt->ionTop, the other when getting rt->ionJSContext. Not sure how to best deal with that, given that the ionTop thing is somewhere inside frame setup goop. 3) We put all of our call args in scratch registers, then move them into the argument registers. We should just put them in the right registers to start with. 4) We unbox, and guard on, the completely unused return value. Not sure how important this is to fix since presumably real code would rarely not use a return value. I'll file a separate bug on #3, since I understand that one best...
Reporter | ||
Comment 2•12 years ago
|
||
I looked at the asm for a method call too, and it has two separate call instructions... not sure whether that's expected?
Comment 3•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #1) > Created attachment 688098 [details] > Annotated asm > > So I see the following things that could use work: > > 1) Before we ever get to the GetDOMProperty there is a useless load of > whatever > object pointer we did the shape guard on (the proto?) into %rax. What's > that about? The guard is produced by IonBuilder::TestCommonPropFunc, and put as a dependency of MGetDOMProperty such as it would be a dependency of the call. The bug might come from the fact that the MIR dependency is not lower to a not-used LUse, during the lowering phase and the register allocator might be lost and generate weird things. > [GetDOMProperty] > subq $0x8, %rsp #bz Make stack space for jsval retval > movq %rsp, %rcx #bz Put pointer to retval in %rcx > push %rdi #bz Pushed "xhr" object on stack. > movq 0x20(%rdi), %rbx #bz Read first fixed slot from "xhr" > shlq $1, %rbx #bz Getting private on 64-bit shifts > movq %rsp, %rdi #bz Put pointer to xhr object in %rdi > movabsq $0xffffffffffffffff, %rax #bz Stick -1 in %rax > pushl $0xc0 #bz Push 0xc0 = 192? Why? This is the frame descriptor, we need it to walk the stack. > push %rax #bz Push -1? Why? This is the emulated return address, -1 is just the placeholder generated which will be patched at the link phase. This fake return address is used to index the safepoint which is used for marking live objects which are in the current frame. > movabsq $0x107dc5000, %r11 #bz Put JSRuntime* in %r11 > movq %rsp, 0x34a28(%r11) #bz save stack pointer in rt->ionTop This is used as a start point for stack iteration from the current IonActivation. rt->ionTop is used as an easy thread-safe way to hold the last stack pointer. > movabsq $0x1, %r11 #bz Stick 1 in %r11 > push %r11 #bz Push 1 > movabsq $0x0, %r11 #bz Stick 0 in %r11 > push %r11 #bz Push 0 These 2 are used to identify the kind of exit frame. In this case an ION_FRAME_DOMGETTER. > movq %rsp, %rax #bz Save stack pointer in %rax > andq $0xfffffff0, %rsp #bz align stack? 32-bit? > push %rax #bz Push pre-alignment %rsp on stack This is made by setupUnalignedABICall, we need to align the stack such as the call frame would be aligned, to respect some compiler restrictions. In practice, we might be able to get rid of them in most cases as we statically know our frame size. But this would imply that we keep it aligned for every-call which might involve some tricky manipulation if funapply and also in the underflow case. > 2) We end up loading JSRuntime* into registers twice: once when saving > rt->ionTop, the other when getting rt->ionJSContext. Not sure how to > best > deal with that, given that the ionTop thing is somewhere inside frame > setup > goop. > movabsq $0x107dc5000, %rax #bz Put JSRuntime* in %rax again, BUG > movq 0x34a30(%rax), %rax #bz Put rt->ionJSContext in %rax This is the first argument of the function. This will also appear with callVM. We should update linkExitFrame to accept a register containing a JSRuntime*. I am not sure but we have to be careful in VMwrapper on x86 as we might run out of register quickly. > subq $0x8, %rsp #bz Re-align stack to 16 bytes > 3) We put all of our call args in scratch registers, then move them into the > argument registers. We should just put them in the right registers to > start with. > > I'll file a separate bug on #3, since I understand that one best... > movq %rcx, %rcx #bz Call setup arg 4: jsval*, BUG > movq %rbx, %rdx #bz Call setup arg 3: nsXMLHttpRequest* > movq %rdi, %rsi #bz Call setup arg 2: JSHandleObject > movq %rax, %rdi #bz Call setup arg 1: JScontext* Hum … apparently the CallTempReg* are not set correctly. This should have been 4 no-op plus a knonw bug from the MoveGroup which is still producing these no-op. This should be an easy fix. > movabsq $0x10280b8f0, %rax #bz Target address into %rax > call *%rax #bz Do the call > addq $0x8, %rsp #bz Remove stack space we made above > pop %rsp #bz pop pre-alignment %rsp > movq 0x28(%rsp), %rcx #bz Read return value into %rcx > addq $0x30, %rsp #bz Pop everything off the stack > 4) We unbox, and guard on, the completely unused return value. Not sure how > important this is to fix since presumably real code would rarely not use > a > return value. We have to because TI does not trust DOM functions to always return the same value type.
Reporter | ||
Comment 4•12 years ago
|
||
Nicolas, thank you for the analysis! Are you willing to file a bug on #1 above, since you seem to understand it at least somewhat? > We should update linkExitFrame to accept a register containing a JSRuntime*. Do you want to file the bug, or should I? > Hum … apparently the CallTempReg* are not set correctly. Let's do this part in bug 817943. > We have to because TI does not trust DOM functions to always return the same value type. Well, in the unused return value case we don't care whether it returned the same value type... But this raises an interesting point in general which I should have thought of last night. Filed bug 818050.
Depends on: 818050
Reporter | ||
Updated•12 years ago
|
Summary: Slim down DOM codegen for DOM bindings a bit → Slim down JIT codegen for DOM bindings a bit
Reporter | ||
Updated•12 years ago
|
Blocks: ParisBindings
Comment 5•12 years ago
|
||
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #3) > > movabsq $0xffffffffffffffff, %rax #bz Stick -1 in %rax > > push %rax #bz Push -1? Why? > > movabsq $0x1, %r11 #bz Stick 1 in %r11 > > push %r11 #bz Push 1 > > movabsq $0x0, %r11 #bz Stick 0 in %r11 > > push %r11 #bz Push 0 Yuck. We can do better than this. And we do, for pushing the frame descriptor; why don't we use |push imm| in these instances, too?
Reporter | ||
Comment 6•12 years ago
|
||
> Yuck. We can do better than this. Filed bug 818138.
Depends on: 818138
Reporter | ||
Comment 7•12 years ago
|
||
Do we need a separate bug on this part:
> movabsq $0xffffffffffffffff, %rax #bz Stick -1 in %rax
> pushl $0xc0 #bz Push 0xc0 = 192? Why?
> push %rax #bz Push -1? Why?
to look more like this:
pushl $0xc0
pushl $-1
?
Reporter | ||
Comment 8•12 years ago
|
||
Or perhaps it would have to be: pushl $0xc0 pushl $0xffffffff Note that on 32-bit that snippet looks like so: movl $0xffffffff, %edi pushl $0x460 push %edi which is no better, imo...
Comment 9•12 years ago
|
||
(In reply to Boris Zbarsky (:bz) from comment #8) > Or perhaps it would have to be: > > pushl $0xc0 > pushl $0xffffffff > > Note that on 32-bit that snippet looks like so: > > movl $0xffffffff, %edi > pushl $0x460 > push %edi > > which is no better, imo... No better than what? We could educate 32-bit (and 64-bit, really) about the |push imm8/imm16| variants, which also do sign-extension and are shorter.
Reporter | ||
Comment 10•12 years ago
|
||
Oh, I see. We later patch the 0xffffffff with the actual address, so we can't treat it as a 32-bit immediate. Alright.
Reporter | ||
Comment 11•12 years ago
|
||
> No better than what?
No better than the 64-bit codegen.
So for 32-bit, we might be be able to do a pushl anyway, since pointers would fit in 32 bits there. But that might complicate the later patching...
Comment 12•12 years ago
|
||
Ah, OK. Probably easier to just keep the mov $imm, reg form for both architectures, then.
Assignee | ||
Updated•10 years ago
|
Assignee: general → nobody
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•