670484 - IonMonkey: Implement function calls

David Anderson [:dvander] - inactive, e-mail if emergency

Reporter

Description

•

14 years ago

To implement non-inlined function calls, we need some sort of calling convention. In JS, you can omit or add extra arguments to a function call, which makes it hard to make sure both calling functions and accessing arguments are fast. In JM, a callee accesses its arguments by looking above the current frame, peeking into the previous frame's stack. The i-th argument is at position [frame - formalArgs + i]. If argc != formalArgs, the frame has to be padded or shrunk, and this process is expensive. In IonMonkey we should be able to pretty easily fix this by reversing the order in which arguments are pushed, so the i-th argument is at [frame - i]. Then we would never need to resize the callee's frame. But naively doing this at the CALL instruction is suboptimal. For example, consider this: call(a.b, c(), "d"); 0: getprop 1: call 2: string("d") 3: push(2) 4: push(1) 5: push(0) 6: call If we need to spill in [0, 2] then the pushes become memory-to-memory. Instead, we want something like: 0: getprop 1: push(0) 2: call 3: push(2) 4: string("d") 5: push(4) 6: call But this isn't in the right order: "push" would have to mean "write to the correct argument slot on the stack". That seems plausible, but now we'd need a way to associate stack pushes as argument pushes. One idea: emit a JSOP_PUSHARG or something, and if necessary prefix calls by a JSOP_PREPARECALL. We want that association no matter what order we push arguments. Note: it might be that FixupArity can be made really cheap, if we can get it down to a few instructions generated as a special trampoline. Aside from pushing arguments: * Caller must communicate to the register allocator that it interferes with volatile registers. * Callee must save and restore the frame pointer (ebp). * Callee must save some nugget of compiler information in a canonical place on the frame, so that C++ can traverse the stack if necessary. (can be a follow-up bug) * Callee must guard against stack overflow. (can be a follow-up bug) * Callee must allocate a closure environment if needed. (can be a follow-up bug)

Luke Wagner [:luke]

Comment 1

•

14 years ago

I'm a big fan of reverse-ordered arguments and dropping both FixupArity and this whole "canonicalActualArgs" business.

David Mandelin [:dmandelin]

Comment 2

•

14 years ago

(In reply to comment #1) > I'm a big fan of reverse-ordered arguments and dropping both FixupArity and > this whole "canonicalActualArgs" business. Thirded. (In reply to comment #0) > For example, consider this: > > call(a.b, c(), "d"); > > 0: getprop > 1: call > 2: string("d") > 3: push(2) > 4: push(1) > 5: push(0) > 6: call > > If we need to spill in [0, 2] then the pushes become memory-to-memory. > Instead, we want something like: > > 0: getprop > 1: push(0) > 2: call > 3: push(2) > 4: string("d") > 5: push(4) > 6: call > > But this isn't in the right order: "push" would have to mean "write to the > correct argument slot on the stack". Yes. > That seems plausible, but now we'd need > a way to associate stack pushes as argument pushes. One idea: emit a > JSOP_PUSHARG or something, and if necessary prefix calls by a > JSOP_PREPARECALL. We want that association no matter what order we push > arguments. AIUI, the conventional way to express this (at a HIR/MIR level) is something like: 0: getprop 1: call 2: string("d") 3: call(0, 1, 2) The call instruction is responsible for putting the arguments in the right place. As LIR, 3 would expand to something like: L32: move argR0, L2 L31: move argR1, L1 L30: move argR2, L0 // L0 is whatever LIR thing computes the value of MIR 0 L33: call I've written it so that |argR0| means the first argument slot in stack order, thus the last argument in the argument list. The idea is that |argR0| is something that represents the address of that argument-area location. Of course, it's always nice to have a way to make L0 compute its result directly into argR2 if it can. > Note: it might be that FixupArity can be made really cheap, if we can get it > down to a few instructions generated as a special trampoline. What is FixupArity in this version, anyway? It seems that excess arguments are no problem at all. And can insufficient arguments be fixed up by the IC in the caller? > Aside from pushing arguments: > * Caller must communicate to the register allocator that it interferes with > volatile registers. The liveness pass can compute length-0 live intervals for all the volatile registers at every call instruction, or something like that. > * Callee must save and restore the frame pointer (ebp). > * Callee must save some nugget of compiler information in a canonical place > on > the frame, so that C++ can traverse the stack if necessary. (can be a > follow-up bug) Do you mean a pointer to the previous stack frame, or to a map of the new stack frame. > * Callee must guard against stack overflow. (can be a follow-up bug) Can we use memory protection of this? > * Callee must allocate a closure environment if needed. (can be a follow-up > bug) Yep.

David Anderson [:dvander] - inactive, e-mail if emergency

Reporter

Comment 3

•

14 years ago

> The call instruction is responsible for putting the arguments in the right > place. As LIR, 3 would expand to something like: > > L32: move argR0, L2 > L31: move argR1, L1 > L30: move argR2, L0 // L0 is whatever LIR thing computes the value of > MIR 0 > L33: call > > Of course, it's always nice to have a way to make L0 compute its result > directly into argR2 if it can. Is there an easy way to do this optimization if the argument movement happens at the call, rather than at the arguments? (I'm not even sure if it matters - if it doesn't, the design you mention is indeed nicer.) > What is FixupArity in this version, anyway? It seems that excess arguments > are no problem at all. And can insufficient arguments be fixed up by the IC > in the caller? This just meant, "If we don't want to do the reversal thing, FixupArity could probably be made really cheap." But it's a bunch of extra complexity and everyone seems to want reversing. > > * Callee must save and restore the frame pointer (ebp). > > * Callee must save some nugget of compiler information in a canonical place > > on > > the frame, so that C++ can traverse the stack if necessary. (can be a > > follow-up bug) > > Do you mean a pointer to the previous stack frame, or to a map of the new > stack frame. A map of the new stack frame - I'm not entirely sure of what needs to be in it yet, but definitely something with the frame size so the stack can be traversed. > Can we use memory protection of this? That would be awesome, we should investigate. We would have to jump in before breakpad and detect that JIT is running and that it's trying to go over the stack limit.

David Anderson [:dvander] - inactive, e-mail if emergency

Reporter

Comment 4

•

14 years ago

Follow-up: tentatively the code generator is computing frame stack offsets relative to esp, not ebp, so in theory we could allocate ebp and only save/restore it in the JIT trampoline. I guess whether we do this depends on how valuable having an extra register is on x86, and whether we care what it does to breakpad.

Implement JSOP_NOTEARG. 14 years ago Sean Stangl [:sstangl] 2.82 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
Implement DynamicArityList. 14 years ago Sean Stangl [:sstangl] 1.41 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
Implement Call lowering and generation. 14 years ago Sean Stangl [:sstangl] 33.53 KB, patch	dvander : feedback+	Details \| Diff \| Splinter Review
LSRA support for calls. 14 years ago Sean Stangl [:sstangl] 3.97 KB, patch		Details \| Diff \| Splinter Review
GreedyAllocator support for calls. 14 years ago Sean Stangl [:sstangl] 3.03 KB, patch		Details \| Diff \| Splinter Review
Implement JSOP_NOTEARG [v2] 14 years ago Sean Stangl [:sstangl] 3.48 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
Implement FixedList [v2] 14 years ago Sean Stangl [:sstangl] 1.53 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
Implement Call lowering and generation [v2] 14 years ago Sean Stangl [:sstangl] 41.69 KB, patch		Details \| Diff \| Splinter Review
GreedyAllocator support for calls. 14 years ago Sean Stangl [:sstangl] 3.02 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
LSRA support for calls [v2] 14 years ago Sean Stangl [:sstangl] 3.84 KB, patch	adrake : review+	Details \| Diff \| Splinter Review
Implement Call lowering and generation [v3] 14 years ago Sean Stangl [:sstangl] 41.37 KB, patch		Details \| Diff \| Splinter Review
Implement Call lowering and generation [v4] 14 years ago Sean Stangl [:sstangl] 44.92 KB, patch	dvander : review+	Details \| Diff \| Splinter Review
Function tests, where functions are passed via argument. 14 years ago Sean Stangl [:sstangl] 1.95 KB, application/javascript		Details