Last Comment Bug 680315 - IonMonkey: Implement Calls to C++
: IonMonkey: Implement Calls to C++
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: ---
Assigned To: Nicolas B. Pierron [:nbp]
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 685228 685273 685841 689556 692114
Blocks: IonMonkey 685838 691340 691373 694169
  Show dependency treegraph
 
Reported: 2011-08-18 19:43 PDT by David Anderson [:dvander]
Modified: 2011-11-01 08:13 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Experiment C++ calls from CodeGenerator. (15.00 KB, patch)
2011-08-29 09:25 PDT, Nicolas B. Pierron [:nbp]
dvander: feedback+
Details | Diff | Splinter Review
Handle IonCode pieces in ${ARCH}/Assembler-${ARCH}.h (6.23 KB, patch)
2011-08-30 09:51 PDT, Nicolas B. Pierron [:nbp]
no flags Details | Diff | Splinter Review
Assembler Call & NearCall for x86/x86_64 (6.70 KB, patch)
2011-08-31 10:33 PDT, Nicolas B. Pierron [:nbp]
dvander: review+
Details | Diff | Splinter Review
Assembler Call for x86/x86_64. (4.62 KB, patch)
2011-09-15 10:10 PDT, Nicolas B. Pierron [:nbp]
dvander: review+
Details | Diff | Splinter Review
Implements C++ calls. (28.52 KB, patch)
2011-09-16 16:27 PDT, Nicolas B. Pierron [:nbp]
no flags Details | Diff | Splinter Review
[0002] Implements C++ calls. (26.30 KB, patch)
2011-09-29 03:58 PDT, Nicolas B. Pierron [:nbp]
dvander: review+
Details | Diff | Splinter Review
Assembler Call/Jump for x86/x86_64. (4.90 KB, patch)
2011-10-18 08:34 PDT, Nicolas B. Pierron [:nbp]
nicolas.b.pierron: review+
Details | Diff | Splinter Review
Implements C++ calls. (41.88 KB, patch)
2011-10-18 08:39 PDT, Nicolas B. Pierron [:nbp]
nicolas.b.pierron: review+
Details | Diff | Splinter Review
Implements C++ calls. (42.36 KB, patch)
2011-10-19 09:30 PDT, Nicolas B. Pierron [:nbp]
nicolas.b.pierron: review+
Details | Diff | Splinter Review
Implements C++ calls. (42.05 KB, patch)
2011-10-27 08:30 PDT, Nicolas B. Pierron [:nbp]
nicolas.b.pierron: review+
Details | Diff | Splinter Review
Implements C++ calls. (42.08 KB, patch)
2011-10-31 14:11 PDT, Nicolas B. Pierron [:nbp]
nicolas.b.pierron: review+
Details | Diff | Splinter Review

Description David Anderson [:dvander] 2011-08-18 19:43:00 PDT
IonMonkey needs to be able to call out to C++ from JIT code. The design space here is pretty big, but the major features we want are:

 (1) Inspecting Ion frames from inside C++, for garbage collection and possibly
     for throwing exceptions.
 (2) Handling errors returned from C++.
 (3) Minimizing memory bloat from call boilerplate (it was huge in JM).

Some random ideas that might be worth considering...

For (1), this means two things:
 (a) Somehow, when calling out to C++, we will have to be able to recover the
     topmost IonFramePrefix. The easiest thing would be to compute its address
     and store it in the JSCompartment, as part of the calling convention. Then
     when starting a new Ion activation (ion::Cannon), make sure we save and
     restore the current top (that could be part of IonActivation).

 (b) We will need a snapshot at instructions which perform C++ calls, and a way
     to find this snapshot in C++. One idea: store a vector in IonScript mapping
     (jit return pc -> snapshot). If, given an IonFramePrefix, we can recover
     both the initial C++ call's return address, and the calling IonScript,
     we could bsearch the table for the right snapshot.

(2) Haven't given this much thought yet, will post more soon.

(3) Let's imagine that our C++ calls end up looking like this:
     lea [frame] -> reg
     mov reg -> &compartment.ionTop
     mov <ptr> -> reg
     call reg
     cmp reg, <failure condition>
     je <deoptimize>

There might be ways we can remove all that boilerplate. One idea is to require that calls out to the VM be explicitly enumerated, and then caching the boilerplate for each function. That is, generating wrappers around each VM call that can be re-used. For example, take...

     bool JSObject::setProperty(JSContext *cx, jsid id, Value v);

Since we really only want to call static functions, for simplicity, we would introduce a separate, static wrapper (we can drop |cx| in IonMonkey, it's in TLS):

     bool SetProperty(JSObject *obj, jsid id, Value v);

Then we could enumerate all available functions:

     enum VMFunctions {
         VM_GetProperty
     };
     struct VMSignature {
          uint32 argc;
          bool fallible;
          IonCode *wrapper;
     };
     Functions[VM_GetProperty] = { 3, true, NULL };

In CodeGeneration, we'd write calls like...

     masm.push(...);
     masm.push(...);
     masm.push(...);
     masm.callVM(VM_GetProperty);

In the MacroAssembler, callVM would create a wrapper for this function if it doesn't exist, and then emit a call to the wrapper instead, moving all the boilerplate to shared code and reducing memory usage.

A potential benefit of this is that the "push .. push .. call" pattern works across all platforms, and then the backend callVM function could move things into the native ABI. (Masm even has the ability to compute where the frame prefix is, though I'm not sure if it should, conceptually.)

Another open question is how to deal with outparams. For example, some functions return a Value through a pointer. My gut feeling is that it'd be easier if we made these functions return a Value instead, and encoded a failure with JSVAL_TYPE_MAGIC or something. But we can worry about this later.
Comment 1 Chris Leary [:cdleary] (not checking bugmail) 2011-08-19 22:35:06 PDT
(In reply to David Anderson [:dvander] from comment #0)
> Another open question is how to deal with outparams. For example, some
> functions return a Value through a pointer. My gut feeling is that it'd be
> easier if we made these functions return a Value instead, and encoded a
> failure with JSVAL_TYPE_MAGIC or something.

OOC, how well would that work with nunboxing? It would need to direct-return 2x the word size, which I imagine would create more calling convention headache.
Comment 2 Nicolas B. Pierron [:nbp] 2011-08-22 07:08:02 PDT
(In reply to David Anderson [:dvander] from comment #0)
> (3) Let's imagine that our C++ calls end up looking like this:
>      lea [frame] -> reg
>      mov reg -> &compartment.ionTop
>      mov <ptr> -> reg
>      call reg
>      cmp reg, <failure condition>
>      je <deoptimize>

Otherwise, call with immediate values (wrapper address) can be used instead of register call.  Thus, most of call prolog and epilog can be moved inside the wrapper.  This would replace the RelativePatch needed for the (&compartment.ionTop) by a RelativePatch needed for the wrapper address. (unless I made a wrong guess)

      ; we need to restore the frame pointer.
      call <ptr>

      ; could be moved in the wrapper because the Z flag is not reset by the
      ; ret/push/pop instructions.

      je <deoptimize>
      ; the bailout is the only statement which needs
      ; to be out of the wrapper function.

> 
> There might be ways we can remove all that boilerplate. One idea is to
> require that calls out to the VM be explicitly enumerated, and then caching
> the boilerplate for each function. That is, generating wrappers around each
> VM call that can be re-used.

Couldn't we use statically compiled wrapper by using naked functions ?

>      struct VMSignature {
>           uint32 argc;

Do we really need so many arguments?  Could we leave space to store information such as needs for bailouts, snapshots and frame traversal.

>           bool fallible;
>           IonCode *wrapper;
>      };

We have no needs for enumerated types, we can just declare the signatures as follow:

VMSignature VMFun_GetProperty = { 3, true, &wrapper_GetProperty };
Comment 3 David Anderson [:dvander] 2011-08-22 09:45:09 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #2)
> Otherwise, call with immediate values (wrapper address) can be used instead
> of register call.  Thus, most of call prolog and epilog can be moved inside
> the wrapper.

Yup, though on x64 there is no immediate 64-bit relative call.

> This would replace the RelativePatch needed for the
> (&compartment.ionTop) by a RelativePatch needed for the wrapper address.
> (unless I made a wrong guess)

Somehow, we'll need to traverse the set of Ion frames - is there a way to do that without saving the current IonFramePrefix?

>       je <deoptimize>
>       ; the bailout is the only statement which needs
>       ; to be out of the wrapper function.

Nice.

> Couldn't we use statically compiled wrapper by using naked functions ?

My experience is that that ends up being pretty difficult to maintain. You end up with a lot of duplication that ends up being a major pain to modify if you change an invariant somewhere. (See methodjit/MethodJIT.cpp for what happens :)

> Do we really need so many arguments?  Could we leave space to store
> information such as needs for bailouts, snapshots and frame traversal.

What do you mean by arguments?

> We have no needs for enumerated types, we can just declare the signatures as
> follow:
> 
> VMSignature VMFun_GetProperty = { 3, true, &wrapper_GetProperty };

Good point, though assuming the wrappers will be dynamically compiled, and shared, they'll be per-compartment GC things. So we'll need some way to enumerate wrappers in that case (see IonCompartment.h's somewhat ad-hoc table for bailouts).
Comment 4 Nicolas B. Pierron [:nbp] 2011-08-22 10:45:47 PDT
(In reply to David Anderson [:dvander] from comment #3)
> (In reply to Nicolas B. Pierron [:pierron] from comment #2)
> > Otherwise, call with immediate values (wrapper address) can be used instead
> > of register call.  Thus, most of call prolog and epilog can be moved inside
> > the wrapper.
> 
> Yup, though on x64 there is no immediate 64-bit relative call.
> 
> > This would replace the RelativePatch needed for the
> > (&compartment.ionTop) by a RelativePatch needed for the wrapper address.
> > (unless I made a wrong guess)
> 
> Somehow, we'll need to traverse the set of Ion frames - is there a way to do
> that without saving the current IonFramePrefix?

The frame pointer is (would be?) a register known at compile time.  Thus, it could be saved by the wrapper if the wrapped function needs it and if it does not erase it.

> >       je <deoptimize>
> >       ; the bailout is the only statement which needs
> >       ; to be out of the wrapper function.
> 
> Nice.

> > Do we really need so many arguments?  Could we leave space to store
> > information such as needs for bailouts, snapshots and frame traversal.
> 
> What do you mean by arguments?

using uint16/uint8 instead of uint32 for argc.

> > We have no needs for enumerated types, we can just declare the signatures as
> > follow:
> > 
> > VMSignature VMFun_GetProperty = { 3, true, &wrapper_GetProperty };
> 
> Good point, though assuming the wrappers will be dynamically compiled, and
> shared, they'll be per-compartment GC things. So we'll need some way to
> enumerate wrappers in that case (see IonCompartment.h's somewhat ad-hoc
> table for bailouts).

We could use a table in which newly created wrappers would be registered, but this will waste space.  Couldn't we allocate them in a space which is not covered by the GC, after all these wrappers will live forever ?
Comment 5 David Anderson [:dvander] 2011-08-22 11:11:15 PDT
> The frame pointer is (would be?) a register known at compile time.  Thus, it
> could be saved by the wrapper if the wrapped function needs it and if it
> does not erase it.

There's no frame pointer in an Ion frame - we try to use all 7 seven registers on x86.

> We could use a table in which newly created wrappers would be registered,
> but this will waste space.  Couldn't we allocate them in a space which is
> not covered by the GC, after all these wrappers will live forever ?

The size of the table (which itself could be lazily allocated, and could be a hash table) would be less than the size of all generated wrappers. Really all the table has to be is an array of IonCode*.

We could consider making them shared at a JSRuntime level (that is, one copy of each wrapper per-thread), but you still end up needing a way to free memory on shutdown. It'd also be a little limiting. You'd have to change ion::MacroAssembler to not generate an IonCode object (which is a gc-thing) and that limits the functionality available. At that point you can't emit instructions that reference other GC objects. You'd also have to be absolutely sure that you didn't need anything compartment-specific in the wrappers, since the code would be shared amongst all compartments.

With the table/cache approach, each slot would be a weakref and wrappers would go away naturally.
Comment 6 Nicolas B. Pierron [:nbp] 2011-08-23 11:00:09 PDT
(In reply to David Anderson [:dvander] from comment #3)
> Somehow, we'll need to traverse the set of Ion frames - is there a way to do
> that without saving the current IonFramePrefix?

We have multiple options to get the IonFramePrefix:

1/  We can extract the IonFrame Prefix, which can be determined with the current stack pointer shifted by the value of framePushed (last local variable space, AFAIU)

     lea [sp + $framePushed] -> reg ;= topFrame
     mov reg -> &compartment.ionTop

2/ We can push the $framePushed value to the wrapper which would extract the content later.

     ; inlined code
     push $(framePushed + argc * stackSlot + stackSlot)

     ... ; wrapped function args (sp -= argc * stackSlot)

     mov <ptr> -> reg
     call reg                  ; (sp -= stackSlot)
     ...


     ; wrapper prolog
     lea [sp + $(argc + stackSlot)] -> reg ;= framePushed + (argc + 1) * stackSlot
     lea [sp + reg * 1] -> reg             ;= topFrame
     mov reg -> &compartment.ionTop

To summarize the code outside the wrapper:

    push imm32 (5)
  vs.
    lea [reg + disp] -> reg (3)
    mov reg -> [immPtr]     (5)

3/ We can define the topFrame when opening it, but this require extra stack space to save the previous value of the topFrame, and extra resources to save and restore it.


(In reply to David Anderson [:dvander] from comment #5)
> We could consider making them shared at a JSRuntime level  (that is, one copy of each
> wrapper per-thread) ... You'd also have to be
> absolutely sure that you didn't need anything compartment-specific in the
> wrappers, since the code would be shared amongst all compartments.

If we do so, the second solution wrapper should use the TLS to fetch &cx->compartment.ionTop and store the top frame in it, which implies more memory manipulations and more code to be executed at each call.

Are VM calls frequent in term of space and execution?
Comment 7 David Anderson [:dvander] 2011-08-23 11:10:16 PDT
> 1/  We can extract the IonFrame Prefix, which can be determined with the
> current stack pointer shifted by the value of framePushed (last local
> variable space, AFAIU)
> 
> 2/ We can push the $framePushed value to the wrapper which would extract the
> content later.

Agree, these both seem better than #3.

> If we do so, the second solution wrapper should use the TLS to fetch
> &cx->compartment.ionTop and store the top frame in it, which implies more
> memory manipulations and more code to be executed at each call.

Luckily, we shouldn't need TLS to fetch the compartment - we're allowed to bake in the JSCompartment* directly into JIT code. SpiderMonkey object graphs are isolated into compartments and compartments are single-threaded.

> Are VM calls frequent in term of space and execution?

Execution: no :) ideally, we should have good type information that lets us avoid hitting C++. If we don't know types at *all*, we'll be required to call out to C++, but unless the operation is super complicated or highly polymorphic, we should be able to use inlining caching and avoid C++ on later executions.

Space: when IonMonkey is applying type information, probably not. I can get some semi-bogus measurements on this. When we don't have type information, we'll pretty much have calls for every operation, even if they're only run once or twice.
Comment 8 Nicolas B. Pierron [:nbp] 2011-08-24 10:33:56 PDT
(In reply to David Anderson [:dvander] from comment #0)
> IonMonkey needs to be able to call out to C++ from JIT code. The design
> space here is pretty big, but the major features we want are:
> 
>  (1) Inspecting Ion frames from inside C++, for garbage collection and
> possibly
> ...
> 
> For (1), this means two things:
>  (a) ...
>  (b) We will need a snapshot at instructions which perform C++ calls, and a
> way
>      to find this snapshot in C++. One idea: store a vector in IonScript
> mapping
>      (jit return pc -> snapshot). If, given an IonFramePrefix, we can recover
>      both the initial C++ call's return address, and the calling IonScript,
>      we could bsearch the table for the right snapshot.

1/ (Question)

What is the use of a snapshot mapping register/stack when these are mess-up by the code above the call?  I guess we need to bailout registers before making the call inside the wrapper, right?


2/ (Question)

Based on the following snippet of comment from IonBuilder.cpp, I guess that a snapshot must be took after each call, am I right?

        The snapshot model differs from TraceMonkey in that we do not need to
        take snapshots for every guard. Instead, we take snapshots at two
        critical points:
          * (1) At the beginning of every basic block.
          * (2) After every non-idempotent operation.

If the function has no return value (except the condition flags) or if it is ignored then is there any reason to take a new snapshot?


3/ (Suggestion)

Before making the call, we can modify the allocation space to add an undefined variable (which correspond to the expected result) at a pre-defined location.  Then we take a snapshot if the execution of the function can fail.

One of the benefit is that the space is reserved for expected results, independently of the result size.


4/ (Suggestion)

Another advantage of taking the snapshot before making the call is that we can reach the snapshot offset by looking at the generated code (in case of fallible functions).  This avoid adding additional data structures to map return address to the corresponding snapshot offset.  Here is how we can proceed:

Based on the return address and with relativePatch ...

    ; call the wrapper
    mov rax, <wrapper ptr>
    call rax

    ; resote possibly unaligned stack
    pop rsp

    ; in case of failure, jump ...
    je <deoptimize>

... we can extract the deoptimize label.  By following it we will reach the code generated by visitOutOfLineBailout function whichc contains the snapshot offset.

    push snapshotOffset
    jmp deoptLabel_

(early optimization, I know)

In case of non-fallible functions, we would need another mean to retrieve the snapshot offset.
Comment 9 David Anderson [:dvander] 2011-08-24 11:30:34 PDT
> What is the use of a snapshot mapping register/stack when these are mess-up
> by the code above the call?  I guess we need to bailout registers before
> making the call inside the wrapper, right?

The reason the snapshot is needed is because calls out to C++ may need to inspect Ion frames, either to build an exception backtrace or to collect garbage. The snapshot is the record which maps CPU stack slots/registers to interpreter js::Values.

As for snapshots and spilling registers, most of this infrastructure will come with bug 670484 so I wouldn't worry about it yet.

> Based on the following snippet of comment from IonBuilder.cpp, I guess that
> a snapshot must be took after each call, am I right?

Yes.

> If the function has no return value (except the condition flags) or if it is
> ignored then is there any reason to take a new snapshot?

If we know that it is idempotent, infallible, and cannot trigger GC, then we can avoid taking a snapshot in MIR, and attaching a bailout in LIR.

> ... we can extract the deoptimize label.  By following it we will reach the
> code generated by visitOutOfLineBailout function whichc contains the
> snapshot offset.
> 
>     push snapshotOffset
>     jmp deoptLabel_
> 
> (early optimization, I know)

The GC has to be able to inspect ion frames - but maybe we could be clever and sniff out the bailout by inspecting the |je failure ... push snapshotOffset| pattern :)

Note that on x86 we avoid having OOL bailouts, and instead jump into a table, but the same trick would work as long as the IonFramePrefix can be recovered.
Comment 10 Nicolas B. Pierron [:nbp] 2011-08-29 09:25:16 PDT
Created attachment 556575 [details] [diff] [review]
Experiment C++ calls from CodeGenerator.
Comment 11 Nicolas B. Pierron [:nbp] 2011-08-29 09:28:02 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #10)
> Created attachment 556575 [details] [diff] [review]
> Experiment C++ calls from CodeGenerator.

This patch is compiling on top of Bug 670484 patches.
I did not test yet because I have to investigate how to reserve registers for a LIR and on how/where snapshots are produced.
Comment 12 David Anderson [:dvander] 2011-08-29 17:08:49 PDT
Comment on attachment 556575 [details] [diff] [review]
Experiment C++ calls from CodeGenerator.

Review of attachment 556575 [details] [diff] [review]:
-----------------------------------------------------------------

Nice work so far, looks like it's on the right track.

::: js/src/ion/IonCompartment.h
@@ +137,5 @@
> +
> +    static void setTopFrame(IonFramePrefix* top) {
> +        JSContext *cx = GetIonContext()->cx;
> +        IonCompartment *ioncompartment = cx->compartment->ionCompartment();
> +        ioncompartment->topFrame_ = top;

I would implement this in Ion.cpp to avoid bringing Ion.h into this header.

::: js/src/ion/shared/CodeGenerator-x86-shared.cpp
@@ +135,5 @@
> +    uint32 size_descriptor = stack_size << 1;
> +
> +    // Construct the IonFramePrefix.
> +    masm.push(Imm32(0x0)); // TODO calleeToken
> +    masm.push(Imm32(size_descriptor));

You don't technically need to construct an IonFramePrefix here, just store the current one (which is &[esp - framePushed]) into the IonCompartment.

@@ +146,5 @@
> +
> +    // Call the wrapper function.  The wrapper is in charge to unwind the
> +    // stack when returning from the call.
> +    masm.movePtr(ImmWord(f.wrapper), rax);
> +    masm.call(rax);

Here I would introduce (if it's not already there), something like masm.call(f.wrapper). IonCode* are GC pointers so we have to handle them specially when embedding them into the code stream. See Assembler-x86.h::jmp(void *target, RelocationKind kind) - for GC things, we store relocation information so the GC can trace and update those pointers.

I would advocate introducing a jmp(IonCode *) helper, and also similar functions for |call|.

::: js/src/ion/shared/CodeGenerator-x86-shared.h
@@ +48,5 @@
>  namespace ion {
>  
>  class OutOfLineBailout;
>  
> +struct VMFunction

This structure probably wants to be in Assembler-shared, we might want to use it outside of CodeGenerator-*

@@ +103,5 @@
>      CodeGeneratorX86Shared(MIRGenerator *gen, LIRGraph &graph);
>  
>    public:
> +    bool callVM(VMFunction &f, LSnapshot *snapshot);
> +    virtual bool wrapVMFunction(VMFunction &f) = 0;

I think you can make this non-virtual.

::: js/src/ion/x64/CodeGenerator-x64.cpp
@@ +60,5 @@
>  }
>  
> +
> +bool
> +CodeGeneratorX64::wrapVMFunction(VMFunction& f)

I would move the wrapVM functions to Trampolines-$(ARCH) if possible, hanging them off the IonCompartment. We might need to wrap functions outside of the code generator, for example if we're using MacroAssembler outside of the main Ion compiler.

@@ +80,5 @@
> +
> +    masm.mov(rsp, rax);
> +    masm.setupUnalignedABICall(1, rcx);
> +    masm.setABIArg(0, rax);
> +    masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, IonCompartment::setTopFrame));

You can completely inline this if you want, which might be easier - you can bake in the current IonCompartment pointer (it will never change for a given script). Then you might not even need setTopFrame.
Comment 13 Nicolas B. Pierron [:nbp] 2011-08-30 09:48:52 PDT
(In reply to David Anderson [:dvander] from comment #12)
> @@ +146,5 @@
> > +
> > +    // Call the wrapper function.  The wrapper is in charge to unwind the
> > +    // stack when returning from the call.
> > +    masm.movePtr(ImmWord(f.wrapper), rax);
> > +    masm.call(rax);
> 
> Here I would introduce (if it's not already there), something like
> masm.call(f.wrapper). IonCode* are GC pointers so we have to handle them
> specially when embedding them into the code stream. See
> Assembler-x86.h::jmp(void *target, RelocationKind kind) - for GC things, we
> store relocation information so the GC can trace and update those pointers.
> 
> I would advocate introducing a jmp(IonCode *) helper, and also similar
> functions for |call|.

callWithABI already implement this with extra.  I've looked at jumps implementation, we either need to implement rip-call to register the relative address inside a RelativePatch, or need to handle ImmGCPtr (unless I wrong and they are already handled).
Comment 14 Nicolas B. Pierron [:nbp] 2011-08-30 09:51:31 PDT
Created attachment 556886 [details] [diff] [review]
Handle IonCode pieces in ${ARCH}/Assembler-${ARCH}.h

This patch brings call of IonCode* inside the Assembler.  It implements "call imm32" on x86 and "mov ImmGCWord, reg   call reg" for x64.
Comment 15 David Anderson [:dvander] 2011-08-30 18:37:10 PDT
Comment on attachment 556886 [details] [diff] [review]
Handle IonCode pieces in ${ARCH}/Assembler-${ARCH}.h

Review of attachment 556886 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, a few nits:

::: js/src/ion/x64/Assembler-x64.h
@@ +434,5 @@
> +    }
> +
> +    void call(void *target,
> +              Relocation::Kind reloc = Relocation::EXTERNAL,
> +              const Register &scratch = ReturnReg) {

You can remove the third parameter (here and on x86), since I think scratch will always be ReturnReg.

@@ +445,5 @@
> +            movq(ImmWord(target), scratch);
> +        masm.call(scratch.code());
> +    }
> +
> +    void call(IonCode *target, const Register &scratch = ReturnReg) {

Same.

::: js/src/ion/x86/Assembler-x86.h
@@ +221,5 @@
>      // Actual assembly emitting functions.
>  
>      void movl(const ImmGCPtr &ptr, const Register &dest) {
>          masm.movl_i32r(ptr.value, dest.code());
> +        // TODO: Honor ImmGCPtr by indexing this offset.

I just filed bug 683407 for this case - could you cite the # next to the :TODO:?

@@ +261,5 @@
>          JmpSrc src = masm.jmp();
>          addPendingJump(src, target, reloc);
>      }
> +    void j(Condition cond, void *target,
> +           Relocation::Kind reloc = Relocation::EXTERNAL) {

This fits on one line (we wrap to 100 columns, or something close).
Comment 16 Nicolas B. Pierron [:nbp] 2011-08-31 10:33:53 PDT
Created attachment 557232 [details] [diff] [review]
Assembler Call & NearCall for x86/x86_64

Apply comments & Add nearCall implementation for x86_64.
Comment 17 David Anderson [:dvander] 2011-09-15 10:00:24 PDT
Comment on attachment 557232 [details] [diff] [review]
Assembler Call & NearCall for x86/x86_64

Review of attachment 557232 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the late review!
Comment 18 Nicolas B. Pierron [:nbp] 2011-09-15 10:10:54 PDT
Created attachment 560384 [details] [diff] [review]
Assembler Call for x86/x86_64.
Comment 19 Nicolas B. Pierron [:nbp] 2011-09-16 16:27:34 PDT
Created attachment 560647 [details] [diff] [review]
Implements C++ calls.

This patch depends on modification done for Bug 685841 and includes modifications made for Bug 685228.
Comment 20 David Anderson [:dvander] 2011-09-21 17:54:19 PDT
Comment on attachment 560647 [details] [diff] [review]
Implements C++ calls.

Review of attachment 560647 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the late review - looks like this is coming along though we should figure out how to deal with failures before landing.

::: js/src/ion/Ion.cpp
@@ +154,5 @@
> +    functionWrappers_ = cx->new_<VMWrapperMap>(cx);
> +    // :TODO: determine which initialization size fits best the usage of the
> +    // WeakMap. (at worst the number of VMFunction, in which case a WeakMap may
> +    // not be the best solution)
> +    if (!functionWrappers_ || !functionWrappers_->init())

I would just remove this :TODO: - I doubt the initialization size matters, and if it does it'll show up somewhere.

@@ +201,5 @@
>          if (bailoutTables_[i] && IsAboutToBeFinalized(cx, bailoutTables_[i]))
>              bailoutTables_[i] = NULL;
>      }
> +
> +    // Do not sweep content of the weakMap because weakMap::sweep is private.

What is responsible for sweeping the weakMap?

::: js/src/ion/IonCompartment.h
@@ +59,5 @@
> +
> +// Specialize js::MarkOf to use VMFunction * as a key of a WeakMap.
> +template <>
> +class MarkOf<ion::VMFunction *> {
> +  public:

House style is { on newline for class/struct declarations.

@@ +71,5 @@
> +
> +// Specialize js::MarkOf to use IonCode * as a value of a WeakMap.
> +template <>
> +class MarkOf<ion::IonCode *> {
> +  public:

Same here.

@@ +100,5 @@
>  class IonCompartment
>  {
> +    typedef WeakMap<VMFunction *, IonCode *,
> +                    DefaultHasher<VMFunction *>,
> +                    MarkKeyPolicy<VMFunction *, IonCode *> > VMWrapperMap;

Nice!

@@ +111,4 @@
>      IonActivation *active_;
>  
>      // Trampoline for entering JIT code.
> +    IonFramePrefix *topFrame_;

What is the difference between this and topCFrame? If we can derive topFrame_ from topCFrame, does it need to be an explicit member?

@@ +129,5 @@
> +    VMWrapperMap *functionWrappers_;
> +
> +  public:
> +    // Trampoline for entering C++ code.
> +    IonCFrameData *topCFrame;

Better to make private, and have topCFrame_, and an accessor. To expose offsetof you can do:

    static inline ptrdiff_t offsetOfTopCFrame() {
        return offsetof(IonCompartment, topCFrame_);
    }

::: js/src/ion/IonFrames.h
@@ +75,5 @@
> +
> +// This Frame is constructed when JS jited code calls a C function.
> +struct IonCFrameData
> +{
> +    ::js::Value result;

You shouldn't need ::js::Value here.

::: js/src/ion/IonRegisters.h
@@ +68,5 @@
> +# define JS_USED_REG_SAVE(name) \
> +    uint32 name = usedMask;
> +# define JS_USED_REG_RESTORE(name) \
> +    usedMask = name;
> +#else

I like the intent, though I'm skeptical of this direction. We tried this in early prototypes of JaegerMonkey and found a few disadvantages. It made surrounding code more verbose without ever catching any bugs. The problem is that uses of registers are not checked - you're limited to explaining your local register state through asserts, but the asserts won't catch actual register misuse.

What you could do instead, for cases where it applies, is use a local RegisterSet to allocate registers. It asserts internally, and for cases where you do hardcode registers, you could explicitly take them from the set. I.e. instead of:

    JS_USED_REG_DECL;

    JS_BIND_REG(ecx);
    masm.stuff(ecx);
    JS_FREE_REG(ecx);

You could end up with:
    TypedRegisterSet<Register> regs;

    Register temp = regs.takeAny();
    masm.stuff(temp);
    regs.free(temp);

This is verbose too - but in a complicated function can help.

::: js/src/ion/shared/CodeGenerator-x86-shared.h
@@ +103,4 @@
>      CodeGeneratorX86Shared(MIRGenerator *gen, LIRGraph &graph);
>  
>    public:
> +    bool callVM(VMFunction &f, LSnapshot *snapshot);

Can this be const?

::: js/src/ion/x64/Assembler-x64.h
@@ +500,5 @@
> +    if (arg < NumArgRegs)
> +        return false;
> +    *disp = (arg - NumArgRegs) * STACK_SLOT_SIZE + ShadowStackSpace;
> +    return true;
> +}

Boolean return isn't used - probably better to just assert.

::: js/src/ion/x64/Trampoline-x64.cpp
@@ +301,5 @@
> +    if (p)
> +        return p->value;
> +
> +    // Generate a separated code for the wrapper.
> +    MacroAssembler masm;

should be masm(cx), here and on x86

@@ +311,5 @@
> +    // convention used)
> +    if (!GetArgReg(0, &arg0))
> +        arg0 = rbx;
> +    if (!GetArgReg(1, &arg1))
> +        arg1 = rcx;

When would GetArgReg fail here?

@@ +320,5 @@
> +    //    frameSize
> +    //    ReturnAddr
> +
> +    // Reserve space for the rest of the IonCFrame.
> +    masm.subq(Imm32(offsetof(IonCFrameData, returnAddress)), rsp);

It took me a minute to understand this offsetof() - clever :) though pushing might be more explicit.

@@ +355,5 @@
> +    masm.setABIArg(1, MoveOperand(arg1));
> +    masm.setABIArg(0, MoveOperand(arg0));
> +    masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, IonCompartment::setTopFrame));
> +    JS_FREE_REG(arg0);
> +    JS_FREE_REG(arg1);

It's safe to inline this - you can do:

  masm.movq(ImmGCPtr(ioncompartment), reg);
  masm.movq(Operand(reg, IonCompartment::offsetOfTopCFrame()), reg2);
  ... etc ....

@@ +409,5 @@
> +    // Check for errors and set the Z flag.
> +    Assembler::Condition c = masm.testError(Assembler::Equal, ValueOperand(JSReturnReg));
> +
> +    // If the following is failing, update "callVM"'s bailoutIf condition.
> +    JS_ASSERT(c == Assembler::Equal);

It looks like this doesn't get used? What happens on failure?

::: js/src/ion/x86/Assembler-x86.h
@@ +341,5 @@
> +GetArgStackSlot(uint32 arg, uint32 *disp)
> +{
> +    *disp = arg * STACK_SLOT_SIZE;
> +    return true;
> +}

It looks like the boolean return here isn't used.

::: js/src/ion/x86/Trampoline-x86.cpp
@@ +385,5 @@
> +    masm.setABIArg(1, MoveOperand(arg1));
> +    masm.setABIArg(0, MoveOperand(arg0));
> +    masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, IonCompartment::setTopFrame));
> +    JS_FREE_REG(arg0);
> +    JS_FREE_REG(arg1);

Same thing, you should be able to inline here.
Comment 21 Nicolas B. Pierron [:nbp] 2011-09-22 05:38:24 PDT
(In reply to David Anderson [:dvander] from comment #20)
> Comment on attachment 560647 [details] [diff] [review]
> Implements C++ calls.
> 
> Review of attachment 560647 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry for the late review - looks like this is coming along though we should
> figure out how to deal with failures before landing.

Indeed, the patch is still missing way to read the snapshot and may need an extra space for setting it somewhere.

Should the wrapper save all registers?  Otherwise I have no **simple** idea how we could do the introspection of the registers referenced by the snapshot.

> ::: js/src/ion/Ion.cpp
> @@ +154,5 @@
> > +    functionWrappers_ = cx->new_<VMWrapperMap>(cx);
> > +    // :TODO: determine which initialization size fits best the usage of the
> > +    // WeakMap. (at worst the number of VMFunction, in which case a WeakMap may
> > +    // not be the best solution)
> > +    if (!functionWrappers_ || !functionWrappers_->init())
> 
> I would just remove this :TODO: - I doubt the initialization size matters,
> and if it does it'll show up somewhere.
> 
> @@ +201,5 @@
> >          if (bailoutTables_[i] && IsAboutToBeFinalized(cx, bailoutTables_[i]))
> >              bailoutTables_[i] = NULL;
> >      }
> > +
> > +    // Do not sweep content of the weakMap because weakMap::sweep is private.
> 
> What is responsible for sweeping the weakMap?

WeakMap are swept by WeakMapBase::sweepAll function which is called inside jsgc.cpp by static MarkAndSweep function.

> ::: js/src/ion/IonCompartment.h
> @@ +59,5 @@
> > +
> > +// Specialize js::MarkOf to use VMFunction * as a key of a WeakMap.
> > +template <>
> > +class MarkOf<ion::VMFunction *> {
> > +  public:
> 
> House style is { on newline for class/struct declarations.
> 
> @@ +71,5 @@
> > +
> > +// Specialize js::MarkOf to use IonCode * as a value of a WeakMap.
> > +template <>
> > +class MarkOf<ion::IonCode *> {
> > +  public:
> 
> Same here.

This patch is going to be reviewed in Bug 687843.

> @@ +100,5 @@
> >  class IonCompartment
> >  {
> > +    typedef WeakMap<VMFunction *, IonCode *,
> > +                    DefaultHasher<VMFunction *>,
> > +                    MarkKeyPolicy<VMFunction *, IonCode *> > VMWrapperMap;
> 
> Nice!

That's my goal ;)

> @@ +111,4 @@
> >      IonActivation *active_;
> >  
> >      // Trampoline for entering JIT code.
> > +    IonFramePrefix *topFrame_;
> 
> What is the difference between this and topCFrame? If we can derive
> topFrame_ from topCFrame, does it need to be an explicit member?

My mistake I forgot to remove it.  The idea was to keep everything related to hosting the next C Frame inside the IonCFrame data structure.

> @@ +129,5 @@
> > +    VMWrapperMap *functionWrappers_;
> > +
> > +  public:
> > +    // Trampoline for entering C++ code.
> > +    IonCFrameData *topCFrame;
> 
> Better to make private, and have topCFrame_, and an accessor. To expose
> offsetof you can do:
> 
>     static inline ptrdiff_t offsetOfTopCFrame() {
>         return offsetof(IonCompartment, topCFrame_);
>     }

(1) At first I prefered to use a function call instead of writing values from the assembly to prevent difficulties of extending it.  But sure, I can inline it, and use cx & ioncompartment directly.

In fact I think I would extend the IonCFrame to contain references to the topFrame, the IonCompartment and the JSContext and then call the VM function with this extra argument.

> ::: js/src/ion/IonFrames.h
> @@ +75,5 @@
> > +
> > +// This Frame is constructed when JS jited code calls a C function.
> > +struct IonCFrameData
> > +{
> > +    ::js::Value result;
> 
> You shouldn't need ::js::Value here.

You mean, I don't need the namespace or I don't need the Value ?  I used the value as a mean to store the result on the stack.  Manipulating JSReturnReg register(s) inside the wrapped function is unsafe and cause more bloat code for handling arch-specific registers.  Thus I allocate stack space for the JS return value.

> ::: js/src/ion/IonRegisters.h
> @@ +68,5 @@
> > +# define JS_USED_REG_SAVE(name) \
> > +    uint32 name = usedMask;
> > +# define JS_USED_REG_RESTORE(name) \
> > +    usedMask = name;
> > +#else
> 
> I like the intent, though I'm skeptical of this direction. We tried this in
> early prototypes of JaegerMonkey and found a few disadvantages. It made
> surrounding code more verbose without ever catching any bugs. The problem is
> that uses of registers are not checked - you're limited to explaining your
> local register state through asserts, but the asserts won't catch actual
> register misuse.

The idea was to prevent collision with the registers used by the calling convention.  Such registers may cause problems when switching to another call convention.  This is not a perfect system because it does not account of register mess created by function calls.  The final goal was to work with it during Code Generation, in which case we could be warn about wrong assumptions of register usage.

> What you could do instead, for cases where it applies, is use a local
> RegisterSet to allocate registers. It asserts internally, and for cases
> where you do hardcode registers, you could explicitly take them from the
> set. I.e. instead of:
> 
>     JS_USED_REG_DECL;
> 
>     JS_BIND_REG(ecx);
>     masm.stuff(ecx);
>     JS_FREE_REG(ecx);
> 
> You could end up with:
>     TypedRegisterSet<Register> regs;
> 
>     Register temp = regs.takeAny();
>     masm.stuff(temp);
>     regs.free(temp);
> 
> This is verbose too - but in a complicated function can help.

Ok, I will switch to it.  I hope constant propagation can get rid of it in optimized builds.

> ::: js/src/ion/shared/CodeGenerator-x86-shared.h
> @@ +103,4 @@
> >      CodeGeneratorX86Shared(MIRGenerator *gen, LIRGraph &graph);
> >  
> >    public:
> > +    bool callVM(VMFunction &f, LSnapshot *snapshot);
> 
> Can this be const?

I already made changes to it to have a constant VMFunction, but don't I think I can also make the function as constant as it produce the assembly code for calling the wrapper and it encodes the snapshot.

> ::: js/src/ion/x64/Assembler-x64.h
> @@ +500,5 @@
> > +    if (arg < NumArgRegs)
> > +        return false;
> > +    *disp = (arg - NumArgRegs) * STACK_SLOT_SIZE + ShadowStackSpace;
> > +    return true;
> > +}
> 
> Boolean return isn't used - probably better to just assert.

I agree.

> ::: js/src/ion/x64/Trampoline-x64.cpp
> @@ +301,5 @@
> > +    if (p)
> > +        return p->value;
> > +
> > +    // Generate a separated code for the wrapper.
> > +    MacroAssembler masm;
> 
> should be masm(cx), here and on x86

Isn't "linker.newCode(cx);" enough? I copied this pieces of code from the IonCompartment::generateBailoutHandler function.  Thus I guess this function has the same issue.

> @@ +311,5 @@
> > +    // convention used)
> > +    if (!GetArgReg(0, &arg0))
> > +        arg0 = rbx;
> > +    if (!GetArgReg(1, &arg1))
> > +        arg1 = rcx;
> 
> When would GetArgReg fail here?

It could fail if the call convention does not provide any register for the argument index.  If it fails, I fallback on an other register.  This is made to prevent extra moves when calling C functions.  This unlikely to fail on x64 (because call convention are using more than 2 registers for the arguments), but the same principle can apply for x86 and other arch.

> @@ +320,5 @@
> > +    //    frameSize
> > +    //    ReturnAddr
> > +
> > +    // Reserve space for the rest of the IonCFrame.
> > +    masm.subq(Imm32(offsetof(IonCFrameData, returnAddress)), rsp);
> 
> It took me a minute to understand this offsetof() - clever :) though pushing
> might be more explicit.

I did it first but I found it quite hard to follow because pushes are not named and you have to rely on comments to understand.  I found offsetof(struct, field) easier to interpret and to follow, because you know that *struct* apply to the base register and *field* is the access value.

// No need to take care of the order of the fields and instruction are talkative.
masm.mov(rax, Operand(rsp, offset(Struct, field1)));
masm.mov(rbx, Operand(rsp, offset(Struct, field2)));

vs.

// Need to take care of the order of the fields. (may break if the addresses are changing)
masm.push(rax); // ((Struct*) rsp)->field1 = rax
masm.push(rbx); // ((Struct*) rsp)->field2 = rbx

> @@ +355,5 @@
> > +    masm.setABIArg(1, MoveOperand(arg1));
> > +    masm.setABIArg(0, MoveOperand(arg0));
> > +    masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, IonCompartment::setTopFrame));
> > +    JS_FREE_REG(arg0);
> > +    JS_FREE_REG(arg1);
> 
> It's safe to inline this - you can do:
> 
>   masm.movq(ImmGCPtr(ioncompartment), reg);
>   masm.movq(Operand(reg, IonCompartment::offsetOfTopCFrame()), reg2);
>   ... etc ....

Ok. See (1) for explanation of this choice.

> @@ +409,5 @@
> > +    // Check for errors and set the Z flag.
> > +    Assembler::Condition c = masm.testError(Assembler::Equal, ValueOperand(JSReturnReg));
> > +
> > +    // If the following is failing, update "callVM"'s bailoutIf condition.
> > +    JS_ASSERT(c == Assembler::Equal);
> 
> It looks like this doesn't get used? What happens on failure?

Indeed the condition is not used, (I should use DebugOnly<...>) but it asserts that the bailout which is done out-side the wrapper use the right condition.

> ::: js/src/ion/x86/Assembler-x86.h
> @@ +341,5 @@
> > +GetArgStackSlot(uint32 arg, uint32 *disp)
> > +{
> > +    *disp = arg * STACK_SLOT_SIZE;
> > +    return true;
> > +}
> 
> It looks like the boolean return here isn't used.

Indeed, same as on x64.

> ::: js/src/ion/x86/Trampoline-x86.cpp
> @@ +385,5 @@
> > +    masm.setABIArg(1, MoveOperand(arg1));
> > +    masm.setABIArg(0, MoveOperand(arg0));
> > +    masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, IonCompartment::setTopFrame));
> > +    JS_FREE_REG(arg0);
> > +    JS_FREE_REG(arg1);
> 
> Same thing, you should be able to inline here.

Ok.
Comment 22 Nicolas B. Pierron [:nbp] 2011-09-22 12:26:18 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #21)
> (In reply to David Anderson [:dvander] from comment #20)
> > 
> > Sorry for the late review - looks like this is coming along though we should
> > figure out how to deal with failures before landing.
> 
> Indeed, the patch is still missing way to read the snapshot and may need an
> extra space for setting it somewhere.
> 

Ok, I manage to dump register and restore only registers which have been used by the wrapper and the inner functions.  I would have to modify the IonFrameIterator to accept IonCFrame in addition to BailoutEnvironment.

By doing so, I remarked that the IonCFrame is similar to a BailoutStack except that the BailoutStack does not contain a return address.  I tried unsuccessful attempts (ugly) to factor IonCFrame with Bailouts.  The current version is still ugly.  If you have any suggestion that I can try.

I wonder if it would not be best to pass the JSContext to the VMFunction, in order to avoid getting it from the Local Thread Storage inside the wrapped function.

When I look at the generated code they are still lots of places where we can optimize the code such as optional iteration of frames (do not save all registers, remove pushing the snapshot offset in the IonScript), no float operation (avoid saving xmm registers), known output type (select VMFunction to avoid boxing/unboxing), using ReturnReg to provide status information instead of comparing it with MAGIC Values (2), etc ...
Comment 23 David Anderson [:dvander] 2011-09-22 13:48:48 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #21)
> Should the wrapper save all registers?  Otherwise I have no **simple** idea
> how we could do the introspection of the registers referenced by the
> snapshot.

The register allocator should take care of this for you - so you don't have to worry about it. If a LIR is marked as a call, all registers will be spilled - the snapshot would not have any registers in it.


> You mean, I don't need the namespace or I don't need the Value ?

Sorry, I typed too much :) yeah I meant the namespace.

> The idea was to prevent collision with the registers used by the calling
> convention.  Such registers may cause problems when switching to another
> call convention.  This is not a perfect system because it does not account
> of register mess created by function calls.  The final goal was to work with
> it during Code Generation, in which case we could be warn about wrong
> assumptions of register usage.

Using argument registers should be okay - for example if you put the contents of arg0 in ArgReg1, and the contents of arg1 in ArgReg0, move resolution will just swap them.

> Ok, I will switch to it.  I hope constant propagation can get rid of it in
> optimized builds.

The operations are very fast, so it's not a big deal - generating these stubs won't be the critical path.

> I already made changes to it to have a constant VMFunction

Sorry, yeah, that's what I meant.

> > should be masm(cx), here and on x86
> 
> Isn't "linker.newCode(cx);" enough? I copied this pieces of code from the
> IonCompartment::generateBailoutHandler function.  Thus I guess this function
> has the same issue.

Ah, okay, I think it's fine. The "cx" is only for when an IonContext isn't guaranteed to exist.

> It could fail if the call convention does not provide any register for the
> argument index.  If it fails, I fallback on an other register.  This is made
> to prevent extra moves when calling C functions.  This unlikely to fail on
> x64 (because call convention are using more than 2 registers for the
> arguments), but the same principle can apply for x86 and other arch.

It's probably easier to assume the minimal number of calling conventions: on x64 that's Win64 versus Everything Else, and on x86 we can ignore fastcall.

> I did it first but I found it quite hard to follow because pushes are not
> named and you have to rely on comments to understand.

Good point!
Comment 24 Nicolas B. Pierron [:nbp] 2011-09-23 08:12:50 PDT
(In reply to David Anderson [:dvander] from comment #23)
> (In reply to Nicolas B. Pierron [:pierron] from comment #21)
> > Should the wrapper save all registers?  Otherwise I have no **simple** idea
> > how we could do the introspection of the registers referenced by the
> > snapshot.
> 
> The register allocator should take care of this for you - so you don't have
> to worry about it. If a LIR is marked as a call, all registers will be
> spilled - the snapshot would not have any registers in it.

Doing so move instructions from the IonCode to the wrapper, this make the IonCode smaller which is good for slow paths and in addition, this reduce the size of the IonCode and the computation of the register allocator.  On the other hand, many unused registers may be save and restore by the wrapper.

I am worried about the size of the code inside the IonCode and the induced cost of saving/restoring registers inside the IonCode and inside the wrapper.

May be we should save registers and restore them inside the wrapper and ask the register allocator to push the mask of used registers to the wrapper (need 5 bytes per general register for saving/restoring [shr, jp, push]).
Comment 25 Nicolas B. Pierron [:nbp] 2011-09-23 10:07:51 PDT
(In reply to Nicolas B. Pierron [:pierron] from comment #24)
> (In reply to David Anderson [:dvander] from comment #23)
> > (In reply to Nicolas B. Pierron [:pierron] from comment #21)
> > > Should the wrapper save all registers?  Otherwise I have no **simple** idea
> > > how we could do the introspection of the registers referenced by the
> > > snapshot.
> > 
> > The register allocator should take care of this for you - so you don't have
> > to worry about it. If a LIR is marked as a call, all registers will be
> > spilled - the snapshot would not have any registers in it.
> 
> Doing so move instructions from the IonCode to the wrapper, this make the
> IonCode smaller which is good for slow paths and in addition, this reduce
> the size of the IonCode and the computation of the register allocator.  On
> the other hand, many unused registers may be save and restore by the wrapper.

In addition (in favor to spilling inside the IonCode), if multiple calls are followed, this may take less time because it won't have to save & restore all registers at each call.  But this could have an undesired impact on calls which do not need to iterate on stack frames.  For these calls we need to save the minimum set of registers and let them save & restore by the wrapper.

One example would be:

for (...; i++)
  f(i); // <-- C+ call created

You don't want to save and restore "i".  The register allocator should be aware of the wrapper registers and use non-volatile registers which are not used by the wrapper.

> I am worried about the size of the code inside the IonCode and the induced
> cost of saving/restoring registers inside the IonCode and inside the wrapper.
> 
> May be we should save registers and restore them inside the wrapper and ask
> the register allocator to push the mask of used registers to the wrapper
> (need 5 bytes per general register for saving/restoring [shr, jp, push]).
Comment 26 Nicolas B. Pierron [:nbp] 2011-09-29 03:58:35 PDT
Created attachment 563365 [details] [diff] [review]
[0002] Implements C++ calls.

Tested on x64 and it supports:
- Any number of argument.
- Return js::Value.
- Iterate/Inspect ion frames.
Comment 27 David Anderson [:dvander] 2011-10-12 21:56:38 PDT
Comment on attachment 563365 [details] [diff] [review]
[0002] Implements C++ calls.

Review of attachment 563365 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonFramesAnalysis.h
@@ +109,5 @@
>      }
>  };
>  
> +// This Frame is constructed when JS jited code calls a C function.
> +struct IonCCallStack

Name nit: IonCFrame (A call stack would be a set of frames)

::: js/src/ion/IonRegisters.h
@@ +224,4 @@
>          JS_ASSERT(!has(reg));
>          addUnchecked(reg);
>      }
> +    bool someAllocated(uint32 bits) const {

Nit: Safer to take in a RegisterSet.

::: js/src/ion/x86/Trampoline-x86.cpp
@@ +425,5 @@
> +
> +    masm.ret();
> +
> +    Linker linker(masm);
> +    IonCode *wrapper = linker.newCode(cx);

wrapper must be NULL checked here.

@@ +426,5 @@
> +    masm.ret();
> +
> +    Linker linker(masm);
> +    IonCode *wrapper = linker.newCode(cx);
> +    functionWrappers_->add(p, &f, wrapper);

And I think this can return false - comments apply to x86 as well.
Comment 28 Nicolas B. Pierron [:nbp] 2011-10-13 06:14:44 PDT
(In reply to David Anderson [:dvander] from comment #27)
> Comment on attachment 563365 [details] [diff] [review] [diff] [details] [review]
> [0002] Implements C++ calls.
> 
> Review of attachment 563365 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/ion/IonRegisters.h
> @@ +224,4 @@
> >          JS_ASSERT(!has(reg));
> >          addUnchecked(reg);
> >      }
> > +    bool someAllocated(uint32 bits) const {
> 
> Nit: Safer to take in a RegisterSet.

Safer ?  What do you mean?
Comment 29 Nicolas B. Pierron [:nbp] 2011-10-13 07:11:34 PDT
Do not merge these patch yet, it won't work until we get rid/or merge its dependencies.
Comment 30 David Anderson [:dvander] 2011-10-17 17:32:24 PDT
> > > +    bool someAllocated(uint32 bits) const {
> > 
> > Nit: Safer to take in a RegisterSet.
> 
> Safer ?  What do you mean?

Enum values and raw uint32s will coerce here without warning, a problem that had real bugs in JM. RegisterSet discourages using raw bits.
Comment 31 Nicolas B. Pierron [:nbp] 2011-10-18 08:34:05 PDT
Created attachment 567757 [details] [diff] [review]
Assembler Call/Jump for x86/x86_64.

Update. (previous patch r=dvander)
Comment 32 Nicolas B. Pierron [:nbp] 2011-10-18 08:39:40 PDT
Created attachment 567758 [details] [diff] [review]
Implements C++ calls.

Update. (old patch: r=dvander)

See the end of the raw file to check get the dependency right from Bugzilla.
Comment 33 Nicolas B. Pierron [:nbp] 2011-10-19 09:30:41 PDT
Created attachment 568083 [details] [diff] [review]
Implements C++ calls.

- Move mark functions to jsgcmark.h
- Follow renaming of MarkCachePolicy to CacheMarkPolicy.
Comment 34 Nicolas B. Pierron [:nbp] 2011-10-27 08:30:36 PDT
Created attachment 569985 [details] [diff] [review]
Implements C++ calls.

Follow Bug 685841 modification (WeakMap -> new Cache class)
Comment 35 Nicolas B. Pierron [:nbp] 2011-10-31 14:11:09 PDT
Created attachment 570840 [details] [diff] [review]
Implements C++ calls.

Follow latest WeakCache interface change (removal of mark function).
Comment 36 David Anderson [:dvander] 2011-10-31 16:13:48 PDT
patch 1: http://hg.mozilla.org/projects/ionmonkey/rev/08998bb1b52f w/ some cosmetic changes to get rid of a warning
Comment 37 David Anderson [:dvander] 2011-10-31 17:15:17 PDT
https://hg.mozilla.org/projects/ionmonkey/rev/691c944bb025 w/ some follow-up fixes in the next commit

Note You need to log in before you can comment on or make changes to this bug.