Closed Bug 634638 Opened 13 years ago Closed 13 years ago

ANI: nativegen should emit "callin" wrappers for all methods in builtin classes

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: stejohns, Assigned: stejohns)

References

Details

Attachments

(1 file, 4 obsolete files)

This would enable simpler (and more robust!) C++ -> AS3 calls in glue code. It would remove lots of boilerplate code in Flash/AIR (eg, likely all of MethodEntry usage, other add-hoc bits in PlayerToplevel). If we emit wrappers for contructors too, then PlayerToplevel::constructV and friends also go away. As an added bonus, it can do C++-to-Atom types conversions appropriately as well.
My initial thought is to emit wrappers for every known method (!) -- this could be done as inline methods (thus pay-as-you-go) or non-inline (and rely on our linkers to reliably dead-strip unused ones, which one would hope is already happening...)
Definitely inline methods.  If the inline methods are generating too much code then we can look into fixing the VM's calling convention or we could un-inline the generated wrappers.
I'd say that there should be no way of calling from C++ code outside of avmplus into AS3 other than by calling a C++ function generated by nativegen.py that calls a particular AS3 method of a class.

If somebody needs to do a dynamic lookup they can put that dynamic lookup in an AS3 method.

The generated method should force the C++ to set the code context properly ( null should not be allowed ).
(In reply to comment #3)
> I'd say that there should be no way of calling from C++ code outside of avmplus
> into AS3 other than by calling a C++ function generated by nativegen.py that
> calls a particular AS3 method of a class.

+1, that's my goal.
 
> If somebody needs to do a dynamic lookup they can put that dynamic lookup in an
> AS3 method.

It's possible we may want an explicit call-a-dynamic-method API at the C++ level, but yes, I'd prefer your approach if it proves acceptable.

> The generated method should force the C++ to set the code context properly (
> null should not be allowed ).

Oh yes -- required argument, methinks :-)
+1
This wil be much easier to do after 634635, adding dependency
Depends on: 634635
See also Bug 634655; not sure if it's a dependency (yet) but probably should be considered before this happens
Depends on: 635293
So I made q quick prototype generating inline methods, but there are issues with this that may make it unsuitable...

(1) order-of-include dependencies become nontrivial, since (eg) a Class may reference its Instance in the generated wrappers (or many classes reference [Player]AvmCore), and said class may only be forward-declared at this point, meaning we can't call concrete methods on it. This can be worked around by shuffling code but my spidey-sense tells me it will be a lot of shuffling. Alternately, we could emit all the inline bodies into a separate "builtin-inlines.hh" file, included later in the include chain...

(2) I'm concerned (without evidence as of yet) that this might introduce nontrival code expansion; a lot of existing call sites for this are currently calling constructObject(), which takes a printf-style arglist, decomposed by varargs downstream; the inlined approach is theoretically more efficient but means more expanded calls to intToAtom(), etc...

Of course, if inlined methods cause code bloat, we could (and should) still generate wrappers that call to a simillar varargs-style call helper; this wouldn't cause any meaningful expansion but still improve readability and type safety. Arguably, the only way to measure the bloat is to convert everything and try it both ways.
After looking at this still further, I think that emitting wrappers for all methods is both tricky and overkill; something like MethodEntry is still probably useful, but we can move it into the VM and make it type-safe.
Attached patch Sketch of call-in wrappers (obsolete) — Splinter Review
Here's an initial sketch. Just asking for feedback for now, as some questions remain to be answered...

(The one place in this patch that actually *uses* one of the call-ins is FunctionClass:createEmptyFunction(), which used to be a wrapper to do an explicit call; I now eliminated the middleman.)

First, note that the vtable-index of MethodEnv can be determined at compiletime for all builtin classes; AOT has already relied on this, so nativegen does too: it replicates the logic in Traits to calculate method indices, and verifies this at class-creation time via SlotOffsetsAndAsserts. Thus all these are similar in cost to a C++ virtual method, with the added overhead of Atomization of args.

Things that need to be resolved:
-- This patch emits a wrapper for *every* method in *every* builtin class; I'm pretty sure we don't want to do that, unless we want to bring the C++ compiler to its knees, as we'd be generating some huge honkin' macros. My first thought is to request it on a per-method basis via metadata, eg

    [native] // request a C++->AS3 call wrapper
    public function writeBoolean(value:Boolean):void;

opinions?

-- At the moment, I simply punt on any methods with rest args; I'd rather not do these at all, frankly, since using varargs from C++ is easy to get wrong and hard to type-check. Anyone have strong feelings otherwise?

-- Syntax I'm uing for call-ins is currently "call_as3name", eg, for the method above, the C++ call would be "void call_writeBoolean(bool);" The short-term justification is that the plain name is already used by the C++ class for the implementation; however, even if it wasn't, I'm still tempted to stick with this syntax (or something similar), as a syntactic hint that you're calling something that isn't a "real" C++ method and thus has more overhead. Opinions?

-- This patch doesn't attempt to do anything with CodeContext as of yet; I'm skeptical that we want to have single call-in require CodeContext as an explicit parameter. Perhaps it would make more send to be able to opt-in for specific methods, eg

    [native,context]
    function foo(i:int):void;

yields

    void foo(CodeContext* cc, int i);

?
Assignee: nobody → stejohns
Attachment #518845 - Flags: feedback?(edwsmith)
Attachment #518845 - Attachment is patch: true
Attachment #518845 - Attachment mime type: application/octet-stream → text/plain
Attachment #518845 - Flags: feedback?(chrisb)
Attachment #518845 - Flags: feedback?(jmott)
Attachment #518845 - Flags: feedback?(rreitmai)
One more thought: an obvious possible optimization would be to special-case native methods on final classes or methods, and call from the wrapper directly to the implementation. (Not sure if there are enough of though to be worthwhile or not.)
I like the goals of the proposal, I don't have much to contribute on the implementation AVM-plus side.  The call_ prefix seems a little generic, something that indicated I was calling into ActionScript would be nice, but I'm not having any luck coming up with something short that I like.
Best thing, ever.

If we go with only generating call-ins for methods with the magic meta-data, then the magic native meta-data could say how many variants of a method with rest arguments should be generated.  Then you don't need to have var args.  Maybe something like:
[native(rest=10)]
would generate 11 call in methods that take between 0 and 10 rest arguments.  It would be nativegen.py error to have such meta-data on a method with no rest args.

Personally, I would not sweat generating massive amounts of C++ code.  I'd at least try it in the player/AIR code base before I worried about it.

I hope we can find a solution to the CodeContext thing, but I suspect the safest simplest thing is to require every call-in to have a code context parameter.
(In reply to comment #13)
> If we go with only generating call-ins for methods with the magic meta-data,
> then the magic native meta-data could say how many variants of a method with
> rest arguments should be generated. 

I like it. Still would prefer to avoid them entirely :-)

> Personally, I would not sweat generating massive amounts of C++ code.  I'd at
> least try it in the player/AIR code base before I worried about it.

If BuildForge gets even 1 second slower because of this, everyone will kill us, and I'll help.
 
> I hope we can find a solution to the CodeContext thing, but I suspect the
> safest simplest thing is to require every call-in to have a code context
> parameter.

Yeah, but the problem is that there are *lots* of existing use cases where we don't set the CodeContext, and arguably, don't need to. (Probably other cases where we don't but really need to...) But anyway, my point is that if we add this new arg to *every* call I think we'll get major pushback. Not to mention, it's extra overhead that may be unnecessary for small calls.
(In reply to comment #14)
> (In reply to comment #13)
> > Personally, I would not sweat generating massive amounts of C++ code.  I'd at
> > least try it in the player/AIR code base before I worried about it.
> 
> If BuildForge gets even 1 second slower because of this, everyone will kill us,
> and I'll help.
> 

What if you tried it and it did not make the build 1 second slower?  What other things are being added to the build that makes the build 10 seconds slower anyway?

> > I hope we can find a solution to the CodeContext thing, but I suspect the
> > safest simplest thing is to require every call-in to have a code context
> > parameter.
> 
> Yeah, but the problem is that there are *lots* of existing use cases where we
> don't set the CodeContext, and arguably, don't need to. (Probably other cases
> where we don't but really need to...) But anyway, my point is that if we add
> this new arg to *every* call I think we'll get major pushback. Not to mention,
> it's extra overhead that may be unnecessary for small calls.

Unless we can statically know the code context is not needed, then we should pass it in.  It might be possible for the compiler to annotate methods it statically knows can't possibly call a native method, but any method that could result in a native method call without the user code on the stack requires a code context.  Hiding this fact from FR engineers is not doing them any favors, its just hiding bugs.
Personally I'd like to see us only generate callins for things that flagged in AS somehow.   

Do we need to support callins for vargs and rest calls?    Seems like things would be more robust with out it and only slightly more cumbersome to use.
(In reply to comment #15)
> Unless we can statically know the code context is not needed, then we should
> pass it in.  It might be possible for the compiler to annotate methods it
> statically knows can't possibly call a native method, but any method that could
> result in a native method call without the user code on the stack requires a
> code context.  Hiding this fact from FR engineers is not doing them any favors,
> its just hiding bugs.

Hmm, here's a thought: what if we leveraged the ANI conversion to require all "strict ANI" classes to declare when a method requires a CodeContext, and pass it as an additional argument to that method? (I'm not sure this actually gains us anything short-term other than ideological purity, but in the long term, if we knew which methods needed CodeContext and which didn't, we might be able to optimize the overhead of keeping track of it in the JIT...)
(In reply to comment #17)
> (In reply to comment #15)
> > Unless we can statically know the code context is not needed, then we should
> > pass it in.  It might be possible for the compiler to annotate methods it
> > statically knows can't possibly call a native method, but any method that could
> > result in a native method call without the user code on the stack requires a
> > code context.  Hiding this fact from FR engineers is not doing them any favors,
> > its just hiding bugs.
> 
> Hmm, here's a thought: what if we leveraged the ANI conversion to require all
> "strict ANI" classes to declare when a method requires a CodeContext, and pass
> it as an additional argument to that method? (I'm not sure this actually gains
> us anything short-term other than ideological purity, but in the long term, if
> we knew which methods needed CodeContext and which didn't, we might be able to
> optimize the overhead of keeping track of it in the JIT...)

Bug 634655 talks about how native methods should be passed a code context in some form if they need it.  Even with that you should still require that C++ pass the code context to call-ins because the dynamic nature of the language makes it difficult to determine which methods and functions a given AS3 function calls.  A single dynamic look would force the AS3 compiler to be pessimistic and assume that could land back in a native method the might need the code context.

With respect to push back from the FR teams, currently these call-ins do not exist yet.  They are more maintainable that what the current code is doing.  There is no reason for anybody to complain.  Requiring the calling C++ code to identify the correct code context is essential to getting the security model right.
(In reply to comment #15)
> > > Personally, I would not sweat generating massive amounts of C++ code.  I'd at
> > > least try it in the player/AIR code base before I worried about it.

I built in FRR and found that avmglue.h went from ~1.6MB (~20k lines), to ~2.3MB (~38k lines).

What mostly concerns me is the massive increase in preprocessor usage, but that can be mitigated some by moving the inlined bodies into -classes.hh. I'll try that.
> I built in FRR and found that avmglue.h went from ~1.6MB (~20k lines), to
> ~2.3MB (~38k lines).
> 
> What mostly concerns me is the massive increase in preprocessor usage, but that
> can be mitigated some by moving the inlined bodies into -classes.hh. I'll try
> that.

Good news: moving all the inlines into -classes.hh does mitigate this, and makes things easier to read, too.

Bad news: it's not possible to compile Flash with this approach, because it would mean that all glue C++ classes would have to be included in a header file prior to the -classes.hh, which isn't currently possible and would require nontrivial restructuring.

I'm gonna just go back to plan A and actually measure compiletime to see if it makes a difference.
Attached patch Patch v2 (obsolete) — Splinter Review
I think this is good enough to land now; it's similar to the previous one, with minor bug fixes. I tested Flash compile times when generating everything and it's not measurable. I'm happy with the "call_" syntax. I'm also happy with completely punting on rest-arg methods for now; if that turns out to be a legitimate need, we can add the multiple-overloaded wrappers. And the CodeContext issue is too large to hold this up, IMHO; applying it to existing Flash code makes things sooooo much nicer that IMHO we should go ahead, and ponder the CodeContext issues in the future.
Attachment #518845 - Attachment is obsolete: true
Attachment #519553 - Flags: review?(rreitmai)
Attachment #519553 - Flags: feedback?(chrisb)
Attachment #518845 - Flags: feedback?(rreitmai)
Attachment #518845 - Flags: feedback?(jmott)
Attachment #518845 - Flags: feedback?(edwsmith)
Attachment #518845 - Flags: feedback?(chrisb)
Attachment #519553 - Flags: feedback?(chrisb) → feedback+
Applying this to existing code in Flash has pointed out one minor wart: that of friend access.

Specifically, existing callin mechanisms in Flash effectively ignore AS3 visibility; all methods are "public" as far as C++ is concerned.

The wrappers try to maintain the AS3 visibility in C++: private, protected, and public all map directly. ("internal" and other namespaces also map to public for now, for lack of a better option.)

The gotcha of course is that converting some existing usages now refuses to compile; IMHO this is a feature, not a bug, but needs addressing somehow... options, as I see them:

(1) require the C++ classes to declare friend access; this is probably the best option, except that it doesn't work for synthetic classes emitted entirely by nativegen (since the class def isn't user editable). I suppose we could finess this with metadata, eg

     [native(friend="FooClass")]
     class Bar { ... }

but it feels like metadata is getting overloaded...

(2) when an AS3 wrapper method needs to be used in this way, require it's vis to be changed, typically from "private" to "public-VM_INTERNAL". Also works but not sure if it's a pattern I'm happy to promote, at least outside the VM.

(3) I can't think of a good third option.

Thoughts, anyone?
(There is the *lame* 3rd option, of course, namely making all callin wrappers public....)
(In reply to comment #23)
> (There is the *lame* 3rd option, of course, namely making all callin wrappers
> public....)

You could make this an option on nativegen.py command line.  That way the player can make the call-ins public and the vm can keep them private or protected.

Option 2 is dangerous, changing private methods to any other namespace will make it much easier to gain access to them from another abc with hand crafted abc ( something an attacker would do ).
(In reply to comment #24)
> Option 2 is dangerous, changing private methods to any other namespace will
> make it much easier to gain access to them from another abc with hand crafted
> abc ( something an attacker would do ).

Shouldn't be possible via VM_INTERNAL.
(In reply to comment #25)
> (In reply to comment #24)
> > Option 2 is dangerous, changing private methods to any other namespace will
> > make it much easier to gain access to them from another abc with hand crafted
> > abc ( something an attacker would do ).
> 
> Shouldn't be possible via VM_INTERNAL.

 As long as you mean putting VM_INTERNAL in the version meta-data tag.
After some pondering, and discussing with Edwin, I want to try an alternate approach before landing this: instead of exposing *all* methods via wrappers by default, try requiring marking and declaring the C++ visibility, eg

    [wrapper("public")] // generate a wrapper with "public" vis to C++, even though we're still private to AS3
    private function foo();

    // no metadata, so no wrapper generated
    private function bar();

I'm not sure if this will be better or worse, but what we choose here will have longterm ramifications on our glue code, so I think it's worth trying before plowing ahead.
(In reply to comment #27)
> I'm not sure if this will be better or worse, but what we choose here will have
> longterm ramifications on our glue code, so I think it's worth trying before
> plowing ahead.

totally agree.  This approach would also give us a place to make rest work if we wanted to and it allows gives us more screws to tighten ( the as3 code is not at the mercy of the c++ code ).   You could require annotations on the member variables as well so that you only get poke and peak C++ methods for the slots that the as3 code want to expose to the C++ code.
Comment on attachment 519553 [details] [diff] [review]
Patch v2

Marking -ve for now until comment 27 is explored.

Feel free to mark ? for re-review if the issue has been resolved.
Attachment #519553 - Flags: review?(rreitmai) → review-
(In reply to comment #29)
> Marking -ve for now until comment 27 is explored.

Oops, yeah, sorry, I should have removed the R? request myself...
I've come to the conclusion that it would desirable to emit synthetic C++ wrappers for AS3 interfaces. That part isn't so hard (and isn't about this bug strictly speaking) but I'm trying to decide how to emit callin wrappers for them; we can't early bind as we do for the other callins. I suspect the most reasonable approach is to reuse the ImtThunk apparatus; I think this will be doable without too much drama, but alternate suggestions are welcomed.
What is the use case for C++ calling an AS3 object via its interface, vs. calling it in the untyped way and going through coerceEnter()?

If we emit callin wrappers, they could be like plain functions, one for each interface method, that box up parameters then call coerceEnter().  Otherwise, whats the plan, pure virtual C++ classes for the interface classes?  or, subclasses of ScriptObject with nonvirtual methods that do as3-virtual-interface-dispatch? 

I'm mildly against re-using the ImtThunk mechanism, because it's completely private to the JIT, and I want to be able to change how it works without more dependencies on it; if it leaks at all, it makes life harder (more work to change it), even if it only leaks into nativegen.py-generated code.
(In reply to comment #32)
> What is the use case for C++ calling an AS3 object via its interface, vs.
> calling it in the untyped way and going through coerceEnter()?

It's not a slamdunk use case, but as I rationalize more Flash glue code, it's feeling more and more useful:

-- ANI is clearly moving towards a design that maps as much of the AS3 static-type information into C++ static-type information as possible: if an AS3 type is well-known to the builtins, there's a concrete C++ representation of it as well, so that the amount of C++ casting magic is reduced to a minimum. Interfaces are just about the last major item that doesn't follow this rule, meaning we have code in Flash like "typedef ScriptObject IEventDispatcher;" to sorta-kinda mimic the type system.

-- there are a handful of places (eg netParser) that need to efficiently dispatch to an interface-defined function. They currently do this by looking up the Binding by name, stashing it, then using that to look up the right MethodEnv and call coerceEnter on it directly. IMHO this is way more information than code outside the VM should need to know about; at heart, they should just need to know if an object implements an interface, and if so, we should provide a way to call the statically known methods on it.
 
> If we emit callin wrappers, they could be like plain functions, one for each
> interface method, that box up parameters then call coerceEnter(). 

Right, that's the idea; the problem is that we can't emit the VTable offsets at compiletime like we do for "normal" functions, since those vary between each classes implementation of the interface. 

> Otherwise, whats the plan, pure virtual C++ classes for the interface classes? 

That would be the best approach if we knew every implementation of the interface was builtin, but I think that's not the case. (e.g., netParser allows things to implements IExternalizable, and while IExternalizable is builtin, not every implementation is.)

> subclasses of ScriptObject with nonvirtual methods that do as3-virtual-interface-dispatch? 

Right, that's the idea. The catch is that interface methods do this dispatch via ImtThunk for efficiency.

> I'm mildly against re-using the ImtThunk mechanism

I'm not wild about it either. Let me see if I can come up with a way to encapsulate it enough to make it palatable.
(In reply to comment #33)
> I'm not wild about it either. Let me see if I can come up with a way to
> encapsulate it enough to make it palatable.

AOT might provide some insite here.  At the very least, we want AOT to use an approach that is at least as performant as the approach you take for native call-ins.
> -- there are a handful of places (eg netParser) that need to efficiently
> dispatch to an interface-defined function. They currently do this by looking up
> the Binding by name, stashing it, then using that to look up the right
> MethodEnv and call coerceEnter on it directly. IMHO this is way more
> information than code outside the VM should need to know about; at heart, they
> should just need to know if an object implements an interface, and if so, we
> should provide a way to call the statically known methods on it.

I agree insofar as the need exists in C++.  there should be a native decl on the interface in AS3, and only that interface should get stubs generated for it.

Within the stubs, I would want to see performance data before we did anything more than the equivalent of OP_callproperty.  And, given that data, one option would be internally using a PIC (wrapper around avmplus::CallCache) instead of using the IMT Thunk stuff.

> > If we emit callin wrappers, they could be like plain functions, one for each
> > interface method, that box up parameters then call coerceEnter(). 
> 
> Right, that's the idea; the problem is that we can't emit the VTable offsets at
> compiletime like we do for "normal" functions, since those vary between each
> classes implementation of the interface. 

I agree but I don't understand how your reply relates to the suggestion.  The plain functions for calling interfaces, generated by nativegen, would wrap any guts that did interface dispatch; nothing in the generated code would (or even could) contain a vtable offset.
(In reply to comment #34)
> AOT might provide some insite here.  

True -- what do you do currently? I have no idea.


(In reply to comment #35)

> Within the stubs, I would want to see performance data before we did anything
> more than the equivalent of OP_callproperty. 

True -- premature optimization strikes again. I'll come up with a "dumb" option before I do anything else.
Attached patch Patch v2, rebased (obsolete) — Splinter Review
Patch v2, rebased and tweaked for current TR tip
Attachment #519553 - Attachment is obsolete: true
Attached patch Patch v3 (obsolete) — Splinter Review
Revised patch:

-- don't emit wrappers for all AS3 calls; only emit them for methods tagged with [cppcall] metadata. (If [cppcall] is specified for a method with restargs, throw an exception.)
-- don't emit multiple wrappers for optional arguments: only a single wrapper for the full-arg version. Simple is good.
-- emit callin wrappers for interface methods too, but going thru the slower lookup-by-name path.

NOTE: this patch is additive to the patch for bug 645409 and won't apply properly without it.
Attachment #539021 - Attachment is obsolete: true
Attachment #548353 - Flags: superreview?(edwsmith)
Attachment #548353 - Flags: review?(rulohani)
(In reply to comment #38)
> Created attachment 548353 [details] [diff] [review] [review]
> Patch v3

Still TBD: whether or not to include CodeContext as a required argument. I need to look at use cases to see if I think it makes sense or not.
Attached patch Patch v4 (FINAL)Splinter Review
Final patch with a few more tweaks:
-- add explicit support for getters and setters (with "call_get_foo/call_set_foo" syntax).
-- assignMethodIndices() in nativegen.py wasn't properly handling the case of overridden protected methods; now it does.
-- some other trivial cleanup to generated code

I think this patch is worthy of landing in TR and FlashRuntime, please review at your earliest convenience :-)
Attachment #548353 - Attachment is obsolete: true
Attachment #549883 - Flags: superreview?(edwsmith)
Attachment #549883 - Flags: review?(rulohani)
Attachment #548353 - Flags: superreview?(edwsmith)
Attachment #548353 - Flags: review?(rulohani)
Review ping? I'd love to land this today if possible.
Looks ok. Few comments:

- cpp_wrapper_name should be renamed ? Made me think it was generating the proper wrapper name. It is just used for sorting the method names. 
- what does cpp_vis in emitWrapperMethod() stands for?
Comment on attachment 549883 [details] [diff] [review]
Patch v4 (FINAL)

R+, once the function and variable name are changed/commented?.
Attachment #549883 - Flags: review?(rulohani) → review+
Pushed to TR as 6501:cfc49e5da644 and 6502:2ed68eb131e0 (with the requested tweaks)
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Attachment #549883 - Flags: superreview?(edwsmith)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: