Closed Bug 496473 Opened 15 years ago Closed 15 years ago

Intepreter and JIT use different approaches to maintaining dxns and codeContext

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: stejohns, Assigned: stejohns)

References

Details

Attachments

(1 file, 4 obsolete files)

The Interpreter carefully saves and restores dxns and codeContext. LIR takes a set-both-on-demand approach, without bothering to save and restore. 

This has already demonstrated itself to cause one real bug (see the reopening of https://bugzilla.mozilla.org/show_bug.cgi?id=494013) and is likely to be even more problematic as work on increased jit/interp interop increases.
Attached patch Initial fix idea (obsolete) — Splinter Review
Not asking for an official review on this yet, as it's just a first-cut-to-solve-this-damn-thing sort of approach, but I definitely want comments and input, especially from the AIR guys on the cc list.

This basically sets up a per-Method frame to track the current dxns and codeContext; for both, the "current" dxns/codeContext at any time is the one derived from the current MethodEnv, unless the frame has an override set.

Stuff I'm not sure about:

-- Passes all the acceptance tests, so dxns handling is (presumably) right... but the acceptance tests do pretty much NO exercising of CodeContext, so who knows if it's correct. (Obviously we need to add such tests, but in the meantime, patching into Flash/AIR and testing there is the only way to determine if I interpreted the rules correctly...)

-- Interpreter size/speed should be essentially unchanged, but we are doing extra work in the JIT to maintain the frame: an extra 4 pointer sized fields per stack frame, with instructions to init them all and save/restore the frame pointer. (However, initial performance tests didn't show any change outside the noise level. I'll run more extensive tests tonight, but the initial look is that it's more stack-space/code-size issue than performance issue.)

-- LIR::emitSetContext and emitSetDxns are now essentially empty, but I have left them in place for the moment... if the logic is all correct then they will likely go away (which will trickle upwards into Verifier and friends).

-- I initially thought of adding this functionality to CallStackNode and having it maintained in non-Debugger builds, but that made me a little nervous: it's probably doable but ticklish to get right (eg, we don't want to allocate all the Debugger-specific fields unless a debugger is allocated; we want to ensure no one tries to access the debugger-specific fields unless a debugger is allocated...) Anyway, it may make sense to take that approach in the long run, but this was the most expedient way to test the design.
Assignee: nobody → stejohns
Attached patch Revised patch (obsolete) — Splinter Review
Slightly revised version of first patch, which wasn't compatible with the way Flash/AIR used EnterCodeContext. This one is a bit closer but probably still needs work.
Attachment #381663 - Attachment is obsolete: true
Edwin wrote:

> Lets add some docs for EnterCodeContext, specifically what the expected
> contract is.  sounds like it intends to mimic a full MethodFrame?  If that's
> the case then the dxns logic might need adjusting (the other fields won't be
> valid)

AFAIK the contract isn't well-defined, or at least explicitly documented anywhere, so that would be the first step. What I have been able to infer so far is that it isn't quite equivalent to a MethodFrame, in that Flash/AIR expect to be able to do EnterCodeContext() when there is no MethodFrame active. However, EnterCodeContext appears to be completely independent of dxns.
Thanks for looking into this Steven!

Making the JIT more like the interpreter is a good.  Flash/AIR should be be able to run with out the JIT, so any strategy for maintaining code contexts and DXNS that is more like the interpreter should not cause a problem.  If it does cause a problem, the root bug was realling Flash/AIR.

Neither stan nor I really understand how DXNS is supposed to work.  AIR does use the DXNS in intrinsic AS3 code which is something we had to fix in the past.

Personally I'm all in favor of maintaining an AS3 callstack in non-debugger builds.  I would want the Debugger to use the callstack that the non-debugger code is maintaining so we are not duplicating the code that maintains the stack.
Steven,

Thanks for looking at this!

A potential optimization is that any method that never calls a native doesn't
need to set the code context at all.  Probably something similar for the dxns,
but I'm out of my depth on that.

CallStackNode per se is as you say probably too heavyweight.  And yet... if
there were always (debugger and not) a way to chain call frames and find their
MethodEnvs, that would be really great.  For instance, people want to be able
to get call stacks even in release builds... if it can be done cheaply enough. 
So it deserves consideration... if we can maintain a call frame chain as
cheaply as we maintain the code context, that would be a really nice benefit.
The big problem is that the rules for maintaining the CodeContext are poorly defined at best. ("totally undefined" seems more like it...)
> Neither stan nor I really understand how DXNS is supposed to work.  AIR does
> use the DXNS in intrinsic AS3 code which is something we had to fix in the
> past.

DXNS is in pretty good shape in this patch, I think, and is fairly well-defined. CodeContext is the problematic one.
 
> Personally I'm all in favor of maintaining an AS3 callstack in non-debugger
> builds.  I would want the Debugger to use the callstack that the non-debugger
> code is maintaining so we are not duplicating the code that maintains the
> stack.

I agree in principle, but in practice, that will probably be a backlog item -- combining the two is a ticklish bit of coding that would probably take at least a couple of days to get right (which is a couple of days more than I have available to me)

> A potential optimization is that any method that never calls a native doesn't
> need to set the code context at all. 

OK, good to know... but is there a well-defined rule (or set of rules) for when the CodeContext needs to be set, and *how* it needs to be set?

> CallStackNode per se is as you say probably too heavyweight.  And yet... if
> there were always (debugger and not) a way to chain call frames and find their
> MethodEnvs, that would be really great.  For instance, people want to be able
> to get call stacks even in release builds... if it can be done cheaply enough. 
> So it deserves consideration... if we can maintain a call frame chain as
> cheaply as we maintain the code context, that would be a really nice benefit.

Getting callstacks even in release builds is not just about missing the callstacknode stuff -- there's also a bunch of string and other debugger-specific stuff we don't maintain, to save memory & time.
I need to correct something I wrote above...  You can omit setting the code
context if you're sure you aren't calling any _builtin_ methods (or,
_technically_, if you're sure you aren't calling any builtins care about the
context).  More in my next comment.
Is AvmCore::codeContext() ever in the critical path? I'm guessing not. And if not, seems like we could implement it by just walking up the (hypothetical) callstack, a return the first CodeContext belonging to a non-builtin MethodEnv. (special-casing for EnterCodeContext, I guess.)
AIR has had to write quite a few explicit EnterCodeContexts because we like the
dogfood: we wrote a lot of AIR in ActionScript.  But this means that a lot of
time we're calling into Flash APIs that want to know the caller's privileges
even though there is no non-builtin code on the stack!

For instance in the HTMLLoader, there are internal callbacks from WebKit into
ActionScript.  They're done on behalf of the application that created the
HTMLloader, so we use that code context before invoking the ActionScript.  It's
slightly dubious, but it's correct and there's no other good solution.

The contract for the CodeContext is pretty simple, but not as reliable as it
could be.  If any native method wants to know what user code is calling it, the
CodeContext needs to be set before it is called.  Builtin code never sets the
context because we want to know what _user_ code (however far up the stack) was
asking.

(Earlier versions of the glue layer let a native method declare whether it
cared about the code context.  That was a good idea, especially if the flag was
set _correctly_.  But that turns out to be too hard to be sure of because it's
way too easy for the glue code to check the code context.)

The interpreter's implementation of this contract is to always set the push/pop
code context on entry/exit of non-builtin code.  The JIT's implementation is to
set it immediately before any method call that could conceivably need it.  I
prefer the implementation that says push it and pop it on entry/exit in any
function that can conceivably call into another pool.
excellent summary. I too think the push/pop is much more robust. Look for a new patch soon.
Attached patch Reviewable Patch (obsolete) — Splinter Review
OK, I think this patch is review-worthy. It trims off some of the added stack size and jit overhead, and seems to be happy with the Flash/AIR testcases I've given it so far. Initial performance testing indicates no noticeable slowdown, but more testing should be done (I'll do that over the weekend). Note that AvmCore::codeContext() actually has to do some work now, so if that turns out to be the critical path then we might have some slowdown -- but I'm skeptical that will be the case (which is why I took this approach).

This patch also includes a totally-unrelated fix (in GCStack.cpp) necessary for AIR compilation.
Attachment #381680 - Attachment is obsolete: true
Attachment #381876 - Flags: review?(edwsmith)
(In reply to comment #12)

> This patch also includes a totally-unrelated fix (in GCStack.cpp) necessary 
> for AIR compilation.

What particular problem requires this fix?  Does AIR define some non-inline or global or otherwise visible placement 'new' operator?  If so, is the #ifdef here the most reasonable or is it just expedient?
AIR defines MMGC_ENABLE_CPP_EXCEPTIONS, which means that GCGlobalNew.h includes <new>, which provides a placement-new definition -- so we get a complaint about redundant definition. 

Alternately, we could fix this by having GCGlobalNew.h always define placement new (even when <new> is not included)
implementation of AvmCore:dxns is fragile, it only works if the xml methods that access DXNS are directly invoked from user code.  If any of the implementation of XML api's is in AS3, then the user's dxns will be higher up on the callstack.  (if it works in tamarin today, maybe thats enough, but we could get surprised later if we change any xml implementation code).

In TT, we had to loop to find the first !builtin frame.  Based on Stan's comments I'm not 100% sure that's the right search criteria, but maybe it's a starting point.

explicit code context mgmt:  do we really need new state in AvmCore for this or could it be done with stack-scoped objects that take care of saving and restoring?  I guess from reading the code I can't tell what we're trying to do here... maybe at least more comments are needed.

let me see if i get it:
  * normally code context is assocated with an as3 callframe (MethodFrame)
  * sometimes native code overrides that and inserts its own.
  * saving/restoring that override is what EnterCodeContext does.

if that's right, then:
  * this override actually takes precedence over any further AS3 call frames
    that are pushed inbetween EnterCodeContext / ~EnterCodeContext.  is that
    what we want? 
  * if yes, okay, but spookey and needs documentation
  * if no, how about EnterCodeContext pushes/pops another MethodFrame that's
    suitably marked as such so it can be skipped by (say) dxns() or other
    stack walkers?  (less extra logic in ExceptionFrame too).  its basically
    a native code call frame that doesn't have any info except a code context

CodegenLIR.cpp:2620
Another patch coming later to remove CodegenLIR::writeSetContext()?

CodegenLIR.cpp:4600
a single store is all we need here:
    storeIns(ns, offsetof(MethodFrame,_dxns), methodFrame)

Interpreter:
should MethodFrame live in aux_memory where the previously saved dxns and code context used to live?  it shuttles it off the C stack.

general:
class MethodFrame is declared in AvmCore.h but inline implementations of its ctor and dtor are in MethodEnv.h..  huh?
Comment on attachment 381876 [details] [diff] [review]
Reviewable Patch

r+ assuming the only patch changes are comments and the one LIR change i suggested.  

any more substantial change, pls r? again
Attachment #381876 - Flags: review?(edwsmith) → review+
(In reply to comment #15)
> implementation of AvmCore:dxns is fragile, it only works if the xml methods
> that access DXNS are directly invoked from user code.  If any of the
> implementation of XML api's is in AS3, then the user's dxns will be higher up
> on the callstack.  (if it works in tamarin today, maybe thats enough, but we
> could get surprised later if we change any xml implementation code).

It passes all our current acceptance tests. Agreed that such a change in the future might break stuff. Not sure if it's worth reworking this patch to stave off theoretical future breakage. What do you think?
 
> In TT, we had to loop to find the first !builtin frame.  Based on Stan's
> comments I'm not 100% sure that's the right search criteria, but maybe it's a
> starting point.

You mean to find the CodeContext? That's essentially what we're doing with this patch.

> explicit code context mgmt:  do we really need new state in AvmCore for this or
> could it be done with stack-scoped objects that take care of saving and
> restoring?  I guess from reading the code I can't tell what we're trying to do
> here... maybe at least more comments are needed.
> 
> let me see if i get it:
>   * normally code context is assocated with an as3 callframe (MethodFrame)
>   * sometimes native code overrides that and inserts its own.
>   * saving/restoring that override is what EnterCodeContext does.

That's my understanding.

> if that's right, then:
>   * this override actually takes precedence over any further AS3 call frames
>     that are pushed inbetween EnterCodeContext / ~EnterCodeContext.  is that
>     what we want? 

Actually, no, it doesn't. It behaves as though a new frame has been inserted. 

>   * if yes, okay, but spookey and needs documentation
>   * if no, how about EnterCodeContext pushes/pops another MethodFrame that's
>     suitably marked as such so it can be skipped by (say) dxns() or other
>     stack walkers?  (less extra logic in ExceptionFrame too).  its basically
>     a native code call frame that doesn't have any info except a code context

I thought about doing that, but such a frame would have a null MethodEnv and it made me a little scared. It might be cleaner that way though. I can give that a try and see how it feels -- it certainly would make the logic a bit less weird.
 
> CodegenLIR.cpp:2620
> Another patch coming later to remove CodegenLIR::writeSetContext()?

Er, yes, that would be good :-)

> CodegenLIR.cpp:4600
> a single store is all we need here:
>     storeIns(ns, offsetof(MethodFrame,_dxns), methodFrame)

nice catch, will fix
 
> Interpreter:
> should MethodFrame live in aux_memory where the previously saved dxns and code
> context used to live?  it shuttles it off the C stack.

Started to do it that way, but getting the ctors/dtors happy required some gyrations and I punted on it since it's a "small" structure. But yeah, it should probably go into there.
 
> general:
> class MethodFrame is declared in AvmCore.h but inline implementations of its
> ctor and dtor are in MethodEnv.h..  huh?

Order-of-include issues.... AvmCore needs to know the structure, but the ctors/dtor (which are trivial enough to warrant inlining) have to get defined after MethodEnv. Probably adding some comments about this would make it less weird...
(In reply to comment #17)

> I thought about doing that, but such a frame would have a null MethodEnv and it
> made me a little scared. It might be cleaner that way though. I can give that a
> try and see how it feels -- it certainly would make the logic a bit less weird.

Actually, there's another advantage to this admittedly-weird technique: we don't have to allocate space for a codeContext in the MethodFrame, since the codeContext is either derivable from the MethodEnv (normally) or specified in AvmCore::explicitCodeContext (when EnterCodeContext overrides it). I suspect (without proof) that the vast majority of calls won't have an overridden CodeContext and thus this is worthwhile.
(In reply to comment #17)

> > Interpreter:
> > should MethodFrame live in aux_memory where the previously saved dxns and code
> > context used to live?  it shuttles it off the C stack.

I looked into doing this, but there's a pretty decent reason to keep it on the C stack: moving it into the aux_memory section would mean we'd need a DRC on the _dxns field there (but not in the JIT, where it's always really on the stack). Since it's a minimal amount of stack space I'm inclined to leave it there to keep things simple.

(In reply to comment #18)
> (In reply to comment #17)
> 
> > I thought about doing that, but such a frame would have a null MethodEnv and it
> > made me a little scared. It might be cleaner that way though. I can give that a
> > try and see how it feels -- it certainly would make the logic a bit less weird.
> 
> Actually, there's another advantage to this admittedly-weird technique: we
> don't have to allocate space for a codeContext in the MethodFrame, since the
> codeContext is either derivable from the MethodEnv (normally) or specified in
> AvmCore::explicitCodeContext (when EnterCodeContext overrides it). I suspect
> (without proof) that the vast majority of calls won't have an overridden
> CodeContext and thus this is worthwhile.

The combination of the desire to keep this struct small, and to keep it on the stack, makes me inclined to leave this particular part as-is, and just to document the weirdness and reasoning behind it. What do you think?
Attached patch Updated Patch (obsolete) — Splinter Review
Updated version of previous patch, with some comments addressed:
-- writeSetContext nuked
-- desired behavior of MethodFrame/CodeContext documented
-- explanation of wonky EnterCodeContext behavior documented & justified
-- JIT optimizations added

I think this one is good to go.
Attachment #381876 - Attachment is obsolete: true
Attachment #382387 - Flags: review?(edwsmith)
Comment on attachment 382387 [details] [diff] [review]
Updated Patch

minor rebuttal, but i agree it's good enough

MethodFrame could point to either CodeContext or MethodEnv, it just needs one bit to distinguish, and you already have the one bit for flagging the override.  (but whatever, good commenting is enough)

adding Lars for a final set of eyes, and to double check the interpreter change.
Attachment #382387 - Flags: superreview?(lhansen)
Attachment #382387 - Flags: review?(edwsmith)
Attachment #382387 - Flags: review+
(In reply to comment #21)
> (From update of attachment 382387 [details] [diff] [review])
> minor rebuttal, but i agree it's good enough
> 
> MethodFrame could point to either CodeContext or MethodEnv, it just needs one
> bit to distinguish, and you already have the one bit for flagging the override.
>  (but whatever, good commenting is enough)

doh! you're right. that would be much cleaner. clean design beats funky-with-documentation. I will revise it in that manner and offer a new patch.
Actually, I'm now having second thoughts on my second thoughts... we could combine MethodEnv and CodeContext as you suggest, but then we'd have to take a similar approach for dxns (since it falls back to env->scope->dxns when null). This would work but would require an extra move in the prologue (instead of initing dxns to null by default, would have to be env, slightly more costly) and would also prevent us from being able to rely on MethodFrame always containing a valid MethodEnv, which I'm hunching will prove useful in the future.
The dxns() search could ignore EnterCodeContext frames (ie MethodFrames with that bit set) entirely.  I'm willing to trade off a possible future need for every MF to have a valid env, (as long as validity is testable) against the cleaner impl now.
(In reply to comment #24)
> The dxns() search could ignore EnterCodeContext frames (ie MethodFrames with
> that bit set) entirely.  

Actually, it can't. dxns doesn't search; if there isn't dxns explicitly set in the current frame, it uses the current frame's env->scope->getDefaultNamespace(). Thus, if a frame had no dxns set, but did have an explicit CodeContext set, we'd have to way to get dxns (since the env/cc slot is a cc)
aha, thats right.  well, i think it needs to search, to skip ECC frames (now), and builtin frames (later).  possible scope creep alert, but i think its just a small change inside dxns().
(In reply to comment #26)
> aha, thats right.  well, i think it needs to search, to skip ECC frames (now),
> and builtin frames (later).  possible scope creep alert, but i think its just a
> small change inside dxns().

definitely scope creep, but might as well get it right while we're here. I guess this brings up the question of ensuring I have the right definition for finding dxns... why do we need to search? 

also, seems like skipping ECC frames could produce the wrong result, as described above.
Edwin: still a bit puzzled here; I must be misunderstanding the design of dxns because I don't see why we'd even need (or want) to search up the callstack. 

My understanding of the pre-existing code (based on analyzing the Interpreter) is that it was doing:

-- if top-of-callstack is a method that sets dxns, return that custom dxns. (it appears that the code assumes that OP_dxns will always execute before any requests.)

-- if top-of-callstack doesn't set dxns, return top-of-callstack's env->scope->getDefaultNamespace().

-- EnterCodeContext doesn't have any affect on dxns.

-- whether a piece of code is builtin or not doesn't have any affect on dxns.
You've got the right idea about DXNS semantics.

IF ECC is implemented as a special MethodFrame, then the topmost MethodFrame might be an ECC frame, and the dxns code should skip those frames to find the topmost AS3-Based MethodFrame.

In a future world where some of the XML library itslef is written in AS3, then the topmost MethodFrame might also not be the one we're interested in.  (that's the potential reason for searching past builtin method frames -- and builtin is probably too coarse; really it needs to skip E4X-implementation-code frames but no other AS3 frames, builtin or not.
(In reply to comment #29)
> IF ECC is implemented as a special MethodFrame, then the topmost MethodFrame
> might be an ECC frame, and the dxns code should skip those frames to find the
> topmost AS3-Based MethodFrame.

ahhh... ok, I get it now, I was confused about what you were suggesting. I'll try it out and offer a new patch, hopefully by end of today.
i've R+'d and asked for Lars sr+, but if you're going to use MethodFrame to implement EnterCodeContext, then we have one more quick round to do.
(In reply to comment #31)
> i've R+'d and asked for Lars sr+, but if you're going to use MethodFrame to
> implement EnterCodeContext, then we have one more quick round to do.

I don't understand... the change to EnterCodeContext is already in this patch, what's the next "quick round" for?
Revised patch using the approach suggested by Edwin.
Attachment #382387 - Attachment is obsolete: true
Attachment #383815 - Flags: review?(edwsmith)
Attachment #382387 - Flags: superreview?(lhansen)
Attachment #383815 - Flags: review?(edwsmith) → review+
Attachment #383815 - Flags: superreview?(lhansen)
Comment on attachment 383815 [details] [diff] [review]
Even Better Patch

It's probable that the large comment added in Interpreter.cpp should say that it's imperative that the structure is cleaned up on /normal/ exit, not just "on exit", since a throw past the frame will not perform cleanup.
Attachment #383815 - Flags: superreview?(lhansen) → superreview+
(In reply to comment #34)
> (From update of attachment 383815 [details] [diff] [review])
> It's probable that the large comment added in Interpreter.cpp should say that
> it's imperative that the structure is cleaned up on /normal/ exit, not just "on
> exit", since a throw past the frame will not perform cleanup.

will fix prior to push.
pushed to redux as changeset:   2032:0dfac3064499
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Depends on: 500195
Resolved fixed engineering / work item that has been pushed.  Setting status to verified.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: