Closed
Bug 616780
Opened 14 years ago
Closed 14 years ago
AOT optimizations
Categories
(Tamarin Graveyard :: Virtual Machine, defect)
Tamarin Graveyard
Virtual Machine
Tracking
(Not tracked)
VERIFIED
INVALID
People
(Reporter: alexmac, Unassigned)
References
Details
Attachments
(1 obsolete file)
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_5; en-us) AppleWebKit/533.19.4 (KHTML, like Gecko) Version/5.0.3 Safari/533.19.4 Build Identifier: This patch contains various tweaks to the VM to support some of the AOT optimizations we've been implementing over the past month. - when an AbcEnv is first created for a given abc we parse all the multinames, AOT generated code digs static multinames directly out of this cache. - Each AbcEnv also contains a GenericGuard (patch coming soon!) used to do exception based caching of pushstrings and finddefs - Faster argument array creation without needing a redundant copy of the atoms - Optimistic double Atoms: the AOT compiler now provides a heap-allocated double so that calls to op_add never need to perform heap allocations. A modified version of op_add takes a pointer to this heap allocated double and fills it in with the result producing an Atom that points to this optimistic double. The AOT compiler promotes this stack-allocated double to a heap-allocated double whenever an atom that references it escapes. I've push the body of op_add into a templatized "op_add_impl" function so it can be shared by the normal and optimistic variants of op_add. I validated that gcc does indeed inline op_add_impl into the two varianst of op_add and that the resulting asm is the same as the original. - cdecl didn't change much but I re-indented it to better follow the styles used elsewhere (verify by doing a whitespace ignoring diff) - static mops storage: allows you to use one fixed size static buffer for the alchemy/mops opcodes to work on, it also allows you to disable range-checking. Obviously neither of these apply to the shipping flash player/AIR, but on certain platforms like iOS where we generate native apps with captive runtimes that can't execute untrusted code we want to expose an option for people to use this as it allows the AOt compiler the opportunity to generate significantly faster code from alchemy'ed swfs. Reproducible: Always
Reporter | ||
Comment 1•14 years ago
|
||
Attachment #495334 -
Flags: review?(stejohns)
Reporter | ||
Updated•14 years ago
|
Attachment #495334 -
Flags: review?(edwsmith)
Reporter | ||
Comment 2•14 years ago
|
||
(In reply to comment #0) > - Optimistic double Atoms: the AOT compiler now provides a heap-allocated > double so that calls to op_add never need to perform heap allocations That first line should say "stack-allocated"
Comment 3•14 years ago
|
||
Comment on attachment 495334 [details] [diff] [review] AOT optimizations Adding Lars, as several parts of this are likely to collide with ongoing ExactGC work.
Attachment #495334 -
Flags: review?(lhansen)
Comment 4•14 years ago
|
||
Comment on attachment 495334 [details] [diff] [review] AOT optimizations (In reply to comment #0) General comment: this really needs to be multiple patches; there are several completely different issues here that really need to be considered separately. (In point of fact, it needs to be multiple *bugs*, each addressing one issue.) When the patches are minor tweaks, making a grab-bag of changes has been fine, but these are major changes, several of which are on a collision course with other work that's underway: The changes to ArrayObject are definitely going to collide with the changes in bug 610063 when they land (but shouldn't be too hard to resolve); I'd recommend you look those over to see if any unpleasant surprises await. In particular, though, "getDenseCopy()" disturbs me, as it makes an assumption about the underlying Array representation (which is ok) and exposes this assumption to the outside world (which is not)... All of class AvmValue is presented without any explanation or documentation as to what it's for, when it's used, etc; all of this needs to be done. Also consider that Atom tagging schemes are under revision as part of the changes for ExactGC and also for float and float4 support; in particular, the representation of kDoubleType is changing (see bug 611484), which will definitely impact this in a major way. > - Each AbcEnv also contains a GenericGuard (patch coming soon!) used to do > exception based caching of pushstrings and finddefs If you are proposing (re)adding MMU support to do this, be aware that this was a source of extreme pain and suffering in the MIR days, and proved a huge debug-and-maintain win in removing. I'd be loathe to re-add it, and would definitely want to see evidence of *substantial* performance wins that can't be replicated in other ways. > - Optimistic double Atoms: the AOT compiler now provides a heap-allocated > double so that calls to op_add never need to perform heap allocations. A > modified version of op_add takes a pointer to this heap allocated double and > fills it in with the result producing an Atom that points to this optimistic > double. The AOT compiler promotes this stack-allocated double to a > heap-allocated double whenever an atom that references it escapes. This makes no sense to me; a double Atom is always heap-allocated in the first place. (Or at least it's supposed to be; if AOT violates this contract then severe pain awaits us) > - static mops storage: allows you to use one fixed size static buffer for the > alchemy/mops opcodes to work on, it also allows you to disable range-checking. > Obviously neither of these apply to the shipping flash player/AIR, but on > certain platforms like iOS where we generate native apps with captive runtimes > that can't execute untrusted code we want to expose an option for people to use > this as it allows the AOt compiler the opportunity to generate significantly > faster code from alchemy'ed swfs. disabling range-checking sounds like an extremely dangerous option, even for AOT code. Style nit: tab spacing is wrong in some cases (eg in the AOT specific ArrayObject ctor, it's 2 spaces rather than 4 spaces) Style nit: Tamarin prefers constant-on-RHS for new code, thus AvmAssert(!(7 & (uintptr_t)dst)); should be AvmAssert(!((uintptr_t)dst & 7));
Attachment #495334 -
Flags: review?(stejohns) → review-
Reporter | ||
Comment 5•14 years ago
|
||
(In reply to comment #4) > Comment on attachment 495334 [details] [diff] [review] > AOT optimizations > > (In reply to comment #0) > > General comment: this really needs to be multiple patches; there are several > completely different issues here that really need to be considered separately. > (In point of fact, it needs to be multiple *bugs*, each addressing one issue.) > When the patches are minor tweaks, making a grab-bag of changes has been fine, > but these are major changes, several of which are on a collision course with > other work that's underway: Ok no problem, I've started breaking the patch apart. > All of class AvmValue is presented without any explanation or documentation as > to what it's for, when it's used, etc; all of this needs to be done. Also The diff is confusing, but its mostly whitespace/tabbing changes + some minor bug fixes, the AvmValue class already exists. I've broken the cdecl refactoring out as bug 617246 > consider that Atom tagging schemes are under revision as part of the changes > for ExactGC and also for float and float4 support; in particular, the > representation of kDoubleType is changing (see bug 611484), which will > definitely impact this in a major way. Yeah we'll definitely need to update cdecl to take that into account when it lands > > > - Each AbcEnv also contains a GenericGuard (patch coming soon!) used to do > > exception based caching of pushstrings and finddefs > > If you are proposing (re)adding MMU support to do this, be aware that this was > a source of extreme pain and suffering in the MIR days, and proved a huge > debug-and-maintain win in removing. I'd be loathe to re-add it, and would > definitely want to see evidence of *substantial* performance wins that can't be > replicated in other ways. The implementation is certainly a bit low-level but the code is all contained within one file and exposed via a straightforward API, it doesn't have any ugly tendrils into the rest of the codebase. We could probably get 90% of the performance by just using plain old pointers but we need every last drop of performance on mobile. Lets debate this more when I post the patch for the Generic Guard code. > > - Optimistic double Atoms: the AOT compiler now provides a heap-allocated > > double so that calls to op_add never need to perform heap allocations. > > This makes no sense to me; a double Atom is always heap-allocated in the first > place. (Or at least it's supposed to be; if AOT violates this contract then > severe pain awaits us) I fudged the description, that should have said stack-allocated, anyway I've broken it out as a separate bug as requested: bug 617252 > > - static mops storage: allows you to use one fixed size static buffer for the > > alchemy/mops opcodes to work on, it also allows you to disable range-checking. > > disabling range-checking sounds like an extremely dangerous option, even for > AOT code. Given that devs are already allowed to inject arbitrary native code into their iOS binary I don't think this is an issue. It will of course be opt-in for those people who want speed at the expense of safety. I've broken this out as bug 617250
Reporter | ||
Updated•14 years ago
|
Attachment #495334 -
Attachment is obsolete: true
Attachment #495334 -
Flags: review?(lhansen)
Attachment #495334 -
Flags: review?(edwsmith)
Updated•14 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Reporter | ||
Comment 6•14 years ago
|
||
This bug is no longer being used as the AOT tracking bug, see bug 620678 instead
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
Whiteboard: Tracking
You need to log in
before you can comment on or make changes to this bug.
Description
•