584834 - Add simple function analyzer to avoid method overhead for simple functions

Reporter

Description

•

15 years ago

Splitting this off from bug 511873. The patch contains logic to analyzer our JITed functions and if they are considered simple, we can avoid the methodFrame and stack overflow checks.

Werner Sharp

Reporter

Updated

•

15 years ago

Blocks: 511873

Werner Sharp

Reporter

Comment 1

•

15 years ago

Attached patch same patch as from bug 511873 (obsolete) — Details — Splinter Review

Attachment #463264 - Flags: review?(lhansen)

Werner Sharp

Reporter

Updated

•

15 years ago

Assignee: nobody → wsharp

Werner Sharp

Reporter

Updated

•

15 years ago

Whiteboard: PACMAN

Lars T Hansen

Comment 2

•

15 years ago

Comment on attachment 463264 [details] [diff] [review] same patch as from bug 511873 r- because of the bugs but the approach seems good, even though I'm a tiny bit concerned about brittleness under evolution (unavoidable probably and mitigated by backward compatibility concerns anyhow). Bugs: OP_dxnslate is flagged as simple, but it can call toString (setDxnsLate ends up in AvmCore::intern, see the case for kObjectType). OP_returnvalue coerces its value to some unknown type so I'm guessing it's not simple either (could invoke a toString method?) All the load and store MOPS have range checking and can throw exceptions, only the sign extend MOPS are simple, probably. OP_deleteproperty can throw an exception (see MethodEnv::delproperty) and is not simple. Nits: A cross reference from the AbcOpcodeInfo definition to the definition of simple in MethodInfo.h would be helpful. Filenames like "out.txt" are hard core, but I for one would love to see something sissier, like "simple-function-profiler.txt". "... don't yet know it's type" should be "... don't yet know its type". Are these deliberate in the operate() method? + return; + break; It's not very good style to insert unreachable code deliberately; some compilers will complain. The only call to checkCoerce that isn't effectively guarded by a test of bIsSimple is the second call in the binop case; thus I doubt the guard at the top of checkCoerce itself is worth it. On that note, how accurate/useful can we expect opcodeFault[COMPLEX_COERCION] to be when the various guards filter out code paths that reach the point where that variable is updated?

Attachment #463264 - Flags: review?(lhansen) → review-

Lars T Hansen

Comment 3

•

15 years ago

Additional note: the analysis is only used by the JIT, but it is also only appropriate for the jit - it makes assumptions about the jit behavior, which is why the OP_returnvalue "bug" may not be a bug. It /is/ a bug for the interpreter though. Ditto (maybe) the mops opcodes. That being the case, renaming isSimple as isSimpleForJit and the "simple" bit as "simple_for_jit" would be more appropriate. At that point I would worry some about whether there's not additional brittleness here since those jit semantics are not tightly controlled by backward compatibility.

Lars T Hansen

Updated

•

15 years ago

Depends on: 545254

Edwin Smith

Updated

•

14 years ago

Depends on: 587823

Werner Sharp

Reporter

Comment 4

•

14 years ago

Partial brightspot static analysis with my latest code: 26+% is promising. Total files analyzed: 2874 Total functions verified: 1404733 Total simple functions verified: 377616 (26.8816921080376%) opcode: 0x03 (throw) count: 36 opcode: 0x04 (getsuper) count: 5866 opcode: 0x05 (setsuper) count: 5645 opcode: 0x07 (dxnslate) count: 195 opcode: 0x1e (nextname) count: 3528 opcode: 0x41 (call) count: 71 opcode: 0x42 (construct) count: 4583 opcode: 0x45 (callsuper) count: 3329 opcode: 0x46 (callproperty) count: 714 opcode: 0x49 (constructsuper) count: 67972 opcode: 0x4a (constructprop) count: 130 opcode: 0x4c (callproplex) count: 188 opcode: 0x4e (callsupervoid) count: 17766 opcode: 0x4f (callpropvoid) count: 822 opcode: 0x57 (newactivation) count: 46980 opcode: 0x58 (newclass) count: 11640 opcode: 0x5a (newcatch) count: 4 opcode: 0x5b (findpropglobalstrict) count: 39 opcode: 0x5c (findpropglobal) count: 52 opcode: 0x5d (findpropstrict) count: 36 opcode: 0x5e (findproperty) count: 3 opcode: 0x5f (finddef) count: 355049 opcode: 0x61 (setproperty) count: 11101 opcode: 0x66 (getproperty) count: 84602 opcode: 0x68 (initproperty) count: 5054 opcode: 0x6a (deleteproperty) count: 25 opcode: 0x87 (astypelate) count: 13 opcode: 0xa0 (add) count: 7199 opcode: 0xab (equals) count: 14553 opcode: 0xac (strictequals) count: 148 opcode: 0xb1 (instanceof) count: 1 opcode: 0xb3 (istypelate) count: 22 opcode: 0xb4 (in) count: 823 opcode: 0x100 (complex coercion) count: 33724 opcode: 0x102 (writeMethodCall) count: 109418 opcode: 0x103 (writeNullCheck) count: 235786 writeNullCheck is an addition from last patch and it usually caught right before setProperty, getProperty or writeMethodCall. Can we filter out findDef/getProperty for builtins? (if they are the reason for the majority of calls) How about constructsuper for trivial (and/or simple) constructors? Does a subclass of Array or Object not get marked as simple because it has to call the constructor? OP_equals could look at the type of traits for each operand to consider it simple?

Lars T Hansen

Comment 5

•

14 years ago

(In reply to comment #4) > Partial brightspot static analysis with my latest code: 26+% is promising. Nice. > How about constructsuper for trivial (and/or simple) constructors? Does a > subclass of Array or Object not get marked as simple because it has to call > the constructor? The "is trivial?" analysis in Tamarin attempts to get at that, and there's an optimization in place that avoids constructsuper calls to the Object constructor if the Object constructor is trivial (it currently is). That optimization is shallow though, so if D2 <: D1 <: Object then D2's constructor calls D1's constructor even if D1's constructor is trivial apart from the call to Object's empty constructor (and even though the latter call will be optimized away). That optimization could usefully be made more powerful by handling that cascade. In general, eliding calls to intermediate trivial constructors in constructor chains might also be good in practice, but I don't know how useful for sure - it will kick in when there are intermediate classes in a hierarchy that only add methods, no slots. People who subclass Array deserve what they get, but it would be useful for Array's constructor to be as streamlined as possible, and it is empty it should not be called at all if we early bind to Array at the call site. Eventually inlining will fix that, but it would be good to have a fix before we have good inlining. > OP_equals could look at the type of traits for each operand to consider it > simple? As could OP_add, probably - if they're both numbers or both strings it's simple. Why are OP_newcatch and OP_newactivation not simple?

Werner Sharp

Reporter

Comment 6

•

14 years ago

Beyond the 'trivial' constructor case which I am handling, it may be possible to cascade 'simple' constructors to maintain the simple flag but I worry about infinite recursion possibilities with badly formed SWFs. Can OP_constructsuper call itself or ping-pong between two classes? Or do we maintain a list of blessed internal constructors that we know are simple and/or can't recurse? Some common OP_superconstructor types are Error, MovieClip, Sprite and EventDispatcher and these are not trivial but probably simple. Some experimentation would determine if it changes our simple function percentage at all. OP_add currently catches the numeric add but not a pure string add. OP_newcatch is complex enough that I didn't feel okay marking it simple. MethodEnv::newcatch can call cloneWithNewVTable and resolveSignatures which I don't specifically know if they can't throw. OP_newactivation calls MethodEnv::newActivation which can call init->coerceEnter which is not simple.

Edwin Smith

Comment 7

•

14 years ago

OP_constructsuper looks at the base type to find the initializer to call, and since the class heirarchy is a tree (not cyclic), it can't be recursive.

Lars T Hansen

Comment 8

•

14 years ago

(In reply to comment #6) > OP_newcatch is complex enough that I didn't feel okay marking it simple. > MethodEnv::newcatch can call cloneWithNewVTable and resolveSignatures which I > don't specifically know if they can't throw. Docs say it won't throw but docs could be wrong. > OP_newactivation calls MethodEnv::newActivation which can call > init->coerceEnter which is not simple. Docs say it doesn't throw so docs are definitely wrong. (Actually in both cases the docs don't say that it will throw, they don't say that it won't throw. I tend to think of those as equivalent... in any case the "docs" in question are specs/avm2overview.doc, which is still the closest thing we have to authoritative apart from the code.)

Werner Sharp

Reporter

Comment 9

•

14 years ago

The other confusing bit from our opcodes.tbl file: //opCount throw stack simple internal name hex ABC_OP(0,1,1,0,0,newactivation) // 0x57 ABC_OP(1,1,1,0,0,newcatch) // 0x5A Both are marked "throw" but there are definitely errors in this table: ABC_OP(1,1,0,1,0,dxns) // 0x06 OP_dxns does not throw from what I can tell.

Lars T Hansen

Comment 10

•

14 years ago

(In reply to comment #9) > > ABC_OP(1,1,0,1,0,dxns) // 0x06 > > OP_dxns does not throw from what I can tell. It may throw a verifyerror (again according to the docs) but i would expect verifyerrors to not be reflected in the opcode table.

Werner Sharp

Reporter

Comment 11

•

14 years ago

Aah, makes sense since it does throw a verifyError. (same with newcatch and newactivation). Some experiments with marking individual opcodes simple on 503 brightspot SWFs: baseline data (top ten opcodes) Total files analyzed: 503 Total functions verified: 230902 Total simple functions verified: 62240 (26.9786433522178%) opcode: 0x5f (finddef) count: 57165 opcode: 0x103 (writeNullCheck) count: 38774 opcode: 0x102 (writeMethodCall) count: 17137 opcode: 0x66 (getproperty) count: 14421 opcode: 0x49 (constructsuper) count: 10878 opcode: 0x57 (newactivation) count: 8720 opcode: 0x100 (complex coercion) count: 5810 opcode: 0x4e (callsupervoid) count: 3009 opcode: 0xab (equals) count: 2342 opcode: 0x61 (setproperty) count: 2147 making OP_finddef simple and skipping nullCheck in getProperty Total files analyzed: 503 Total functions verified: 230902 Total simple functions verified: 63873 (27.662384907883%) opcode: 0x66 (getproperty) count: 46271 opcode: 0x103 (writeNullCheck) count: 41454 opcode: 0x102 (writeMethodCall) count: 22118 opcode: 0x49 (constructsuper) count: 11351 opcode: 0x4a (constructprop) count: 9927 opcode: 0x57 (newactivation) count: 8740 opcode: 0x100 (complex coercion) count: 7531 opcode: 0x4e (callsupervoid) count: 3049 opcode: 0xab (equals) count: 2825 opcode: 0x61 (setproperty) count: 2212 making OP_constructsuper simple Total files analyzed: 503 Total functions verified: 230902 Total simple functions verified: 69510 (30.1036803492391%) opcode: 0x5f (finddef) count: 58462 opcode: 0x103 (writeNullCheck) count: 39015 opcode: 0x102 (writeMethodCall) count: 17573 opcode: 0x66 (getproperty) count: 15564 opcode: 0x57 (newactivation) count: 8740 opcode: 0x100 (complex coercion) count: 5900 opcode: 0x4e (callsupervoid) count: 3009 opcode: 0xab (equals) count: 2358 opcode: 0x61 (setproperty) count: 2149 opcode: 0x58 (newclass) count: 1909 making OP_newactivation simple Total files analyzed: 503 Total functions verified: 230902 Total simple functions verified: 62353 (27.0040969762064%) opcode: 0x5f (finddef) count: 60917 opcode: 0x103 (writeNullCheck) count: 41111 opcode: 0x102 (writeMethodCall) count: 17747 opcode: 0x66 (getproperty) count: 14938 opcode: 0x49 (constructsuper) count: 11259 opcode: 0x100 (complex coercion) count: 6394 opcode: 0x4e (callsupervoid) count: 3249 opcode: 0xab (equals) count: 2499 opcode: 0x61 (setproperty) count: 2154 opcode: 0x58 (newclass) count: 1909

Werner Sharp

Reporter

Comment 12

•

14 years ago

Attached patch update patch after review (obsolete) — Details — Splinter Review

Rebased patch to latest code incorporating comments from Lars Moved simpleForJIT flag to opcodes.tbl Moved static isNullable to Traits::isNullable and changed "const Traitsp" parameter for builtinTypes to be "const Traits *" to fix MSVC complaints about casts from one to the other. CodegenLIR::callIns has debugging code to trap an unauthorized helper function Slightly better DEBUG_SIMPLE_FUNCTION_RESULTS functionality

Attachment #463264 - Attachment is obsolete: true

Attachment #475198 - Flags: superreview?(edwsmith)

Attachment #475198 - Flags: review?(lhansen)

Werner Sharp

Reporter

Comment 13

•

14 years ago

To verify the function list in CodegenLIR::callIns, I added some logging for every call to number, equals, compare, coerce and toUint32 to ensure that the simple flag was properly being maintained. This output from static analysis of 16k brightspot swfs shows that we are only calling these helper functions with simple types and won't hit any toString/defaultValue helper functions. 5783 ---equals 4400 equals: String String 796 equals: null String 534 equals: Boolean Boolean 50 equals: String null 1 equals: int Boolean 1 equals: null Boolean 1 equals: String Boolean 1723 ---compare 1720 compare: String String 2 compare: int Boolean 1 compare: String int 33 ---number 23 number: null 10 number: String 17 ---coerce 6 coerce2: undefined String 4 coerce2: undefined Number 2 coerce2: undefined Boolean 2 coerce2: undefined int 1 coerce2: undefined Point 1 coerce2: undefined Object 1 coerce2: undefined IContentState 5 ---toUInt32 3 toUInt32: null 2 toUInt32: String

Lars T Hansen

Comment 14

•

14 years ago

Comment on attachment 475198 [details] [diff] [review] update patch after review Nits: The two inline functions in SimpleFunctionAnalyzer belong in Verifier-inlines.h, the (developing) house style seems to be that it's good to segregate even simple getters and setters. I see you just moved isNullable from CodegenLIR to Traits and that's fine, but can that function use a faster branch-free decision a la Traits::isMachineType? On that note, should Traits::isNullable be REALLY_INLINE and in the header file? In opcodes.tbl it would be good to make sure, in the header block comment, that it is understood that the flags do not apply to interpreted code, that the interpreter may throw exceptions in some cases where the jit may not (I'm thinking of returnvalue, for example). In MethodInfo.h, comments for isComplex and isUnknown. In Verifier.cpp, might it be useful to encapsulate this: !in->isMachineType() && in != STRING_TYPE && in != NAMESPACE_TYPE as a predicate "isObjectReferenceType" (if I understand correctly what it does)?

Attachment #475198 - Flags: review?(lhansen) → review+

Werner Sharp

Reporter

Comment 15

•

14 years ago

I will incorporate your suggested changes. The last verifier code snippet (basically copied in codegenlir) is a bit perplexing. in->isMachineType fails for BUILTIN_OBJECT but will work for BUILTIN_NONE (which can be an object derived type) const uint32_t MACHINE_TYPE_MASK = (1<<BUILTIN_object) | (1<<BUILTIN_void) | (1<<BUILTIN_int) | (1<<BUILTIN_uint) | (1<<BUILTIN_boolean) | (1<<BUILTIN_number); We could encapsulate those three if's as: const uint32_t MYSTERY_MASK = (1<<BUILTIN_object) | (1<<BUILTIN_void) | (1<<BUILTIN_int) | (1<<BUILTIN_uint) | (1<<BUILTIN_boolean) | (1<<BUILTIN_number | (1<<BUILTIN_string | (1<<BUILTIN_namespace); What should the mask be called and helper function? isMachineTypeOrStringOrNamespace?

Lars T Hansen

Comment 16

•

14 years ago

I guess that would count as descriptive, as names go... I'm not averse to it.

Steven Johnson

Comment 17

•

14 years ago

I dunno, MYSTERY_MASK is mighty appealing. We need more completely inexplicable code bits in Tamarin.

Werner Sharp

Reporter

Comment 18

•

14 years ago

Attached patch latest patch with changes from Lars review — Details — Splinter Review

Attachment #475198 - Attachment is obsolete: true

Attachment #475939 - Flags: superreview?(edwsmith)

Attachment #475198 - Flags: superreview?(edwsmith)

Edwin Smith

Comment 19

•

14 years ago

(In reply to comment #15) > isMachineTypeOrStringOrNamespace? isScriptObjectPtr() or isObjectReferenceType() both get at it -- the predicate is making sure the runtime value is a ScriptObject* This is a desireable patch but it has the unfortunate nature that even careful review could easily miss something subtle. Reviewing the correctness of the "simple" bits for each opcode (plus the "complex" analysis that takes into account operand types, for some of them), amounts to doing a interprocedural analysis, by hand, checking for indirect code paths that * access core->currentMethodFrame() * call MethodEnv->coerceEnter() * recurse unboundedly within C++ Even if we do that by hand, its super easy to make a mistake. So the key is having a good safety net -- testing, asserts, and anything else we can think of. (the comment on SimpleFunctionAnalyzer should summarize what i just wrote -- it explains what makes a function simple, but it should also explain in concrete terms what simple functions are not allowed to do). I don't like overloading the word 'simple' to describe a quasi-leaf function, ie "simple function" and also to describe an opcode that doesn't have implicit calls, as in 'simple opcode'. the term 'complex opcode' also is fuzzy - it doesn't mean the opcode isn't simple, it means the static types of the operands must be inspected to decide if the opcode is 'simple' or not.... unfortuantely i'm having a hard time coming up with better terms. brainstorming: "simple function" => "AS3 leaf function" or "quasi leaf function" (meaning, doesn't call other AS3 functions, or access core->currentMethodFrame but can call C++ helpers). I'm searching for a concrete term with definite, checkable meaning. if we are forced to use a fuzzy word like "simple" or "quasi" or "pseudo" then the onus is to strictly define what it means. "complex opcode" => "polymorphic opcode" or "overloaded opcode" meaning the behavior depends on the operand types. The "short" jit-generated prolog should write 0xdeadbeef into AvmCore::currentMethodFrame, and each method in AvmCore that accesses currentMethodFrame should check for the poison value and assert. The patch seems to intend to do the poisoning: 1790 // poison our methodFrame to no one accesses it. 1791 methodFrame = InsConstPtr((void*)(uintptr_t)0xdeadbeef); but this doesn't do it -- the poison must be stored in AvmCore, with something like this: stp(methodFrame, coreAddr, offsetof(AvmCore,currentMethodFrame), ACCSET_OTHER); Can you think of any other safety nets? The asserts that check for the poison should have a comment that explains what is going wrong and how to fix it. Maybe by referring to the explanation in the comment for class SimpleFunctionAnalyzer. We need some way to turn the optimization on or off, for debugging. (maybe this is a followon task, that also adds a switch for the restarg optimization?) Testing: testing on AIR exposed lots of bugs with stale core->codeContext... so testing there, with this fix in place, is important.

Edwin Smith

Updated

•

14 years ago

Attachment #475939 - Flags: superreview?(edwsmith) → superreview+

Edwin Smith

Comment 20

•

14 years ago

On second thought, 0xdeadbeef is probably not the best choice for a poison value since it is used for GCEndOfObjectPoison and could mislead someone debugging. Something unique and equally bogus would be better. maybe 0xdead1eaf :-)

Lars T Hansen

Comment 21

•

14 years ago

The MMgc poison values are all defined as constants in GCHeap.h. Since poison values should be globally unique we have the option of moving them into vmbase, now that we have vmbase. "dead1eaf" is cute.

Werner Sharp

Reporter

Updated

•

14 years ago

Assignee: wsharp → nobody

Dan Smith

Updated

•

13 years ago

Flags: flashplayer-qrb+

Target Milestone: --- → Future

Dan Smith

Updated

•

13 years ago

OS: Windows 7 → All

Hardware: x86 → All

Lars T Hansen

Comment 22

•

13 years ago

Major PACMAN work item (or forms important basis for that major item); "Future" is inappropriate. Suggest Brannan.

Flags: flashplayer-qrb+ → flashplayer-qrb?

Target Milestone: Future → Q1 12 - Brannan

Dan Smith

Updated

•

13 years ago

Flags: flashplayer-qrb? → flashplayer-qrb+

Priority: -- → P2

Sylvestre Ledru [:Sylvestre]

Comment 23

•

6 years ago

Tamarin is a dead project now. Mass WONTFIX.

Status: NEW → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

Sylvestre Ledru [:Sylvestre]

Comment 24

•

6 years ago

Tamarin isn't maintained anymore. WONTFIX remaining bugs.

same patch as from bug 511873 15 years ago Werner Sharp 67.98 KB, patch	lhansen : review-	Details \| Diff \| Splinter Review
update patch after review 14 years ago Werner Sharp 91.46 KB, patch	lhansen : review+	Details \| Diff \| Splinter Review
latest patch with changes from Lars review 14 years ago Werner Sharp 93.18 KB, patch	edwsmith : superreview+	Details \| Diff \| Splinter Review