Closed Bug 460000 Opened 17 years ago Closed 7 years ago

Wordcode to wordcode optimization framework

Categories

(Tamarin Graveyard :: Interpreter, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX
Future

People

(Reporter: zherczeg, Unassigned)

References

Details

Attachments

(8 files, 1 obsolete file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; hu; rv:1.9.0.3) Gecko/2008092417 Firefox/3.0.3 Build Identifier: Started @ rev 997 This bug contains the history of improvements of the optimization framework. Reproducible: Always
Hi! I have opened a bug report for the optimization framework. The patches contains only the framework, the requested infrastructure changes are reported in: https://bugzilla.mozilla.org/show_bug.cgi?id=459815 Now the framework can parse the byte code, can build basic blocks, doing control and data flow analisys, and rebuild the code. The basic blocks can be freely moved, the jumps and exception handlers are rebuild accordingly. The best feature now is the wordcode dump system: Example function with 1 basic block: Basic block <1> From: 0 To: 13 In edges: -1 Out edges: -2 [0] getlocal 0 [1] USE: INP(0) DEF: 2(0) [2] pushscope {1} USE: 0(0) [3] findpropstrict {shell_toplevel.as$1,public,,avmplus}::System {0&0} [1] DEF: 5(0) [5] getslot 1 {1} [1] USE: 3(0) DEF: 9(0) [7] getlocal 1 [1] USE: INP(0) DEF: 9(0) [9] callmethod 9 1 {1+1} [1] USE: 7(0) 5(1) DEF: 12(0) [12] pop {1} USE: 9(0) [13] returnvoid The general instruction dump format: [wordcode index] name {pops} [pushes] USE: list DEF: list wordcode index - the index of the instruction in the wordcode array name - the friendly name of the instruction {a+b&c} - the number of stack pops (if a > 0 or b or c defined) a - the number of atoms always popped from the stack +b - for calls, the number of extra atoms popped from the stack &c - for multiname parameters, the number of extra atoms popped from the stack [a] - the number of stack pushes (only if a > 0) USE: index(position) list - if the list is nonempty index - the index of instructions, which result used by this instruction An INP means the dependency is an external input argument, not a result of another instruction. position - the stack pop index. 0 = this atom popped first from the stack, 1 = this atom popped second from the stack and so on DEF: index(position) list - if the list is nonempty similar to use, just the list of the instructions, which use the result of this particular instruction. The general basic block format: Basic block <basic block index> From: first wordcode index To: last wordcode index [CatchTarget] [* * ...] In edges: list of indexes Out edges: list of indexes Basic block edges: -1 = marks the function entry -2 = marks the function exit CatchTarget - the basic block is a target of a catch handler [* * ...] - how deep this basic block is embedded into an exception guard no * = 0, * = 1, * * = 2, * * * = 3, and so on Some more complex examples: [...] Basic block <2> From: 68 To: 72 In edges: 66 Out edges: 78 74 [68] findpropstrict {public,<#internal c:\Projects\Tamarin\test.es>}::b {0&0} [1] DEF: 70(0) [70] getslot 4 {1} [1] USE: 68(0) DEF: 72(0) [72] iffalse 4 {1} USE: 70(0) Basic block <3> From: 74 To: 76 In edges: 72 Out edges: 84 [74] pushbits 2 [1] DEF: 84(0) [76] jump 6 Basic block <4> From: 78 To: 80 In edges: 72 Out edges: 84 [78] pushbits 1 [1] DEF: 84(0) [80] jump 2 Basic block <5> From: 82 To: 82 In edges: 66 Out edges: 84 [82] pushbits 0 [1] DEF: 84(0) Basic block <6> From: 84 To: 110 In edges: 82 76 80 Out edges: 116 112 [84] dup {1} [2] USE: 74(0) 78(0) 82(0) DEF: 85(0) 87(1) [85] setlocal 2 {1} USE: 84(0) DEF: 89(0) [87] setslot 5 {2} USE: 84(0) 60(1) [89] getlocal 2 [1] USE: 85(0) DEF: 91(0) [91] setlocal 1 {1} USE: 89(0) [93] findpropstrict {public,<#internal c:\Projects\Tamarin\test.es>}::print {0&0} [1] DEF: 99(0) [95] findpropstrict {public,<#internal c:\Projects\Tamarin\test.es>}::c {0&0} [1] DEF: 97(0) [97] getslot 5 {1} [1] USE: 95(0) DEF: 99(0) [99] callmethod 8 1 {1+1} [1] USE: 97(0) 93(1) DEF: 102(0) [102] setlocal 1 {1} USE: 99(0) DEF: 199(0) [104] findproperty {public,<#internal c:\Projects\Tamarin\test.es>}::c {0&0} [1] DEF: 134(0) [106] findpropstrict {public,<#internal c:\Projects\Tamarin\test.es>}::a {0&0} [1] DEF: 108(0) [108] getslot 3 {1} [1] USE: 106(0) DEF: 110(0) [110] iffalse 4 {1} USE: 108(0) Basic block <7> From: 112 To: 114 In edges: 110 Out edges: 118 [112] pushbits 1 [1] DEF: 130(0) [114] jump 2 Basic block <8> From: 116 To: 116 In edges: 110 Out edges: 118 [116] pushbits 0 [1] DEF: 130(0) Basic block <9> From: 118 To: 122 [...] And a typical catch block: Basic block <16> From: 197 To: 222 CatchTarget * In edges: Out edges: 223 [197] getlocal 0 [1] USE: INP(0) DEF: 199(0) [199] pushscope {1} USE: 197(0) [200] newcatch 1 [1] DEF: 202(0) [202] dup {1} [2] USE: 200(0) DEF: 203(0) 205(1) [203] setlocal 2 {1} USE: 202(0) [205] dup {1} [2] USE: 202(0) DEF: 206(0) 207(1) [206] pushscope {1} USE: 205(0) [207] swap {2} [2] USE: 205(0) INP(1) DEF: 208(0) 208(1) [208] setslot 1 {2} USE: 207(0) 207(1) [210] findpropstrict {public,<#internal c:\Projects\Tamarin\test.es>}::print {0&0} [1] DEF: 218(0) [212] pushstring "Except2 " [1] DEF: 218(0) [214] getscopeobject 1 [1] DEF: 216(0) [216] getslot 1 {1} [1] USE: 214(0) DEF: 218(0) [218] callmethod 8 2 {1+2} [1] USE: 216(0) 212(1) 210(2) DEF: 221(0) [221] pop {1} USE: 218(0) [222] popscope
Attachment #343188 - Attachment is obsolete: true
Attachment #343899 - Flags: review?(lhansen)
Attachment #343188 - Flags: review?
The attached patch can be used with the 1062 revision of the tamarin-redux repository. The patch currently contains the following optimizations: - Constant Propagation. (-Ocp) - Constant Folding. (-Ocf) - Unreachable Code Elimination. (-Ouce) - Loop Invariant Code Motion. (-Olicm) - Linear Elimination. (-Ole) - Global Common Subexpression Elimination. (-Ogcse) - Local Common Subexpression Elimination. (-Olcse) - Global Copy Propagation. (-Ogcp) - Local Copy Propagation. (-Olcp) - Dead Code Elimination. (-Odce) The optimizations can be turned on with the corresponding command line options. Multiple optimizations can be turned on at the same time, but their execution order is fixed. (And just to make sure: constant propagation and constant folding are mutually exclusive, as are GCSE and LCSE, and GCP and LCP.) Two typical executions would be: "-Ocp -Ouce -Olicm -Ole -Ogcse -Ogcp -Odce" and "-Ocf -Ouce -Olicm -Ole -Olcse -Olcp -Odce". All the algorithms passed the acceptance test (however, almost all of them can be further optimized - and some refoctoring would do them good as well). We have measured the number of the output wordcode instructions and the number of the executed wordcodes on the scimark, sunspider, and windscorpion benchmarks with different optimizations. The results are attached as well. Beside the optimization options there are two new debug options, with which you can dump the basic block structures to the console (-Dbcdump) or to graphml files (-Dgbcdump). The graphml files can be opened with yEd. (Since the graphmls do not contain layout information, Tools/FitNodesToLabel and Layout/Hierarchical/Classic actions should be executed in yEd once they are opened.
OptStat2.xls contains the same results as OptStat.xls but several columns are hidden. Only the results of the two configurations (mentioned in the comment for the patch) are not hidden.
(In reply to comment #5) > Created an attachment (id=348207) [details] > Results of the optimizations. Thanks, some of these results are encouraging. With luck, I'll attempt a trial integration and some benchmarking this week.
This patch integrates with TR 1142:87d398595b4a and compiles on Mac in both 32-bit and 64-bit configurations; fixes to the project files are included in this patch. It crashes quickly in 64-bit configurations. It also crashes on all benchmarks with -Ogcp, in the 32-bit configurations. I have not looked into these crashes yet; they could be my fault. (This patch does not incorporate the MMgc change from the submitted patch, perhaps that has something to do with it.) In terms of benchmark results: on the sunspider and misc benchmarks running both without any -O flags and with one of the sets of -O switches suggested in an earlier comment in this bug (except for -Ogcp), there is a slowdown across the board, in several cases above 20%, and there is little difference between no flags and a full set of flags. The slowdown without flags is presumably due to the peephole optimizer not being run; the comparable slowdown with the flags is an indication that on those benchmarks the optimizations don't make much of a difference in practice. Even so, it's early days and the results reported earlier on instruction counts are encouraging.
The performance crashes comes from the mistyped -Ogcp and -Olcp option. :) The performance test writes out 'crash' if you add an option to the avmshell which it does not recognize. In the comment of the patch I accidentally wrote -Ogcp instead of -Ogcopy and -Olcp instead of -Olcopy, but in the usage output of the tool you can read the right names. We haven't tested it on a 64bit system yet, but we will do it.
Attached the patch of Milestone 3 and performance measurements.
Attachment #343899 - Flags: review?(lhansen)
Flags: flashplayer-qrb+
Priority: -- → P3
Target Milestone: --- → Future
Blocks: Wordcode
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: P3 → --
Component: Virtual Machine → Interpreter
Tamarin is a dead project now. Mass WONTFIX.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Tamarin isn't maintained anymore. WONTFIX remaining bugs.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: