Closed Bug 434571 Opened 17 years ago Closed 17 years ago

TT: Refactor LIR generation into pipeline stages

Categories

(Tamarin Graveyard :: Tracing Virtual Machine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: edwsmith, Assigned: edwsmith)

References

Details

Attachments

(3 files, 6 obsolete files)

We're getting asymptotically closer to zero productivity partly due to the complexity of generating and processing LIR. The UCI approach of using a pipeline of non-destructive modifications of an instruction stream seems to hold promise. this is a tracking bug for prototyping
ready for early review. let me know if this is going in the right direction.
Attachment #321757 - Flags: review?(rreitmai)
Looks good!
Depends on: 435044
Attached patch updated, getting close to done. (obsolete) — Splinter Review
* added FuncFilter in interpreter.cpp * moved optimizations out of prim.fs into ExprFilter and FuncFilter * experimenting with cleaner verbose formats. I think i'm going to leave the verbose output the same for this patch, and work on an extra fancy (and diffable) verbose filter in an addon patch. ditto for additional frontend and backend factoring.
Assignee: nobody → edwsmith
Attachment #321757 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #322035 - Flags: review?(rreitmai)
Attachment #321757 - Flags: review?(rreitmai)
Comment on attachment 322035 [details] [diff] [review] updated, getting close to done. Hmm, this latest patch doesn't seem to apply well to mainline. I'm getting 1 failure with LIR.cpp. I'm sure they're just placeholders so we should come up with better names for BeforeWriter and AfterWriter; I don't know maybe display writer or something? Also I curious behind the motive behind exposing addPage in LirWriter? Will look over patch in more detail tomorrow.
Attachment #322035 - Flags: review?(rreitmai) → review+
the first line of the patch has the hg changeset number it's based on. hg update -C ### before applying the patch with "patch -p1 -l <patch.txt". should work. * I will delete BeforeWriter * s/AfterWriter/DisplayWriter? * LirBuffer.addPage() is public so LirBufWriter can call it to add new pages. I could move ensureRoom() and insFar() from LirBufWriter to LirBuffer, for better encapsulation. There's no getting around the fact that you need a LirBufWriter to write to a LirBuffer. if we borrow from the click router design pattern, we'd merge LirReader and LirWriter into LirFilter (or something), and then LirBuffer would simply be an element (Filter) that managed an internal buffer and had a write port and a read port, and two cursors... one for writing, one for reading.
Attached patch pipeline refactoring (obsolete) — Splinter Review
Attachment #322035 - Attachment is obsolete: true
Attachment #322284 - Flags: review?(rreitmai)
sample output from new verbose mode for LIR. - names can be preassigned (eg rp, sp, core) - If an expression is written to the stack, i make a name based on the stack position. (sp[0] = "a", sp[1] = "b", etc). - otherwise i use the opcode name as the expression name - or the function name, if it's a call - if a name has already been used, i add a suffix (ssa style.. a becomes a1, a2, etc). generally the LIR is printed as [name] = [op] [operands] SOT A13052 T1 sp 97E130 rp 976188 t A12040 trace state = in ecx fromfrag = in edx rp = ld state[12] sp = ld state[8] f = ld state[0] sub0 = sub rp,f eq0 = eq sub0,52 xf eq0 -> A13052 sp+0 rp+0 f+0 T A13052 OP_label -3:9DB400 -3:9DBF40 -8:B -3:9DBF40 T 470E6A op_CORE_intck -3:9DB400 -3:9DBF40 -8:B -3:9DBF40 T 471252 CORE -3:9DB400 -3:9DBF40 -8:B -3:9DBF40 T 471254 intck -3:9DBF40 -8:B -3:9DBF40 -2:973010 T 471256 EXIT -3:9DBF40 -8:B -3:9DBF40 -2:0 T 470E6C BRF -3:9DBF40 -8:B -3:9DBF40 -2:0 sti rp[0] = #A13054 sti rp[4] = #470E6C sti sp[8] = core c0 = call intck ( core ) sti sp[8] = c0 eq1 = eq c0,0 xf eq1 -> 470E70 sp+0 rp+4 f+0
Comments from Rick: Looks good so far; couple of ideas/suggestions; - since we no longer show instruction numbers can we get rid of the immediate notation ("#") and just display the value directly. - can we print all constants in decimal, unless they are greater than 4 digits in which case we switch to hex. - could we put all assignments on the lhs, including stores? so > sti rp[0] = #A13054 becomes > rp[0] = #A13054 If we remove stores and migrate to defs, a single def may land in multiple stack locations. In that case, the form [name] = [operands] works well; e.g. add1 = 1 + 4 rp[0] = add1 <- def ... sp[4] = add1 <- def For calls, can we use call0 instead of c0 or is it too verbose? Makes it more consistent with the rest of the ops. Unless you plan to use the function name (which seems like a better idea) but that might end up being quite verbose. Would it look something like? intck1 = ( core ) Might be worth trying out anyway. - can we start the suffix numbering at 1 rather than 0. Zero usually has special meaning in many contexts so it tends to grab my attention :)
> - since we no longer show instruction numbers can > we get rid of the immediate notation ("#") and > just display the value directly. Yes. I kept # for hex numbers, but i could use 0x instead. i'm already printing decimal ints plainly. > - can we print all constants in decimal, unless they are > greater than 4 digits in which case we switch to hex. already do that, i think. i'll look for missing cases. > - could we put all assignments on the lhs, including stores? > so > > sti rp[0] = #A13054 > becomes > > rp[0] = #A13054 you mean just drop the "sti" opcode? I had done that in one iteration, and i thought you'd want to still see the opcode. also, our LIR_st[i] is overloaded, if we add LIR_fst, or stq, then would you still want to exclude the opcode? along those lines, i'll probably rename LIR_in to LIR_param. > If we remove stores and migrate to defs, a single def may land in > multiple stack locations. In that case, the form [name] = [operands] > works well; e.g. > > add1 = 1 + 4 > > rp[0] = add1 <- def > ... > sp[4] = add1 <- def that happens even today. currently what i'm doing is keeping the first name we assign, so if we haven't stored an expr to a stack slot by the time we hit the guard, then it gets an opcode-based name. if it gets stored to multiple stack slots, i still keep its original name. maybe what i'll do is associate names with stack slots based on the forth or abc signatures, and use those. for example "this", "param1", etc. and if we dont have a name, we'll use an opcode or call based name. right now the way i make up names for stack slots is: sp[0] = "a", sp[4] = "b", its kind of arbitrary and really lame for sp[-4], etc. i think that made up name is confusing. (it's stable, but confusing). > For calls, can we use call0 instead of c0 or is it too verbose? > Makes it more consistent with the rest of the ops. > > Unless you plan to use the function name (which seems like a better > idea) but that might end up being quite verbose. the code already uses function names for calls but the stack slot names take precedence. 'c0' was a coincidence. > - can we start the suffix numbering at 1 rather than 0. > Zero usually has special meaning in many contexts so it > tends to grab my attention :) yessir!
Attached patch latest (obsolete) — Splinter Review
Attachment #322284 - Attachment is obsolete: true
Attachment #322284 - Flags: review?(rreitmai)
Attachment #322288 - Attachment is obsolete: true
Here's an example of a new filter adding the ability to eliminate redundant loads when there is no intervening clobbering store. This impl is very conservative, all stores are clobbering except stores to the interpreter stack. all loads are candidates except loads from the interpreter stacks. works best when upstream of CseFilter.
Summary: Refactor LIR generation into pipeline stages → TT: Refactor LIR generation into pipeline stages
added insLoad() as a separate pipeline function, updated near tip.
Attachment #322517 - Attachment is obsolete: true
Attachment #322518 - Attachment is obsolete: true
Attachment #324997 - Flags: review?(rreitmai)
example stats output Fragment statistics for 11 entries after 0 cache flushes of 5 pages (20KB) h=loop header, x=exit trace, L=loop location calls guards main native T-trace T-interp T5 :0 1 0 0 0 0 0 avmplus::Domain/loadBytesMulti()+22 T11 :0 1 0 0 0 0 0 avmplus::Domain/loadBytesMulti()+100 T4 :0 1 0 0 0 0 0 global/private::main()+76 T3 :0 1 0 0 0 0 0 global/private::getfiles()+38 T7 bitops-nsieve-bits.as:16 5000 13 164 252 2642 0 loop 0 bitops-nsieve-bits.as$1:primes()+54 - bitops-nsieve-bits.as:16 1 0 0 0 0 4403 bitops-nsieve-bits.as$1:primes()+83 T8 bitops-nsieve-bits.as:18 159998 14 166 249 47134 24 loop 0 bitops-nsieve-bits.as$1:primes()+93 - bitops-nsieve-bits.as:19 14678 7 76 117 1918 7 cross T9 bitops-nsieve-bits.as$1:primes()+118 - bitops-nsieve-bits.as:20 6846 10 98 164 1028 9 loop 0 bitops-nsieve-bits.as$1:primes()+182 - bitops-nsieve-bits.as:18 1 0 0 0 0 0 bitops-nsieve-bits.as$1:primes()+207 T9 bitops-nsieve-bits.as:20 418240 19 201 287 163165 0 loop 0 bitops-nsieve-bits.as$1:primes()+131 - bitops-nsieve-bits.as:21 4999 19 204 343 2227 20 loop 0 code_pool+3201 - bitops-nsieve-bits.as:20 1 0 0 0 0 18 bitops-nsieve-bits.as$1:primes()+182 - bitops-nsieve-bits.as:20 338242 5 47 79 33762 3 merge M1 code_pool+5660 M1 bitops-nsieve-bits.as:20 338240 5 67 100 35631 6 loop 0 bitops-nsieve-bits.as$1:primes()+175 - bitops-nsieve-bits.as:20 7836 10 98 163 1162 22 cross T8 bitops-nsieve-bits.as$1:primes()+182 T1 :0 115 23 238 394 27 0 loop 0 Object$/private::_hideproto()+17 - :0 -7 26 0 0 0 17090 UnknownPrim code_pool+1252 - :0 61 26 247 444 570 8 loop 0 code_pool+1504 - :0 1 0 0 0 0 20384 code_pool+1252 - bitops-nsieve-bits.as:34 41 26 247 444 589 9 loop 0 code_pool+1504 - bitops-nsieve-bits.as:34 1 0 0 0 0 805 code_pool+1252 T10 bitops-nsieve-bits.as:39 1 0 0 0 0 0 avmplus::System$/trace()+23 T6 bitops-nsieve-bits.as:27 1 0 0 0 0 0 bitops-nsieve-bits.as$1:sieve()+27 T2 :0 1 0 0 0 0 0 Array/private::_set_length()+50
Comment on attachment 324997 [details] [diff] [review] updated labelers, new -Dverbose_addrs flag Looks like a nice change. One question though: Do we still have a way to emit the verbose output as its being generated? E.g. a simple switch of 'if (_outputCache)' to 'if (0 && _outputCache)' would do the trick.
Attachment #324997 - Flags: review?(rreitmai) → review+
> Do we still have a way to emit the verbose output as its being generated? E.g. Yes, the assembler reverses output same way as before. the actual output is all that's changed. ... change pushed as changeset: 410:701f40104ce7
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: