Closed
Bug 463137
Opened 15 years ago
Closed 14 years ago
TM/nanojit: type-check LIR
Categories
(Core :: JavaScript Engine, enhancement)
Tracking
()
RESOLVED
FIXED
People
(Reporter: jimb, Assigned: n.nethercote)
References
Details
(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)
Attachments
(4 files)
36.50 KB,
patch
|
Details | Diff | Splinter Review | |
55.32 KB,
patch
|
jseward
:
review+
rreitmai
:
review+
|
Details | Diff | Splinter Review |
4.70 KB,
patch
|
jseward
:
review+
|
Details | Diff | Splinter Review |
13.09 KB,
patch
|
rreitmai
:
review+
|
Details | Diff | Splinter Review |
It would be nice if TraceMonkey developers could spend less of their time tracking down code generation errors. A verification pass that ran after LIR generation could catch some kinds of incorrect code immediately. If such a pass ran after native code generation, it could catch bugs in the assembler as well. If the pass used a type system that was sound, it would catch all segmentation faults due to JITted code, in addition to other kinds of bugs. The verifier would be an optional pass, perhaps enabled by default in debugging builds, that would disassemble machine code into LIR, analyze the LIR to infer the types of the values being operated on, and use those types to validate memory reference, arithmetic operations, and so on. The results here would be: - a type system for validation, whose types describe what the validator can know about values. These types would include what JavaScript calls types, shapes, and perhaps internally used values like shape numbers. - type rules: for each primitive operation (add; subtract; fetch), a function that looks at the types of the operands and computes the type of the result --- or complains if the operation is ill-typed. - An "abstract interpreter" for LIR that starts with the known types of the inputs and then walks the code applying the type rules for the operations in the code. We could stage the work as follows: - Start with a trivial type system, perhaps with types "tagged value", "integer", "pointer", "float". Ensure that we never ld the result of a float-valued LIR, etc. This allows us to get the interpreter working on the LIR vocabulary. - Elaborate the type system to know about objects, shapes, pointers to floats, dense arrays, etc. - Disassemble machine code to LIR, to bring the assembly pass under scrutiny as well. The type system would necessarily be specific to the layout of the objects it describes. For example, in TraceMonkey, passing a guard that checks the shape of an object would refine the validator's understanding of type of that object and permit subsequent member access. So different clients of nanojit would need different type systems. This could mean that the abstract interpreter and disassembler should be part of nanojit, and the type system should be provided by nanojit's client. Or, the type system could be part of nanojit and provide all the types all nanojit's clients need. Or perhaps the whole verifier would be TraceMonkey-specific.
Reporter | ||
Updated•15 years ago
|
Summary: TraceMonkey should verify the compiler's output → TM: TraceMonkey should verify the compiler's output
Updated•15 years ago
|
Severity: normal → enhancement
Reporter | ||
Updated•15 years ago
|
Assignee: general → jim
Reporter | ||
Comment 1•15 years ago
|
||
There's a hitch in the idea of translating machine code to LIR. Edwin Smith has said that LIR is SSA without phi nodes: before a join, all live values need to be spilled to memory. If we try to translate arbitrary graphs of machine code to LIR, there's no reason the incoming machine code would be expressible in this form. If the machine code modifies a register on both branches above a join, and then refers to that register after the join, there's no way to express the operand of that last reference: it can't point to both originators at once. Perhaps LIR produced from machine code *could* include phi nodes. The validator needs to check them anyway. The 'phi' LIR would be one that the validator would understand, but not the compiler.
Comment 2•15 years ago
|
||
Yes you probably want to introduce a LIR_phi node and have it point to expressions (hint you could use LIR_2 as the operand, aka con/cons style, to chain the refs indefinitely). Unlike the more general case; with the current compiler we'll find that all expressions across a branch get sunk to a store.
![]() |
Assignee | |
Comment 3•15 years ago
|
||
It sounds like a simple type system would be easy. Disassembling native code would be a lot harder, as would more advanced type systems. (In reply to comment #2) > (hint you could use LIR_2 as the operand, aka con/cons style, to > chain the refs indefinitely). I was hoping to get rid of LIR_2... I think it's currently only used for cmov and qcmov, because they are trinary and current LIR only handles binary ops, but with variable-width LIR (bug 492866) trinary ops could be supported. But that's not all that important.
Reporter | ||
Comment 4•15 years ago
|
||
[clueless] Why isn't a phi node an ordinary two-operand instruction?
Comment 5•15 years ago
|
||
comment #3 true, was speaking from the perspective of the current tamarin branch which is fixed LIR. comment #4: it could be, it depends on how we want to define it. I was thinking of the case where we have n incoming paths and want to use a single LIR_phi to represent the merged value. I would assume then that the stores on each path would be the operands for the phi (or more likely the expression being stored).
Reporter | ||
Comment 6•15 years ago
|
||
(In reply to comment #5) > comment #4: it could be, it depends on how we want to define it. I was > thinking of the case where we have n incoming paths and want to use a single > LIR_phi to represent the merged value. I would assume then that the stores on > each path would be the operands for the phi (or more likely the expression > being stored). How is this different from simply having a phi node whose operands are phi nodes? If one forbids the left operand of a phi node from also being a phi node after the same most recent join, then an n-ary phi node is simply a linked list of binary phi nodes chained along the right children, like a lisp list. And the semantics "just work".
Comment 7•15 years ago
|
||
Minor divergence (not about how to represent phi nodes): LIR as it stands forces all scalar (non-memory, I mean) operands to be into memory at a join point, because there are no phi nodes. In that case, why bother to continue the fragment? Wouldn't it be the same cost to jump to another fragment, which has to pull all that stuff back out of memory, as it would be to continue in the same fragment after the join point? If there's no real desire to do phi-node-ing on scalar values, then this might be a nice simplication of LIR (== get rid of forward jumps altogether, since they're just as expensive as jumping to a new fragment). Sure, phi-nodes will make LIR more general, but they'll also make the code that has to process it significantly more complex. So I guess my assumptions here are borked .. but which ones?
Reporter | ||
Comment 8•15 years ago
|
||
In case the term "abstract interpreter" in the description misleads anyone: we're just talking about the obvious type checking algorithm, traversing the LIR and checking each node. But when the type rules are handled by callbacks from nanojit into jstracer, the nanojit code for operations other than branches ends up looking like an interpreter for LIR that delegates all the actual operations (adding two values; fetching) to a class provided by jstracer. At branch and join points we do non-interpreter-like things, but making everything else look as much like an interpreter for LIR as possible gives you confidence that your type checking is true to the semantics of the language.
Updated•15 years ago
|
Assignee: jim → jseward
![]() |
Assignee | |
Comment 9•15 years ago
|
||
Should we rename this bug as "type-check LIR", and push other kinds of checking into another bug?
Comment 10•15 years ago
|
||
Comment 11•15 years ago
|
||
Comment #10 contains a first attempt at a typechecker for LIR. This is a WIP patch, but is pretty usable. I'm putting it up for people to look at and try out. It can run through the whole of trace-test.js without difficulty, and it found the type error reported in https://bugzilla.mozilla.org/show_bug.cgi?id=503427 It also reports some other invalid-LIns* reference errors which I have not yet investigated. How to use: firstly, is only built as part of a debug build. There are two settings: * TMFLAGS=lircheck. Runs the checker, but is silent unless errors are reported, in which case all error output is prefixed by the string LIR_CHECKER, for easy greppage. * TMFLAGS=lircheckverbose. Runs the checker and gives a very verbose listing of everything that goes through it. This is necessary for diagnosing type errors. When is the checker run? The basic algorithm is such that it could be placed at any reader or writer pipeline stage, although at the moment the implementation inherits from LirWriter, so it can only be used in a writer position right now. Currently the checking is done right at the front of the writer pipeline, at the point where the tracer and the regexp compiler generate new LIRs. The type system distinguishes 32-bit from 64-bit quantities and float from integer quantities. It works by collecting up all the constraints imposed on both results and arguments from all the LIR nodes, and trying to resolve them -- if they can't be resolved then a type error is reported. This roundabout mechanism is needed because LIR is overloads load and store data types. For example, LIR_ldq produces a 64-bit something, and that could be fed either to a floating or integer operation (but not both). So we have to see if later uses of the value further constrain it to 64-bit int or 64-bit float, and report any cases where it gets constrained to both as being a type error.
![]() |
Assignee | |
Updated•15 years ago
|
Summary: TM: TraceMonkey should verify the compiler's output → TM/nanojit: type-check LIR
![]() |
Assignee | |
Comment 12•15 years ago
|
||
Comment on attachment 388356 [details] [diff] [review] First attempt at a typechecker for LIR >+ case LTy_CMIN: return "CMIN(no-constraint)"; I'd print this as "Any" or "AnyXX". >+ /* If possible, merge two type constraints together, generating a >+ new type constraint. Or fail to. 'res' is unchanged whenever >+ this returns false. */ >+ bool mergeLTys ( /*OUT*/LTy* res, LTy t1, LTy t2, LogControl* logc ) I'd really really like to see the types described (where they are defined, in LIR.h) as a lattice. This function would then be meetLTys() or glbLTys() (or joinLTys() or lubLTys(), depending how the lattice is drawn). Then CMAX/CMIN would be top/bottom (or bottom/top). And it'd be good to have a brief comment explaining the lattice for those who aren't familiar with them :) And you can avoid the need for two return values by just returning CMAX where you currently return false, and then callers can check for CMAX instead of false. Also, C++ references are arguably better than pointers for out-parameters. >+ case LIR_flt: >+ case LIR_fle: >+ case LIR_fgt: >+ case LIR_fge: >+ case LIR_feq: >+ *resFml = LTy_Int32; >+ argFmls[0] = LTy_Flt64; >+ args[0] = ip->oprnd1(); >+ argFmls[1] = LTy_Flt64; >+ args[1] = ip->oprnd2(); >+ return 2; Nit: I'd reorder this (and similar cases) so that argFmls[] assignments are all together, and likewise for args[] assignments. >+ case LIR_alloc: >+ *resFml = getHostWordLTy(); >+ return 0; LIR_alloc takes an integer argument, no? >+ case LIR_cmov: >+ // kludge; args 2 and 3 are in LIR_2 pointed to by >+ // the second operand. >+ *resFml = LTy_Any32; >+ argFmls[0] = LTy_Any32; /* cond */ >+ args[0] = ip->oprnd1(); >+ return 1; You should check that operand 2 is a LIR_2 and then constrain its operands to have the same types. (This will go away when bug 501232 is landed, but it can be handled in this form.) >+ case LIR_fcall: >+ case LIR_call: { >+ ArgSize sizes[10]; Use MAXARGS. (Occurs more than once.) >+ *resFml = opc == LIR_fcall ? LTy_Flt64 >+ : LTy_Int32; Can LIR_fcall return a 64-bit integer? >+ /* Algorithm: associate with each LIR, a Ty indicating the Nit: Use "LIns" or "LIR instruction" instead of "LIR". (Occurs more than once.) >+ LIns* LirChecker::ins0(LOpcode v) { >+ LIns* i = out->ins0(v); >+ checkLIns(i); >+ return i; >+ } >+ LIns* LirChecker::ins1(LOpcode v, LIns* a) { >+ LIns* i = out->ins1(v, a); >+ checkLIns(i); >+ return i; >+ } This repetition is a shame. I wonder if it's worth making a class, say, LirWriter0, beneath LirWriter that has no virtual methods other than the constructor, make LirWriter inherit from LirWriter0, and make the pipeline a LirWriter0* instead of a LirWriter. (And feel free to use better names than LirWriter/LirWriter0.) It'd be good to know about places where the type-checking has holes. Eg. I think LIR_ov is one. And so is LIR_callh -- actually, you don't seem to have LIR_callh in there anywhere...
![]() |
Assignee | |
Comment 13•15 years ago
|
||
(In reply to comment #12) > >+ case LIR_fcall: > >+ case LIR_call: { > >+ ArgSize sizes[10]; > > Use MAXARGS. (Occurs more than once.) Actually, you probably want MAXARGS+1, to allow for the return value?
Comment 14•15 years ago
|
||
(In reply to comment #11) > For example, LIR_ldq produces a 64-bit something, and that could > be fed either to a floating or integer operation (but not both). > So we have to see if later uses of the value further constrain it > to 64-bit int or 64-bit float, and report any cases where it gets > constrained to both as being a type error. tamarin-tracing (but not -redux or tracemonkey) used a value encoding where all non-double values were tagged in a way that made them look like NaN's. Then, you'd have code like this: value = ldq ... if (value & mask < limit) { // tag test uses value as int64 // value is a double ... = fadd(value, ...) // use value as double } else { // value is tagged-something ... = qlo(value) // <-- use value as int64 } I think the type checker would reject code like this? possible solutions: 1. ignore -- TT is obsolete, not a problem in TR or TM (but we may want to bring back this encoding in TR) 2. require a no-op cast instruction in one case or the other. (this would correspond to the implicit GPU<->FPU copy that the assembler would have to emit, in one case or the other) 3. can we allow int64|double to coexist without being an error and without losing useful type checking?
Comment 15•15 years ago
|
||
Is on hold during the nanojit merging, since it visits all of the LIR. Will resume when that's completed.
Updated•15 years ago
|
QA Contact: nanojit → general
![]() |
Assignee | |
Updated•14 years ago
|
![]() |
Assignee | |
Comment 16•14 years ago
|
||
Stealing. I have a mostly-working patch that I'll put up soon. It's a lot easier now because LIR has become better typed, eg. int64 and float64 operations are distinguished (thanks in part to bug 520714).
Assignee: jseward → nnethercote
![]() |
Assignee | |
Comment 17•14 years ago
|
||
This NJ patch reimplements the LIR type-checker, adding new stages ValidateWriter and ValidateReader to NJ. It's much simpler than Julian's type checker, because LIR has been cleaned up so there's no need for merging type constraints and all that -- we can just check types immediately. There are a couple of obscure cases where LIR isn't type-clean, these are currently unchecked and marked with XXX, and discussed in detail in bug 540368. - ValidateWriter does a combination of type-checking and also "structure checking", ie. ad-hoc stuff mostly required due to hacks in LIR's design (also some 32-bit vs 64-bit differences). It replaces SanityFilter. - Unlike Julian's patch, ValidateWriter checks implicit LIR instructions, ie. those specified as args to functions like insLoad(), rather than explicit LIns objects. This is less clean -- you can't do it all in a nice single checkLIns() function -- but it is necessary if you want to check the LIR coming directly from the compiler at the start of the writer pipeline, before optimisations are applied, which I think is important. - ValidateReader is very similar to TR's, it just checks some things that ValidateWriter cannot check. - StackFilter doesn't need a LirBuffer, I removed this field. - StackFilter is now allocated via 'alloc' rather than on the stack, like the other reader pipeline stages. - TR's ValidateWriter used to abort if it hit a guard, because TR doesn't use guards. That check is now gone, it's hard to do within NJ. - TR's ValidateWriter checks that the arg to LIR_ret is word-sized, but requiring it to be int-sized is more consistent with everything else (we should have LIR_qret and LIR_pret if necessary) and how it's used in TM and TR doesn't seem to use it. I changed the comment in LIRopcode.tbl accordingly as well. - I moved two CallInfo methods from Assembler.cpp to LIR.cpp, a more sensible place for them. - I removed a bogus LIR_ov instance in ExprFilter -- it's unary but was being treated as binary. - In lirasm, I fixed some misplaced opcodes. And I fixed ill-typed qilsh/qirsh/qursh usage in --random mode. There's one lirasm test failure, for multfrag3.in on 64-bit, because it uses 'param' when it wants some int32 args, but that gives it int64 args. Not sure how to fix this, I'm not even sure if it supposed to be possible to get 32-bit int args on 64-bit platforms. - In LIRopcode.tbl I more clearly documented the nastiest instructions. - I added LIR::retType().
Attachment #422162 -
Flags: review?(jseward)
![]() |
Assignee | |
Updated•14 years ago
|
Attachment #422162 -
Flags: review?(rreitmai)
![]() |
Assignee | |
Comment 18•14 years ago
|
||
This TM patch adjust for the above NJ patch. In particular: - Fixes a type error in js_SetCall{Arg,Val}() - an int-sized arg was being passed where a word-sized one was expected. Turns out it didn't change anything because on X64 the codegen for 64-bit integers that fit into 32-bits is the same as that for 32-bit integers. But it's good to have fixed anyway.
Attachment #422163 -
Flags: review?(jseward)
![]() |
Assignee | |
Updated•14 years ago
|
Attachment #422162 -
Attachment description: patch → NJ patch
![]() |
Assignee | |
Comment 19•14 years ago
|
||
This is the TR patch. SeqReader has become a bit ugly but I couldn't see a better way to do it. It passes the acceptance tests.
Attachment #422164 -
Flags: review?(rreitmai)
Updated•14 years ago
|
Attachment #422164 -
Flags: review?(rreitmai) → review+
Comment 20•14 years ago
|
||
Comment on attachment 422162 [details] [diff] [review] NJ patch Excellent clean-up. As you mention in the comments, we are validating the LIR that's pushed into the Writer, it would be interesting to see if some of this code could be co-opted to validate the LIR that comes out the other end. Thinking out loud, could we could wire up an instruction pump that pulls LIR from a ValidateReader and feeds it to a ValidateWriter that then pushes it to a null sink?
Attachment #422162 -
Flags: review?(rreitmai) → review+
![]() |
Assignee | |
Comment 21•14 years ago
|
||
(In reply to comment #20) > > As you mention in the comments, we are validating the LIR that's pushed into > the Writer, it would be interesting to see if some of this code could be > co-opted to validate the LIR that comes out the other end. Something like that is certainly possible, but I think we'd be getting into diminishing returns territory. In TM we have a ValidateWriter at the start of the pipeline and the end of the pipeline, and I don't think any of the stages modify LIns objects as they are returned. There's also a certain amount of sanity checking in the backends.
Comment 22•14 years ago
|
||
(In reply to comment #17) > - TR's ValidateWriter checks that the arg to LIR_ret is word-sized, but > requiring it to be int-sized is more consistent with everything else (we > should have LIR_qret and LIR_pret if necessary) and how it's used in TM and > TR doesn't seem to use it. I changed the comment in LIRopcode.tbl > accordingly as well. I think this new ValidateWriter is great, but does the above comment mean we can't use it in 64-bit TR builds until we add qret/pret? I'm also in favor of those new opcodes, but if using ValidateWriter depends on them existing, then we (TR folks) should hurry up and work on the qret patch.
![]() |
Assignee | |
Comment 23•14 years ago
|
||
(In reply to comment #22) > > I think this new ValidateWriter is great, but does the above comment mean we > can't use it in 64-bit TR builds until we add qret/pret? I'm also in favor of > those new opcodes, but if using ValidateWriter depends on them existing, then > we (TR folks) should hurry up and work on the qret patch. There isn't a dependency, because the patch just doesn't type-check LIR_ret and the two other ill-typed opcodes (LIR_qlo, LIR_live). Look for the 'XXX' marks in the patch. Bug 540368 exists for those and it can be fixed after this bug.
Comment 24•14 years ago
|
||
Comment on attachment 422162 [details] [diff] [review] NJ patch Nicely done; seems a clean structure. My only comment, which definitely falls into the hair-splitting department, is that you use the term "formals" to refer to the expected argument types of primitive operations. I've only seen it used to refer to the arguments & types of subroutines.
Attachment #422162 -
Flags: review?(jseward) → review+
Comment 25•14 years ago
|
||
Comment on attachment 422163 [details] [diff] [review] TM patch Looks ok.
Attachment #422163 -
Flags: review?(jseward) → review+
![]() |
Assignee | |
Comment 26•14 years ago
|
||
(In reply to comment #24) > My only comment, which definitely falls > into the hair-splitting department, is that you use the term "formals" to refer > to the expected argument types of primitive operations. I've only seen it used > to refer to the arguments & types of subroutines. Dude, I copied that from your original patch!
Comment 27•14 years ago
|
||
(In reply to comment #26) > Dude, I copied that from your original patch! Oh .. um .. er .. my apologies .. /me feels a bit silly now :)
![]() |
Assignee | |
Comment 28•14 years ago
|
||
Pushed, and then pushed bustage fixes: http://hg.mozilla.org/projects/nanojit-central/rev/162182c5cd78 http://hg.mozilla.org/projects/nanojit-central/rev/7befbe058634
Whiteboard: fixed-in-nanojit
![]() |
Assignee | |
Comment 29•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/3548d8258aca http://hg.mozilla.org/tracemonkey/rev/75704a6936a8 http://hg.mozilla.org/tracemonkey/rev/37ee0e47774d
Whiteboard: fixed-in-nanojit → fixed-in-nanojit, fixed-in-tracemonkey
![]() |
Assignee | |
Comment 30•14 years ago
|
||
Mochitest bustage fix: http://hg.mozilla.org/tracemonkey/rev/7a9913df9ca4
Comment 31•14 years ago
|
||
I'm trying to pull this into TR, but there's a serious failure in our acceptance tests, and it boils down to the revised ValidateReader. In ValidateReader::read(), we have case LIR_jtbl: ... LIns* target = ins->getTarget(); but in LIns::getTarget(), we have LIns* LIns::getTarget() const { NanoAssert(isBranch() && !isop(LIR_jtbl)); return oprnd2(); } it's not clear to me which of these is expected to be true...
Comment 32•14 years ago
|
||
Should have debugged before commenting... the line LIns* target = ins->getTarget(); should be LIns* target = ins->getTarget(i);
Comment 33•14 years ago
|
||
I took the liberty to push the fix to nj as http://hg.mozilla.org/projects/nanojit-central/rev/0a2444656722 on self-review, as this looks trivial and safe.
Updated•14 years ago
|
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
![]() |
Assignee | |
Comment 35•14 years ago
|
||
Hooray for tag checking :)
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•