463137 - TM/nanojit: type-check LIR

Reporter

Description

•

16 years ago

It would be nice if TraceMonkey developers could spend less of their time tracking down code generation errors. A verification pass that ran after LIR generation could catch some kinds of incorrect code immediately. If such a pass ran after native code generation, it could catch bugs in the assembler as well. If the pass used a type system that was sound, it would catch all segmentation faults due to JITted code, in addition to other kinds of bugs.
The verifier would be an optional pass, perhaps enabled by default in debugging builds, that would disassemble machine code into LIR, analyze the LIR to infer the types of the values being operated on, and use those types to validate memory reference, arithmetic operations, and so on.

The results here would be:

- a type system for validation, whose types describe what the validator can know about values. These types would include what JavaScript calls types, shapes, and perhaps internally used values like shape numbers.

- type rules: for each primitive operation (add; subtract; fetch), a function that looks at the types of the operands and computes the type of the result --- or complains if the operation is ill-typed.

- An "abstract interpreter" for LIR that starts with the known types of the inputs and then walks the code applying the type rules for the operations in the code.

We could stage the work as follows:

- Start with a trivial type system, perhaps with types "tagged value", "integer", "pointer", "float". Ensure that we never ld the result of a float-valued LIR, etc. This allows us to get the interpreter working on the LIR vocabulary.

- Elaborate the type system to know about objects, shapes, pointers to floats, dense arrays, etc.

- Disassemble machine code to LIR, to bring the assembly pass under scrutiny as well.

The type system would necessarily be specific to the layout of the objects it describes. For example, in TraceMonkey, passing a guard that checks the shape of an object would refine the validator's understanding of type of that object and permit subsequent member access. So different clients of nanojit would need different type systems.

This could mean that the abstract interpreter and disassembler should be part of nanojit, and the type system should be provided by nanojit's client. Or, the type system could be part of nanojit and provide all the types all nanojit's clients need. Or perhaps the whole verifier would be TraceMonkey-specific.

Jim Blandy :jimb

Reporter

Updated

•

16 years ago

Summary: TraceMonkey should verify the compiler's output → TM: TraceMonkey should verify the compiler's output

Robert Sayre

Updated

•

16 years ago

Severity: normal → enhancement

Jim Blandy :jimb

Reporter

Updated

•

16 years ago

Assignee: general → jim

gabe

Updated

•

16 years ago

Flags: wanted1.9.2?

Jim Blandy :jimb

Reporter

Comment 1

•

15 years ago

There's a hitch in the idea of translating machine code to LIR.

Edwin Smith has said that LIR is SSA without phi nodes: before a join, all live values need to be spilled to memory.  If we try to translate arbitrary graphs of machine code to LIR, there's no reason the incoming machine code would be expressible in this form.  If the machine code modifies a register on both branches above a join, and then refers to that register after the join, there's no way to express the operand of that last reference: it can't point to both originators at once.

Perhaps LIR produced from machine code *could* include phi nodes.  The validator needs to check them anyway.  The 'phi' LIR would be one that the validator would understand, but not the compiler.

Rick Reitmaier

Comment 2

•

15 years ago

Yes you probably want to introduce a LIR_phi node and have it point to expressions (hint you could use LIR_2 as the operand, aka con/cons style, to chain the refs indefinitely). 

Unlike the more general case; with the current compiler we'll find that all expressions across a branch get sunk to a store.

First attempt at a typechecker for LIR 15 years ago Julian Seward [:jseward] 36.50 KB, patch		Details \| Diff \| Splinter Review
NJ patch 15 years ago Nicholas Nethercote [inactive] 55.32 KB, patch	jseward : review+ rreitmai : review+	Details \| Diff \| Splinter Review
TM patch 15 years ago Nicholas Nethercote [inactive] 4.70 KB, patch	jseward : review+	Details \| Diff \| Splinter Review
TR patch 15 years ago Nicholas Nethercote [inactive] 13.09 KB, patch	rreitmai : review+	Details \| Diff \| Splinter Review