Closed Bug 461316 Opened 16 years ago Closed 15 years ago

Re-engineering the verifier to break apart type modeling, optimization, and code generation

Categories

(Tamarin Graveyard :: Virtual Machine, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: lhansen, Assigned: jodyer)

References

Details

Attachments

(8 files, 2 obsolete files)

At present the verifier combines verification and type modeling (based on an unwritten specification), analysis of optimization opportunities (some based on representational detail that is not present in the VM specification), and code generation into one blob of code.

This structure has led to problems.  In particular, as the verifier and the analyzer are integrated, verification has come to mean "anything we can and wish to prove", which is not good; also, what we wish to prove is sometimes well hidden.  There are other problems too, integration of the word-code translator into the verifier has been difficult because the translator has code generation needs that are different from those of the two JITs.

We need to break out a clear verifier spec, and redo the verifier code, probably in some sort of pipeline form, that separates verification from other tasks while making the other tasks possible by providing analysis results as appropriate.
Taking ownership. First step will be to write down the spec for modeling types and verifying code.
Assignee: nobody → jodyer
> At about line 350 I verifier.cpp:
> 
>                 if (!blockEnd || !blockState->initialized)
>                 {
>                     checkTarget(pc);
>                 }
> 
> I'm trying to understand the reason that we would merge the state of the
> previous instruction with that of the current block if control does not flow
> from that instruction into the current block (i.e. blockEnd == true). This
> appears to be the effect of the second part to the if condition above?
> 
> Do you remember what case the '!blockState->initialized' condition is
> supposed to handle?
> 
> Jd

this logic is an artifact of the verifier wanting to do a single linear pass through the code.  if the code is an inverted loop like this:

init
goto check
loop:
body
check:
if cond goto loop

then when reaching loop, we have no flow information since we havent seen the predecessor yet, because the loop was inverted.

Making the verifier strictly follow control flow paths, and support arbitrary block ordering, will require new logic to determine the reverse-postorder block list, or code to iterate until a fixed point.
But we do want the verifier to follow control flow paths, right?  The way it is now, it's hard / untractable to generate correct code for some cases because the verifier assumes control flow that does not in fact exist.
of course we do!  just pointing out the current rationale
Adding reference to related bug
Depends on: 412532
Depends on: 413522
No longer depends on: 412532
On or about line 445 in Verifier.cpp:

ExceptionHandler* handler = &info->exceptions->exceptions[i];
if (pc >= code_pos + handler->from && pc <= code_pos + handler->to)
{
   // Set the insideTryBlock flag, so the codegen can
   // react appropriately.
   state->insideTryBlock = true;

   // If this instruction can throw exceptions, add an edge to
   // the catch handler.
   if (opcodeInfo[opcode].canThrow || pc == code_pos + handler->from)
      ...

This code appears to add an edge from the start of a try block to each of the catch handlers of that block. Why and edge from the start of a try block?

Jd
iirc the idea was to establish state for the catch block before any exception-edges are verified.  having thought more about it, i don't think it's necessary.  

since there never is such a control-flow edge, if that check turns out to be important then it must be covering up a different issue.  lets plan on removing it.
I want to check the verifier spec into Tamarin. Where does it belong? In my local workspace it is in 

        ./tamarin-redux/doc/verifier.html

'doc' being a new directory. Thinking forward, where do we want to put design and other reference documents? That's probably were verifier.html should go.

Jd
A "doc" subdirectory seems right for these things.
Flags: flashplayer-triage+
Flags: flashplayer-qrb+
Blocks: 469843
Blocks: 469836
preliminary performance testing shows on difference in sunspider, but otherwise parity with TR. The difference might be explained by a rogue type cast in the jitted code. Investigating...

Jd
Attachment #358259 - Flags: review?(edwsmith)
preliminary comments & questions

* Should TeeWriter and NullWriter move from Coder.h to a cpp file?  filters dont beenfit from inlining when called through virtual functions on CodeWriter.  NullWriter appears to be dead code, maybe it should be braketed by #ifdef _DEBUG or something?  CodeReader also dead code, apparently.

* in theory, the verifier is the tip of the pipeline, feeding stuff to downstream filters.  Should Verifier then take a CodeWriter instead of a CodegenLIR* jit?  how close are we to erasing traces of lir/mir/wordcode from Verifier.cpp? (except maybe from something that sets up the pipeline?)

* from what i can tell, many ops call coder->write, which indirectly calls the jit or wordcode translator, while some still directly call the jit/wc translator.  these must be doing stuff that doesnt (yet) fit the pipeline api?

* how deeply should i look at the Edge/Block/CfgWriter code?  at first glance i'm wondering why its using int id's for blocks and edges and maintaining a list, as opposed to directly using pointers to edges in the succ/pred lists.
(In reply to comment #11)
> preliminary comments & questions
> * Should TeeWriter and NullWriter move from Coder.h to a cpp file?  filters
> dont beenfit from inlining when called through virtual functions on CodeWriter.
>  NullWriter appears to be dead code, maybe it should be braketed by #ifdef
> _DEBUG or something?  CodeReader also dead code, apparently.

Agreed. Compile out NullWriter and move implementation of CodeWriter to Coder.cpp.

> * in theory, the verifier is the tip of the pipeline, feeding stuff to
> downstream filters.  Should Verifier then take a CodeWriter instead of a
> CodegenLIR* jit?  how close are we to erasing traces of lir/mir/wordcode from
> Verifier.cpp? (except maybe from something that sets up the pipeline?)

There is still work to do to remove all direct uses of lir, mir and wordcode translator (see comment below). By my count there are 8 opcodes that directly call the backends and various other uses that appear before and after the opcode emitters are called. In principle, when they are fixed we can pass in a CodeWriter reference to Verifier::verify(). 

> * from what i can tell, many ops call coder->write, which indirectly calls the
> jit or wordcode translator, while some still directly call the jit/wc
> translator.  these must be doing stuff that doesnt (yet) fit the pipeline api?

References to jit and translator remain because of details that are difficult to abstract into the existing pipeline calls, or work that the verifier is doing that not all filters understand (early binding to accessors is not supported by wc translator). These can and should be erased in time.

> * how deeply should i look at the Edge/Block/CfgWriter code?  at first glance
> i'm wondering why its using int id's for blocks and edges and maintaining a
> list, as opposed to directly using pointers to edges in the succ/pred lists.

This code is just to demonstrate that the pipeline can be used for other purposes but currently has no practical value. IIRC, the use of int labels rather than pointers was simply because int print nicer. Anyway, this code should be compiled out for now. 

Jd
Attachment #358259 - Flags: review?(edwsmith) → review+
Patch landed, but keeping this bug open because there is still work to do. In addition to the issues mentioned in the comment above, Steven notes:

> Hey Jeff, there are two (remaining) ifdef DEBUGGER chunks in Verifier:
> 
> #if defined DEBUGGER && defined AVMPLUS_WORD_CODE
>         XLAT_ONLY( if (translator && core->debugger()) translator-
> >emitOp0(code_pos, WOP_debugenter); )
> #endif
> 
> This looks all good to me, but I’m wondering where WOP_debugexit is
> emitted?
> 
>             case OP_debugfile:
>                 //checkStack(0,0)
>                 #if defined(DEBUGGER) || defined(VTUNE)
>                 checkCpoolOperand(imm30, kStringType);
>                 #endif
>                 coder->write(state, pc, opcode);
>                 break;
> 
> This makes me nervous, shouldn’t debugger and non-debugger builds verify
> the same way? Any reason we should verify the operand in all builds?

This too needs to be investigated.

Jd
agreed, we should validate operands the same way regardless of configuration.
Attached patch Patch for MIRSplinter Review
Some debugger-conditional code in LIR needs to be mimicked in MIR as well.
Attachment #359112 - Flags: review?
Attachment #359112 - Flags: review? → review?(jodyer)
Attachment #359112 - Flags: review?(jodyer) → review+
Comment on attachment 359112 [details] [diff] [review]
Patch for MIR

pushed to redux as changeset:   1352:660082c93800
A few calls to Verifier::emitCheckNull() got lost in the refactoring, but still appear to be necessary.
Attachment #359561 - Flags: review?(jodyer)
Better fix per Jeff's suggestion: JIT is emitting the checknulls, we just have to mark the nonNull flag appropriately
Attachment #359561 - Attachment is obsolete: true
Attachment #359568 - Flags: review?(jodyer)
Attachment #359561 - Flags: review?(jodyer)
Comment on attachment 359568 [details] [diff] [review]
better patch for missing emitCheckNull calls

nice work!
Attachment #359568 - Flags: review?(jodyer) → review+
Comment on attachment 359568 [details] [diff] [review]
better patch for missing emitCheckNull calls

pushed to redux as changeset:   1369:5d20efe63165
OP_getslot is subtly broken: it maintains the value of notNull on top of stack, but after a getslot notNull must be false (since the value of the slot isn't known to be non-null). This means subsequent code might skip NPE checks leading to crashes.

I did a quick skim of other calls to state->setType() and didn't see any obviously wrong, but someone more familiar with the code than me should probably check further.

It also seems that the responsibility of calling emitCheckNull (or setNotNull) isn't completely clear; in some cases the verifier does it, in some cases the jit does it. might be worth revisiting this to make it easier to get right.
Attachment #359856 - Flags: review?(jodyer)
Comment on attachment 359856 [details] [diff] [review]
Patch for OP_getslot

that should do it.
Attachment #359856 - Flags: review?(jodyer) → review+
Comment on attachment 359856 [details] [diff] [review]
Patch for OP_getslot

pushed to redux as changeset:   1370:1b4672f57004
So there appears to be two things that need to be done before we can close this bug:

1/review the code of other such inconsistencies
2/compare an abstract form of the -Dverbose output for existing test cases to prove to ourselves that there aren't other differences lurking

Also important, if not a blocker, is the availability of a tool (an assembler) that allows us to write white box tests of ABC code so that we can exercise the verifier and the code generators in ways that we cannot with ASC.

Jd
Note, I am now blocked on the fix for Verifier.cpp that allows AVMPLUS_ABC_INTERPRETER to be enabled.
Attachment #360315 - Flags: review?(lhansen)
previous patch did not tee when word code was enabled. this patch fixes that and is otherwise simpler.

Jd
Attachment #360315 - Attachment is obsolete: true
Attachment #360315 - Flags: review?(lhansen)
Attachment #360328 - Flags: review?(lhansen)
Attachment #360328 - Flags: review?(lhansen) → review+
Blocks: 477133
Status: NEW → ASSIGNED
The last attachment is a diff of verbose output for the test case that broke Steven's flex integration, with the current patch applied. There are no signficant differences that I can see.

Jd
Attachment #361008 - Flags: review?(edwsmith)
Attachment #361008 - Flags: review?(edwsmith) → review+
Comment on attachment 361008 [details] [diff] [review]
eliminate differences in verbose output of old and new verifier

Does it make sense to have the verifier update the state instead of the jit, even though the state is not used by the interpreter?  specifically the state changes that happen in emitCheckNull(), emitCoerce(), and emitPrep()

I think the patch looks correct and can land; but if you agree we should tee this up as further cleanup work.
(In reply to comment #31)
> (From update of attachment 361008 [details] [diff] [review])
> Does it make sense to have the verifier update the state instead of the jit,
> even though the state is not used by the interpreter?  specifically the state
> changes that happen in emitCheckNull(), emitCoerce(), and emitPrep()

Absolutely. This is a work in progress. Of the methods you mention only emitCoerce() is affecting the type state, as far as I can tell. The reason this is still the case is that I wanted to move the call to emit a null check into the code generators that care about it. But doing so hides the state change that mirrors that check in the code generator. We call setNotNull() when control returns back the the verifier, but that is normally too late for correct code generation. Eventually the null-ness of a reference will not be checked by the verifier and the calls to setNotNull can be removed from the verifier code.

Jd
Fix pushed: Revision: 1423:d43606b6346c
(In reply to comment #32)

> Absolutely. This is a work in progress. Of the methods you mention only
> emitCoerce() is affecting the type state, as far as I can tell. The reason this
> is still the case is that I wanted to move the call to emit a null check into
> the code generators that care about it. But doing so hides the state change
> that mirrors that check in the code generator. 

"emits a type coercion", rather.

> We call setNotNull() when
> control returns back the the verifier, but that is normally too late for
> correct code generation. Eventually the null-ness of a reference will not be
> checked by the verifier and the calls to setNotNull can be removed from the
> verifier code.

This is true, but not particularly relevant to emitCoerce().

The point I was trying to make is that the jit needs to change the type state in emitCoerce() because that state is used by the jit before control returns back to the verifier. I should also add that this behind-the-scene mucking with the type state does not (or at least, should not) affect the type state visible inside the verifier or other code generators (e.g. the word code translator). The official type state is maintained by the verifier.

Jd
Blocks: 473441
the last patch causes my build to fail many acceptance tests (and hang on the final one), with lots of assertion failures in debug builds.
This patch is more conservative than the last one. It restores the calls to emitCoerce and emitCheckNull to the verifier, as well as reverts other changes in the first refactoring.

Jd
Attachment #361818 - Flags: review?(edwsmith)
Attachment #361818 - Flags: review+
Attachment #361818 - Flags: review?(edwsmith)
changeset:   1439:9340af10d0d6

Pushed. 

Further investigation shows that all of the changes in the jit output due to this fix occur, in reverse, in the stage 1 of refactoring.
Blocks: 478115
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Blocks: 413522
No longer depends on: 413522
Resolved fixed engineering / work item that has been pushed.  Setting status to verified.
Status: RESOLVED → VERIFIED
marking in-testsuite? but really the whole verifier needs a thorough testing strategy, to:
  - make sure abc's that should pass, do
  - make sure ones that should not pass, don't
  - ensure that side effects of verification are correct, other than pass/fail status
  - ensure that failure modes are correct (error messages, etc)
  - ensure that failure modes don't change by accident (one error hides another, for example), maybe via differential testing.

if not this bug, then a separate tracker bug should be opened.  since comprehensive testing may require various creative invalid ABC's, abcasm may need to grow new features.
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: