Closed
Bug 413522
Opened 18 years ago
Closed 15 years ago
Upgrade verifier semantics (was: Verifier should not ignore data flow)
Categories
(Tamarin Graveyard :: Virtual Machine, defect, P2)
Tamarin Graveyard
Virtual Machine
Tracking
(Not tracked)
VERIFIED
FIXED
flash10.1
People
(Reporter: lhansen, Assigned: edwsmith)
References
Details
(Whiteboard: Tracking, verifier-cleanup)
Attachments
(1 file, 7 obsolete files)
I've had problems (and Erik reports them too) with getting the verifier to approve code generated for "finally" handling because of the curious control flow that often results. The emitted code is not usually at fault in the sense that it does not violate bytecode invariants; the problem is that the verifier ignores the data flow in the basic block graph, propagating register and stack contents to blocks in linear succession whether control flows that way or not.
This is surprising and does not seem necessary, and IMO the verifier should be fixed to honor data flow when propagating information.
The only open issue is whether such a rewritten verifier would reject programs that are now considered valid. If so we could just have a bit for this in the ABC, but a simple proof that nothing's lost would be OK. It would go along the lines of, "Data propagated at backward edges must agree with data propagated along (nonexistent) forward edges. Ergo data propagated at backward edges do not add information, for existing code, or equivalently, data propagated along (nonexistent) forward edges do not add information. Ergo one edge can be summarily removed." Discuss.
Assignee | ||
Comment 1•18 years ago
|
||
(In reply to comment #0)
> The only open issue is whether such a rewritten verifier would reject programs
> that are now considered valid.
I'm having a hard time imagining how that could happen, since (I think) with this change the verifier would be less restrictive than it currently is.
At first glance this change could be easy without losing the verifier's O(N) pass. The current main loop loops over the whole body of the method. change it to loop over one basic block at a time.
Process each block in reverse postorder according to control flow. This approach should find loop edges whether they are forward or backwards, and not misclassify non-loop edges if they are backwards
now at first, this also seems hard because MIR code is generated inline with verification. but, at second glance, this seems exactly right.. in fact we ought to generate MIR code in reverse postorder, regardless of the linear order of abc blocks. no?
this would cause inverted loops to be un-inverted. something to monitor but not likely to matter in practice except for the smallest/tightest loops. comments?
> proof...
agree.
"data propagated at backwards edges" cannot modify data you already have at the back-edge target, regardless of how it got there. when a target has data from more than one edge, the merge function is monotonic towards (*) and "maybe-null". propagating data from one less edge (the phantom forward edge) means the data there is "closer" (according to the monotonic merge) to the data from the back edge.
verifying blocks in reverse postorder may also simplify the way unreachable code is ignored, no need to scan forward to the next reachable instruction.
Assignee | ||
Updated•17 years ago
|
Assignee: nobody → edwsmith
Assignee | ||
Comment 3•17 years ago
|
||
first layer, just cosmetic fixes for verbose/debugging.
Attachment #305744 -
Flags: review?(stejohns)
Assignee | ||
Comment 4•17 years ago
|
||
I'm not 100% happy with this fix, but it's a first draft worth reviewing
new behavior:
pass 1: identify blocks
pass 2: link blocks in reverse postorder
pass 3: verify blocks in reverse postorder
when MIR is enabled, IR is generated in reverse postorder as well, so loops will be un-inverted, if the compiler inverted them. Dataflow information is only propagated along explicit and exception edges.
Illegal dataflow merges don't result in an immeiate error. The variable is marked killed. Reading from a killed variable causes an error. net: incompatible temps that are merged but never accessed again, are allowed.
Unreachable code is never verified.
machine compatibility is ignored in the dataflow merge() function, when the method will be interpreted, since the interpreter operates on * (Atom) values and still contains all necessary runtime checks.
problems
* three passes is excessive. ideally block identification could be concurrent with reverse-postorder traversal.
* dead temps that converge in a finally block then fan out, can be merged with a live exception variable, which is then killed.
* methods with exceptions are forced to be interpreted due to the above problem, and a MIR bug with merging dataflow along exception edges.
Attachment #305749 -
Flags: review?(lhansen)
Assignee | ||
Comment 5•17 years ago
|
||
one more problem:
* pass 2 is recursive in C++. should be re-written to avoid recursion, we can record the traversal state in the block list, even using union{} if we need a few scratch fields.
Updated•17 years ago
|
Attachment #305744 -
Flags: review?(stejohns) → review+
Assignee | ||
Comment 6•17 years ago
|
||
Comment on attachment 305744 [details] [diff] [review]
Cosmetic fixes
pushed, 409:3b774fe73307
Attachment #305744 -
Attachment is obsolete: true
Assignee | ||
Updated•17 years ago
|
Priority: -- → P1
Reporter | ||
Comment 7•17 years ago
|
||
Comment on attachment 305749 [details] [diff] [review]
First draft, verify in reverse-postorder
Review no longer relevant.
Attachment #305749 -
Flags: review?(lhansen)
Reporter | ||
Updated•17 years ago
|
Assignee: edwsmith → jodyer
Assignee | ||
Updated•16 years ago
|
Turning into a tracker bug.
Summary: Verifier should not ignore data flow → Upgrade verifier semantics (was: Verifier should not ignore data flow)
Comment 9•16 years ago
|
||
Patch didn't land in this sprint and needs a better perf testing strategy (and a couple of correctness bugfixes) before it can land.
Comment 10•16 years ago
|
||
Cleaned up the patch a little bit and fixed most of the acceptance test failures.
Attachment #376162 -
Attachment is obsolete: true
Comment 11•16 years ago
|
||
Key learnings from this patch are:
1. Emulating the Verifier's FrameState semantics is largely an "all or nothing" proposition. Either do it exactly, in which case use the FrameState, or recover the intent of the FrameState processing and find a way to express it in LIR.
2. Modeling instructions or instruction sequences with side effects to the frame can be done by creating a new LIR instruction sequence that's a load/store of the existing frame element. This pair should be elided by the emitter, but at a LIR level it gives one two otherwise equivalent instructions, one of which can be annotated "with" the side effect, one "without." Or what have you. Personally, I like 'em with, and with mustard.
2a. One difference between the first and second cleaned-up iteration of this patch is a side-effect clone in the procedural not-null checking. This is not required by the current single-pass processing, but a multi-pass LIR postprocessor that cared about this side effect would need that.
3. Many of the ad-hoc setNotNull() calls are due to a call to a helper (e.g., newcatch) that's known to return not-null -- but the call insn, which is correctly annotated not-null, goes through a atomToNativeRep transformation.
3a. The biggest change from the POC (which had poor hit ratios vs. the Verifier) to this patch (which consistently beats the Verifier to date) is code in the NullCheckFilter to propagate not-null information across load and store instructions. Similarly, we could detect code going through atomToNativeRep transformations, or not-null preserving transformations generally. Probably want to generate these pattern matchers from a grammar, similar to the wordcode peephole generator.
Comment 12•16 years ago
|
||
Marking as "Future", resetting assignment.
Assignee: jodyer → nobody
Flags: flashplayer-qrb+
Reporter | ||
Updated•16 years ago
|
Priority: P1 → --
Target Milestone: flash10.1 → Future
Reporter | ||
Updated•16 years ago
|
Whiteboard: Tracking
Assignee | ||
Updated•15 years ago
|
OS: Windows XP → All
Hardware: x86 → All
Assignee | ||
Updated•15 years ago
|
Attachment #305749 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Attachment #376273 -
Attachment is obsolete: true
Assignee | ||
Comment 13•15 years ago
|
||
Posting this larg-ish patch for early feedback.
Attempts to solve individual problems with the verifier have failed, shifting the cases under which you can get VerifyErrors, but not removing any main class of error.
The problems this patch addresses are:
- unwanted verify errors due to backwards data flow that ASC can't easily manage
- unwanted propagation of dataflow along non-existant control flow edges
- unwanted restrictions on type-meet for primitive types
- unwanted special powers for OP_kill
- verifier behavior that depends on the execution mechanism
Here's what it does:
(1) Upgrades the verifier semantics by only propagating type and
notNull information along real control flow edges.
(2) removes VerifyErrors caused by the following situations:
- a notNull true->false transition along a back-edge.
- a change in an expression's type caused by dataflow along a back edge
- meet of incompatible machine types (forwards or backwards), e.g. int and *
(3) tightens the checking of exception edges in try blocks:
- do not scan forwards for OP_kill, because suppressing
machine type dataflow merges is no longer needed. OP_kill has
no more special powers.
(4) Verifies code the same way regardless of jit/interp/wordcode, or debugger/release configuration, which addresses bug 542055
(5) Iterates to a fixed-point solution, to accomplish (1) and (2) without causing new verify errors.
(6) Introduces an array of 1-byte tags in jit'd code which store the runtime representation of a value, which allows any types to merge at control flow joins, without adding code on CFG edges. The tagging scheme is based on SlotStorageType and is agnostic about the Atom representation.
(7) optimizes out the tags completely, in nearly all code. Iterating to a fixed point during typechecking makes this possible. Tags are preserved when more than one represenation is possible at a call site, at which point a helper is used to sniff the tag and manufacture an Atom (the static type is always Object or * in such cases).
(8) preserves copy-propagation of notNull within the jit. Combined with (5), jit'd code emits strictly fewer null checks than before.
Assignee: nobody → edwsmith
Status: NEW → ASSIGNED
Comment 14•15 years ago
|
||
(In reply to comment #13)
Meta-comment: Is there a recommended order to review? I'm starting at the top.
CodegenLIR.cpp
General disclaimer: I paged rapidly through most of the small changes to code-gen.
- Comments on the intent of CopyPropagation::reachable would be good. Takes a bit of back-and-forth searching to figure it out, and the place where it's used as a control value is not very prominent compared to its effects.
- I am taking some of CodegenLIR::localGetp on faith since I don't know the SST data structures. Looks amusing.
- bool isValid(int) move the @pre up to the function level? Doxygen will pick it up if you use three / chars, e.g. /// @pre
- So we're eliding dead stores within blocks now?
- Lines 1386-1391 _save_eip = _ef = NULL : is this a latent bug, or necessitated by this patch? Either way a comment would be helpful.
- Lines 2798-2827 Is there some reason to use the mask vs. just a straight if (bt & THIS || bt & THAT)? I realize it looks snazzy and is slightly more micro-efficient but from the reader's POV it looses heavily.
- CodegenLIR::emitTypedCall so all parameters to a function come in as atoms?
- CodegenLIR::emitSetslot since the first overload delegates to the second, is the emitPrep() in the first redundant?
- CodegenLIR::patchLater put another / in front of that comment and you win free Doxygen goodness.
Assignee | ||
Comment 15•15 years ago
|
||
(In reply to comment #14)
> (In reply to comment #13)
> - Comments on the intent of CopyPropagation::reachable would be good. Takes a
> bit of back-and-forth searching to figure it out, and the place where it's used
> as a control value is not very prominent compared to its effects.
done.
> - I am taking some of CodegenLIR::localGetp on faith since I don't know the SST
> data structures. Looks amusing.
Will add explanitory comment and rename. SlotStorageType is an enum already in use for slots, one enum for each kind of value the VM deals with natively: Atom, ScriptObject*, Namespace*, String*, int32, uint32, bool32, and double. these are the possible native representations for a variable.
sst_mask is an 8-bit mask, one bit for possible SlotStorageType, representing the possible representations of a given var at a given point in the control flow. The corresponding runtime value (the byte in tags[]) will be a SlotStorageType value (0-7).
> - bool isValid(int) move the @pre up to the function level? Doxygen will pick
> it up if you use three / chars, e.g. /// @pre
er, okay. any apparent attempt to make my comments work in Doxygen is accidental, but i'll tidy up.
> - So we're eliding dead stores within blocks now?
We have been for quite some time. See CodegenLIR::deadvars() in tamarin-redux. This patch extends the analysis to include both stores to vars[] and stores to tags[].
> - Lines 1386-1391 _save_eip = _ef = NULL : is this a latent bug, or
> necessitated by this patch? Either way a comment would be helpful.
necessary, will add comment.
> - Lines 2798-2827 Is there some reason to use the mask vs. just a straight if
> (bt & THIS || bt & THAT)? I realize it looks snazzy and is slightly more
> micro-efficient but from the reader's POV it looses heavily.
well, it would be (bt == THIS || bt == THAT), but i can make it more obvious.
> - CodegenLIR::emitTypedCall so all parameters to a function come in as atoms?
emitTypedCall is used when the Verifier has found an opportunity to early bind, and has already coerced arguments from whatever native type is discovered, to the required types. emitTypedCall() then just double-checks (via assert) that the arg types are already correct.
emitCoerceCall is used when the jit finds an opportunity to early bind that the verifier did not. It does the coersions using the signature of the callee, but does not mutate FrameState. (using the combination of Verifier::emitCoerceArgs() and writeMethodCall(), from within the jit, is being expressly avoided because we don't want the JIT to mutate the verifier's state [FrameState]).
> - CodegenLIR::emitSetslot since the first overload delegates to the second, is
> the emitPrep() in the first redundant?
good catch, actually the second one is redundant. Removed.
> - CodegenLIR::patchLater put another / in front of that comment and you win
> free Doxygen goodness.
cool
Comment 16•15 years ago
|
||
(In reply to comment #15)
> (In reply to comment #14)
> > (In reply to comment #13)
> > (bt & THIS || bt & THAT)? I realize it looks snazzy and is slightly more
> > micro-efficient but from the reader's POV it looses heavily.
>
> well, it would be (bt == THIS || bt == THAT), but i can make it more obvious.
We can have our cake and eat it too by commenting it.
> emitTypedCall is used when the Verifier has found an opportunity to early
> bind, and has already coerced arguments ...
> emitCoerceCall is used when the jit finds an opportunity to early bind that
> the verifier did not. ... but
> does not mutate FrameState.
Those would be great method-level comments...
Assignee | ||
Updated•15 years ago
|
Assignee | ||
Updated•15 years ago
|
Priority: -- → P2
Whiteboard: Tracking → Tracking, verifier-cleanup
Target Milestone: Future → flash10.1
Assignee | ||
Updated•15 years ago
|
Assignee | ||
Comment 17•15 years ago
|
||
(In reply to comment #15)
> > - bool isValid(int) move the @pre up to the function level? Doxygen will pick
> > it up if you use three / chars, e.g. /// @pre
>
> er, okay. any apparent attempt to make my comments work in Doxygen is
> accidental, but i'll tidy up.
DebuggerCheck needs a bunch of TLC. I have a separate patch started for that that will incorporate these changes.
Assignee | ||
Comment 18•15 years ago
|
||
Changes since first draft:
* CodegenLIR factored to never modify FrameState. Internally all references to FrameState are via const FrameState* and/or const Value&.
* cleanups and comments
* rebased
Assignee | ||
Updated•15 years ago
|
Attachment #427239 -
Attachment is obsolete: true
Assignee | ||
Comment 19•15 years ago
|
||
Changes since last patch:
* strengthened notNull tracking; sync state with Verifier at block entries.
* fixed Verifier bug where some bugs were incorrectly marked as loop headers
* suppress unnecessary coerce(Namespace)
* print notnull state in verbose mode
* merged & rebased with cleanup patches
Consider this "RC" status.
Attachment #428452 -
Attachment is obsolete: true
Assignee | ||
Updated•15 years ago
|
Attachment #429212 -
Flags: review?(tharwood)
Attachment #429212 -
Flags: review?(swilkinson)
Attachment #429212 -
Flags: review?(stejohns)
Attachment #429212 -
Flags: review?(rreitmai)
Attachment #429212 -
Flags: review?(jodyer)
Comment 20•15 years ago
|
||
(In reply to comment #19)
> Created an attachment (id=429212)
Comments through line 2149 of the new CodegenLir.cpp
CopyPropagation could use a rename, it's becoming a more general static analysis stage. Good class-level comment.
CopyPropagation::varTracker and tagTracker could use a comment
CopyPropagation::syncNotNull()
Why AvmAssert(checked->isEmpty())? I know not-null can't propagate past a loop header in this design, but what's the invariant being checked? A naive view would be that checked is just dead and should be cleared.
(helpful comments on checked and bits at the class level)
CopyPropagation::trackForwardEdge
Comment on the first assert is stale, s/trackEdge/trackForwardEdge/
So what does it mean if varTracker[i] is null? Just that no useful info has been discovered about that var's state?
General suggestion: BitSet could use methods to do intersect, union, etc.
CopyPropagation::checkBackEdge
Thanks for the truth table. Might rename the variable e.g. s/current/srcNotNull/
CopyPropagation::clearVarState
vs. clearState() -- this could use a comment on when clearVarState() gets called outside clearState(). How odd can the results be if the debugger modifies a var, but we have a copy of that var tracked as not-null? I presume it's kinda like modifying storage of a var that's in a register. Do we already account for that in some fashion?
CopyPropagation::trackTagStore
The assertion checks that d is aligned on TAGSIZE, so to speak? How about a macro? I see the same assertion elsewhere.
The logic that finds the tagTracker index by dividing d by tagsize is a hard to follow. Not so much the actual logic, or why d is so wide compared to the number of vars, but how the input parameter d is initialized; I chased it up to insStore() but didn't follow it further.
... especially since trackTagLoad() doesn't do a symmetric computation.
... oh, but insLoad() does. This is all a little hard to grok in open code.
CodegenLIR::localGet
So is it OK if the sst mask and the traits don't match? Looks OK from the storage compatibility perspective. Should that change the sst mask? Or are these types always treated as an equivalence class?
CodegenLIR::callIns
Way out of scope for this, but wouldn't it be swell if we knew whether methods could throw or not?
Why does !ci->_cse need to be tested? Because that means this call can be elided?
General comment: this new branch-tracking discipline is much better than the old way!
CodegenLIR::writeFixExceptionsAndLabels
I'm waiting with bated breath for a writeFixExceptionsAndLabels that has guts.
Maybe comment here that this one doesn't need any logic... because?
Comment 21•15 years ago
|
||
FYI, patch doesn't apply cleanly against TR or TR-argo, and bb59490e5158 doesn't appear to be a known revision...
Comment 22•15 years ago
|
||
(In reply to comment #19)
> Created an attachment (id=429212) [details]
CodegenLir.cpp lines 2161-3112
CodegenLIR::coerceArgs and CodegenLIR::emitConstructCall
Helpful comments. Move 'em up to the method level?
A general question for my understanding: emitConstructCall and emitCoerceCall both assert that the MethodInfo is resolved. Is there any circumstance where a getMethodSignature() call would be valid before the MethodInfo was resolved?
CodegenLIR::emitCoerceCall
re: the question on early binding and assertion: a failure at this point is not going to produce workable code or a good diagnostic, so an upstream check would be better. I presume this is a future/robustness issue?
Line 2861 -- new patch loses the FIXME comment from the old code, no indication of a fix. Typo or was this a non-issue? (64-bit value in 32-bit storage)
Line 3006 -- stale comment, code doesn't do nothing any longer
Line 3015 -- There go them three integer types rompin' around together again.
... and the code doesn't "do nothing,' another stale comment
Line 3089 -- Stale comment, the verifier doesn't remember, it's manifest here in the LIR
Comment 23•15 years ago
|
||
(In reply to comment #19)
> Created an attachment (id=429212) [details]
> (v3) new verifier semantics
CodegenLIR.cpp lines 3114-end (of CodegenLIR)
CodegenLIR::emitTypedCall -- the name is a little ambiguous; does this do type analysis and emit a call, or assume the caller did type analysis? The comments on the function explain it very well, but they're not necessarily visible to a reader in the hinterlands of CodegenLIR. emitEarlyBoundCall() would be more explicit.
Lines 3593, 3620 -- any reason to keep the "// will be patched" comment?
CodegenLIR::writeEpilogue
Line 5063 -- inconsistent tabbing.
Line 5110 -- helpful comment.
CodegenLIR::emitLabel
General comment on LIR labels: "patches" is nomenclature that relates to the implementation. "inEdges" goes to intent. OTOH if "patches" is ingrained in the nanojit community's collective consciousness then this sug. looses.
CodegenLIR::branchToLabel
Line 5441 - so lirout->insLabel() can return NULL? Guess I'd better have a look at it.
... any circumstances other than unconditional branch? Eliding an unconditional would be easier to understand if done be an explicit check beforehand, rather than testing the result for NULL>
the (labelIns != 0) test makes labelIns look like an int. (labelIns != NULL) would be more self-evident.
CodegenLIR::branchToPc
The routine's name is a little confusing when read elsewhere in code. It means "branch to start of block identified by this program position" but branchToPc keeps making me think it means "branch from some other part of the program to this position."
Comment 25•15 years ago
|
||
(In reply to comment #19)
> Created an attachment (id=429212) [details]
Reviewing all the files up to Verifier.cpp
Trait::bt2sst
It's too bad that we lose the assert when storing void into a slot. Is it too much churn to make bt2sst(bt) delegate to bt2sst(bt,false) (i.e., don't allow void stores into slots) and have the presumably fewer callers who need this call bt2sst(bt,true)?
Comment 26•15 years ago
|
||
(In reply to comment #19)
Verifier.cpp Lines 1-404
General comment: brace style in this file is getting to be a complete hash.
Line 133: Why is it necessary to initialize verbose_out when verify_only is set but verbose is not? Funny little thing to see there.
Great comment @line 184 re: duplicate function definition. Always wondered how that came about.
Lines 315-334: tabs s/b spaces
Line 327: Why is writePrologue() called twice? (Also called at line 360.)
Assignee | ||
Updated•15 years ago
|
Attachment #429212 -
Flags: review?(swilkinson) → review?(siwilkin)
Comment 27•15 years ago
|
||
Comment on attachment 429212 [details] [diff] [review]
(v3) new verifier semantics
Removed SJ as reviewer as he is OOO this month.
Attachment #429212 -
Flags: review?(stejohns)
Comment 28•15 years ago
|
||
(In reply to comment #19)
[nits]
CodegenLIR::writeOpcodeVerified (Line 1933)
--------------------------------------------
What is the reason for setting this->state to NULL?
CodegenLIR::state's nullness is never checked for, and is always loaded with
the current verifier state at the top of each CodegenLIR::writeXXXX method that
may require it.
Verifier.h
--------------------------------------------
If you're changing getFrameState() and a few other things to be private, then
you might as well go the whole hog and do the same for
parseExceptionHandlers(), blockStates, state, max_scope, max_stack, info, ms,
pool and emit_pass.
Assignee | ||
Comment 29•15 years ago
|
||
(In reply to comment #28)
> CodegenLIR::writeOpcodeVerified (Line 1933)
> --------------------------------------------
> What is the reason for setting this->state to NULL?
> CodegenLIR::state's nullness is never checked for, and is always loaded with
> the current verifier state at the top of each CodegenLIR::writeXXXX method that
> may require it.
It is a debugging only sanity check to poke you in the eye if we
lose one of those updates, guarding against using a stale state pointer.
I'll comment it better, at least. If its not worth its weight, easy to drop. opinions anyone?
> If you're changing getFrameState() and a few other things to be private, then
> you might as well go the whole hog and do the same for
> parseExceptionHandlers(), blockStates, state, max_scope, max_stack, info, ms,
> pool and emit_pass.
yes, good, will do.
Assignee | ||
Comment 30•15 years ago
|
||
(In reply to comment #26)
> Line 327: Why is writePrologue() called twice? (Also called at line 360.)
once for phase 1, once for phase 2. Comments added.
Assignee | ||
Comment 31•15 years ago
|
||
(In reply to comment #26)
> Line 133: Why is it necessary to initialize verbose_out when verify_only is set
> but verbose is not? Funny little thing to see there.
verbose_out was vestigal code. Removed.
Assignee | ||
Comment 32•15 years ago
|
||
(In reply to comment #25)
> Trait::bt2sst
>
> It's too bad that we lose the assert when storing void into a slot. Is it too
> much churn to make bt2sst(bt) delegate to bt2sst(bt,false) (i.e., don't allow
> void stores into slots) and have the presumably fewer callers who need this
> call bt2sst(bt,true)?
renamed bt2sst() to valueStorageType(), added slotStorageType() which preserves the assert and is called in Traits.cpp for slots. JIT calls valueStorageType().
I was tempted to rename enum SlotStorageType and SST_xxx to drop the "slot" moniker, but too much churn for this patch. added to followup todo list tho.
Assignee | ||
Comment 33•15 years ago
|
||
(In reply to comment #23)
> CodegenLIR::emitTypedCall -- the name is a little ambiguous; does this do type
> analysis and emit a call, or assume the caller did type analysis? The comments
> on the function explain it very well, but they're not necessarily visible to a
> reader in the hinterlands of CodegenLIR. emitEarlyBoundCall() would be more
> explicit.
This function and the other variants all emit early bound calls, so I dont like that name. I'm struggling to come up with a short name for "assert that the args are already the right types then emitCall()". Suggestions welcome, and the comments will need to suffice in the mean time.
> Lines 3593, 3620 -- any reason to keep the "// will be patched" comment?
removed and improved comments on branchToPc() and branchToLabel().
> CodegenLIR::emitLabel
> General comment on LIR labels: "patches" is nomenclature that relates to the
> implementation. "inEdges" goes to intent. OTOH if "patches" is ingrained in
> the nanojit community's collective consciousness then this sug. looses.
renamed:
Patch -> InEdge
InEdge br -> branchIns
CodegenLabel bb -> labelIns
CodegenLabel patches -> unpatchedEdges
> CodegenLIR::branchToLabel
> Line 5441 - so lirout->insLabel() can return NULL? Guess I'd better have a
> look at it.
> ... any circumstances other than unconditional branch? Eliding an
> unconditional would be easier to understand if done be an explicit check
> beforehand, rather than testing the result for NULL>
When we manage to eliminate a *conditional* branch, we do it -- its in branchToLabel(). however, LIR generation is done via a pipeline, and later stages in the pipeline can also eliminate a conditional branch, so we check for NULL to handle that condition.
> the (labelIns != 0) test makes labelIns look like an int. (labelIns != NULL)
> would be more self-evident.
fixed
> CodegenLIR::branchToPc
> The routine's name is a little confusing when read elsewhere in code. It means
> "branch to start of block identified by this program position" but branchToPc
> keeps making me think it means "branch from some other part of the program to
> this position."
renamed to branchToAbcPos()
Assignee | ||
Comment 34•15 years ago
|
||
(In reply to comment #22)
> CodegenLIR::coerceArgs and CodegenLIR::emitConstructCall
> Helpful comments. Move 'em up to the method level?
done
> A general question for my understanding: emitConstructCall and emitCoerceCall
> both assert that the MethodInfo is resolved. Is there any circumstance where a
> getMethodSignature() call would be valid before the MethodInfo was resolved?
getMethodSignature() also asserts this->isResolved(), inside. So we have a redundant assert. Removed.
> CodegenLIR::emitCoerceCall
> re: the question on early binding and assertion: a failure at this point is not
> going to produce workable code or a good diagnostic, so an upstream check would
> be better. I presume this is a future/robustness issue?
Yes, the asserts are all just checking that upstream tests are working.
> Line 2861 -- new patch loses the FIXME comment from the old code, no indication
> of a fix. Typo or was this a non-issue? (64-bit value in 32-bit storage)
Non issue. the argument that we pass f->iid() to is typed intptr_t, which is safe.
> Line 3006 -- stale comment, code doesn't do nothing any longer
> ... and the code doesn't "do nothing,' another stale comment
> Line 3089 -- Stale comment, the verifier doesn't remember, it's manifest here
fixed up comments
Assignee | ||
Comment 35•15 years ago
|
||
(In reply to comment #20)
> (In reply to comment #19)
> > Created an attachment (id=429212) [details]
> Comments through line 2149 of the new CodegenLir.cpp
>
> CopyPropagation could use a rename
> CopyPropagation::varTracker and tagTracker could use a comment
class VarTracker now. comments improved.
> CopyPropagation::syncNotNull()
> Why AvmAssert(checked->isEmpty())? I know not-null can't propagate past a loop
> header in this design, but what's the invariant being checked? A naive view
> would be that checked is just dead and should be cleared.
> (helpful comments on checked and bits at the class level)
>
>
> CopyPropagation::trackForwardEdge
> Comment on the first assert is stale, s/trackEdge/trackForwardEdge/
> So what does it mean if varTracker[i] is null? Just that no useful info has
> been discovered about that var's state?
After checking the code flow I decided the assert doesn't pay for itself and removed it. the invariant is that we don't have any lingering known-not-null expressions at the tops of loops. but checkBackEdge() does a better and clearer job ensuring this is the case.
> General suggestion: BitSet could use methods to do intersect, union, etc.
union is alread supported, this code needs intersect. I didn't add it because I don't want to modify nanojit itself in this patch. but noted for followup action.
> CopyPropagation::checkBackEdge
> Thanks for the truth table. Might rename the variable e.g.
> s/current/srcNotNull/
went with currentNotNull, clearer to my eyes.
> CopyPropagation::clearVarState
> vs. clearState() -- this could use a comment on when clearVarState() gets
> called outside clearState(). How odd can the results be if the debugger
> modifies a var, but we have a copy of that var tracked as not-null? I presume
> it's kinda like modifying storage of a var that's in a register. Do we already
> account for that in some fashion?
yes, comments added, and a reference to bug 544238.
> CopyPropagation::trackTagStore
> The assertion checks that d is aligned on TAGSIZE, so to speak? How about a
> macro? I see the same assertion elsewhere.
added IS_ALIGNED macro and used it in various places.
> The logic that finds the tagTracker index by dividing d by tagsize is a hard to
> follow. Not so much the actual logic, or why d is so wide compared to the
> number of vars, but how the input parameter d is initialized; I chased it up to
> insStore() but didn't follow it further.
> ... especially since trackTagLoad() doesn't do a symmetric computation.
> ... oh, but insLoad() does. This is all a little hard to grok in open code.
cleaned up with helpers and fixed the asymmetry.
> CodegenLIR::localGet
> So is it OK if the sst mask and the traits don't match? Looks OK from the
> storage compatibility perspective. Should that change the sst mask? Or are
> these types always treated as an equivalence class?
the storage type must be appropriate for the type. Tightened up the assert.
> CodegenLIR::callIns
> Way out of scope for this, but wouldn't it be swell if we knew whether methods
> could throw or not?
Agreed, more precice information would be great, if it were accurate. it is okay for the verifier to have a conservative approximation, as long as the semantics are constant over time, and okay for the JIT to be more aggressive. bad mojo, if the jit generates exception-throwing code for bytecodes which are presumed non-throwing by the verifier.
> Why does !ci->_cse need to be tested? Because that means this call can be
> elided?
ci->_cse is the jit's approximation for "cannot throw exception". canThrow() helper added to clarify.
> General comment: this new branch-tracking discipline is much better than the
> old way!
thanks, i think so too :-)
> CodegenLIR::writeFixExceptionsAndLabels
> I'm waiting with bated breath for a writeFixExceptionsAndLabels that has guts.
> Maybe comment here that this one doesn't need any logic... because?
done.
Assignee | ||
Comment 36•15 years ago
|
||
(In reply to comment #35)
> > CodegenLIR::writeFixExceptionsAndLabels
> > I'm waiting with bated breath for a writeFixExceptionsAndLabels that has guts.
> > Maybe comment here that this one doesn't need any logic... because?
>
> done.
er, i mean: comments added.
Comment 37•15 years ago
|
||
(In reply to comment #19)
> Created an attachment (id=429212) [details]
> (v3) new verifier semantics
Verifier.cpp to line 2502
try_from and try_to : first thing, they could use comments at their decls. They mark the min and max positions in the IR that have exception handlers, do they not? (I'm thinking "union" of exception handler ranges except that the regions can be discontiguous).
Their names are also not camel-cased, which I thought we used for fields, with underscores_in_local_vars.
class ScopeWriter
- class level comment would be good.
Verifier::checkParams()
- why did so much code change positions when this routine was formed? Looks like an allemand left occured.
... not an issue, just curious.
Verifier::verifyBlock()
- line 447-448: commented out code. DUC or there for debugging?
- line 460: testing for the beginning of the next block. A small comment would be in order.
Assignee | ||
Comment 38•15 years ago
|
||
(In reply to comment #37)
> try_from and try_to : first thing, they could use comments at their decls.
> They mark the min and max positions in the IR that have exception handlers, do
> they not? (I'm thinking "union" of exception handler ranges except that the
> regions can be discontiguous).
> Their names are also not camel-cased, which I thought we used for fields, with
> underscores_in_local_vars.
Correct, comments added. The style problem is bigger than just these vars, but i'll change to camelCase anyway.
> class ScopeWriter
> - class level comment would be good.
done
> Verifier::checkParams()
> - why did so much code change positions when this routine was formed? Looks
> like an allemand left occured.
> ... not an issue, just curious.
I hoisted the checks for VerifyErrors to the top, maybe that's why.
> Verifier::verifyBlock()
> - line 447-448: commented out code. DUC or there for debugging?
Good question, was already like that in Verifier, out of scope for this patch.
> - line 460: testing for the beginning of the next block. A small comment would
> be in order.
done
Assignee | ||
Comment 39•15 years ago
|
||
Attachment #429212 -
Attachment is obsolete: true
Attachment #431865 -
Flags: review?(tharwood)
Attachment #431865 -
Flags: review?(rreitmai)
Attachment #431865 -
Flags: review?(jodyer)
Attachment #429212 -
Flags: review?(tharwood)
Attachment #429212 -
Flags: review?(siwilkin)
Attachment #429212 -
Flags: review?(rreitmai)
Attachment #429212 -
Flags: review?(jodyer)
Assignee | ||
Comment 40•15 years ago
|
||
Changes since v3
* style, naming, visibility cleanups from Tom and Simon's review (thank you)
* rebased; all underlying cleanup patches have landed, patch should apply cleanly to tamarin-redux-argo
Updated•15 years ago
|
Attachment #431865 -
Flags: review?(rreitmai) → review+
Comment 41•15 years ago
|
||
Comment on attachment 431865 [details] [diff] [review]
(v4) New verifier semantics
Verifier::checkTarget
- This the only routine that adds to the worklist? Seems worth a comment on the wl_pending decl.
Verifier::mergeState
- The ISSUE noted with interfaces is a future item? Is that an optimization or is there a correctness issue? (Looks like the former to me but just asking.)
- can curValue be a const Value&?
Verifier.h
Can't all those comments on the fields go up above to make them Doxygen-able? Just put an extra / on 'em and they're API doc fodder.
Attachment #431865 -
Flags: review?(tharwood) → review+
Comment 42•15 years ago
|
||
Oh! Ed, or Dan, did you try running GO'd code through here without turning on -legacy_verifer semantics? That will give the type-meet code a good workout.
Assignee | ||
Comment 43•15 years ago
|
||
I haven't tried it yet but its on the todo list for followup testing.
Assignee | ||
Comment 44•15 years ago
|
||
(In reply to comment #41)
> (From update of attachment 431865 [details] [diff] [review])
> Verifier::checkTarget
> - This the only routine that adds to the worklist? Seems worth a comment on
> the wl_pending decl.
done
> Verifier::mergeState
> - The ISSUE noted with interfaces is a future item? Is that an optimization or
> is there a correctness issue? (Looks like the former to me but just asking.)
removed ISSUE, explained in comment. handling interfaces would generate more accurate type information but its the kind of thing we must only do in new revs of ABC. The type-flow model of each operation including type-meet should stay fixed for a given abc version. (ahem, except for when we make an exception).
> - can curValue be a const Value&?
yes, changed.
> Verifier.h
>
> Can't all those comments on the fields go up above to make them Doxygen-able?
> Just put an extra / on 'em and they're API doc fodder.
i suppose so, but it makes the file twice as long and comment-after-field is a prevailing style in the code. if we were actually using Doxygen then i'd be more compelled to change this style.
Comment 45•15 years ago
|
||
Comment on attachment 431865 [details] [diff] [review]
(v4) New verifier semantics
nothing is obviously wrong, so...hail mary!
Attachment #431865 -
Flags: review?(jodyer) → review+
Assignee | ||
Comment 46•15 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
QA Contact: vm → cpeyer
Updated•15 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•