Closed Bug 640318 Opened 11 years ago Closed 11 years ago

LIR control-flow graph gml output

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rreitmai, Assigned: rreitmai)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey)

Attachments

(2 files, 4 obsolete files)

Generate a control-flow graph of generated LIR in gml format to assist with debugging.
Attached patch wip (obsolete) — Splinter Review
Assignee: nobody → rreitmai
Attached patch produce LIR cfgs (obsolete) — Splinter Review
Attachment #518162 - Attachment is obsolete: true
Attachment #523685 - Flags: review?(nnethercote)
Attachment #523686 - Attachment description: TR component of → TR component of LIR cfg generator
Attachment #523686 - Flags: review?(wmaddox)
Attachment #523685 - Flags: review?(wmaddox)
Status: NEW → ASSIGNED
Comment on attachment 523685 [details] [diff] [review]
produce LIR cfgs

What is GML?  Google gives me "Geography Markup Language" but that doesn't seem right.

I skimmed over this, the only thing I want to check is that all the new GML-related code is debug builds only... it's not easy to tell from the patch if that's the case.
Attachment #523685 - Flags: review?(nnethercote) → review+
It's Graph Modelling Language, similar to dot, but understood by yEd, which is a pretty great graph viewer.  I'd also like to see some "see also" links in the source code.  the yEd documentation has good details on what means what.
The code is guarded around NJ_VERBOSE, the intention being one would be able to output these graphs as a debugging aid.

re: gml info - good point, I'll add some links into the code comments before the patch lands.
Comment on attachment 523685 [details] [diff] [review]
produce LIR cfgs

Neat!

What happens in CFG_BB mode to the instruction following a LIR_jtbl? From the logic in CfgLister:read(), it doesn't appear that it would begin a new basic block, i.e., is not added to _vertices.  Then again,the code to add fallthrough edges in CfgLister::printGmlCfg() would appear to do the right thing.  Is there a good reason why _vertices does not contain every instruction that begins a block by the end of the first pass?

Under what circumstances can there be an incoming edge that points to an instruction that is not the first instruction of a block? This is fundamental to both BBs and EBBs.  I understand multiple edges may *exit* an EBB at different instructions, but these are incoming edges.  I imagine this would all be made clear with a good explanation of the way the _alt mapping is used.

I will withold a review pending an answer to these questions, since the problem is likely with my understanding rather than the code itself.

Nits:

1) There needs to be a good header comment explaining the algorithm, as there are a few twists here, and having to do everything backward also causes further divergence from the standard textbook algorithm.

2) Its clear enough from the Tamarin use case how the 'proxiesFor' argument is used, but the comments in the Nanojit patch should explain this more fully. 

3) In CfgLister::printEdges(), there is a local variable named 'l' (ell).  It looks almost exactly like a '1' (digit one).  When I saw the statment 'while (l) ...', I parsed it as the common loop-forever idiom at first.

4) Multiple fprintf() calls in gmlNodePrefix() and friends could be replaced by a single call with a sequence of single-line string literals in the format string position. (Juxtaposed literals are concatenated at compile time.)
Comment on attachment 523686 [details] [diff] [review]
TR component of LIR cfg generator

R+

Nits:

Do we really want the syntax '-Dverbose=lircfg=bb'? Perhaps use a different delimiter after lircfg, or just use a compound name like lircfg_bb or lircfg-bb, since that's how it's parsed anyway.

It appears you added the missing help string for the 'lir' option to -Dverbose, but omitted it from the summary list of options a few lines earlier.

In the help string for 'lircfg=ins', 'instructions' should read 'instruction'.
Attachment #523686 - Flags: review?(wmaddox) → review+
(In reply to comment #7)
>  LIR_jtbl is not added to _vertices. 

fixed.

> 
> Under what circumstances can there be an incoming edge that points to an
> instruction that is not the first instruction of a block?

Ah yes, I didn't explain _alt well enough.  During the first pass (i.e. read()), we are only capturing edges and vertices at an instruction level and recall this is during a backwards pass so for forward branches we won't know whether an instruction we've seen already is the target of the branch or not until we reach the branch.

On the 2nd pass (i.e. printGmlCfg) we use the list of branch targets form the first pass (i.e. _vertices) and group instructions into them.  This is where _alt comes into play, it provides a map from all instructions in a block to the first instruction of a block.
(In reply to comment #7)
>
> 1) There needs to be a good header comment explaining the algorithm

agreed; will update in the final patch.

> 
> 2) 'proxiesFor' in the Nanojit patch should explain this more fully. 

will do.

> 
> 3) 'l' (ell).  

Looks fine in my editor :)

> 
> 4) Multiple fprintf() calls in gmlNodePrefix() and friends could be replaced by
> a single call 

ah good idea.
(In reply to comment #8)
> Nits:
> 
> Do we really want the syntax '-Dverbose=lircfg=bb'? 

Made them all use '-' rather than '='

> 'lir' omitted 

fixed

> In the help string for 'lircfg=ins', 'instructions' should read 'instruction'.

fixed
Hmm just noticed the patch didn't have the comments I added in response to comment #4.

Here's the verbiage in Lir.h for the interested.

    /**
      * A reverse filter for LIR that generates a control-flow graph in gml format.
      * More information on Graph Modelling Language (gml) can be found here:
      *   http://en.wikipedia.org/wiki/Graph_Modelling_Language
      * An excellent tool for manipulating the graphs produced by this code
      * is yED (http://www.yworks.com/en/products_yed_about.html).
      *
      * The raw output produced by this class contains connectively (i,e edge)
      * information (including formatting), and node (i.e. vertex) data, but
      * does not contain any positional information.
      * Thus when opening the .gml file, all the nodes will likely appear stacked
      * on one another.  An auto-layout tool like yEd can then re-position the
      * nodes for better viewing.  E.g. Tools->Fit Node to Label followed by
      * Layout->Hierarchical->Interactive produces a relatively convential
      * looking flow-control graph.
      *
      * Usage:
      *
      *         LirReader  reader(frag->lastIns);
      *         CfgLister  cfg(&reader, alloc);
      *
      *         for (LIns* ins = cfg.read(); !ins->isop(LIR_start); ins = cfg.read()) {}
      *
      *         cfg.printGmlCfg(f, frag->lirbuf->printer, proxiesFor);
      *         fclose(f);
      *
      *  The first and second parameters to printGmlCfg() are an open FILE* and
      *  a populated LInsPrinter, respectively.  The printer must be able to produce
      *  a name from either the lirNameMap or a call to formatIns().
      *
      *  The 3rd parameter to primtGmlCfg(), proxiesFor can be used to limit
      *  the number of edges in the graph. If an edge points to a vertex (LIns) in
      *  this set, then a new node is created, with the same name as the original,
      *  and the edge will point to that node.
      *
      * Algortihm:
      *
      *  During the first pass (i.e. CfgLister::read()), we capture edges and vertices
      *  at an instruction level.  Recall that this is during a backwards pass, so for
      *  forward branches we won't know whether an instruction we've seen already is
      *  the target of a branch or not, until we reach the branch.  I.e. we see the
      *  target instructions first and don't yet know they are targets.
      *
      *  On the 2nd pass (i.e. printGmlCfg) we use the list of branch targets from the
      *  first pass (i.e. _vertices) and group instructions into them.  This is where
      *  _alt comes into play, it provides a map from any instruction to the first
      *  instruction of the containing block.
      */
Comment on attachment 523685 [details] [diff] [review]
produce LIR cfgs

Clearing out request queue pending posting of revised patch promised above in response to earlier comments.
Attachment #523685 - Flags: review?(wmaddox) → review-
repost 523685 with changes addressed as per comment 7 for bill to re-review.
Attachment #523685 - Attachment is obsolete: true
Attachment #527881 - Flags: review?(wmaddox)
Attachment #527881 - Flags: review?(wmaddox) → review+
rreitmai http://hg.mozilla.org/projects/nanojit-central/rev/45ca084d9cbe
Whiteboard: fixed-in-nanojit
Attachment #523686 - Attachment is obsolete: true
Attachment #530206 - Attachment is obsolete: true
changeset: 6282:cba81c16c10d
user:      Rick Reitmaier <rreitmai@adobe.com>
summary:   Bug 640318 - LIR control-flow graph gml output (r+nnethercote,wmaddox)

http://hg.mozilla.org/tamarin-redux/rev/cba81c16c10d
Whiteboard: fixed-in-nanojit → fixed-in-nanojit,fixed-in-tamarin
changeset: 6286:924856fa9dfe
user:      Rick Reitmaier <rreitmai@adobe.com>
summary:   Bug 640318 - LIR control-flow graph gml output, tamarin portion (r+wmaddox)

http://hg.mozilla.org/tamarin-redux/rev/924856fa9dfe
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/ab32afc59a89
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: fixed-in-nanojit,fixed-in-tamarin → fixed-in-nanojit, fixed-in-tamarin, fixed-in-tracemonkey
Depends on: 666480
fclose() is being called twice on the gml filehandle, once in CodegenLIR::emitMD() and again in lircfg() This can cause free() to complain when buffers associated with the filehandle are released for a second time.
You need to log in before you can comment on or make changes to this bug.