Closed Bug 545412 Opened 14 years ago Closed 9 months ago

Patch that prints the control flow graph of a JavaScript program in dot format

Categories

(Core :: JavaScript Engine, enhancement)

x86
macOS
enhancement

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: rodrigosol, Unassigned)

Details

Attachments

(3 files, 4 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_7; en-US) AppleWebKit/532.5 (KHTML, like Gecko) Chrome/4.0.249.49 Safari/532.5
Build Identifier: 

This is a new feature. The class JSControlFlowGraph draws the control flow graph of a javaScript bytecode. We are currently outputing the control flow graph, in dot format, in the standard output. To use this function the user must add the function "controlfgraph" in the script, in the same way that she would have to add the function "disscr" to print the bytecode.

For instance, given the script below:

 function cfg(n){
   for(i=0;i<n;i++){
     for(j=0;j<n;i++){
       print("The sum of i and j is:" + (i+j));
     }
   }
 }

 controlfgraph(cfg);


I produce the bytecode in the pdf here:
http://www.dcc.ufmg.br/~rsol/docs/cfg.pdf


Reproducible: Always
Attached patch Patch to print cfg in dot format (obsolete) — Splinter Review
Comment on attachment 426252 [details] [diff] [review]
Patch to print cfg in dot format

Cool! I had some ancient code to build a CFG from bytecode, never landed it and lost in some data coffin.

Global nits:

* Don't enter tabs into files (editors including vim, emacs and VS can be taught, some even obey the modeline comments at the top of the file all by themselves). Tabs are evident at least in the js.cpp patch.

* Put the left brace starting a function body on its own line (inline methods defined in classes are exempt from this rule).

/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
Dear Brendan,

   Thank you for the comments.
   I have updated my patch, to meet your suggestions. I have changed the braces, to match the K&R style. I was a bit in doubt about methods where the definition comes together with the implementation in h files, but are not marked inline. I made them follow K&R.
    Regarding the tabs, I have now changed js.cpp, using the editor's automatic formatting mode. It seems that all the indentations are now 4 spaces.
Attachment #426252 - Attachment is obsolete: true
Attachment #426514 - Attachment is patch: true
Attachment #426514 - Attachment mime type: application/octet-stream → text/plain
This is cool.

I hacked on it a little bit so that it doesn't use std::string, and I hacked jsopcode.h/cpp so that you don't have to use the filesystem to get bytecode disassembly. I added #ifdef DEBUG around it since the disassembler is already #ifdef DEBUG.

For this to go in, it should support all the opcodes. The switch opcodes, throw, and return are missing. Can you add support for those?
Sure!

I'll implement your suggestions.
Hi,

I'm with some doubt to handle JSOP_RETRVAL opcode.

What should be the behavior of the CFG when this bytecode 
is seen?

Can anyone give me a js snippet where this bytecode is 
produced?
(In reply to comment #9)
> Hi,
> 
> I'm with some doubt to handle JSOP_RETRVAL opcode.
> 
> What should be the behavior of the CFG when this bytecode 
> is seen?

It's like JSOP_RETURN but it takes the return value from fp->rval, where it was stored by an earlier JSOP_SETRVAL to preserve order of evaluation in the face of finally clauses.

> Can anyone give me a js snippet where this bytecode is 
> produced?

js> dis(function(){try{nogood}catch(e){return e}})
flags: LAMBDA NULL_CLOSURE
main:
00000:  trace
00001:  try
00002:  name "nogood"
00005:  pop
00006:  goto 31 (25)
00009:  enterblock depth 0 {e: 0}
00012:  exception
00013:  setlocalpop 0
00016:  getlocal 0
00019:  setrval
00020:  leaveblock 1
00023:  retrval
00024:  leaveblock 1
00027:  goto 31 (4)
00030:  nop
00031:  stop

Source notes:
  0:     6 [   6] hidden  
  1:     9 [   3] catch   
  3:    20 [  11] xdelta  
  4:    20 [   0] hidden  
  5:    24 [   4] catch    stack depth 1
  7:    27 [   3] hidden  
  8:    30 [   3] endbrace

Exception table:
kind      stack    start      end
 catch        0        2        9

/be
I'm sending a new patch. It handles JSOP_TRY, switch, JSOP_RETURN and JSOP_RETRVAL.

Please, give me some feedback if the graphs are like they should be.

So now, I'll try to implement special treatment fo JSOP_GENERATOR and JSOP_YIELD.
Attached patch Patch that handles new operators (obsolete) — Splinter Review
Attachment #426514 - Attachment is obsolete: true
I am a bit confused about the yield and the generator constructs. How do they change the control flow graph?
JSOP_GENERATOR does not affect control flow, it's a declaration at the top of a function f indicating that it contains yield expressions.

JSOP_YIELD is like return but it is also a resume point, and an expression. Local variables survive of course. Its value as an expression is whatever the caller of g.send(v) for a generator-iterator g created by calling the generation-function f passes as v.

/be
So JSOP_YIELD does not affect control flow, in the CFG sense -- no goto, no return (edge to exit block).

/be
Hi, Brendan. 

Given that these instructions will not change the CFG, to get the extra credits, should I highlight them? 

Say, writing them in boldface? I have prepared a small gallery on this address (http://dcc.ufmg.br/~rsol/cfg_examples.html), where I display a couple programs, and the respective CFG's. 

What do you guys think about it? Would you like me to render them in any other way? 

One thing that I did not like was the lone 'no-op' after the try block. Should I remove them from the CFG? 

All the best
Highlighting (colorizing in HTML using CSS?) would be nice. Unusual ops should stand out, even if they have no control flow effects.

Eliminating no-ops might hide the truth, but perhaps they could have their visibility (CSS again) toggled. I'm going crazy with Ajax-y UI design thoughts, so start ignoring me now. ;-)

/be
Attachment #429776 - Attachment is obsolete: true
I have done some more updating on the control flow printer. Couple points:

1) I have found a bug, that now is fixed: I was not correctly printing try and catch blocks if the try had a branch inside. Now I am correctly drawing the edge that jumps from the end of the try to the catch.

2) The extra-credit is not so easy: I don't know how to color only part of the text in dot. I can color the whole basic block, for instance, but not only one instruction inside the block.

3) I am not sure about what to do about the GOSUB and RETSUB bytecodes. One RETSUB should lead to the next instruction after the GOSUB that led to it. However, I can have many GOSUBs and RETSUBs sprouting from the same basic block. This happens, for instance, when I have a finally block (exception handling). So, the control flow does not perfectly model the dynamic behavior of a program. That is, by looking at the control flow, someone may believe that it is possible to go from a RETSUB to a GOSUB that is not really related. However, I don't think it is possible to draw in a (static) figure this dynamic behavior.

4) I am using std::list to store some data in the new implementation. Could someone review this to me?
(In reply to comment #19)

> 3) I am not sure about what to do about the GOSUB and RETSUB bytecodes.

Each finally block is like a "subroutine", you have to infer them a bit but the retsub ops in a given finally block all return to any of the gosubs that target that block. Could be quite an over-connected, non-planar graph. Maybe skip for now, try/finally is not that common.

/be
Dear guys,

    I have updated an old SpiderMonkey pass that Rodrigo had coded almost two
years ago. This pass converts the TraceMonkey's bytecodes to Dot. It works
just like the dissrc() function. Well, it is all documented in this thread.
If you are interested, I could try to improve the patch. If I read the thread
correctly, there are some missing features, such as how to handle RETSUB/GOSUB.
The patch follows attached. For an example, given this very function that
Rodrigo has first used as a demo:

function cfg(n){
   for(i=0;i<n;i++){
     for(j=0;j<n;i++){
       print("The sum of i and j is:" + (i+j));
     }
   }
 }

It results in the following CFG: http://homepages.dcc.ufmg.br/~igor/gsoc2011/t2.pdf

Regards,

Igor
The previous patch was incorrect. Sorry for the mistake.
Attachment #534749 - Attachment is obsolete: true
Assignee: general → nobody
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 9 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: