Closed Bug 512181 Opened 15 years ago Closed 14 years ago

nanojit: rework TMFLAGS=assembly,regalloc,activation

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

References

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)

Attachments

(3 files, 3 obsolete files)

TMFLAGS=regalloc only has an effect if TMFLAGS=assembly is also specified.  Furthermore, it augments the TMFLAGS=assembly output with a additional info about register spills and restores;  this additional info is very compact and frequently useful, so there seems little point in not always showing it with TMFLAGS=assembly.
Another possibility is to keep 'regalloc' and 'assembly', but move control of the printing of register state from 'assembly' to 'regalloc', which makes more sense.  'regalloc' would still imply 'assembly', as printing the regalloc info without the assembled code doesn't make sense.

The advantage of this would be that if you really just want to see the generated code, you can do so without all the register allocation junk.  But getting rid of one TMFLAGS alternative would be nice... I guess the question is how often would people use 'assembly' rather than 'regalloc'?
And once TMFLAGS=activation is added (bug 518724 adds it to TM, it's already in TR) we have a similar situation.  It might make sense to merge 'activation' into 'regalloc', since they're very much related information.
Depends on: 518724
Time to rename this bug.
Blocks: 513616
Summary: nanojit: merge TMFLAGS=assembly and TMFLAGS=regalloc → nanojit: rework TMFLAGS=assembly,regalloc,activation
Attached patch patch (obsolete) — Splinter Review
This patch overhauls printing via TMFLAGS of register and stack state.

- It always shows the " <= spill" and " <= restore" annotations, even if
  TMFLAGS=regalloc isn't specified.  Reason being that it's extremely useful
  but doesn't take up any extra space (it's tacked onto the end of the line)
  so it's worth always showing.

- It prints register and activation (stack) state on separate lines from the
  asm code;  this makes it clearer which state it is (pre-state or
  post-state).  Previously the pre-state was printed after the instruction,
  which was very unintuitive.

  And it only does it once per LIR instruction;  for LIR instructions that
  result in multiple native instructions, this avoids sometimes incorrect 
  intermediate register states from being printed.
  
  And printing the activation state for every LIR instruction -- not just
  when it changes -- makes it consistent with the register state, and avoids
  further confusion.

- It prints register and activation state in similar ways.  Register state is
  now preceded with "RR", activation state with "AR".

- It uses a 26-space indent for both.  Previously register state used 35, and
  activation state used 45.  Now that they're on separate lines a smaller
  indent is better as it avoids some long lines that require wrapping.

- It treats asm_restore() consistently across all back-ends.  Previously
  some of the cases had "<= restore" printed for them, some had "<= remat",
  and some had nothing.  Now all of them always print "<= restore".  And
  they print it in the right place, on X64 at least it was being printed one
  line too early.

Here's an example of the old output:

      js_UnboxDouble4 = fcall #js_UnboxDouble ( ld14 )
               call js_UnboxDouble    ebx($var0) esi(sp) edi(ld1)
               movq xmm0,-48(ebp)     ebx($var0) esi(sp) edi(ld1)  <= restore js_UnboxDouble3
                                                <FP 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4) 36($args3)
               mov edx,ebx            ebx($var0) esi(sp) edi(ld1) xmm0(js_UnboxDouble3)
               mov ebx,-36(ebp)       esi(sp) edi(ld1) xmm0(js_UnboxDouble3)  <= restore $args3
                                                <FP 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4)
                                                >FP 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4) 36+(js_UnboxDouble4)
               fstpq -40(ebp)         edx($var0) ebx($args3) esi(sp) edi(ld1) xmm0(js_UnboxDouble3) f0(js_UnboxDouble4)  <= spill js_UnboxDouble4
               movq xmm1,-40(ebp)     edx($var0) ebx($args3) esi(sp) edi(ld1) xmm0(js_UnboxDouble3)  <= restore js_UnboxDouble4
                                                <FP 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4)
      fsub2 = fsub js_UnboxDouble3, js_UnboxDouble4
               subsd xmm0,xmm1        edx($var0) ebx($args3) esi(sp) edi(ld1) xmm0(js_UnboxDouble3) xmm1(js_UnboxDouble4)

And the new output for the same code:

                            RR ecx(ld14) ebx($var0) esi(sp) edi(ld1)
                            AR 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4) 36($args3) 44+(js_UnboxDouble4)
      js_UnboxDouble3 = fcall #js_UnboxDouble ( ld14 )
               call js_UnboxDouble
               movq xmm0,-48(ebp)  <= restore js_UnboxDouble4
               mov edx,ebx
               mov ebx,-36(ebp)  <= restore $args3
               fstpq -40(ebp)  <= spill js_UnboxDouble3
               movq xmm1,-40(ebp)  <= restore js_UnboxDouble3
                            RR edx($var0) ebx($args3) esi(sp) edi(ld1) xmm0(js_UnboxDouble4) xmm1(js_UnboxDouble3)
                            AR 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4)
      fsub2 = fsub js_UnboxDouble4, js_UnboxDouble3
               subsd xmm0,xmm1
                            RR edx($var0) ebx($args3) esi(sp) edi(ld1) xmm0(fsub2)
                            AR 4(state) 8(ebx) 12(esi) 16(edi) 20(ld1) 24(cx) 28($var1) 32($args4)

The new output has more lines, but shorter lines, and accurate info, and
is much easier to read.
Attachment #413002 - Flags: review?(edwsmith)
Attached file old vs. new output
The old vs. new outputs in comment 4 wrap badly.  Here's an attachment that shows them more clearly.
Attached patch NJ-only patch (obsolete) — Splinter Review
The first patch included a TM-only change.  This one is NJ-only, I'll file the TM-only one separately.
Attachment #413002 - Attachment is obsolete: true
Attachment #413005 - Flags: review?(edwsmith)
Attachment #413002 - Flags: review?(edwsmith)
Attached patch TM-only patch (obsolete) — Splinter Review
Attachment #413006 - Flags: review?(graydon)
Attachment #413006 - Attachment is obsolete: true
Attachment #413007 - Flags: review?(graydon)
Attachment #413006 - Flags: review?(graydon)
Depends on: 515311
Blocks: 513615
Attachment #413007 - Flags: review?(graydon) → review+
Comment on attachment 413005 [details] [diff] [review]
NJ-only patch

in Nativei386.cpp, Assembler::asm_restore(), the new call to outputForEOL() isn't guarded by a check of the log verbose bits, like what is done inside the asm_output() macro.  This makes tamarin crash in non-verbose mode, because _thisfrag->lirbuf->names is NULL.  

This might happen in other places too?  this was the first one I found.

Other than that i couldn't find any problems and the improved output is great.
Attachment #413005 - Flags: review?(edwsmith) → review-
This patch augments the previous one significantly, in order to clean up the
mess that is the debug printing in class Assembler.

- Prints "<= restore" comments in Assembler, rather than (identically) in
  all the back ends.

- Introduces printRegState() to partner printActivationState().
  Inlines RegAlloc::formatRegisters() into printRegState() because that's a
  more sensible place for that code.

- Adds LC_Assembly guards to setOutputForEOL() calls, which fixes the
  TR problem in the first patch.

- Renames outputForEOL() to setOutputForEOL() to emphasise the fact that it
  doesn't actually produce any output itself.

- Removes output_asm() -- no more confusion over how it differs from
  asm_output().

- Removes outputAlign(), it was a complicated and confusing way of doing
  something that memset() does much more easily.

- Rejigs output() and outputf() to be simpler.

- Adds comments about all the output methods in the class declaration.

- Makes most of the output-related members of class Assembler private.

- Adds JMPX and JMPXB to the X64 backend so debugging output will be printed
  for them.

Tested on TM and TR and NJ.

BTW: vsprintf() is used in Assembler.cpp.  Should it really be
VMPI_vsprintf() like the others?
Attachment #413005 - Attachment is obsolete: true
Attachment #413536 - Flags: review?(edwsmith)
Attachment #413536 - Flags: review?(edwsmith) → review+
Landed the TM-specific patch:

http://hg.mozilla.org/tracemonkey/rev/4ca9f32effcf

This patch could be landed on TM before the NJ-specific patch because it doesn't require the NJ changes -- it's only a tweak to the TMFLAGS usage message, and one which is appropriate even without the NJ changes.
http://hg.mozilla.org/tracemonkey/rev/da93d69ff1a3
http://hg.mozilla.org/tamarin-redux/rev/9a8436a2492a
Whiteboard: fixed-in-nanojit → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
http://hg.mozilla.org/mozilla-central/rev/da93d69ff1a3
http://hg.mozilla.org/mozilla-central/rev/4ca9f32effcf
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: