Closed Bug 811349 Opened 12 years ago Closed 12 years ago

IonMonkey: add basic block hit counters to PC count information

Categories

(Core :: JavaScript Engine, defect)

Other Branch
x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla20

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

Details

Attachments

(4 files, 4 obsolete files)

Attached patch patch (f41c225d1919) (obsolete) — Splinter Review
This is related to bug 771118 but considerably simpler.  IonMonkey is arranged around generating code from a set of basic blocks containing LIR nodes originating from any number of PCs (or from no PC).  Rather than accumulate counts at a PC level of granularity, it is cleaner to count how many times each basic block executes, and then associate each basic block with information about the containing code.  These counters can be accumulated and kept around even after the IonScript has been destroyed, and then converted to JSON for use by the JIT inspector (or another) addon.  (This patch is missing increment64 for non-x86 architectures, will fix that before checking in.)
Attachment #681086 - Flags: review?(nicolas.b.pierron)
This adds generated assembly to the text information for a basic block.  Instructions are interleaved with the name of the LIR node which caused that instruction to be generated, for bonus comprehensibility.
Assignee: general → bhackett1024
Attachment #681093 - Flags: review?(nicolas.b.pierron)
Screenshot from a version of the JIT inspector modified to show Ion information when prompted.  Some cleanup is still needed here (format assembly differently from LIR node headers, rm duplicate labels, ...) but that can be done within the addon itself.
bz, iirc you were interested at one point in being able to see the assembly for generated jitcode.
Does JIT inspector still depend on the old decompiler with these patches?
> iirc you were interested at one point in being able to see the assembly for generated
> jitcode.

Yes!  I wanted that for seeing what code we're actually generating for DOM binding calls from JS.
(In reply to David Mandelin [:dmandelin] from comment #4)
> Does JIT inspector still depend on the old decompiler with these patches?

Yes, the JIT inspector is still keeping the decompiler live, with and without these patches.  Basic blocks are not associated with particular expressions though, so the functionality added to the decompiler for the JIT inspector is not really needed by this stuff.
(In reply to Brian Hackett (:bhackett) [mostly gone until mid-November] from comment #6)
> (In reply to David Mandelin [:dmandelin] from comment #4)
> > Does JIT inspector still depend on the old decompiler with these patches?
> 
> Yes, the JIT inspector is still keeping the decompiler live, with and
> without these patches.  Basic blocks are not associated with particular
> expressions though, so the functionality added to the decompiler for the JIT
> inspector is not really needed by this stuff.

The new decompiler system is going out in release in Fx17 on Nov 20, so decompiler is scheduled for deletion soon after that. If you want to restore JIT Inspector support, could you fix bug 777474 as well?
Hrmm, where is the schedule posted?  I guess I'll fix bug 777474 if that's the only option, the JIT inspector is pretty critical for my productivity.
Comment on attachment 681093 [details] [diff] [review]
augment basic block information with generated code

Review of attachment 681093 [details] [diff] [review]:
-----------------------------------------------------------------

Do the same modification on other backend.
Fix indentation.
Remove trailing white space visible in the review tool of bugzilla.

::: js/src/assembler/assembler/X86Assembler.h
@@ +45,3 @@
>  #include "methodjit/Logging.h"
>  #define IPFX  "        %s"
>  #define ISPFX "        "

Are these spaces needed outside JM insns output? If so, can you add them to the format the JM spewer.

@@ +441,5 @@
> +
> +            if (i > -1) {
> +                if (printer)
> +                    printer->printf("%s", buf);
> +                js::JaegerSpew(js::JSpew_Insns, "%s", buf);

nit: Add a comment that we should not use buf as it may contain a '%' produced by vsnprintf.

Add IonSpew CodeGen here and select between JM or Ion spew to avoid confusing spew outputs.

@@ +453,1 @@
>                         IPFX "nop\n", MAYBE_PAD);

nit: Remove the extra indentation level and the extra \n.
Attachment #681093 - Flags: review?(nicolas.b.pierron)
(In reply to Brian Hackett (:bhackett) [mostly gone until mid-November] from comment #8)
> Hrmm, where is the schedule posted?  I guess I'll fix bug 777474 if that's
> the only option, the JIT inspector is pretty critical for my productivity.

Removing the decompiler isn't explicitly scheduled. It becomes possible after Nov 20 + a few weeks waiting to see if anything happens, so it'd be cool to be ready to delete fully by the end of the year. Since it's not being used, the code will likely break over time so it will be good to get it out soonish.
Clean up spew for generated code per comment 9
Attachment #681093 - Attachment is obsolete: true
Attachment #681207 - Flags: review?(nicolas.b.pierron)
Comment on attachment 681086 [details] [diff] [review]
patch (f41c225d1919)

Review of attachment 681086 [details] [diff] [review]:
-----------------------------------------------------------------

Move BlockCount code outside generateBody and only keep the increment and the spew attachment to the BlockCount inside generateBody's function.

Add support for other architecture: masm.Inc64(Address addr)

Handle inlined script properly, either by ignoring now or by adding support to the BlockCount system.

::: js/src/ion/CodeGenerator.cpp
@@ +1369,5 @@
> +
> +    RawScript script = gen->info().script();
> +
> +    if (GetIonContext()->cx->runtime->profilingScripts && !script->hasScriptCounts)
> +        script->initScriptCounts(GetIonContext()->cx);

This is not related to generateBody, move this to another function called before generateBody.

@@ +1393,5 @@
> +        iter++;
> +
> +        Maybe<Sprinter> printer;
> +        if (counts) {
> +            MBasicBlock *block = current->mir();

This scope should be moved in its own function.

@@ +1394,5 @@
> +
> +        Maybe<Sprinter> printer;
> +        if (counts) {
> +            MBasicBlock *block = current->mir();
> +            uint32 offset = block->pc() - script->code;

script does not account for inlining case. Use block->info().script().

@@ +1396,5 @@
> +        if (counts) {
> +            MBasicBlock *block = current->mir();
> +            uint32 offset = block->pc() - script->code;
> +            if (offset >= script->length)
> +                offset = 0;

Assert this!

@@ +1400,5 @@
> +                offset = 0;
> +            if (!counts->block(i).init(block->id(), offset, block->numSuccessors()))
> +                return false;
> +            for (size_t j = 0; j < block->numSuccessors(); j++)
> +                counts->block(i).setSuccessor(j, block->getSuccessor(j)->id());

This copy does not belong in generateBody, move this cloning above the call to generateBody.

@@ +1424,5 @@
>          if (masm.oom())
>              return false;
> +
> +        if (counts)
> +            counts->block(i).setCode(printer.ref().string());

I am not a fan of showing the assembly to any user.  People should learn how to code in JS, not how to code in Assembly.

::: js/src/ion/x86/MacroAssembler-x86.h
@@ +683,5 @@
>          static const double NegativeOne = 2147483648.0;
>          addsd(Operand(&NegativeOne), dest);
>      }
>  
> +    void increment64(uint64 *dest) {

Use the generic Macro assembler and fallback on addq on x64.
Attachment #681086 - Flags: review?(nicolas.b.pierron)
Comment on attachment 681207 [details] [diff] [review]
clean up instruction spew

Review of attachment 681207 [details] [diff] [review]:
-----------------------------------------------------------------

Great.

nit: sed 's/^\([ +].*\) *$/\1/' my.patch > my-new.patch

::: js/src/assembler/assembler/AssemblerBuffer.h
@@ +258,5 @@
> +        }
> +
> +        void spew(const char *fmt, ...)
> +#ifdef __GNUC__
> +            __attribute__ ((format (printf, 2, 3)))

Awesome, except that this would be:
    __attribute__ ((format (printf, 1, 2)))

because the format is used as first argument, and the variadic list starts at the second argument.

@@ +280,5 @@
> +                        printer->printf("%s\n", buf);
> +
> +                    // The assembler doesn't know which compiler it is for, so if
> +                    // both JM and Ion spew are on, just print via one channel
> +                    // (Use JM to pick up isOOLPath).

Great.

::: js/src/assembler/assembler/X86Assembler.h
@@ +40,5 @@
>  #include "assembler/wtf/Assertions.h"
>  #include "js/Vector.h"
>  
> +#define PRETTY_PRINT_OFFSET(os) (((os)<0)?"-":""), (((os)<0)?-(os):(os))
> +#define FIXME_INSN_PRINTING                                 \

Factor these macros near the GenericAssembler, as they seems to be shared by all assemblers and as they are already declared in header files.

@@ +1349,5 @@
>  #if WTF_CPU_X86_64
>      void xchgq_rr(RegisterID src, RegisterID dst)
>      {
> +        spew(
> +                       IPFX "xchgq      %s, %s",

Remove IPFX.
Attachment #681207 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #12)
> @@ +1424,5 @@
> >          if (masm.oom())
> >              return false;
> > +
> > +        if (counts)
> > +            counts->block(i).setCode(printer.ref().string());
> 
> I am not a fan of showing the assembly to any user.  People should learn how
> to code in JS, not how to code in Assembly.

The JIT inspector addon is intended both for users looking to tune perf in their code and for developers of the JIT itself.  The latter group definitely cares about the generated assembly; displaying this assembly makes it easy to see where there are performance defects in the generated code (excessive spilling, lousy instruction selection, unnecessary use of registers, ...).  Right now the Ion output we can show will mainly be useful to people developing the JIT.  Fixing bug 771118 would make it more generally useful.
Attached patch part 1 (obsolete) — Splinter Review
Restructure code generation instrumentation per comment 12.
Attachment #681534 - Flags: review?(nicolas.b.pierron)
Attachment #681086 - Attachment is obsolete: true
Comment on attachment 681534 [details] [diff] [review]
part 1

Review of attachment 681534 [details] [diff] [review]:
-----------------------------------------------------------------

Document PCCount API. Ask Waldo if needed.
Fix & Optimize ARM Macro assembler. Ask Marty if needed.
Fix the Macro assembler signature.
Use the outer resume points to recover the enclosing basic block.

::: js/src/ion/CodeGenerator.cpp
@@ +1395,5 @@
> +
> +        // Find a PC offset in the outermost script to use. If this block is
> +        // from an inlined script, find a location in the outer script to
> +        // associate information about the inling with. Follow the inlined
> +        // block's dominators to find the nearest block in the outer script.

I will suggest another approach based on the caller() link of resume point and the block() function of the last resume point to get the outermost caller block.

::: js/src/ion/IonCode.h
@@ +476,5 @@
> +
> +    void setCode(const char *code) {
> +        char *ncode = (char *) js_malloc(strlen(code) + 1);
> +        if (ncode) {
> +            strcpy(ncode, code); // nooooooo

nit: I like your comment.

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +65,5 @@
>      ma_b(fail, Assembler::Equal);
>  }
>  
> +void
> +MacroAssemblerARM::increment64(uint64 *dest)

This function should have the following signature and name:

    void MacroAssemblerARM::inc64(AbsoluteAddress dest)

No raw pointer are accepted in the signature of the Macro assembler functions, and other operations are named add32, shr32, and so on …

@@ +72,5 @@
> +    // store to an absolute address, so at least one new register is needed in
> +    // addition to the scratch register. Pick an arbitrary one to spill.
> +    ma_push(r0);
> +
> +    load32(AbsoluteAddress(dest), r0);

ARM is used to work with pair of consecutive registers, you should ask Marty to know if this would be more efficient on ARM by  pushing r0, r1 (with one push), and loading the 64 bit value in r0 and r1 in one load instruction. (in which case there is only one instruction to skip after in case of overflow)

@@ +77,5 @@
> +    ma_add(Imm32(1), r0);
> +    store32(r0, AbsoluteAddress(dest));
> +    Label noOverflow;
> +    ma_cmp(r0, Imm32(0));
> +    ma_b(&noOverflow, NonZero);

I don't know if this is necessary for skipping 3 instructions.  You might consider using the Conditional version of each instruction.

@@ +78,5 @@
> +    store32(r0, AbsoluteAddress(dest));
> +    Label noOverflow;
> +    ma_cmp(r0, Imm32(0));
> +    ma_b(&noOverflow, NonZero);
> +    load32(AbsoluteAddress(dest), r0);

Typo?  You might want to access the high part of the number, not the same part again.  In addition, you don't want to bake in the constant as ARM is quite verbose about loading constants, and by loading it in a register you can do the store of the low bits and move the pointer at the same time:

   ma_dtr(IsStore, ScratchRegister, Imm32(4), r0, PostIndex);

::: js/src/ion/x86/MacroAssembler-x86.h
@@ +687,5 @@
> +    void increment64(uint64 *dest) {
> +        addl(Imm32(1), Operand(dest));
> +        Label noOverflow;
> +        cmpl(Operand(dest), Imm32(0));
> +        j(NotEqual, &noOverflow);

You can just check the overflow flag after the addl instruction.

::: js/src/jsopcode.cpp
@@ +6866,5 @@
>      buf.append(']');
> +
> +    ion::IonScriptCounts *ionCounts = sac.getIonCounts();
> +    if (ionCounts) {
> +        AppendJSONProperty(buf, "ion");

Provide appropriate documentation for the output of GetPCCountJSON as this function is part of the API.  Currently there is no documentation for any of the PCCount function.

::: js/src/jsscript.cpp
@@ +846,5 @@
> +{
> +    JS_ASSERT(script->hasScriptCounts);
> +    ScriptCountsMap *map = script->compartment()->scriptCountsMap;
> +    JS_ASSERT(map);
> +    ScriptCountsMap::Ptr p = map->lookup(script);

nit: JS_ASSERT(map) is useless.
Attachment #681534 - Flags: review?(nicolas.b.pierron)
Comment on attachment 681207 [details] [diff] [review]
clean up instruction spew

Review of attachment 681207 [details] [diff] [review]:
-----------------------------------------------------------------

Ion ARM assembler is not instrumented, it at least lack  void setPrinter(Sprinter *sp).
Attached patch part 1 (obsolete) — Splinter Review
Fix some of the nits from comment 16.

Sorry, but your comments about ARM performance are pretty off base.  I'm not going to try to micro-optimize code which is only used for debugging instrumentation, especially when it's not clear that the proposed changes will actually be an improvement.

For x86, we have a jump-on-overflow but not a jump-on-non-overflow, which your change would require (or a more complicated nest of jumps).  Again, not going to optimize this.

I added docs on the PCCount interface to https://developer.mozilla.org/en-US/docs/XPCOM_Interface_Reference/nsIDOMWindowUtils.  This doesn't include docs on the contents of the JSON, as doing this properly would basically require making all the internal details of JSScript part of the API, which isn't going to happen.
Attachment #681534 - Attachment is obsolete: true
Attachment #682035 - Flags: review?(nicolas.b.pierron)
Comment on attachment 682035 [details] [diff] [review]
part 1

Review of attachment 682035 [details] [diff] [review]:
-----------------------------------------------------------------

Please address **all** remarks made during this review and the previous one.

::: js/src/ion/IonCode.h
@@ +476,5 @@
> +
> +    void setCode(const char *code) {
> +        char *ncode = (char *) js_malloc(strlen(code) + 1);
> +        if (ncode) {
> +            strcpy(ncode, code); // nooooooo

Remove comment.

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +81,5 @@
> +    ma_b(&noOverflow, NonZero);
> +    load32(AbsoluteAddress(dest), r0);
> +    ma_add(Imm32(1), r0);
> +    store32(r0, AbsoluteAddress(((uint32 *) dest) + 1));
> +    bind(&noOverflow);

Rename this function "inc64" as previously mentioned 2 times before.

Clean-up this function, Do not create unnecessary temporary object.
   AbsoluteAddress(dest)

Hide “AbsoluteAddress(((uint32 *) dest) + 1)” logic under “AbsoluteAddress::offset(uintptr_t)” function

Fix as previously commented on this function and test on ARM.

::: js/src/ion/x86/MacroAssembler-x86.h
@@ +687,5 @@
> +    void increment64(AbsoluteAddress dest) {
> +        addl(Imm32(1), Operand(dest));
> +        Label noOverflow;
> +        cmpl(Operand(dest), Imm32(0));
> +        j(NotEqual, &noOverflow);

addl(Imm32(1), Operand(dest));
    Label noOverflow;
    j(NotOverflow, &noOverflow);

This implies adding the NotOverflow inside the enum as ConditionNO.  I don't see where we need an extra jump?

::: js/src/jsscript.cpp
@@ +845,5 @@
> +static inline ScriptCountsMap::Ptr GetScriptCountsMapEntry(JSScript *script)
> +{
> +    JS_ASSERT(script->hasScriptCounts);
> +    ScriptCountsMap *map = script->compartment()->scriptCountsMap;
> +    JS_ASSERT(map);

Remove useless assertion.
Attachment #682035 - Flags: review?(nicolas.b.pierron)
(In reply to Brian Hackett (:bhackett) from comment #18)
> I added docs on the PCCount interface to
> https://developer.mozilla.org/en-US/docs/XPCOM_Interface_Reference/
> nsIDOMWindowUtils.

That's a good first step. Thanks.

> This doesn't include docs on the contents of the JSON,
> as doing this properly would basically require making all the internal
> details of JSScript part of the API, which isn't going to happen.

“The structure of the JSON is unspecified and may vary between versions of Firefox.”, Then you should flag all of the functions related to PCCount as {{experimental_inline}}.
Attached patch part 1Splinter Review
Per discussion I made the ARM implementation a NYI.  Will file a followup after this lands to implement it, should someone want detailed Ion profiling on ARM.  I added experimental_inline to the dev docs and made other requested changes.
Attachment #682035 - Attachment is obsolete: true
Attachment #682258 - Flags: review?(nicolas.b.pierron)
Comment on attachment 682258 [details] [diff] [review]
part 1

Review of attachment 682258 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +67,5 @@
>  
> +void
> +MacroAssemblerARM::inc64(AbsoluteAddress dest)
> +{
> +    JS_NOT_REACHED("NYI");

Considering an optimized build, this will cause the function to behave as a no-op.  Which means that any code using inc64 will have a good surprise if they are not tested, which is sad.

For this patch, not implementing this function will cause the Ion counters to never be incremented which does not cause any security issue but does provide a different behavior, which is acceptable for now.

::: js/src/ion/x64/MacroAssembler-x64.h
@@ +827,5 @@
>      void convertUInt32ToDouble(const Register &src, const FloatRegister &dest) {
>          cvtsq2sd(src, dest);
>      }
>  
> +    void inc(AbsoluteAddress dest) {

You should name it inc64, to prevent a compilation issue on x64.

::: js/src/jsscript.cpp
@@ +846,5 @@
> +{
> +    JS_ASSERT(script->hasScriptCounts);
> +    ScriptCountsMap *map = script->compartment()->scriptCountsMap;
> +    JS_ASSERT(map);
> +    ScriptCountsMap::Ptr p = map->lookup(script);

You can remove JS_ASSERT(map), as the pointer is dereferenced just after.

@@ +859,1 @@
>      return p->value.pcCountsVector[pc - code];

The same goes for JS_ASSERT(p).
Attachment #682258 - Flags: review?(nicolas.b.pierron) → review+
(I think the script looks for things that start with "[leave")
Whiteboard: leave open → [leave open]
This has regressed sunspider by about 2%.
Where do you see that? SunSpider looks fairly stable on AWFY, and this patch didn't change code generation.
(In reply to comment #30)
> Where do you see that? SunSpider looks fairly stable on AWFY, and this patch
> didn't change code generation.

Regression  SunSpider 0.9.1 increase 1.96% on MacOSX 10.6 (rev4) Mozilla-Inbound
----------------------------------------------------------------------------------
    Previous: avg 311.070 stddev 1.573 of 30 runs up to revision 5b43d9de3c05
    New     : avg 317.160 stddev 0.351 of 5 runs since revision fee11982f357
    Change  : +6.090 (1.96% / z=3.871)
    Graph   : http://mzl.la/TQfZtu

Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=5b43d9de3c05&tochange=fee11982f357

Changesets:
  * http://hg.mozilla.org/integration/mozilla-inbound/rev/26eeb8c3a0ae
    : Shu-yu Guo <shu@rfrn.org> - Bug 791850 - Followup: fix computing caller stack formal position. (r=jorendorff)
    : http://bugzilla.mozilla.org/show_bug.cgi?id=791850

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/fee11982f357
    : Brian Hackett <bhackett1024@gmail.com> - Show generated assembly in Ion PC count information, bug 811349. r=pierron
    : http://bugzilla.mozilla.org/show_bug.cgi?id=811349

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=791850 - Lazily clone self-hosted methods installed via js_DefineFunction
  * http://bugzilla.mozilla.org/show_bug.cgi?id=811349 - IonMonkey: add basic block hit counters to PC count information
_______________________________________________
dev-tree-management mailing list
dev-tree-management@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tree-management


Somebody from the JS team needs to watch dev-tree-management.  :-)
We use AWFY for tracking small regressions in the JS engine as we've found it to be more reliable, especially for SunSpider.
(In reply to comment #32)
> We use AWFY for tracking small regressions in the JS engine as we've found it
> to be more reliable, especially for SunSpider.

If you feel that the Talos benchmarks that we run are not useful, please file a bug to disable them.  Otherwise, we can't ignore them just because we also use AWFY.
https://hg.mozilla.org/mozilla-central/rev/fee11982f357
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
(In reply to Ehsan Akhgari [:ehsan] from comment #33)
> (In reply to comment #32)
> > We use AWFY for tracking small regressions in the JS engine as we've found it
> > to be more reliable, especially for SunSpider.
> 
> If you feel that the Talos benchmarks that we run are not useful, please
> file a bug to disable them.  Otherwise, we can't ignore them just because we
> also use AWFY.

Well, not all regressions are equal, but I think you're right, we should remove the test. Where can I file a bug to do that?
(In reply to comment #35)
> (In reply to Ehsan Akhgari [:ehsan] from comment #33)
> > (In reply to comment #32)
> > > We use AWFY for tracking small regressions in the JS engine as we've found it
> > > to be more reliable, especially for SunSpider.
> > 
> > If you feel that the Talos benchmarks that we run are not useful, please
> > file a bug to disable them.  Otherwise, we can't ignore them just because we
> > also use AWFY.
> 
> Well, not all regressions are equal, but I think you're right, we should remove
> the test. Where can I file a bug to do that?

Testing::Talos would be the right component, I believe.
Blocks: 814489
I noticed some jit tests are failing with this NYI.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Pretty sure this implements inc64.  Of course that depends a bit on exactly how it is supposed to function.
Attachment #685091 - Flags: review?(bhackett1024)
Attachment #685091 - Flags: review?(Jacob.Bramley)
Attachment #685091 - Flags: review?(Jacob.Bramley) → review+
Attachment #685091 - Flags: review?(bhackett1024) → review+
https://hg.mozilla.org/mozilla-central/rev/abf833bf24de
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: