Last Comment Bug 691788 - Add type behavior info to PCCOUNTS
: Add type behavior info to PCCOUNTS
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla11
Assigned To: Brian Hackett (:bhackett)
:
Mentors:
Depends on:
Blocks: 771118
  Show dependency treegraph
 
Reported: 2011-10-04 09:05 PDT by Brian Hackett (:bhackett)
Modified: 2012-07-05 06:48 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (23.11 KB, patch)
2011-10-04 17:55 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Splinter Review
WIP v2 (33.10 KB, patch)
2011-10-06 19:00 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
Prop access patch (10.00 KB, patch)
2011-10-07 00:16 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
WIP v3 (39.60 KB, patch)
2011-10-13 03:44 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
Hardcode pc counters to be on and dump to file (2.37 KB, patch)
2011-10-13 12:04 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
WIP v3.1 (39.60 KB, patch)
2011-10-17 09:26 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
WIP v3.1 (39.67 KB, patch)
2011-10-18 03:35 PDT, Shu-yu Guo [:shu]
no flags Details | Diff | Splinter Review
scraper (2.54 KB, patch)
2011-10-18 08:26 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Splinter Review
revised (41.52 KB, patch)
2011-10-18 11:43 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Splinter Review
scraper (4.83 KB, patch)
2011-10-19 11:33 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Splinter Review
patch (8aeb207c9a2f) (68.75 KB, patch)
2011-11-01 15:56 PDT, Brian Hackett (:bhackett)
sphink: review+
Details | Diff | Splinter Review
sample output (3.35 KB, text/plain)
2011-11-03 08:40 PDT, Brian Hackett (:bhackett)
no flags Details
revised patch (8aeb207c9a2f) (68.66 KB, patch)
2011-11-03 12:35 PDT, Brian Hackett (:bhackett)
no flags Details | Diff | Splinter Review
sample output (4.47 KB, text/plain)
2011-11-03 12:37 PDT, Brian Hackett (:bhackett)
no flags Details

Description Brian Hackett (:bhackett) 2011-10-04 09:05:07 PDT
Right now the script PCCOUNTS can be used to track information about execution in both a dynamic and static sense (e.g. dynamic: where are the hotspots, static: which code is ever executed).  Additional metrics have been added to estimate the quality of the jitcode generated.

We should add additional metrics about the types that show up during execution.  This is needed to better understand and analyze TI's behavior for a writeup, and it should be useful to supplement the existing data when analyzing performance faults (more of the 'why' rather than the 'what' for the quality of generated jitcode).

Mainly:

- For each opcode, what are the type distributions of the pushed and/or popped values, in terms of both inferred and observed types.

- Counts of property accesses and calls that require type barriers.

- Counts of accesses on packed arrays and definite properties.

- Counts of accesses on missing properties and integer arithmetic which overflows.
Comment 1 Brian Hackett (:bhackett) 2011-10-04 17:55:56 PDT
Created attachment 564722 [details] [diff] [review]
WIP

Tracks information about observed types at name and property accesses, whether those accesses were inferred as monomorphic or polymorphic, and whether property accesses and calls have type barriers.

Also updates pccounts output to be both human and machine readable.

Still needs tracking for type behavior of arithmetic and characteristics of element and property accesses.
Comment 2 Shu-yu Guo [:shu] 2011-10-05 19:07:42 PDT
What is ARITH_UNKNOWN and ARITH_OTHER intended for?
Comment 3 Brian Hackett (:bhackett) 2011-10-05 19:32:14 PDT
What I'd intended for the ARITH ones:

ARITH_INT: known int inputs, int output
ARITH_OVERFLOW: known int inputs, double output
ARITH_DOUBLE: known double inputs, double output
ARITH_OTHER: known inputs of other types
ARITH_UNKNOWN: polymorphic inputs

Since this is just about the known types it saves needing to do dynamic testing to figure out how to bucket an access.
Comment 4 Shu-yu Guo [:shu] 2011-10-06 19:00:17 PDT
Created attachment 565429 [details] [diff] [review]
WIP v2

Counts arithmetic and elem accesses, still needs to count the type of prop accesses
Comment 5 Shu-yu Guo [:shu] 2011-10-07 00:16:43 PDT
Created attachment 565456 [details] [diff] [review]
Prop access patch

Couldn't figure out a way to instrument prop accesses without editing the IC stubs themselves.

This patch increments the following counters for {GET,CALL}PROP under the following conditions:

PROP_DEFINITE: is definite prop, baked in slot access
PROP_SHAPE: shape-guarded IC hit, _including_ the 'warming' up hits to ic::GetProp/ic::CallProp stubs
PROP_UNKNOWN: hits to stubs after IC has been disabled

Brian, I'm not sure my understanding of IC in JM is correct or if the above is the right way to go. If not, please rewrite.
Comment 6 Brian Hackett (:bhackett) 2011-10-07 14:38:46 PDT
Looks good, we should be able to start getting measurements off benchmarks and websites now.
Comment 7 Shu-yu Guo [:shu] 2011-10-13 03:44:51 PDT
Created attachment 566777 [details] [diff] [review]
WIP v3

Updated patch.

 - Replace saving/restoring the counter reg using frame.addressOfTop() with masm.push/pop
 - Fix offset calculation in IC stubs
Comment 8 Shu-yu Guo [:shu] 2011-10-13 12:04:06 PDT
Created attachment 566900 [details] [diff] [review]
Hardcode pc counters to be on and dump to file

Each Firefox session dumps to its own pccounts dump file.
Comment 9 Shu-yu Guo [:shu] 2011-10-13 12:05:59 PDT
Both patches are diffed against m-c 78609:46a6d0fd13d5
Comment 10 Shu-yu Guo [:shu] 2011-10-17 09:26:50 PDT
Created attachment 567475 [details] [diff] [review]
WIP v3.1

Fix bug in arg inc/dec opcodes
Comment 11 Shu-yu Guo [:shu] 2011-10-18 03:35:42 PDT
Created attachment 567703 [details] [diff] [review]
WIP v3.1

Apparently I don't know how to qref...
Comment 12 Brian Hackett (:bhackett) 2011-10-18 08:26:32 PDT
Created attachment 567755 [details] [diff] [review]
scraper

Script to scrape type information from a pccounts log and collate the info at a domain granularity.
Comment 13 Brian Hackett (:bhackett) 2011-10-18 11:43:24 PDT
Created attachment 567813 [details] [diff] [review]
revised

Fix a few codegen bugs.
Comment 14 Brian Hackett (:bhackett) 2011-10-19 11:33:57 PDT
Created attachment 568130 [details] [diff] [review]
scraper

Update scraper to report ratios among each category for adding to tables.
Comment 15 Brian Hackett (:bhackett) 2011-11-01 15:56:11 PDT
Created attachment 571171 [details] [diff] [review]
patch (8aeb207c9a2f)

Updated patch, restructures PC count storage to use less memory and make it easier to track more refined counters for particular opcodes.  As a followup I want to add an about:jitperf or something to expose this information and allow people to better understand the JIT's behavior and identify performance faults.
Comment 16 Steve Fink [:sfink] [:s:] 2011-11-02 17:41:52 PDT
Comment on attachment 571171 [details] [diff] [review]
patch (8aeb207c9a2f)

Review of attachment 571171 [details] [diff] [review]:
-----------------------------------------------------------------

This looks like really useful stuff.

As for an about:jitperf, David Flanagan (djf) wrote CoverMonkey, a great command-line tool using the previous output format. Cedric Vivier wrote an addon that borrows the CoverMonkey analysis to give a live view of what code is executing. Cedric, I don't remember the name or details of your tool, and I couldn't dig up a link. Do you have something you could show Brian & Shu? It's well beyond any addon I'd be able to write for this stuff; I really wish we could've recorded the demo at the all-hands.

This change will totally break both of their tools. I don't think that should hold up landing this patch, but I'd like to get their views on how to expose the information they were using. djf - in particular, this patch replaces the -D output with a more direct dump of the pccounts information, and in non-DEBUG builds it won't show opcode names, just the numeric opcode values. (Previously, it wouldn't work at all in a non-DEBUG build, so that's not so bad.) It also loses the various offsets that you were using to do your reachability analysis, so you'll probably end up needing both dumps (the decompilation + pccounts.) It'd probably be easiest to put the subset of the counters you care about back into the decompilation output.

CoverMonkey - http://www.davidflanagan.com/2011/08/covermonkey-cod.html

::: js/src/jsinfer.cpp
@@ +2308,5 @@
>  void
>  ScriptAnalysis::addTypeBarrier(JSContext *cx, const jsbytecode *pc, TypeSet *target, Type type)
>  {
> +    target->addType(cx, type);
> +    return;

I'm guessing this wasn't an intended part of this patch... :)

::: js/src/jsopcode.cpp
@@ +370,5 @@
> +            return countAccessNames[which - BASE_LAST - 1];
> +        if (elementOp(op))
> +            return countElementNames[which - ACCESS_LAST - 1];
> +        if (propertyOp(op))
> +            return countPropertyNames[which - ACCESS_LAST - 1];

nit: could this be either [which - (BASE_LAST + 1)] or [which - BASE_COUNT] (or a similar name?) Took me a second to figure out where the 1-bias was coming from.

@@ +394,5 @@
> +
> +        int len = js_CodeSpec[op].length;
> +        jsbytecode *next = (len != -1) ? pc + len : pc + js_GetVariableBytecodeLength(pc);
> +
> +        Sprint(sp, "%05u %5u", pc - script->code, JS_PCToLineNumber(cx, script, pc));

JS_PCToLineNumber is a linear scan, which makes the whole thing quadratic. If we ever need to care about performance, we should switch to a shared op iterator. Doesn't seem worth fixing right now, though.

@@ -306,5 @@
>      uintN len;
>  
>      SprintCString(sp, "loc   ");
> -    if (script->pcCounters)
> -        Sprint(sp, "counts%*s x ", COUNTS_LEN - strlen("counts"), "");

djf, cedricv: note that this eliminates the output format you were using and moves it to a separate counts-only dump. Would you rather we left at least the 1st 3 counts here? (interpreter, trace jit, method jit) They'll soon become 2 with the removal of the tracer.

::: js/src/jsscript.h
@@ -543,3 @@
>  #define JS_SCRIPT_INLINE_DATA_LIMIT 4
>      uint8           inlineData[JS_SCRIPT_INLINE_DATA_LIMIT];
> -#endif

What's this?

::: js/src/jstracer.cpp
@@ -7219,5 @@
> -    if (!wasInImacro && cx->hasRunOption(JSOPTION_PCCOUNT)) {
> -        JSScript *script = cx->fp()->script();
> -        if (script->pcCounters) {
> -            int offset = cx->regs().pc - script->code;
> -            LIns *pcCounter_addr_ins = w.nameImmpNonGC(&script->pcCounters.get(JSPCCounters::TRACEJIT, offset));

If you're removing this, then I'd rather you removed the tjit slot in the counts at the same time.

::: js/src/methodjit/Compiler.cpp
@@ -2803,5 @@
>                  }
>              }
>          }
>  
> -        if (script->pcCounters || pcLengths) {

I need pcLengths to be filled in independently of pcCounters, for JIT code registration.

@@ +2950,5 @@
> +    types::TypeFlags flags = types->baseFlags();
> +    bool objects = !!(flags & types::TYPE_FLAG_ANYOBJECT) || !!types->getObjectCount();
> +
> +    if (objects && !!(flags & types::TYPE_FLAG_STRING))
> +        return false;

nit: I ignored another !! earlier, but these are just too noisy. (I assume these are from shu?) A boolean operator is enough to coerce both sides, and the main reason for using 'bool' in place of an 'int' is to eliminate the need for these sorts of things in assignments.

@@ +2961,5 @@
> +}
> +
> +void
> +mjit::Compiler::updatePCTypes(jsbytecode *pc, FrameEntry *fe)
> +{

JS_ASSERT(script->pcCounts);

@@ +2971,5 @@
> +    if (frame.peekTypeInRegister(fe) && reg == frame.tempRegForType(fe)) {
> +        JS_STATIC_ASSERT(Registers::ReturnReg != Registers::ArgReg1);
> +        reg = Registers::ArgReg1;
> +    }
> +    masm.push(reg);

Is there a way to know if this register is live here, to avoid the push/pop? I suppose it's not worth bothering about.

@@ +2990,5 @@
> +          case JSVAL_TYPE_OBJECT:     counter = OpcodeCounts::ACCESS_OBJECT;     break;
> +          default:;
> +        }
> +        if (counter)
> +            masm.bumpCounter(&counts.get(counter), reg);

counter can't be zero or otherwise false here, can it?

@@ +3051,5 @@
> +
> +void
> +mjit::Compiler::updateArithCounters(jsbytecode *pc, FrameEntry *fe,
> +                                    JSValueType firstUseType, JSValueType secondUseType)
> +{

JS_ASSERT(script->pcCounters)
Comment 17 Steve Fink [:sfink] [:s:] 2011-11-02 17:43:38 PDT
Cedric, David -- see top of comment 16.
Comment 18 David Flanagan [:djf] 2011-11-02 21:10:17 PDT
Steve,

Thanks for the heads up!  Am I understanding correctly that all the same information is being collected, but that the -D dump format is no longer reporting it all?  Or has the information collected changed, too?

If it is just the reporting, could we keep the current format for -Dcoverage and switch to the new format with -Dtypes or something like that?

I remember someone (forget who) saying that in the future the PCCOUNTS info would be available through the debugging API.  That sounds like a good long term solution.  I'd rather not chase an evolving textual dump format now and then switch to the debug API later, if we could keep the current dump format (even just in debug builds) as an option.

But if necessary, I can probably adapt CoverMonkey to use the new format.  Having to sync up a counts report with a separate disassembly sounds sort of tricky, but I haven't looked at any of this code in a couple of months, so I don't really know.
Comment 19 Brian Hackett (:bhackett) 2011-11-03 08:40:55 PDT
Created attachment 571655 [details]
sample output

Sample dump output with the patch.  Each line looks like:

00045     9   getlocal [interp: 113, mjit: 888, mjit_code: 5328, infer_poly: 888, observe_int32: 888]

In order, this is the op's bytecode offset, the line number of the op, the op name, and all the counts accumulated for the op.  David, does this have all the information you need?  If so, I would really rather switch to this format outright --- it adds a lot of complexity to support multiple output formats, and if all the information you need is available here then it should be straightforward to update CoverMonkey.

I don't expect the dump format to change after this.  This format is designed so that new counts can be added while still being both human- and machine-readable.

Short term, I want to expose this information to addons, for the above mentioned about:jitperf.  Cedric, it would be great if I could get a look at / get the code for your addon.
Comment 20 David Flanagan [:djf] 2011-11-03 10:45:46 PDT
Brian,

I love the named fields that hold the counts and other data.  If you change the square brackets to curly braces, then its JSON and becomes super-easy to parse from JS and other scripting langauges.

One of the things that CoverMonkey does is a reachability analysis to identify dead code. The opcode name doesn't do me any good without a full disassembly that includes jump offsets.  So with your output format, I'd have to take the bytecode offset and go look at the actual bytecodes and deal with those directly.  I could do that, but it would mean that I can't just pipe the -D output to a file and then analyze it in a separate pass.  I'd have to add my own flag to spidermonkey that sets -D but adds in the disassembly.  Or, CoverMonkey would have to re-compile the JavaScript source to regenerate the opcodes.

I'm guessing that you include the opcode name because it is useful information, but that you include only the opcode name so that you still have a readable output with fixed column widths....

How about leaving your format as is, except that you also stick another full copy of the disassembly at the end of each line, after all the counts.  Switches give multiline disassembly for all their offsets in the current -D format, and I don't need that.  And the current -D includes source code for anonymous functions and I certainly don't need that!

Or, if you don't want to do that by default, could you do a -Dd "dump with disassembly" option that includes the extra info?

Actually, instead of putting the disassembly at the end of the line, you could even include it inside the [] (or {}) along with the count info.  {..., op: "goto 19"}

While I'm raving about JSON, would you consdier puting the pc and linenum values inside the curly braces too and making each line an JSON object?  Then each script could be a JSON array and the entire dump could be another JSON array of scripts.  If you format it carefully, you could probably still get pretty good human readability of the raw output, and you would certainly make it easier to write scripts that analyze the output
Comment 21 Steve Fink [:sfink] [:s:] 2011-11-03 11:23:12 PDT
(In reply to David Flanagan from comment #18)
> Steve,
> 
> Thanks for the heads up!  Am I understanding correctly that all the same
> information is being collected, but that the -D dump format is no longer
> reporting it all?  Or has the information collected changed, too?

There is additional information being collected, but all the old information is still present. The -D dump format is now very different and tailored specifically to reporting the counts. It does *not* have enough information for you to do your analysis.

> If it is just the reporting, could we keep the current format for -Dcoverage
> and switch to the new format with -Dtypes or something like that?

I'm not sure of the exact syntax, but I'm leaning towards doing exactly that.

> I remember someone (forget who) saying that in the future the PCCOUNTS info
> would be available through the debugging API.  That sounds like a good long
> term solution.  I'd rather not chase an evolving textual dump format now and
> then switch to the debug API later, if we could keep the current dump format
> (even just in debug builds) as an option.

Yes, I told you that. There's a now-bitrotted patch in bug 671602 where jorendorff did exactly that, but the debug API has changed a fair bit since then, and it only exposed the raw counts. You need additional information.

(In reply to Brian Hackett from comment #19)
> In order, this is the op's bytecode offset, the line number of the op, the
> op name, and all the counts accumulated for the op.  David, does this have
> all the information you need?  If so, I would really rather switch to this
> format outright --- it adds a lot of complexity to support multiple output
> formats, and if all the information you need is available here then it
> should be straightforward to update CoverMonkey.

No, it doesn't have enough information. David does a reachability analysis that requires the offsets from various branching instructions. I just attempted to explain exactly what he's doing and why, but I think I'd better let him do it. :)

> I don't expect the dump format to change after this.  This format is
> designed so that new counts can be added while still being both human- and
> machine-readable.

I think we're talking about two separate use cases, though. David is (ok, was) doing code coverage tools, for which really only the interp/TM/JM (or just the sum of them) counts are relevant, plus line numbers and control flow graph information. Not that incorporating the additional counts wouldn't be cool, but it feels like if you want to do that it would be better to use both sets of outputs.

Brian, any objections to keeping just the basic counts in the decompiled output? I can put a patch to re-add them in a separate bug, if you want.

> Short term, I want to expose this information to addons, for the above
> mentioned about:jitperf.  Cedric, it would be great if I could get a look at
> / get the code for your addon.

Cedric really wanted control flow tracing for his addon instead of op or basic block counts, but his current addon is probably pretty close to being useful with your new info.
Comment 22 Steve Fink [:sfink] [:s:] 2011-11-03 11:23:44 PDT
Whoops, collided. Listen to David, not me.
Comment 23 David Flanagan [:djf] 2011-11-03 11:47:32 PDT
(In reply to Brian Hackett from comment #19)

> Short term, I want to expose this information to addons, for the above
> mentioned about:jitperf.  Cedric, it would be great if I could get a look at
> / get the code for your addon.

Brian, 

The CoverMonkey tool is at https://github.com/davidflanagan/CoverMonkey 
It does all the analysis of the -D output that is created by a spidermonkey on exit hook.

Cedric's add-on is at: https://github.com/neonux/CodeInspector
It uses CoverMonkey for coverage and # of execution analysis.  The trick with the add on is that it updates live, rather than waiting for -D dumps when the program exits.  So it required a patched version of Firefox to get those live dumps.  I forget whether that patch is now in the tree or not.
Comment 24 Brian Hackett (:bhackett) 2011-11-03 12:35:33 PDT
Created attachment 571743 [details] [diff] [review]
revised patch (8aeb207c9a2f)

The JSON idea sounds really good.  This patch changes the output to interleave JSON for the counts with the standard script disassembly (nothing is printed in release builds).
Comment 25 Brian Hackett (:bhackett) 2011-11-03 12:37:12 PDT
Created attachment 571744 [details]
sample output
Comment 26 David Flanagan [:djf] 2011-11-03 12:41:32 PDT
Brian,

Great! Assuming that you do something reasonable with the disassembly for switch statements (like removing newlines so the interleaving strictly alternates lines) then I should quite easily be able to update CoverMonkey to use this new format.

    David
Comment 27 Brian Hackett (:bhackett) 2011-11-03 13:53:32 PDT
Opcodes with multiple lines of disassembly will be fully printed before the count information for the op:

00012:   2  tableswitch defaultOffset 33 low 0 high 3
	0: 15
	1: 15
	2: 15
	3: 24
                  {"interp": 1}

I'll check this in after the next aurora branch next week.
Comment 28 Cedric Vivier [:cedricv] 2011-11-07 02:17:27 PST
(Sorry for late replies, I guess this bug got lost in my bugmail. Don't hesitate to ping  me on irc if I don't reply swiftly [cedricv])

(In reply to Steve Fink [:sfink] from comment #16)
> Comment on attachment 571171 [details] [diff] [review] [diff] [details] [review]
> patch (8aeb207c9a2f)
> 
> I couldn't dig up a link. Do you have something you could show Brian & Shu?
> It's well beyond any addon I'd be able to write for this stuff; I really
> wish we could've recorded the demo at the all-hands.

Actually there is one video demo about it :)
http://neonux.com/tmp/live-coverage-allhands-demo.ogv

Other than that, David posted the links to get the code running (with patch attached to the bug above)


(In reply to David Flanagan from comment #23)
> So it required a patched version of Firefox to get
> those live dumps.  I forget whether that patch is now in the tree or not.

The patch is attached to https://bugzilla.mozilla.org/show_bug.cgi?id=687134
It's not in the tree, and is not in a state to be in tree as it is now.
Moreover clean up and better API, we'll likely also need an API to start/stop the counters.

--
I really like the JSON idea as well.
Otoh could we make the format less tied to the opcodes (by definition very Spidermonkey-specific - maybe even version-specific) and just expose line and column info [1] ?


[1]: looks pretty close to land, https://bugzilla.mozilla.org/show_bug.cgi?id=568142
Comment 29 Brian Hackett (:bhackett) 2011-11-10 12:36:08 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd9c1c9707b0
Comment 30 Marco Bonardo [::mak] 2011-11-11 02:20:05 PST
https://hg.mozilla.org/mozilla-central/rev/bd9c1c9707b0

Note You need to log in before you can comment on or make changes to this bug.