1494473 - Collect statistics about CacheIR IC execution

Assignee

Description

•

7 years ago

It may be valuable for us to gather detailed execution statistics about CacheIR stubs, to further our ability to make decisions. For one could imagine providing execution weighted data in the BaselineInspector.

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 1

•

7 years ago

Attached patch Add success and entered counters to Baseline's CacheIR_Regular stubs (obsolete) — Details — Splinter Review

By adding a counter increment to the entry of an IC stub, as well as on a successful-value-return Path, we are able to compute a hit-rate for a particular stub (as implicitly the failure case is |entered - success|). This information can be queried in the baseline inspector to provide more information to Ion should it be desired. Note: this version uses 64 bit counters to try to avoid overflow situations.

Attachment #9012378 - Flags: feedback?(kvijayan)

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Assignee: nobody → mgaudet

Status: NEW → ASSIGNED

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 2

•

7 years ago

Attached patch Allow dumping information on script cleanup using DUMP_IC_INFO environment variable (obsolete) — Details — Splinter Review

This allows us to extract information from the IC chains like this: /Users/mgaudet/mozilla-unified-clean/js/src/octane/box2d.js:174:250 (sub) CacheIR_Regular 1009 236 773 CacheIR_Regular 772 772 0 Which shows there are two CacheIR stubs attached for a JSOP_SUB bytecode, and out of the 1009 usages of the IC Chain, 236 succeeded on the first IC, and 772 succeeded on the second IC.

Attachment #9012379 - Flags: feedback?(kvijayan)

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 3

•

7 years ago

Comment on attachment 9012378 [details] [diff] [review] Add success and entered counters to Baseline's CacheIR_Regular stubs Before landing this, this patch would of course need to be try-built, and benchmarked appropriately. This patch currently only handles Baseline CacheIR ICs. A similar approach can be taken in Ion, however, it will require some more complexity (patched addresses I believe) due to the lack of ICStubReg in the Ion case. Also note: Currently uses 64 bit counters to try to avoid caring about overflow, but one could imagine shortening the counters if we're OK with 32 bits of accuracy.

Attachment #9012378 - Flags: feedback?(jdemooij)

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 4

•

7 years ago

(one more comment about Patch 2: discardStubs/discardOptimizedStubs will hide data from that dump)

Kannan Vijayan [:djvj]

Comment 5

•

7 years ago

Comment on attachment 9012378 [details] [diff] [review] Add success and entered counters to Baseline's CacheIR_Regular stubs Neat! Those are pretty big differences, and I can see these sorts of distribution unevenness showing up basically everywhere, and in even more exaggerated ways (e.g. a 95%/5% split on stubs). Ion utilizing this information to selectively inline certain paths and include bailout / offramp paths for others would be valuable to keep in mind. Additionally, this approach can be extended to other areas of the engine as well: 1. Value-specialization for Ion to inject speculative path splitting based on predicted high-frequency constants. 2. Basic block outgoing-edge counts, to feed into PGO style compile decisions, including better selection of where to put bailouts / offramps into lower tiers.

Attachment #9012378 - Flags: feedback?(kvijayan) → feedback+

Kannan Vijayan [:djvj]

Comment 6

•

7 years ago

Comment on attachment 9012378 [details] [diff] [review] Add success and entered counters to Baseline's CacheIR_Regular stubs Review of attachment 9012378 [details] [diff] [review]: ----------------------------------------------------------------- Nice. ::: js/src/jit/BaselineCacheIRCompiler.cpp @@ +79,5 @@ > +static void > +EmitUpdateSuccessCounter(MacroAssembler& masm) > +{ > + Address success(ICStubReg, ICCacheIR_Regular::offsetOfSuccess()); > + masm.addPtr(Imm32(1), success); Just for noting: if we end up going the route of using uint32_ts with periodic scaling, then the addPtr() can be followed by an `jif_overflow` to a the scaling code. @@ +190,5 @@ > masm.setSecondScratchReg(BaselineSecondScratchReg); > #endif > > + if (kind_ == BaselineCacheIRStubKind::Regular) { > + EmitUpdateEnteredCounter(masm); As noted before, if we only keep a success count on the stubs, and avoid the "entered" counter.. leaving that to be a sum of the hits across all stubs including fallback stub, we can eliminate this update from the optimized cases. ::: js/src/jit/BaselineIC.h @@ +823,5 @@ > + // value. > + uint64_t entered_; > + > + // Counts the nmber of times the stub has successfully provided a value. > + uint64_t success_; We don't need two fields here. We can keep a count on the fallback stub. Additionally, this can be a uint32_t. The way to handle this is to periodically scale things So if a particular IC's total hits become UINT32_MAX count (very rare), then we actually jump to some trampoline code into C++ that walks the IC chain, and scales all the values down by some constant factor (e.g. 2 or 4). This loses some precision, in that very-low-frequency stubs will scale down to 0.. and we can adjust for that to ensure that a nonzero values never scales down to anything lower than 1. This has the small imprecision of maybe over-representing small-frequency stubs in extreme cases.. but in practice it won't be a big issue.

Kannan Vijayan [:djvj]

Comment 7

•

7 years ago

Comment on attachment 9012379 [details] [diff] [review] Allow dumping information on script cleanup using DUMP_IC_INFO environment variable Presumably this would be turned into Spew output.

Attachment #9012379 - Flags: feedback?(kvijayan) → feedback+

Jan de Mooij [:jandem]

Comment 8

•

7 years ago

Comment on attachment 9012378 [details] [diff] [review] Add success and entered counters to Baseline's CacheIR_Regular stubs Review of attachment 9012378 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/BaselineCacheIRCompiler.cpp @@ +76,5 @@ > } > }; > > +static void > +EmitUpdateSuccessCounter(MacroAssembler& masm) If there's one caller maybe just inline there. I agree with Kannan that one counter is probably sufficient. I sort of like the success one because "did this optimization succeed?" is the most useful bit of information I expect and answering that won't require more analysis. I'd consider an uint32_t that we increment without worrying about overflow. Even if we hit this stub 1 million times a second, then it will still take more than an hour to overflow, and in practice GC will probably have purged the stub long before that. For profiling/heuristics I think that's still plenty useful and not worth the extra hit/overflow/complexity of handling overflow. If that makes us uncomfortable I'd say uint64_t for simplicity/perf.

Attachment #9012378 - Flags: feedback?(jdemooij) → feedback+

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 9

•

7 years ago

I've been working on productizing these patches. The current barrier I'm trying to overcome is that it appears there is no guarantee that by the time emitReturnFromIC() runs ICStubReg is still actually pointed to the beginning of the ICStub. I'm still trying to find what code right now is most responsible, but it can be seen by injecting a masm.printf and looking at the value of ICStubReg just before we update the 'counter' ICStubReg: 6210003d7718 ICStubReg: 6210003af310 ICStubReg: 1fffdda13814 ICStubReg: 6210003af700 ICStubReg: 6210003af310 One of these things is not like the other. My first analytic gut instinct was BaseineCompiler::callVM, but it -appears- that should be saving and restoring ICStubReg.

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Priority: -- → P3

QA Contact: sdetar

Firefox Bug Husbandry Bot

Updated

•

7 years ago

QA Contact: sdetar

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 10

•

7 years ago

Attached file Bug 1494473 - Add entered counters to Baseline's CacheIR_Regular and Fallback stubs r?jandem — Details

By adding a counter increment to the entry of an IC stub, it is possible to analyze the IC chain for patterns that indicate how the ICs are performing, which can influence the baseline inspector, and allow Ion to weight the choices it makes on IC data.

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 11

•

7 years ago

Attached file Bug 1494473 - Using IONFLAGS=bl-ic-stats dump baseline IC statistics on script cleanup r?jandem — Details

The output for a particular IC chain looks like this: box2d.js:174:250 (sub) 1009 -> 772 -> (fb) 2 Which is read like this: This sub opcode has two ICs attached. The first IC was entered 1009 times, the second 772 and the fallback stub was hit twice. There are some conclusions we can draw from this (and some we cannot) - We can say with confidence the fallback stub was only hit twice, meaning 99.998% of queries to the IC chain hit in the IC rather than the fallback. - We cannot however say necessarily that the first IC failed to provide a value 771 times. Since new ICs are added at the front, it is possible that there was a phase transition that happened, and the first 771 requests were served by the first IC, followed by a transition to a new type and addition of a new IC, and subsequently 1009 queries were served by the newly added IC. - We can conclude with confidence that there have been at least two different ICs involved in computation across the lifetime of the IC chain, and they are almost equally probable. Though we can't do temporal reasoning with entry counters, we can still do some reasoning. Depends on D8859

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Attachment #9012378 - Attachment is obsolete: true

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Attachment #9012379 - Attachment is obsolete: true

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Comment 12

•

7 years ago

I *think* as a side effect of adding counters to fallback stubs, it's possible to infer branch taken probabilities and code execution frequencies: At a given branch, walk the bytecode forward on both the target and fallthrough looking for ops with ICEntries; When you encounter them check the IC chains: 1. If there are zero hits to the fallback, the code has never been executed, so you can infer the branch has only been taken the one way. 2. If there is an IC chain, you can sum the number of entries across the chain, and compare frequencies to give a rough branch biasing. If we wanted this to be more robust, on IC unlinking, we could roll the sum into a special field added to ICFallbackStub to help track counts more accurately. A side note: it may make sense from a reasoning perspective to reset the IC counters on the chain to 1 on the addition of a new stub; this would mean that at any time you look at an IC chain you will be able to reason about the success rate of each stub by looking at it's entry count divided by the entry count of its successors; resetting restores some semblance of the temporal reasoning discussed in Comment 11, as you'd be able to clearly answer the phase transition question.

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1500187

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1500189

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1500194

Pulsebot

Comment 13

•

7 years ago

Pushed by mgaudet@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2ac1bca7094c Add entered counters to Baseline's CacheIR_Regular and Fallback stubs r=jandem https://hg.mozilla.org/integration/autoland/rev/4bdddb2c5100 Using IONFLAGS=bl-ic-stats dump baseline IC statistics on script cleanup r=jandem

Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout)

Comment 14

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/2ac1bca7094c https://hg.mozilla.org/mozilla-central/rev/4bdddb2c5100

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox65: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla65

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1501775

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1501779

Matthew Gaudet (he/him) [:mgaudet]

Assignee

Updated

•

7 years ago

Blocks: 1501787

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

status-firefox64: affected → wontfix

Add success and entered counters to Baseline's CacheIR_Regular stubs 7 years ago Matthew Gaudet (he/him) [:mgaudet] 6.94 KB, patch	djvj : feedback+ jandem : feedback+	Details \| Diff \| Splinter Review
Allow dumping information on script cleanup using DUMP_IC_INFO environment variable 7 years ago Matthew Gaudet (he/him) [:mgaudet] 3.55 KB, patch	djvj : feedback+	Details \| Diff \| Splinter Review
Bug 1494473 - Add entered counters to Baseline's CacheIR_Regular and Fallback stubs r?jandem 7 years ago Matthew Gaudet (he/him) [:mgaudet] 46 bytes, text/x-phabricator-request		Details \| Review
Bug 1494473 - Using IONFLAGS=bl-ic-stats dump baseline IC statistics on script cleanup r?jandem 7 years ago Matthew Gaudet (he/him) [:mgaudet] 46 bytes, text/x-phabricator-request		Details \| Review