Closed Bug 1209515 Opened 9 years ago Closed 9 years ago

Attempt to use Code Coverage for pruning branches in IonMonkey.

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla45

Tracking Flags:

Tracking

Status

firefox44

---

affected

firefox45

---

fixed

People

(Reporter: nbp, Assigned: nbp)

References

Details

Attachments

(12 files, 4 obsolete files)

part 0 - Move callerResumePoint field next to other resume points of basic blocks. 9 years ago Nicolas B. Pierron [:nbp] 2.66 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review
part 1 - IonBuilder: Attach hit counts on the MIRGraph. 9 years ago Nicolas B. Pierron [:nbp] 15.58 KB, patch		Details \| Diff \| Splinter Review
part 2 - Ensure that MPhi removal considers removed uses. 9 years ago Nicolas B. Pierron [:nbp] 3.41 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
part 3 - IonMonkey: Add branch pruning based on code coverage counters. 9 years ago Nicolas B. Pierron [:nbp] 21.07 KB, patch		Details \| Diff \| Splinter Review
part 4 - Increment code coverage counters on bailouts. 9 years ago Nicolas B. Pierron [:nbp] 4.89 KB, patch		Details \| Diff \| Splinter Review
part 5 - Ensure that we recompile when we have frequent bailouts caused by the first execution. 9 years ago Nicolas B. Pierron [:nbp] 3.39 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
part 1 - IonBuilder: Attach hit counts on the MIRGraph. 9 years ago Nicolas B. Pierron [:nbp] 16.12 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
part 3 - IonMonkey: Add branch pruning based on code coverage counters. 9 years ago Nicolas B. Pierron [:nbp] 28.82 KB, patch		Details \| Diff \| Splinter Review
part 4 - Increment code coverage counters on bailouts. 9 years ago Nicolas B. Pierron [:nbp] 4.96 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
part 6 - Interpreter: Instrument JOF_JUMP opcodes instead of using interpreter's interrupts. 9 years ago Nicolas B. Pierron [:nbp] 13.70 KB, patch		Details \| Diff \| Splinter Review
part 3 - IonMonkey: Add branch pruning based on code coverage counters. 9 years ago Nicolas B. Pierron [:nbp] 28.67 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
part 7 - IonBuilder: pushLoop should not infer the continuepc from the entry block. 9 years ago Nicolas B. Pierron [:nbp] 3.56 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
part 8 - IonBuilder: newOsrPreheader should not use the hit-count of the loop. 9 years ago Nicolas B. Pierron [:nbp] 5.35 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
part 9 - IonBuilder: Loop header should use the hit count of the loop, instead of before the loop. 9 years ago Nicolas B. Pierron [:nbp] 2.63 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
part 10 - Prevent PruneUnusedBranches from being greedy while removing branches which have only be visited a few times. 9 years ago Nicolas B. Pierron [:nbp] 4.00 KB, patch	bhackett1024 : review+	Details \| Diff \| Splinter Review
part 11 - Increase JitSpewer mask size to avoid overflow. 9 years ago Nicolas B. Pierron [:nbp] 1.68 KB, patch	bbouvier : review+	Details \| Diff \| Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Description

•

9 years ago

Now that Code Coverage is working fine in the Interpreter and in Baseline, we can make a second attempt at removing unused branches, as previously tried by Wei Wu in Bug 877872.

Also, removing the branch before Scalar Replacement should bring a 33% improvements on the for-of micro benchmark [1], as previously achieved by GVN in Bug 1165348 but this time without any of the overhead related to re-looping compiler phases.

[1] https://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-array-forof&start=1433992864&end=1434080570

Nicolas B. Pierron [:nbp]

Assignee

Comment 1

•

9 years ago

Attached patch part 0 - Move callerResumePoint field next to other resume points of basic blocks. — Details — Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Comment 2

•

9 years ago

Attached patch part 1 - IonBuilder: Attach hit counts on the MIRGraph. (obsolete) — Details — Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Comment 3

•

9 years ago

Attached patch part 2 - Ensure that MPhi removal considers removed uses. — Details — Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Comment 4

•

9 years ago

Attached patch part 3 - IonMonkey: Add branch pruning based on code coverage counters. (obsolete) — Details — Splinter Review

Comment hidden (obsolete)

With the current set of patches, running 3 runs of octane leads to the following results:

                  No PGO     With PGO
       Richards:   34706 ->   36892  6.3%   (improved)
      DeltaBlue:   65959 ->   65809  -0.2%
         Crypto:   31965 ->   30519  -4.5%
       RayTrace:  115487 ->  111639  -3.3%
    EarleyBoyer:   34234 ->   33603  -1.8%
         RegExp:    3922 ->    3828  -2.4%
          Splay:   19115 ->   20359  6.5%   (improved)
   SplayLatency:   17810 ->   22771  27.9%  (noise)
   NavierStokes:   40014 ->   39969  -0.1%
          PdfJS:   20195 ->   20131  -0.3%
       Mandreel:   32620 ->   22510  -31.0% (regressed)
MandreelLatency:   41432 ->   29402  -29.0% (regressed)
        Gameboy:   65889 ->   63178  -4.1%
       CodeLoad:   20325 ->   19580  -3.7%
          Box2D:   55082 ->   46654  -15.3% (regressed)
           zlib:   88571 ->   88208  -0.4%
     Typescript:   33469 ->   34865  4.2%
          Score:   33442 ->   32341  -3.3%

I will investigate what caused the major regressions on mandreel and box2d.

Nicolas B. Pierron [:nbp]

Assignee

Comment 6

•

9 years ago

Attached patch part 4 - Increment code coverage counters on bailouts. (obsolete) — Details — Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Comment 7

•

9 years ago

Attached patch part 5 - Ensure that we recompile when we have frequent bailouts caused by the first execution. — Details — Splinter Review

Nicolas B. Pierron [:nbp]

Assignee

Comment 8

•

9 years ago

With the current set of patches (part 0 to 5), running octane 997 times:

                  No PGO     With PGO
       Richards:   34683 ->   37175  7.2%  (++)
      DeltaBlue:   67346 ->   67583  0.4%
         Crypto:   32423 ->   32504  0.2%
       RayTrace:  115766 ->  112063  -3.2% (-)
    EarleyBoyer:   35002 ->   34644  -1.0% (-)
         RegExp:    4043 ->    4027  -0.4%
          Splay:   21158 ->   21202  0.2%
   SplayLatency:   23876 ->   23648  -1.0% (-)
   NavierStokes:   40103 ->   40013  -0.2%
          PdfJS:   20711 ->   20654  -0.3%
       Mandreel:   32616 ->   36339  11.4% (++)
MandreelLatency:   41156 ->   40076  -2.6% (-)
        Gameboy:   67691 ->   67690  -0.0% (whoa this is precise)
       CodeLoad:   20648 ->   19954  -3.4% (-)
          Box2D:   56376 ->   56558  0.3%
           zlib:   89513 ->   89535  0.0%
     Typescript:   35278 ->   36306  2.9%  (+)
          Score:   34944 ->   35143  0.6%  (+)

I guess code load is affected by this modification, as this implies collecting code coverage information while running the interpreter and baseline.

I will investigate Raytrace in case there is something obvious.

Also, the modification of the test cases made in part 3, highlight that we should now have a 33% improvements on the for-of loops (mentioned in comment 0) by removing of the corner-case branch which does a side-exit.  Which let Scalar Replacement remove the object allocation created by the next function of the Array iterator.

Nicolas B. Pierron [:nbp]

Assignee

Comment 9

•

9 years ago

The following correspond to a little instrumented build that I made on top of the previous patches, which reports:
 - blocks: the total number of basic block produced by IonBuilder.
 - pruned: the number of basic block which either got removed or which content got discarded.
 - %: the ratio pruned branches over the number of blocks.
 - bailouts: the number of first-execution bailout (produced by this optimization) which are taken.

Richards:     (blocks: 1350,   pruned: 54    (  3%), bailouts: 2)
DeltaBlue:    (blocks: 3238,   pruned: 77    (  2%), bailouts: 0)
Crypto:       (blocks: 6103,   pruned: 2310  ( 37%), bailouts: 22)
RayTrace:     (blocks: 3022,   pruned: 61    (  2%), bailouts: 0)
EarleyBoyer:  (blocks: 4705,   pruned: 1054  ( 22%), bailouts: 1)
RegExp:       (blocks: 406,    pruned: 116   ( 28%), bailouts: 2)
Splay:        (blocks: 1993,   pruned: 204   ( 10%), bailouts: 12)
NavierStokes: (blocks: 1287,   pruned: 169   ( 13%), bailouts: 4)
PdfJS:        (blocks: 11904,  pruned: 3831  ( 32%), bailouts: 16)
Mandreel:     (blocks: 7164,   pruned: 4984  ( 69%), bailouts: 31)
Gameboy:      (blocks: 16995,  pruned: 9940  ( 58%), bailouts: 59)
CodeLoad:     (blocks: 59,     pruned: 9     ( 15%), bailouts: 0)
Box2D:        (blocks: 12317,  pruned: 1699  ( 13%), bailouts: 47)
zlib:         (blocks: 43,     pruned: 4     (  9%), bailouts: 0)
Typescript:   (blocks: 107399, pruned: 28070 ( 26%), bailouts: 67)

Note that the current implementation only cause a recompilation after 10 bailouts of the same IonScript, even for Bailout_FirstExecution, which means that the bailout of EarleyBoyer will bailout, but not recompile the function yet.

We see that RayTrace / CodeLoad benchmarks have no bailouts, which would mean that the slow down observed in the previous comment is not related to the code produced by this optimization but by the instrumentation which feeds this optimization phase.  Thus we should probably look for improving the collection of code coverage in baseline.

Nicolas B. Pierron [:nbp]

Assignee

Comment 10

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #8)
> I will investigate Raytrace in case there is something obvious.

After some investigation with "perf" on raytrace, the baseline profiles seems similar which suggest that the overhead of incrementing code coverage counters is insignificant.  On the other hand, I noticed that raytrace.js:682:19 (Block14) is now taking a larger part of the profile, jumping from 0.6% to 1.6% of profile.

One explanation would be that the code added in part 3 (ConvertToBailingBlock) which is flagging all resume point operands has having removed uses is too conservative.  I took this conservative approach to avoid iterating over all instructions' operands of all removed basic blocks.  I will check if I can take a precise approach to this problem.

Nicolas B. Pierron [:nbp]

Assignee

Comment 11

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #10)
> One explanation would be that the code added in part 3
> (ConvertToBailingBlock) which is flagging all resume point operands has
> having removed uses is too conservative.  I took this conservative approach
> to avoid iterating over all instructions' operands of all removed basic
> blocks.  I will check if I can take a precise approach to this problem.

Ok, after lots of issues & rewrites to ensure I have the right set of less-conservative rules for marking removed uses, I finally got some encouraging results.

With the current set of patches (part 0 to 6(¹)), running octane 930 times:

                  No PGO     With PGO
       Richards:   34652 ->   37175  7.3%   (++)
      DeltaBlue:   67232 ->   67737  0.8%   (~+)
         Crypto:   32377 ->   32871  1.5%   (+)
       RayTrace:  115413 ->  114950  -0.4%
    EarleyBoyer:   34630 ->   34534  -0.3%
         RegExp:    4030 ->    4007  -0.6%  (~-)
          Splay:   21092 ->   20971  -0.6%  (~-)
   SplayLatency:   23917 ->   23752  -0.7%  (~-)
   NavierStokes:   40104 ->   39989  -0.3%
          PdfJS:   20361 ->   20786  2.1%   (+)
       Mandreel:   32436 ->   35749  10.2%  (++)
MandreelLatency:   40464 ->   41345  2.2%   (+)
        Gameboy:   67114 ->   67514  0.6%   (~+)
       CodeLoad:   20507 ->   20294  -1.0%  (-)
          Box2D:   56158 ->   57087  1.7%   (+)
           zlib:   89239 ->   89206  -0.0%
     Typescript:   34540 ->   36118  4.6%   (+)
          Score:   34728 ->   35266  1.5%   (++)

These less conservative-rules should not change the number of blocks / pruned branches / bailouts reported in comment 9.  These rules were only made to prevent other optimizations to optimize-out some instructions if we ever remove a use from the graph, such that we keep it in its original state for bailouts.

(¹) I will attach the patch for part 6, as soon as I improved a few comments, and fix a few other patches attached here.

Nicolas B. Pierron [:nbp]

Assignee

Comment 12

•

9 years ago

Attached patch part 1 - IonBuilder: Attach hit counts on the MIRGraph. — Details — Splinter Review

Delta:
 - Fix an issue while accessing the getNearestPCCounts near the end of the script.

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8672751 - Attachment is obsolete: true

Nicolas B. Pierron [:nbp]

Assignee

Comment 13

•

9 years ago

Attached patch part 3 - IonMonkey: Add branch pruning based on code coverage counters. (obsolete) — Details — Splinter Review

Delta:
 - Change the removal of block to be more like an analysis / transformation
 phase, and make the RemoveUses flagging precise. (with some nice ascii art,
 because this is way easier to visualize than to read)

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8672753 - Attachment is obsolete: true

Nicolas B. Pierron [:nbp]

Assignee

Comment 14

•

9 years ago

Attached patch part 4 - Increment code coverage counters on bailouts. — Details — Splinter Review

Delta:
 - Check the we are effectively collecting results for the current script
 before any attempts to increment any counters.

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8673566 - Attachment is obsolete: true

Nicolas B. Pierron [:nbp]

Assignee

Comment 15

•

9 years ago

Attached patch part 6 - Interpreter: Instrument JOF_JUMP opcodes instead of using interpreter's interrupts. — Details — Splinter Review

Previously counting code coverage in the interpreter required making use of
the interruptible flag which basically was doing debug checks in between
every instructions.

This change instrument the macro of the interpreter to only increment at
branches locations.

Nicolas B. Pierron [:nbp]

Assignee

Comment 16

•

9 years ago

I did a bit of instrumentation (JitCode, IonScript, PCCounts) to approximate the memory usage, knowing that this optimization relies on code coverage instrumentation.  Here are the from running the Ovtane benchmark (¹):

               No PGO           With PGO

ionScript:
  count:        1808               1884
  size:      8338096            6517784
  mean:         4611               3459
jitCode:
  count:       45130              47839
  size:     32441336           30991584
  mean:          718                647
pcCounts:
  count:           0              15671
  size:            0            2618768
  mean:            0                167

  count (the number of constructor called)
  size (the size of the allocation)
  mean (= size / count)

Note, that the interesting part is not the sum, as effectively the version which is running with branch pruning enabled compiles more frequently, but it complies smaller code, as it is able to remove branches.

What is interesting is that this gives an overview of the number of bytes consumed by any JSScript (167 bytes), which have to be compared with the delta of memory ""saved"" (²) by enabling this optimization. (= 1223 = 4611 - 3459 + 718 - 647).  Which means that we are effectively saving memory if we Ion compile at least 13.6% of the non-relazified JSScripts.

(¹) Octane is not a memory benchmark, but I used it to provide an approximation of the memory consumed for various data structures.

(²) Yes, ""saved"".  I was the first person surprised as I did not consider the shrinkage of the compiled code before comment 9.

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8672750 - Flags: review?(benj)

Nicolas B. Pierron [:nbp]

Assignee

Comment 17

•

9 years ago

Comment on attachment 8681243 [details] [diff] [review]
part 1 - IonBuilder: Attach hit counts on the MIRGraph.

Review of attachment 8681243 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIRGraph.h
@@ +610,5 @@
> +    }
> +    uint64_t getHitCount() const {
> +        return hitCount_;
> +    }
> +    enum HitState { HS_NotDefined, HS_Count, HS_Frequency };

Note for the reviewer: Currently HS_Frequency is not used, but I intend to use it after converting the absolute counts into frequencies of visits.  The idea being that we can then tell the register allocator about cold/hot branches.

Attachment #8681243 - Flags: review?(bhackett1024)

Nicolas B. Pierron [:nbp]

Assignee

Comment 18

•

9 years ago

Comment on attachment 8672752 [details] [diff] [review]
part 2 - Ensure that MPhi removal considers removed uses.

Review of attachment 8672752 [details] [diff] [review]:
-----------------------------------------------------------------

As PruneBranches as coming before the MPhi removal, we need to annotate MPhi if we removed one of its uses, and as we remove basic blocks, we need to push this information to the inputs of the removed phi instructions.

Attachment #8672752 - Flags: review?(bhackett1024)

Benjamin Bouvier [:bbouvier] (inactive)

Updated

•

9 years ago

Attachment #8672750 - Flags: review?(benj) → review+

Nicolas B. Pierron [:nbp]

Assignee

Comment 19

•

9 years ago

Attached patch part 3 - IonMonkey: Add branch pruning based on code coverage counters. — Details — Splinter Review

This optimization (branch pruning) does almost the same as PGO in static
compiler, except that instead of compiling the unused branches in
out-of-line code, we just get rid of the code and bailout to baseline if we
ever reach such code.

This optimization does not provide much speed-up alone, but enable the rest
of the compiler to do cleverer optimizations such as Scalar Replacement and
Range Analysis.

Delta:
 - Improve comments.
 - rename a few functions.

Attachment #8681368 - Flags: review?(bhackett1024)

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8681244 - Attachment is obsolete: true

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8681245 - Flags: review?(bhackett1024)

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8673567 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp]

Assignee

Updated

•

9 years ago

Attachment #8681249 - Flags: review?(bhackett1024)

Jan de Mooij [:jandem]

Comment 20

•

9 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #17)
> > +    enum HitState { HS_NotDefined, HS_Count, HS_Frequency };

Drive by comment, we should use enum class everywhere instead of having these prefixes.

Jan de Mooij [:jandem]

Updated

•

9 years ago

Attachment #8673567 - Flags: review?(jdemooij) → review+

Brian Hackett [Laid off!]

Comment 21

•

9 years ago

Comment on attachment 8681243 [details] [diff] [review]
part 1 - IonBuilder: Attach hit counts on the MIRGraph.

Review of attachment 8681243 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIRGraph.h
@@ +608,5 @@
> +        hitCount_ = count;
> +        hitState_ = HS_Count;
> +    }
> +    uint64_t getHitCount() const {
> +        return hitCount_;

Assert(hitState_ == HS_Count)?

@@ +610,5 @@
> +    }
> +    uint64_t getHitCount() const {
> +        return hitCount_;
> +    }
> +    enum HitState { HS_NotDefined, HS_Count, HS_Frequency };

ditto re: Jan about the 'enum' thing.  It's also not obvious what these enum values mean and how they relate to hitCount_, so some comments would be nice.

Maybe it will be clear later, but will there ever be any time when we'll be putting different types of data on different MIR blocks?  If not then it might be clearer to remove HitState so that there can be a consistent interpretation for the count information associated with the blocks.

@@ +679,5 @@
>  
>      BytecodeSite* trackedSite_;
>  
> +    // Record the number of times a block got visited. Note, due to inlined
> +    // scripts these numbers might not continuous.

might not be

::: js/src/jsscript.cpp
@@ +1415,5 @@
>      return elem;
>  }
>  
>  const js::PCCounts*
> +ScriptCounts::getNearestPCCounts(size_t offset) const

What is this function trying to do?  It seems from its behavior and from its use in getHitCount is that it's supposed to find the count entry that is either at offset or at the closest point preceding it, which is different from what "nearest" means.

Maybe name this getImmediatePrecedingPCCounts?  This should also have a comment that because IIRC there are PC counts after jumps and at jump targets, the result counts will be in the same basic block as the supplied pc.

@@ +1438,5 @@
>      return elem;
>  }
>  
> +const js::PCCounts*
> +ScriptCounts::getNearestThrowCounts(size_t offset) const

Ditto re: renaming getNearestPCCounts, though the comment there would not apply here.

Brian Hackett [Laid off!]

Updated

•

9 years ago

Attachment #8681243 - Flags: review?(bhackett1024) → review+

Brian Hackett [Laid off!]

Updated

•

9 years ago

Attachment #8672752 - Flags: review?(bhackett1024) → review+

Brian Hackett [Laid off!]

Comment 22

•

9 years ago

Comment on attachment 8681368 [details] [diff] [review]
part 3 - IonMonkey: Add branch pruning based on code coverage counters.

Review of attachment 8681368 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/jit/IonAnalysis.cpp
@@ +313,5 @@
> +            block->markUnchecked();
> +        }
> +
> +        // When removing a loop header, we should ensure that its backedge is
> +        // removed first, otherwise this trigger an assertion in

triggers

Attachment #8681368 - Flags: review?(bhackett1024) → review+

Brian Hackett [Laid off!]

Comment 23

•

9 years ago

Comment on attachment 8681245 [details] [diff] [review]
part 4 - Increment code coverage counters on bailouts.

Review of attachment 8681245 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/BaselineBailouts.cpp
@@ +793,5 @@
>  
> +    // When pgo is enabled, increment the counter of the block in which we
> +    // resume, as Ion does not keep track of the code coverage.
> +    if (!js_JitOptions.disablePgo && script->hasScriptCounts())
> +        script->incHitCount(pc);

Why is this necessary?  When Ion is running the basic block counts are underapproximations anyways, right?

Nicolas B. Pierron [:nbp]

Assignee

Comment 24

•

9 years ago

(In reply to Brian Hackett (:bhackett) from comment #23)
> Comment on attachment 8681245 [details] [diff] [review]
> part 4 - Increment code coverage counters on bailouts.
> 
> Review of attachment 8681245 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/jit/BaselineBailouts.cpp
> @@ +793,5 @@
> >  
> > +    // When pgo is enabled, increment the counter of the block in which we
> > +    // resume, as Ion does not keep track of the code coverage.
> > +    if (!js_JitOptions.disablePgo && script->hasScriptCounts())
> > +        script->incHitCount(pc);
> 
> Why is this necessary?  When Ion is running the basic block counts are
> underapproximations anyways, right?

This is needed because if we only visit this branch after a bailout caused by the branch pruning, then we might never count this branch and recompile without this branch.  Incrementing the precedent jump target solve this issue, as the next time we re-compile the corresponding basic block would have a non-zero count.

Brian Hackett [Laid off!]

Comment 25

•

9 years ago

Comment on attachment 8681249 [details] [diff] [review]
part 6 - Interpreter: Instrument JOF_JUMP opcodes instead of using interpreter's interrupts.

Review of attachment 8681249 [details] [diff] [review]:
-----------------------------------------------------------------

This doesn't seem equivalent to the interrupt method.  Say we have a branch in the bytecode sequence like:

(0) ... 
(1) ifeq 3
(2) ...
(3) ...

If we have PC counts at the start of all basic blocks then all but (1) here will have counts.  With interrupts we'll update counts at all those opcodes no matter the control flow, but with this patch AFAICS if the branch at (1) is not taken then counts will be updated at (2) but not (3), since there is no explicit branch instruction associated with the fallthrough from (2) to (3).

Is this understanding correct?

Brian Hackett [Laid off!]

Comment 26

•

9 years ago

Comment on attachment 8681245 [details] [diff] [review]
part 4 - Increment code coverage counters on bailouts.

Review of attachment 8681245 [details] [diff] [review]:
-----------------------------------------------------------------

OK, though there should be a comment describing why this is necessary and how it interacts with the FirstExecution bailout.

Attachment #8681245 - Flags: review?(bhackett1024) → review+

Nicolas B. Pierron [:nbp]

Assignee

Comment 27

•

9 years ago

(In reply to Brian Hackett (:bhackett) from comment #25)
> This doesn't seem equivalent to the interrupt method.  Say we have a branch
> in the bytecode sequence like:
> 
> (0) ... 
> (1) ifeq 3
> (2) ...
> (3) ...
> 
> If we have PC counts at the start of all basic blocks then all but (1) here
> will have counts.  With interrupts we'll update counts at all those opcodes
> no matter the control flow, but with this patch AFAICS if the branch at (1)
> is not taken then counts will be updated at (2) but not (3), since there is
> no explicit branch instruction associated with the fallthrough from (2) to
> (3).
> 
> Is this understanding correct?

Hum, yes … I did not consider this case, I will add a test case for the code coverage, and maybe insert an opcode for a no-op jump instruction to indicate such fallthrough.

Brian Hackett [Laid off!]

Updated

•

9 years ago

Attachment #8681249 - Flags: review?(bhackett1024)

Nicolas B. Pierron [:nbp]

Assignee

Comment 28

•

9 years ago

(In reply to Brian Hackett (:bhackett) from comment #21)
> > +    enum HitState { HS_NotDefined, HS_Count, HS_Frequency };
> 
> ditto re: Jan about the 'enum' thing.  It's also not obvious what these enum
> values mean and how they relate to hitCount_, so some comments would be nice.
> 
> Maybe it will be clear later, but will there ever be any time when we'll be
> putting different types of data on different MIR blocks?  If not then it
> might be clearer to remove HitState so that there can be a consistent
> interpretation for the count information associated with the blocks.

Probably not different types, but it might not be present on some scripts if for obscure reasons we failed to allocate the PCCount vector of a JSScript or an inlined JSScript.  This might cause some blocks to have a PCCounts while others don't.

Nicolas B. Pierron [:nbp]

Assignee

Comment 29

•

9 years ago

Attached patch part 7 - IonBuilder: pushLoop should not infer the continuepc from the entry block. — Details — Splinter Review

Attachment #8682510 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp]

Assignee

Comment 30

•

9 years ago

Attached patch part 8 - IonBuilder: newOsrPreheader should not use the hit-count of the loop. — Details — Splinter Review

Attachment #8682511 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp]

Assignee

Comment 31

•

9 years ago

Attached patch part 9 - IonBuilder: Loop header should use the hit count of the loop, instead of before the loop. — Details — Splinter Review

The last 3 patches do not change the semantic of the compiled program, but
are used to ensure that we have consistent hit count on the basic blocks.
Before these patches, we had something like:

  Before-Loop:  42 hit
  OSR:           0 hit (By definition)
  PreHeader:  1001 hit (One extra because we OSR'd at the loopEntry)
  LoopHeader:   42 hit
  Backedge:   1000 hit
  After-Loop:   41 hit (We have not yet exited the loop)

Attachment #8682515 - Flags: review?(jdemooij)

Nicolas B. Pierron [:nbp]

Assignee

Comment 32

•

9 years ago

Attached patch part 10 - Prevent PruneUnusedBranches from being greedy while removing branches which have only be visited a few times. — Details — Splinter Review

This adds an extra heuristic which is needed to both avoid major regression
on kraken-gaussian-blur, and a noticable regression on
kraken-ai-astar. While removing the large unused branch of
Mandreel. (Otherwise we will lose all the 10% improvement)

I tried to associate some logic to the rules which are keeping the benchmark
score shown in comment 11, while avoiding regressions of kraken and
sunspider.

Attachment #8682521 - Flags: review?(bhackett1024)

Nicolas B. Pierron [:nbp]

Assignee

Comment 33

•

9 years ago

Attached patch part 11 - Increase JitSpewer mask size to avoid overflow. — Details — Splinter Review

Attachment #8682522 - Flags: review?(benj)

Benjamin Bouvier [:bbouvier] (inactive)

Comment 34

•

9 years ago

Comment on attachment 8682522 [details] [diff] [review]
part 11 - Increase JitSpewer mask size to avoid overflow.

Review of attachment 8682522 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/JitSpewer.cpp
@@ +87,5 @@
>  // IonSpewer singleton.
>  static IonSpewer ionspewer;
>  
>  static bool LoggingChecked = false;
> +static uint64_t LoggingBits = 0;

Unless you've deleted it in another patch, you also need to change this line: https://dxr.mozilla.org/mozilla-central/source/js/src/jit/JitSpewer.cpp#492

@@ +627,5 @@
>  bool
>  jit::JitSpewEnabled(JitSpewChannel channel)
>  {
>      MOZ_ASSERT(LoggingChecked);
> +    return (LoggingBits & (uint64_t(1) << uint32_t(channel))) && !filteredOutCompilations;

Can you add either a MOZ_ASSERT (or a static_assert) that channel (resp. the highest value in JitSpewChannel) is less than 64, to make sure that the << doesn't overflow?

@@ +634,5 @@
>  void
>  jit::EnableChannel(JitSpewChannel channel)
>  {
>      MOZ_ASSERT(LoggingChecked);
> +    LoggingBits |= (uint64_t(1) << uint32_t(channel));

nit: maybe the outer parenthesis could be deleted here? (if the compiler warns without them, or you want to keep them, feel free to ignore this comment)

Attachment #8682522 - Flags: review?(benj) → review+

Jan de Mooij [:jandem]

Updated

•

9 years ago

Attachment #8682515 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 35

•

9 years ago

Comment on attachment 8682511 [details] [diff] [review]
part 8 - IonBuilder: newOsrPreheader should not use the hit-count of the loop.

Review of attachment 8682511 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/IonBuilder.cpp
@@ +7339,5 @@
>      MBasicBlock* preheader = newBlock(predecessor, loopEntry);
>      if (!osrBlock || !preheader)
>          return nullptr;
>  
> +    // In order to keep the logical progression of the hit count, we use the

Nit: "logical progression of the hit count" is very vague. Maybe just:

// Give the pre-header the same hit count as the code before the loop.

Attachment #8682511 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Updated

•

9 years ago

Attachment #8682510 - Flags: review?(jdemooij) → review+

Brian Hackett [Laid off!]

Comment 36

•

9 years ago

Comment on attachment 8682521 [details] [diff] [review]
part 10 - Prevent PruneUnusedBranches from being greedy while removing branches which have only be visited a few times.

Review of attachment 8682521 [details] [diff] [review]:
-----------------------------------------------------------------

This is OK but the heuristic is a little bizarre and would be nice to remove sometime in the future.

Attachment #8682521 - Flags: review?(bhackett1024) → review+

Pulsebot

Comment 37

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/65df510148c4
https://hg.mozilla.org/integration/mozilla-inbound/rev/b0a0ff5fa705
https://hg.mozilla.org/integration/mozilla-inbound/rev/85a339f04ada
https://hg.mozilla.org/integration/mozilla-inbound/rev/27711f3220ad
https://hg.mozilla.org/integration/mozilla-inbound/rev/6f8b4f873daa
https://hg.mozilla.org/integration/mozilla-inbound/rev/14d13a4c9e04
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a0b9fd2a25a
https://hg.mozilla.org/integration/mozilla-inbound/rev/52c720d95ec3
https://hg.mozilla.org/integration/mozilla-inbound/rev/1943b5ba3148
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3e7199c884a
https://hg.mozilla.org/integration/mozilla-inbound/rev/5ddb8631dbcc

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 38

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/65df510148c4
https://hg.mozilla.org/mozilla-central/rev/b0a0ff5fa705
https://hg.mozilla.org/mozilla-central/rev/85a339f04ada
https://hg.mozilla.org/mozilla-central/rev/27711f3220ad
https://hg.mozilla.org/mozilla-central/rev/6f8b4f873daa
https://hg.mozilla.org/mozilla-central/rev/14d13a4c9e04
https://hg.mozilla.org/mozilla-central/rev/1a0b9fd2a25a
https://hg.mozilla.org/mozilla-central/rev/52c720d95ec3
https://hg.mozilla.org/mozilla-central/rev/1943b5ba3148
https://hg.mozilla.org/mozilla-central/rev/d3e7199c884a
https://hg.mozilla.org/mozilla-central/rev/5ddb8631dbcc

Status: NEW → RESOLVED

Closed: 9 years ago

status-firefox45: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla45

You need to log in before you can comment on or make changes to this bug.