IonMonkey: Sink recoverable instructions

RESOLVED FIXED in mozilla36

Status

()

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: nbp, Assigned: nbp)

Tracking

(Blocks: 1 bug)

unspecified
mozilla36
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox36 disabled, firefox37 disabled)

Details

Attachments

(4 attachments)

(Assignee)

Description

4 years ago
This bug is about moving the execution of instructions in paths which are less frequently used, such that most frequently used paths are executing less code.

This optimization is similar in some sense to DCE, except that the manipulated instruction still has live uses.  The instruction is moved to the common post-dominator block of its uses if the instruction can be recovered on bailout and if it can be cloned.

Each time we move such instruction, we have to keep a clone of it at its previous location to satisfy the resume points which might be capturing the computation of the instruction.

This optimization was used as an example of things which are doable with recover instruction in the last code sample of the blog article [1].

[1] https://blog.mozilla.org/javascript/2014/07/15/ionmonkey-optimizing-away/
(Assignee)

Comment 1

4 years ago
Created attachment 8518189 [details]
Micro benchmark (sink concat).

Without Sink:
mylog: 107.5498095703125ns
mylog: 102.99260009765625ns

With Sink: (upcoming patch)
mylog: 105.69716064453125ns
mylog: 64.1889892578125ns
Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
(Assignee)

Comment 2

4 years ago
Created attachment 8518226 [details] [diff] [review]
IonMonkey: Add Sink for instruction which can be recovered on bailout.

This patch implements a Sinking algorithm for recoverable instruction.

Instructions are moved into If branches or after join blocks of If
statements, a clone of the instruction is made to replace the instruction
in the operands of resume points / recovered instructions.

This phase runs after DCE, and it runs only once as it cannot change the
semantic of basic blocks, as opposed to folding Loads / Stores.
Attachment #8518226 - Flags: review?(sunfish)
(Assignee)

Comment 3

4 years ago
Here are Octane results, with a later version of the previous patch:

       Richards:   30216 ->   30232  0.1%
      DeltaBlue:   41921 ->   41873  -0.1%
         Crypto:   30041 ->   30060  0.1%
       RayTrace:  102823 ->  103742  0.9%
    EarleyBoyer:   33511 ->   33469  -0.1%
         RegExp:    3917 ->    3915  -0.1%
          Splay:   22347 ->   22319  -0.1%
   SplayLatency:   24701 ->   24685  -0.1%
   NavierStokes:   31770 ->   31788  0.1%
          PdfJS:   20776 ->   20821  0.2%
       Mandreel:   33107 ->   33128  0.1%
MandreelLatency:   43833 ->   43992  0.4%
        Gameboy:   68403 ->   68223  -0.3%
       CodeLoad:   22275 ->   22281  0.0%
          Box2D:   51246 ->   51164  -0.2%
           zlib:   84088 ->   83150  -1.1%
     Typescript:   32000 ->   32028  0.1%
          Score:   32719 ->   32714  -0.0%

I think I fix the zlib regressions, by not enabling the Sink phase on AsmJS code.

These results were obtained by taking the average results of 500 runs of octane with / without the sink phase enabled.
Comment on attachment 8518226 [details] [diff] [review]
IonMonkey: Add Sink for instruction which can be recovered on bailout.

Review of attachment 8518226 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

One additional heuristic concern is register pressure. When sinking something heavyweight like MConcat, register pressure is probably not a big concern, but when sinking cheap things like MBitAnd, if both operands are non-constants, this can extend the lifetimes of the 2 operands while only shortening the lifetime of the 1 result. It's obviously hard to be optimal here, but I wonder if it would be useful to have a simple heuristic that if an instruction is a "cheap" binary instruction with two non-constant single-use operands, don't sink it.

::: js/src/jit/MIRGraph.cpp
@@ +795,5 @@
> +MInstruction *
> +MBasicBlock::safeInsertTop(MDefinition *ins)
> +{
> +    // Beta nodes and interrupt checks are required to be located at the
> +    // beginnings of basic blocks, so we must insert range assertions

This code is now used for more than just range assertions, so this comment should be updated.

::: js/src/jit/MIRGraph.h
@@ +286,5 @@
>      // Move an instruction. Movement may cross block boundaries.
>      void moveBefore(MInstruction *at, MInstruction *ins);
>  
> +    // Move an instruction to the top of the |block|. Movement may cross block
> +    // boundaries.

This comment doesn't describe this function.

::: js/src/jit/Sink.cpp
@@ +3,5 @@
> + * This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#include "jit/ScalarReplacement.h"

I believe this should be Sink.h

@@ +21,5 @@
> +// CommonDominator function returns the basic block which dominate the last
> +// common dominator and the definition. If no such block exists, then this
> +// functions return null.
> +MBasicBlock *
> +CommonDominator(MBasicBlock *commonDominator, MBasicBlock *defBlock)

Since this function isn't currently used outside this file, please make it static.

@@ +33,5 @@
> +    // block which dominates all previous uses as well as this instruction.
> +    while (!commonDominator->dominates(defBlock)) {
> +        MBasicBlock *nextBlock = commonDominator->immediateDominator();
> +        if (commonDominator == nextBlock)
> +            return nullptr;

Can this ever happen? A def must dominate all its uses, so walking up the immediate dominator chain from a use should always go through the def.

@@ +87,5 @@
> +            }
> +
> +            // Leave this instruction for DCE.
> +            if (!hasUses)
> +                continue;

Random idea: would it make sense to just do the DCE here, and then disable the discrete DCE pass?

There is a logic to folding DCE into Sink, since Sink is "partial DCE" after all :).

@@ +119,5 @@
> +                if (lastJoin->numSuccessors() > 1)
> +                    break;
> +            }
> +            if (*block == lastJoin)
> +                usesDominator = lastJoin;

When this happens, the if immediately below is always true. Can we just do a 'continue;' here, to make it more clear what happens?

::: js/src/jit/Sink.h
@@ +3,5 @@
> + * This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +// This file declares scalar replacement of objects transformation.

copy+pasto

@@ +19,5 @@
> +
> +} // namespace jit
> +} // namespace js
> +
> +#endif /* jit_ScalarReplacement_h */

copy+pasto
Attachment #8518226 - Flags: review?(sunfish) → review+
(Assignee)

Comment 5

4 years ago
(In reply to Dan Gohman [:sunfish] from comment #4)
> One additional heuristic concern is register pressure. When sinking
> something heavyweight like MConcat, register pressure is probably not a big
> concern, but when sinking cheap things like MBitAnd, if both operands are
> non-constants, this can extend the lifetimes of the 2 operands while only
> shortening the lifetime of the 1 result. It's obviously hard to be optimal
> here, but I wonder if it would be useful to have a simple heuristic that if
> an instruction is a "cheap" binary instruction with two non-constant
> single-use operands, don't sink it.

I think it would make sense to have a follow-up bug for this.  The reason I am not convinced yet by implementing such heuristic is that this is a non-local decision which makes more sense to be computed in a reverse post-order traversal.

> @@ +87,5 @@
> > +            }
> > +
> > +            // Leave this instruction for DCE.
> > +            if (!hasUses)
> > +                continue;
> 
> Random idea: would it make sense to just do the DCE here, and then disable
> the discrete DCE pass?
> 
> There is a logic to folding DCE into Sink, since Sink is "partial DCE" after
> all :).

I agree that we could merge it, but at the same time this optimization is currently disabled for asmjs code at the moment, in order to avoid the regression on zlib.  On the other hand DCE is still used by Odin and Ion.
(Assignee)

Comment 6

4 years ago
Created attachment 8524586 [details] [diff] [review]
fixup - Do not skip the instructions which are recovered on bailout.

This patch change the safeInsertTop function to select which kind can be
ignored or not.  This also fix an issue which cause this issue to not be
triggered as often as wanted.

The problem (see the test case) is that the multiplication is used as
operand of the MObjectState which corresponds to "obj.a = x". This
multiplication was moved after the MObjectState, which caused the
GraphCoherency assertion to fail.  Ignoring the recover instruction as part
of safeInsertTop fix this issue by inserting the multiplication before the
MObjectState.
Attachment #8524586 - Flags: review?(sunfish)

Updated

4 years ago
Attachment #8524586 - Flags: review?(sunfish) → review+
(Assignee)

Comment 7

4 years ago
(try) https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=3da1e567d020
A few failures on Android which also appear in other pushes to Try which are also unrelated to android / User-agent.

https://hg.mozilla.org/integration/mozilla-inbound/rev/9188c8b7962b
https://hg.mozilla.org/mozilla-central/rev/9188c8b7962b
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
Depends on: 1105187
Depends on: 1105574
Depends on: 1106171

Updated

4 years ago
Depends on: 1106136
This is still causing the #1 (by a wide margin) top crash on Aurora and Nightly.

Here's a report with build 20141206030202 (rev 78fe054f6737), that's m-c tip (except for some ffxbld commits that don't change anything):

https://crash-stats.mozilla.com/report/index/d6b551fe-3a65-48d9-b328-d76ad2141206

Nicolas, please back out or disable for now to fix this; it has been more than two weeks.
Flags: needinfo?(nicolas.b.pierron)
(Assignee)

Comment 10

4 years ago
Created attachment 8533207 [details] [diff] [review]
Disable Sink phase.
Attachment #8533207 - Flags: review?(jdemooij)
(Assignee)

Comment 11

4 years ago
Comment on attachment 8533207 [details] [diff] [review]
Disable Sink phase.

Approval Request Comment
[Feature/regressing bug #]: this bug.
[User impact if declined]: Top crasher.
[Describe test coverage new/current, TBPL]: wait for review & sent to try [1]
[Risks and why]: low, disable code which cause bad register allocation.
[String/UUID change made/needed]: N/A

[1] https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=abc626f7af2e
Flags: needinfo?(nicolas.b.pierron)
Attachment #8533207 - Flags: approval-mozilla-aurora?

Updated

4 years ago
Attachment #8533207 - Flags: review?(jdemooij) → review+
(Assignee)

Comment 12

4 years ago
Disable this feature for the moment, until all bugs are fixed:
https://hg.mozilla.org/integration/mozilla-inbound/rev/110ecfd89e27
status-firefox36: --- → affected
status-firefox37: --- → fixed
Comment on attachment 8533207 [details] [diff] [review]
Disable Sink phase.

next time, please report a new bug to deal with the top crash. thanks
Attachment #8533207 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
status-firefox36: affected → disabled
status-firefox37: fixed → disabled
(Assignee)

Updated

4 years ago
Blocks: 1109195
You need to log in before you can comment on or make changes to this bug.