Closed Bug 1489142 Opened 6 years ago Closed 6 years ago

FlagAllOperandsAsHavingRemovedUses can take a very long time

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

61 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla65
Performance Impact high
Tracking Status
firefox65 --- fixed

People

(Reporter: jonco, Assigned: nbp)

References

Details

Attachments

(2 files, 3 obsolete files)

When profiling loading the google doc in bug 1488435 I saw several examples of this, e.g. when running on a helper thread this function 76% of a 651ms compilation task:

https://perfht.ml/2NfMfy3

(I was looking at the JS Helper threads with inverted thread stack, and dropping samples from HelperThread::threadLoop::wait)

May be related to bug 1329901.
This might be a duplicate of Bug 1396822.
I've seen this too in profiles (I think one in bug 1478104). 

94% under OptimizeMIR was in FlagAllOperandsAsHavingRemovedUses; that needs to be fixed.
QA Contact: sdetar
Whiteboard: [qf]
(I think sdetar is automatically added as QA contact on qf bugs? I didn't make that change, but I also saw this on other bugs.)
We need to fix this ASAP.

Looking at FlagPhiInputsAsHavingRemovedUses, I think that one has quadratic behavior? There is this limit of 128 uses we look at, but I think that can still be pretty bad. Maybe a global limit would work better?

Also, when we call def->setUseRemoved(), should we also set the flag on phis that are in the worklist?

Here's the profile where we have 94% of OptimizeMIR under this function: https://perfht.ml/2zSIqau
Flags: needinfo?(nicolas.b.pierron)
Priority: -- → P1
Blocks: 1478104
This bug should be fixed as soon as we finish moving the branch pruning to IonBuilder.

We should not see the same issue once this code is moved to IonBuilder because the costly part of this code is about ensuring that we do not make any mistake in terms of used property while this can be computed efficiently from a bunch of bit masks when computing the control flow graph.

I presume that most of the issues we see with this signature are caused either by a huge number of variable * branches, or by the inlining of of functions in unused branches.  Our inlining logic does not care about the usage of the block in which it decides to inline code into.

(In reply to Jan de Mooij [:jandem] (PTO until Oct 15) from comment #4)
> Also, when we call def->setUseRemoved(), should we also set the flag on phis
> that are in the worklist?

This is not a bad idea, however the worklist is a breath-search and not a depth-search, thus not all the Phi in the worklist should be marked as having removed uses.

The problem we have in this algorithm is that we do not have a flag which claim if a Phi is used or not, what we have is used or we-don-t-know.  If we had a flag to say used or not-used reserved for Phi, Then we could iterate over all the Phi instructions before hand, we could compute this information in time linear to the number of Phi's uses plus linear in the number of Phis (= O(#Phi-uses + #Phi) ).  However, this cost would be regardless of the number of removed blocks.
(In reply to Nicolas B. Pierron [:nbp] from comment #5)
> This bug should be fixed as soon as we finish moving the branch pruning to
> IonBuilder.

I don't want this to be blocked on pretty big change like that. We need a fix here for 64, maybe even beta 63.
Nicolas, would you be willing to take ownership of this bugs and work to get a fix into 64?
(In reply to Steven DeTar [:sdetar] from comment #7)
> Nicolas, would you be willing to take ownership of this bugs and work to get
> a fix into 64?

I will look at this issue after the LifoAlloc issue for doubling the size.
Assignee: nobody → nicolas.b.pierron
Status: NEW → ASSIGNED
Flags: needinfo?(nicolas.b.pierron)
QA Contact: sdetar
Whiteboard: [qf] → [qf:p1:64]
I am still working on it, I have a draft of a patch failing locally, and I will submit it here as soon as the test suite passes.
Whiteboard: [qf:p1:64] → [qf:p1:f64]
This patch changes the current algorithm of FlagPhiInputsAsHavingRemovedUses,
which was a depth-first search by a breath-first search which record the
Unused/Used state with 2 extra flags dedicated to Phi instructions.

This new algorithm, with the uses of the flags, ensure that this algorithm will
at most visit each Phi once.
Attachment #9020080 - Flags: review?(jdemooij)
Attachment #9019361 - Attachment is obsolete: true
Whiteboard: [qf:p1:f64] → [qf:p1:f65]
Comment on attachment 9020080 [details] [diff] [review]
Rewrite FlagPhiInputsAsHavingRemovedUses to iterate at most once per Phi.

Review of attachment 9020080 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good overall, just some comments and nits. Also we should try to write a worst-case testcase for the old algorithm so we can confirm the new version fixes it.

::: js/src/jit/IonAnalysis.cpp
@@ +131,5 @@
> +        }
> +
> +        // We do not know if the Phi was Used or Unused, iterate over all uses
> +        // with a depth-search of uses. Break out of the loop as soon as one use
> +        // is found.

MOZ_ASSERT(worklist.empty());

@@ +155,3 @@
>                          break;
>                      }
>                      continue;

The breaks vs continues get a bit complicated with the nested loops. I wonder if it would be clearer to factor out some helper functions so we can just |return ...;| instead of break...

@@ +192,4 @@
>                      return false;
>                  }
> +                producer = cphi;
> +                use = cphi->usesBegin();

Because the loop is like this:

  for (; use != usesEnd; use++) {

Doesn't this mean we will always skip cphi's first use (because of the use++)? Can we add asserts/tests that would have caught this?

@@ +197,4 @@
>              }
>  
> +            if (use != usesEnd) {
> +                if (!worklist.append(std::make_pair(producer, use))) {

MOZ_ASSERT(producer->isInWorklist());

@@ +205,5 @@
> +
> +            // When unused, we cannot bubble up this information without
> +            // iterating over the consumers.
> +            producer->setPhiUnused();
> +            producer->setNotInWorklist();

Can we setNotInWorklist() when we popCopy or does that affect the algorithm?

::: js/src/jit/MIR.h
@@ +171,5 @@
> +    _(Discarded)                                                                \
> +                                                                                \
> +    /* When computing lazily the fact that Phi instructions are used or unused,
> +     * such as in FlagPhiInputsAsHavingRemovedUses, we need 3 states: Unknown,
> +     * Used and Unused.

Because there are 3 states it would be clearer to represent this as an uint8_t enum class stored in MPhi (we have enough space there to not affect sizeof(MPhi)).

Then we can also assert in setPhiUsed/setPhiUnused that the state is still Unknown :)
Attachment #9020080 - Flags: review?(jdemooij) → feedback+
(In reply to Jan de Mooij [:jandem] from comment #12)
> @@ +155,3 @@
> >                          break;
> >                      }
> >                      continue;
> 
> The breaks vs continues get a bit complicated with the nested loops. I
> wonder if it would be clearer to factor out some helper functions so we can
> just |return ...;| instead of break...

I did and named it DepthSearchUse, including the while and for loop.

> @@ +192,4 @@
> >                      return false;
> >                  }
> > +                producer = cphi;
> > +                use = cphi->usesBegin();
> 
> Because the loop is like this:
> 
>   for (; use != usesEnd; use++) {
> 
> Doesn't this mean we will always skip cphi's first use (because of the
> use++)? Can we add asserts/tests that would have caught this?

Oh, good catch!

Unfortunately I do not see how to assert that in every cases without changing the worklist representation. In the mean time I added checks for the cases where we have non observable uses.

About adding a test case, I do not think this would be possible as the first use of the Phi instruction is more than likely the entry resume point of the basic block in which they are listed.  And observable Phi instructions are more than likely to be listed as the observable slot of more than one instruction.

> @@ +205,5 @@
> > +
> > +            // When unused, we cannot bubble up this information without
> > +            // iterating over the consumers.
> > +            producer->setPhiUnused();
> > +            producer->setNotInWorklist();
> 
> Can we setNotInWorklist() when we popCopy or does that affect the algorithm?

We can do it, but we have to add an additional condition for the cases where a Phi depends on it-self in a loop.
This file is a JS shell benchmark which abuse the chain of Phi instructions which are looping before being able to figure out which are used or unused.

I will not submit it to the repository as part of our test suite because it fails with --no-threads due to the excessively large size of the bytecode.

Also, I think this highlight other issues as this code is hard to optimize through, while being fast to execute.

Before: ~20.1s (11% corresponds to FlagAllOperandsAsHavingRemovedUses)
After: ~17.4s (0.84% corresponds to FlagAllOperandsAsHavingRemovedUses)
Delta:
 - Move the Depth search to its own functions, and return from the function when one use is found.
 - Add assertions to checks that when flagging all Phi as being unused, then we visited all the Phi uses.
 - Remove the worklist flag when not in the worklist vector.
 - Add a PhiUsage enum, and a usageAnalysis field on Phi to record the 3-state analysis.
Attachment #9022947 - Flags: review?(jdemooij)
Attachment #9020080 - Attachment is obsolete: true
Comment on attachment 9022947 [details] [diff] [review]
Rewrite FlagPhiInputsAsHavingRemovedUses to iterate at most once per Phi.

Review of attachment 9022947 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for making these changes! It's still complicated code but having the separate function really helps and I don't see good ways to simplify it more.

::: js/src/jit/IonAnalysis.cpp
@@ +37,5 @@
> +
> +// Look for Phi uses with a depth-first search. If any uses are found the stack
> +// of MPhi instructions is returned in the |worklist| argument.
> +static bool
> +DepthSearchUse(MIRGenerator* mir, MPhiUseIteratorStack& worklist, MPhi* phi)

Nit: s/DepthSearch/DepthFirstSearch/, it's the term people should be familiar with.

@@ +114,5 @@
> +                continue;
> +            }
> +
> +            // We found another Phi instruction, move the use iterator to
> +            // the next use push it to the worklist stack. The, continue

Nit: s/The/Then/

@@ +241,5 @@
>          }
>  
> +        MOZ_ASSERT_IF(worklist.empty(), phi->getUsageAnalysis() == PhiUsage::Unused);
> +        if (!worklist.empty()) {
> +            // One of the Phi is used, set Used flags on all the Phi which are

Nit: s/Phi/Phis/ twice

::: js/src/jit/MIR.h
@@ +6826,5 @@
>      }
>  };
>  
> +// This is a 3 state flag used by FlagPhiInputsAsHavingRemovedUses to record and
> +// propagate the information about the consumers of a Phi instructions. This is

Nit: s/instructions/instruction/

@@ +6850,5 @@
>      bool canProduceFloat32_;
>      bool canConsumeFloat32_;
> +    // Record the state of the data flow before any mutation made to the control
> +    // flow, such that removing branches is properly accounted for.
> +    PhiUsage usageAnalysis_;

Is it okay for this flag to keep its old value if we run FlagPhiInputsAsHavingRemovedUses more than once on the same graph with some other passes in between? I think so because it's similar to the removed-uses flag, right? r=me if you agree.
Attachment #9022947 - Flags: review?(jdemooij) → review+
(In reply to Jan de Mooij [:jandem] from comment #16)
> > +    // Record the state of the data flow before any mutation made to the control
> > +    // flow, such that removing branches is properly accounted for.
> > +    PhiUsage usageAnalysis_;
> 
> Is it okay for this flag to keep its old value if we run
> FlagPhiInputsAsHavingRemovedUses more than once on the same graph with some
> other passes in between? I think so because it's similar to the removed-uses
> flag, right? r=me if you agree.

Thanks for this comment, this made me think more about this. I initially thought that this was fine as well, but:
 - This is fine if we remove uses.
 - This not fine if we add uses, and the Phi instruction was marked as Unused.

This last case could happen in GVN, with MPhi::congruentTo.
I will see if I can make a test case for it.
(In reply to Nicolas B. Pierron [:nbp] from comment #17)
> This last case could happen in GVN, with MPhi::congruentTo.

So MPhi::congruentTo should check the enum field right? Good catch.
(In reply to Jan de Mooij [:jandem] from comment #18)
> (In reply to Nicolas B. Pierron [:nbp] from comment #17)
> > This last case could happen in GVN, with MPhi::congruentTo.
> 
> So MPhi::congruentTo should check the enum field right? Good catch.

Yes, or we should add MPhi::updateForReplacement, and merge the flags.

I am going to go for this second option as it sounds better to me if we can fold data-flow, and thus have less register/stack space needed.

Adding the condition to MPhi::congruentTo would work too, but I think this might lead to other issues if multiple Phi instructions have the same inputs and if the leader Phi has a different usage analysis. This could be solved with a different hash result based on the usage analysis, but would still have the issue of lacking GVN benefits.
Delta:
 - Add something which might look like the right way to test, but is not yet
   there. I am lacking imagination for having a Phi being marked as Unused,
   while having uses of the value which would highlight a differential
   behaviour.
 - Add updateForReplacement function, in case it is ever possible to hit this
   corner case.
Attachment #9024042 - Flags: review?(jdemooij)
Comment on attachment 9024042 [details] [diff] [review]
Rewrite FlagPhiInputsAsHavingRemovedUses to iterate at most once per Phi.

Review of attachment 9024042 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay. I only looked at the updateForReplacement function.

::: js/src/jit/MIR.cpp
@@ +2344,5 @@
> +        // or this == unknown && other == unused
> +        usageAnalysis_ = PhiUsage::Unknown;
> +    }
> +    // else: this == unused && other == unused
> +    //       this == unknown && other = unknown

We should assert this instead:

} else {
    MOZ_ASSERT(usageAnalysis_ == PhiUsage::Unused || usageAnalysis_ == PhiUsage::Unknown);
    MOZ_ASSERT(usageAnalysis_ == other->usageAnalysis_)
}
Attachment #9024042 - Flags: review?(jdemooij) → review+
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/586b29eb1dae
Rewrite FlagPhiInputsAsHavingRemovedUses to iterate at most once per Phi. r=jandem
Backed out for assertion failures on js/src/jit/MIR.cpp:2349

backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/2990cbc0eb72c2ae2cf7954bdd97a3d67586ccc9

push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&searchStr=a11y&revision=586b29eb1dae3b18d60b702441db60bd67bb1c4f&selectedJob=211519093

failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=211519093&repo=mozilla-inbound&lineNumber=5352

[task 2018-11-13T19:53:59.519Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: HTMLEditRules::BeforeEdit() failed to handle something: 'NS_SUCCEEDED(rv)', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditor.cpp, line 3840
[task 2018-11-13T19:53:59.519Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: '!aSelection.RangeCount()', file /builds/worker/workspace/build/src/editor/libeditor/EditorBase.cpp, line 3923
[task 2018-11-13T19:53:59.520Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: '!selectionStartPoint.IsSet()', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditRules.cpp, line 10233
[task 2018-11-13T19:53:59.521Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: Failed to normalize Selection: 'NS_SUCCEEDED(rv)', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditRules.cpp, line 492
[task 2018-11-13T19:53:59.527Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: HTMLEditRules::BeforeEdit() failed to handle something: 'NS_SUCCEEDED(rv)', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditor.cpp, line 3840
[task 2018-11-13T19:53:59.528Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: '!aSelection.RangeCount()', file /builds/worker/workspace/build/src/editor/libeditor/EditorBase.cpp, line 3923
[task 2018-11-13T19:53:59.531Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: '!selectionStartPoint.IsSet()', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditRules.cpp, line 10233
[task 2018-11-13T19:53:59.532Z] 19:53:59     INFO - GECKO(1054) | [1054, Main Thread] WARNING: Failed to normalize Selection: 'NS_SUCCEEDED(rv)', file /builds/worker/workspace/build/src/editor/libeditor/HTMLEditRules.cpp, line 492
[task 2018-11-13T19:54:01.111Z] 19:54:01     INFO - GECKO(1054) | Assertion failure: usageAnalysis_ == PhiUsage::Used || usageAnalysis_ == PhiUsage::Unknown, at /builds/worker/workspace/build/src/js/src/jit/MIR.cpp:2349
[task 2018-11-13T19:54:01.116Z] 19:54:01     INFO - GECKO(1054) | ExceptionHandler::GenerateDump cloned child 1313
[task 2018-11-13T19:54:01.117Z] 19:54:01     INFO - GECKO(1054) | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2018-11-13T19:54:01.126Z] 19:54:01     INFO - GECKO(1054) | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2018-11-13T19:54:01.333Z] 19:54:01     INFO - TEST-INFO | Main app process: exit 11
Flags: needinfo?(nicolas.b.pierron)
(In reply to Natalia Csoregi [:nataliaCs] from comment #23)
> Backed out for assertion failures on js/src/jit/MIR.cpp:2349

Hum ..., just a typo in the newly added assertion code. :/
Flags: needinfo?(nicolas.b.pierron)
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c0bef417dc8e
Rewrite FlagPhiInputsAsHavingRemovedUses to iterate at most once per Phi. r=jandem
Attachment #9022947 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/c0bef417dc8e
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Depends on: 1511538
Depends on: 1519096
Depends on: CVE-2019-9792
No longer depends on: CVE-2019-9792
Regressions: CVE-2019-9792
Performance Impact: --- → P1
Whiteboard: [qf:p1:f65]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: