If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

[GDocs] Extreme amount of time spent in FlagAllOperandsAsHavingRemovedUses().

RESOLVED FIXED in Firefox 53

Status

()

Core
JavaScript Engine: JIT
RESOLVED FIXED
9 months ago
7 months ago

People

(Reporter: sstangl, Assigned: sstangl)

Tracking

(Blocks: 1 bug)

53 Branch
mozilla53
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify ?

Firefox Tracking Flags

(firefox53 fixed)

Details

(URL)

Attachments

(3 attachments)

(Assignee)

Description

9 months ago
Created attachment 8825301 [details]
Flame chart of Ion compilation

STR in Bug 1269695 Comment 28 (let page load, then press Ctrl+End).

Using "perf" to profile the JS compilation helper thread shows an extreme amount of compilation time spent in FlagAllOperandsAsHavingRemovedUses(). About 35% of the time during the test, the compilation thread is actively compiling Ion code; of that time, 80% is apparently spent in that one function.

Although the compilation thread doesn't block main thread execution, it's possible that abnormally-long compilation times are masking other compilation opportunities.

Flame chart of Ion compilation attached. Note that although items are proportional to the amount of time taken, the x-axis is not time (items are just sorted alphabetically).

Nicolas, do any ideas come to mind?
Flags: needinfo?(nicolas.b.pierron)
(Assignee)

Comment 1

9 months ago
Created attachment 8825303 [details] [diff] [review]
Remove expensive isObservableOperand loop guard

FlagAllOperandsAsHavingRemovedUses() has the following loop-within-a-loop:

> for (size_t i = 0, e = rp->numOperands(); i < e; i++) {
>     if (mir->shouldCancel("FlagAllOperandsAsHavingRemovedUses inner loop"))
>         return false;

>     if (!rp->isObservableOperand(i))
>         continue;
>     rp->getOperand(i)->setUseRemovedUnchecked();
> }

Looking at the generated assembly with perf, the compiler totally messes up optimization because the guard isObservableOperand() is actually way more expensive than the behavior it's intended to protect. Instead of a tight loop, it's unrolled and all over the place, with tons of branches, touching memory all over the place, probably taking tons of page faults.

Additionally, shouldCancel() is called even though it's in the outer loop: this is only necessary because of isObservableOperand().

If we remove isObservableOperand(), then we can also remove shouldCancel(), and then the loop is able to be compiled extremely tightly. Re-measurement with perf shows that the function completely disappears with this change. Compilation time is greatly reduced.

Although the "JS Helper" thread now takes significantly fewer resources, the actual main thread doesn't get faster with this patch. But it's still an improvement.
Flags: needinfo?(nicolas.b.pierron)
Attachment #8825303 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 2

9 months ago
Created attachment 8825305 [details]
Flame chart of Ion compilation after patch

Same chart after the patch: total time in OptimizeMIR() reduced to ~5% from ~32%.

The number of samples taken with GenerateLIR() on the stack is pretty close in both of these runs, so they probably are measuring the same thing, modulo timing differences since I had to attach perf manually before pressing End.
(Assignee)

Comment 3

9 months ago
Just as a sanity check, since it seems somewhat insane that this function could blow up so much, I took more perf recordings of Nightly on the testcase, and FlagAllOperandsAsHavingRemovedUses() dominated the compilation thread every time.

Still insane, but at least reproducible.
(Assignee)

Updated

9 months ago
Assignee: nobody → sstangl
Comment on attachment 8825303 [details] [diff] [review]
Remove expensive isObservableOperand loop guard

Review of attachment 8825303 [details] [diff] [review]:
-----------------------------------------------------------------

Last time I tried this, this was a performance regression on benchmark, because the generated compiled code was keeping too many instruction live, preventing other optimizations to kick in.

Another approach would be to cache the ranges of observable slots on the CompileInfo, either by creating a boolean array ahead when they are constructed, or something else.
Attachment #8825303 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 5

9 months ago
If an operand is not observable, why would it matter whether it's marked as having uses removed?

This also affects Google Sheets performance: we spend ~10% of compilation time just within FlagAllOperandsAsHavingRemovedUses().

Would you happen to be able to remember which benchmark regresses? Since this is affecting a major P0 real-world use case, we may judge it to be more important than a benchmark, depending on the severity of regression.
Flags: needinfo?(nicolas.b.pierron)
(Assignee)

Comment 6

9 months ago
I ran benchmarks locally and did not notice any regressions.

Could we land it and then see what regresses on AWFY?
Comment on attachment 8825303 [details] [diff] [review]
Remove expensive isObservableOperand loop guard

Review of attachment 8825303 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I miss read the conditions.

You are right, if the operand is observable, its computation should be kept around anyway as observable operands are considered to always have removed uses.

Which makes me wonder what condition did I added which was necessary to get some benchmark improvements with

http://searchfox.org/mozilla-central/source/js/src/jit/IonAnalysis.cpp#1042
http://searchfox.org/mozilla-central/source/js/src/jit/RangeAnalysis.cpp#2973
Attachment #8825303 - Flags: review+
Flags: needinfo?(nicolas.b.pierron)
(Assignee)

Updated

9 months ago
Keywords: checkin-needed

Comment 8

9 months ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c98510adc780
Remove expensive isObservableOperand() loop guards. r=nbp
Keywords: checkin-needed

Comment 9

9 months ago
This patch regressed some tests on AWFY: ss-fannkuck and 4 tests from the assorted list.
(Assignee)

Comment 10

9 months ago
(In reply to Guilherme Lima from comment #9)
> This patch regressed some tests on AWFY: ss-fannkuck and 4 tests from the
> assorted list.

The good news is that the generated MIR is identical, so the difference is just the amount of time spent within FlagAllOperandsAsHavingRemovedUses().
(Assignee)

Comment 11

9 months ago
Ah, it has nothing to do with the execution time of FlagAllOperandsAsHavingRemovedUses(). If you modify access-fannkuch.js to have n=11 at the bottom, the difference is magnified linearly with n. But that function does not show up at all in the profile.

The problem does not reproduce with --ion-offthread-compile=off, so I can't use the JITSpewer to compare the MIRGraphs.
(Assignee)

Comment 12

9 months ago
Apparently I can use the JITSpewer to compare the MIRGraphs. The generated MIR is exactly the same, to the extent that the PNG files iongraph generates match byte-for-byte.

The only point that really looks like it could be having some change in behavior is in range analysis, but I instrumented it and there is no change in behavior.

Comment 13

9 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c98510adc780
Status: NEW → RESOLVED
Last Resolved: 9 months ago
status-firefox53: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
(In reply to Sean Stangl [:sstangl] from comment #11)
> The problem does not reproduce with --ion-offthread-compile=off, so I can't
> use the JITSpewer to compare the MIRGraphs.

The JITSpewer should work even with off-thread compilation (IONFLAGS=logs supports that now)

Otherwise, if you cannot reproduce it this with the logging enabled, you can record the perf issue under rr, and use the iongraph command added in gdb to display the final MIRGraph from CodeGenerator::generateBody.
(Assignee)

Updated

8 months ago
Depends on: 1330667
Depends on: 1342016
Sean, does this bug have automated coverage? Would it benefit from manual testing?
Flags: qe-verify?
You need to log in before you can comment on or make changes to this bug.