Closed Bug 1674143 Opened 4 years ago Closed 4 years ago

Math.max+spread seems to be very slow with firefox compared to chrome.

Tracking

()

Status:

RESOLVED FIXED

Milestone:

87 Branch

Tracking Flags:

Tracking

Status

firefox84

---

wontfix

firefox87

---

fixed

People

(Reporter: calixte, Assigned: iain)

Details

Attachments

(6 files)

Bug 1674143: Emit OptimizeSpreadCall for non-rest arguments r=arai 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1674143: Update column offset testcase r=jorendorff 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1674143: Add Int32MinMaxArrayResult r=jandem 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1674143: Transpile Int32MinMaxArrayResult r=jandem 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1674143: Add NumberMinMaxArrayResult r=jandem 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1674143: Transpile NumberMinMaxArrayResult r=jandem 4 years ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review

Calixte Denizet (:calixte)

Reporter

Description

•

4 years ago

I ran the following script:

var tests = 1000000;
var input = [0, 10, 20, 30, 8, 0, 0];
var sumOutput = function() {
  return Math.max(...input);
};
var maxOutput = function() {
  return input.reduce(function(a, b) {
    return Math.max(a, b);
  });
};
console.time('max');
var i = tests;
while (i--) {
  input.forEach(function() {
    return sumOutput();
  });
}
console.timeEnd('max');
console.log(' ======');
console.time('max-reduce');
var i = tests;
while (i--) {
  input.forEach(function() {
    return maxOutput();
  });
}
console.timeEnd('max-reduce');

on https://playcode.io and with nightly 84 I get:

max: 3775ms
 ======
max-reduce: 321ms

and when ran with chrome, I get:

max: 315ms
 ======
max-reduce: 71ms

or maybe there is an issue with forEach.

Nicolas B. Pierron [:nbp]

Comment 1

•

4 years ago

Looking at the test case, especially max-reduce, I would think that this is likely an inlining issue, and allocation of inner function which could have been removed by scalar replacement. I presume there is also an issue with the spread operator, which is used in max case.

Iain, can you check if this is an inlining issue?

Severity: -- → S4

Flags: needinfo?(iireland)

Priority: -- → P2

Iain Ireland [:iain]

Assignee

Comment 2

•

4 years ago

We don't currently inline spread calls. We also don't optimize spread calls to native builtins. I believe we also don't inline forEach, because it's above our bytecode size limit for self-hosted code. So there are a lot of known issues that could be contributing to this.

I suspect that the difference between max and max-reduce is because we don't optimize spread calls to native builtins, and the remaining delta between us and V8 may be forEach, but I'll take a look to see if I can confirm.

Flags: needinfo?(iireland)

André Bargull [:anba]

Comment 3

•

4 years ago

Math.max(...input); is currently implemented as always executing var array = [...input]; Math.max.apply(null, array);. So we always create an array copy, which accounts to about 2/3 of the time needed when running this test case.

Nicolas B. Pierron [:nbp]

Comment 4

•

4 years ago

(In reply to Iain Ireland [:iain] from comment #2)

[…] the remaining delta between us and V8 may be forEach, but I'll take a look to see if I can confirm.

For what is worth, I introduced benchmarks a while ago which were showing that we were multiple time faster than chrome on forEach:
https://github.com/mozilla/arewefastyet/tree/master/benchmarks/misc/tests/assorted (misc-forEach-*.js)

Unfortunately this benchmark did not caught the regression, as it is no longer run as part of AWFY, which should probably be added back.

Iain Ireland [:iain]

Assignee

Comment 5

•

4 years ago

Do we have any data on how often single argument spread calls (foo(...input)) are used with non-array arguments? Right now we only generate OptimizeSpreadCall (to check whether the argument is already a packed array and skip making a copy) if the single argument is a rest parameter. This appears to be motivated by a micro-benchmark Arai wrote when we first added OptimizeSpreadCall, showing a slight regression for a spread call using a Map. At the time, OptimizeSpreadCall used a VM call. It's now been optimized to use an IC that will return false immediately if it has ever failed, so the overhead should be extremely low.

If I remove the restriction, the spread call case is more than twice as fast, and the slowdown on the Map case appears to be ~4%. My intuition is that spread calls with non-rest-parameter arrays are probably significantly more common than other iterables, so it seems like a worthwhile tradeoff.

Am I missing anything?

Iain Ireland [:iain]

Assignee

Comment 6

•

4 years ago

I also took a shot at optimizing the native spread call to Math.max in CacheIR. The results look pretty good. Running the shell locally, I get:

Before any changes: 
 max: 3690.59912109375
With OptimizeSpreadCall changes:
 max: 1610.087158203125
With OptimizeSpreadCall + CacheIR changes:
 max: 55.56103515625

Which should make us significantly faster than Chrome. I've only implemented it for int32 values so far, but it shouldn't be too hard to do the same thing for doubles.

I'll put the patches up tomorrow.

Iain Ireland [:iain]

Assignee

Comment 7

•

4 years ago

Attached file Bug 1674143: Emit OptimizeSpreadCall for non-rest arguments r=arai — Details

When OptimizeSpreadCall was introduced, we conservatively only emitted it when the spread argument was a rest parameter. At the time, OptimizeSpreadCall required a VM call. It has been rewritten to use an IC, making the failure case very efficient. If we do not emit OptimizeSpreadCall, then we must allocate and copy an array object, even if the argument is already a packed array. In a microbenchmark, emitting OptimizeSpreadCall for non-rest-parameter cases is more than a 2x speedup, and generating the OptimizeSpreadCall in a non-packed-array case is <5% slowdown. The optimized case is also likely to be much more common in real code.

This patch extends OptimizeSpreadCall to be emitted whenever the single spread argument is any Name node. In theory we might also want to allow other nodes, but currently the bytecode emitter will call emitTree on the first argument twice in this case, which is fine for a Name node but broke in testing for foo(...function() {}) (because we tried emitting a non-hoisted function twice).

Phabricator Automation

Updated

•

4 years ago

Assignee: nobody → iireland

Status: NEW → ASSIGNED

Iain Ireland [:iain]

Assignee

Comment 8

•

4 years ago

Attached file Bug 1674143: Update column offset testcase r=jorendorff — Details

For a spread call like foo(...args), if OptimizeSpreadCall succeeds, we don't hit a breakpoint associated with ...args, because we skip past that code. Before the parent patch, this wasn't a problem in this testcase, because we were only generating OptimizeSpreadCall for rest parameters. After that patch, we are generating OptimizeSpreadCall more frequently.

We don't actually care about hitting this breakpoint, so I'm just updating the test.

Depends on D101231

Iain Ireland [:iain]

Assignee

Comment 9

•

4 years ago

Attached file Bug 1674143: Add Int32MinMaxArrayResult r=jandem — Details

Depends on D101232

Iain Ireland [:iain]

Assignee

Comment 10

•

4 years ago

Attached file Bug 1674143: Transpile Int32MinMaxArrayResult r=jandem — Details

On a microbenchmark evaluating Math.max(...input), this is ~30x faster.

Depends on D101233

Iain Ireland [:iain]

Assignee

Comment 11

•

4 years ago

Attached file Bug 1674143: Add NumberMinMaxArrayResult r=jandem — Details

This is about a 12x speedup on a microbenchmark.

Depends on D101234

Iain Ireland [:iain]

Assignee

Comment 12

•

4 years ago

Attached file Bug 1674143: Transpile NumberMinMaxArrayResult r=jandem — Details

Depends on D101235

Phabricator Automation

Updated

•

4 years ago

Attachment #9196177 - Attachment description: Bug 1674143: Transpile DoubleMinMaxArrayResult r=jandem → Bug 1674143: Transpile NumberMinMaxArrayResult r=jandem

Pulsebot

Comment 13

•

4 years ago

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7c13fde9da0c Emit OptimizeSpreadCall for non-rest arguments r=arai https://hg.mozilla.org/integration/autoland/rev/52be5f823ff3 Update column offset testcase r=jorendorff https://hg.mozilla.org/integration/autoland/rev/f83cf07fe74a Add Int32MinMaxArrayResult r=jandem https://hg.mozilla.org/integration/autoland/rev/01f4d696ae46 Transpile Int32MinMaxArrayResult r=jandem,anba https://hg.mozilla.org/integration/autoland/rev/5a5fa7bca15d Add NumberMinMaxArrayResult r=jandem https://hg.mozilla.org/integration/autoland/rev/c6c0ef525f49 Transpile NumberMinMaxArrayResult r=jandem

Sandor Molnar[:smolnar]

Comment 14

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/7c13fde9da0c
https://hg.mozilla.org/mozilla-central/rev/52be5f823ff3
https://hg.mozilla.org/mozilla-central/rev/f83cf07fe74a
https://hg.mozilla.org/mozilla-central/rev/01f4d696ae46
https://hg.mozilla.org/mozilla-central/rev/5a5fa7bca15d
https://hg.mozilla.org/mozilla-central/rev/c6c0ef525f49

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox87: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 87 Branch

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

status-firefox84: affected → wontfix

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Math.max+spread seems to be very slow with firefox compared to chrome.

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

Tracking

()

People

(Reporter: calixte, Assigned: iain)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(6 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Updated

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Updated

Comment 13

Comment 14

Updated

Attachment

General

Description

File Name

Content Type