Closed Bug 851769 Opened 7 years ago Closed 2 years ago

Array.reverse() is slower than a simple self-hosted version

Categories

(Core :: JavaScript Engine, defect)

22 Branch
x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: tjmart1, Assigned: jandem)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:22.0) Gecko/20130314 Firefox/22.0
Build ID: 20130314030914

Steps to reproduce:

Ran the tests given here: http://jsperf.com/js-array-reverse-vs-while-loop/5


Actual results:

The native Array.reverse() method was slower than the "for swap" test.  


Expected results:

The native method should probably be faster than a hand coded JS solution.
Assignee: nobody → general
Component: Untriaged → JavaScript Engine
Product: Firefox → Core
Something is very wrong with the benchmark setup.

If you run the attached code in the Scratchpad (open with Shift+F4, then paste the code, select everythink and press Cmd+L to see the output), the "while push then slice" (which uses `splice`, not `slice`, btw) method is roughly 5x as slow as the "for swap" method, which is again roughly 7x as slow as array.reverse. Combined, the method that's fastest according to JSPerf is roughly 35x slower than the native one.

This makes a lot of sense theoretically, too: Reversing an array is highly optimizable in C++, so it's unlikely that a JS implementation can match the speed. Additionally, the "while push then s[p]lice" method should be *much* slower, because it causes lots of allocations through array.splice returning a new array with the removed elements.

To make sure my results aren't only an artifact of executing in the Scratchpad, I also ran the attached benchmark in the Spidermonkey (i.e. Firefox's) and the v8 (i.e. Chrome's) JS shells (you have to wrap the last line in a print() call to see the results).
The results:
js bench.js
pushSplice: 299,swap: 4,array.reverse: 4
v8 bench.js
pushSplice: 151,swap: 3,array.reverse: 5

As you can see, these results roughly corroborate the ones from the Scratchpad, even though they differ in the details.


The morale of the story: don't rely on JSPerf benchmarks. They are weird.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → INVALID
Ah, thanks for looking into this!  I meant to link where I saw the benchmark, which was this stackoverflow thread: http://stackoverflow.com/questions/5276953/what-is-the-most-efficient-way-to-reverse-an-array-in-javascript

It's currently a top search result on this topic, so I'll post a link back to this bug so that (hopefully) the misleading benchmark doesn't confuse others.  (It did seem weird to me that the native method would be slower.)
The benchmark is just buggy.  The "for push then slice" test modifies "length", so all the tests after that don't do anything useful.
I fixed that bug in http://jsperf.com/js-array-reverse-vs-while-loop/9 but the "for swap" test is still about 3x faster in jsperf as array.reverse().

It's worth looking into why that is; that could be either a bug in jsperf of a bug in the test code that I'm not seeing or a bug on our end somehow....
I'm going to reopen this.

I took the attached testcase, converted it into a shell testcase (where it will actually get a JIT, which may well not be the case in scratchpad!), got rid of the super-slow splice thing, and bumped up the iteration count so the times mean something.

The result, in a reasonably current shell:

swap: 183
array.reverse: 435

which pretty much matches the jsperf numbers.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INVALID → ---
Summary: Array.reverse() is pretty slow → Array.reverse() is slower than a simple self-hosted version
A profile shows nothing too terribly surprising.  Some time spent doing GetLengthProperty (could we do it faster in the isArray case?) and ensureDenseElements.  Some time taken by the hole-checking.

I sort of wonder how well a self-hosted version that has all the requisite complexity would do...

That said, on the shell testcase I see these numbers in my (somewhat old; a current one is refusing to compile) v8 shell:

swap: 230
array.reverse: 405

so it's not just us.
Attached file Shell testcase
Attachment #725770 - Attachment is obsolete: true
I took a quick stab at self-hosting this. Unfortunately, a spec-compliant implementation is way slower than the native implementation.

The problem is that we have to reverse with holes intact. That means that the simple t = array[left]; array[left] = array[right]; array[right] = t; thing doesn't work. Instead, the loop body has to look something like this:

    for (var lower = 0, upper = len - 1; lower < upper; lower++, upper--) {
      var lowerExists = lower in O;
      if (lowerExists)
        var temporary = O[lower];
      if (upper in O)
        O[lower] = O[upper];
      else
        delete O[lower];
      if (lowerExists)
        O[upper] = temporary;
      else
        delete O[upper];
    }

All that branching makes this implementation ~3.5x as slow than the attached testcase's one, and 1.5x as slow as the native implementation.

(And it's not even completely spec-compliant, as it doesn't properly handle non-writable properties or generic application to, say, strings, in strict mode.)

We might look into optimizing GetLengthProperty as mentioned in comment 6, but self-hosting this doesn't look promising.
Assignee: general → nobody
I wonder if it makes sense to revisit self-hosting reverse(), maybe the issues observed in comment #8 no longer apply. FWIW the native array_reverse implementation seems to waste quite a bit of time in pre/post barriers... :-/
(In reply to André Bargull from comment #9)
> I wonder if it makes sense to revisit self-hosting reverse(), maybe the
> issues observed in comment #8 no longer apply. FWIW the native array_reverse
> implementation seems to waste quite a bit of time in pre/post barriers... :-/

We could optimize that more though.. Self-hosting could be slower if this is used in polymorphic code.
(In reply to Jan de Mooij [:jandem] from comment #10)
> (In reply to André Bargull from comment #9)
> > I wonder if it makes sense to revisit self-hosting reverse(), maybe the
> > issues observed in comment #8 no longer apply. FWIW the native array_reverse
> > implementation seems to waste quite a bit of time in pre/post barriers... :-/
> 
> We could optimize that more though.. Self-hosting could be slower if this is
> used in polymorphic code.

If it's not too complicated to improve the native version, we should probably look into it. bug 1365361, comment #12 has array_reverse on the 20th place when comparing number of ticks used, even though there are only 102111 calls to that function.
Blocks: 1365361
anba reminded me of reverse() perf being bad still. I'll attach 3 patches to improve Boris' attached testcase from:

  swap: 157
  array.reverse: 653

to:

  swap: 157
  array.reverse: 228

This makes about as fast as V8, maybe a tiny bit faster:

  swap: 92
  array.reverse: 231
Changing |length == 0| to |length <= 1| is correct AFAICS and makes reversing single-element arrays much faster. The micro-benchmark below improves from 176 ms to 114 ms.

André, please double check this is correct for frozen elements and arrays with holes, but I think it is.

function f() {
    var a = [1];
    var t = new Date;
    for (var i = 0; i < 10000000; i++) {
        a.reverse();
    }
    print(new Date - t);
}
f();
Assignee: nobody → jdemooij
Status: REOPENED → ASSIGNED
Attachment #8936836 - Flags: review?(andrebargull)
This adds a fast path for reversing elements when no pre-barrier or property-suppression is needed.
Attachment #8936840 - Flags: review?(andrebargull)
A bit sad but a pretty big speedup. The single-element micro-benchmark improves from 135 ms to 113 ms.
Attachment #8936844 - Flags: review?(andrebargull)
Comment on attachment 8936840 [details] [diff] [review]
Part 2 - Add fast path

Review of attachment 8936840 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/NativeObject-inl.h
@@ +286,5 @@
> +
> +    MOZ_ASSERT(!denseElementsAreCopyOnWrite());
> +    MOZ_ASSERT(!denseElementsAreFrozen());
> +
> +    MOZ_ASSERT(length > 0);

I should change this to length > 1 since that's what we rely on below (it lets us use a do-while loop instead of a while-loop).
Comment on attachment 8936836 [details] [diff] [review]
Part 1 - Handle single-element arrays like empty arrays

Review of attachment 8936836 [details] [diff] [review]:
-----------------------------------------------------------------

Spec-wise we could even directly return at [1] if |len <= 1| is true. 

[1] https://searchfox.org/mozilla-central/rev/110706c3c09d457dc70293b213d7bccb4f6f5643/js/src/jsarray.cpp#1597
Attachment #8936836 - Flags: review?(andrebargull) → review+
Attachment #8936840 - Flags: review?(andrebargull) → review+
Attachment #8936844 - Flags: review?(andrebargull) → review+
(In reply to André Bargull [:anba] from comment #17)
> Spec-wise we could even directly return at [1] if |len <= 1| is true. 

Hm yeah, I might make that change (and assert |length > 1| in ArrayReverseDenseKernel), because I added some logging and (not too surprising) the length == 0 and length == 1 cases are pretty common on some websites.
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9b216b26741a
part 1 - Handle arrays with length <= 1 more efficiently in Array.prototype.reverse. r=anba
https://hg.mozilla.org/integration/mozilla-inbound/rev/a1c58d11bcb4
part 2 - Fast path Array.prototype.reverse when no pre-barriers are needed. r=anba
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d82e132348f
part 3 - Inline GetLengthProperty. r=anba
https://hg.mozilla.org/mozilla-central/rev/9b216b26741a
https://hg.mozilla.org/mozilla-central/rev/a1c58d11bcb4
https://hg.mozilla.org/mozilla-central/rev/6d82e132348f
Status: ASSIGNED → RESOLVED
Closed: 7 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.