Improve allocations and hard-coded limits when to use quick-sort when sorting typed arrays

RESOLVED FIXED in Firefox 56

Status

()

enhancement
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: anba, Assigned: anba)

Tracking

Trunk
mozilla56
Points:
---

Firefox Tracking Flags

(firefox56 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

bug 1381474 improves %TypedArray%.prototype.sort when a custom comparator function is present. The improvements can lead to situations where it's faster to use a custom comparator than to use the default comparator, simply because our default comparators perform some unnecessary allocations or are using outdated limits on when to switch to QuickSort (http://searchfox.org/mozilla-central/rev/01d27fdd3946f7210da91b18fcccca01d7324fe2/js/src/builtin/Sorting.js#98-102; and completely missing for CountingSort).

The attached patch gave the following performance improvements.


Int32Array [N=200, iterations=10000], improves from 375ms to 60ms
    var n = 200;
    var source = new Int32Array(n);
    var j = 0; for (var x of XorShiftGenerator(0xCAFED00D, n)) source[j++] = x;
    var ta = new Int32Array(n);
    var q = 0;
    var t = 0;
    for (var i = 0; i < 10000; ++i) {
        for (var j = 0; j < n; ++j) ta[j] = source[j];
        dt = dateNow();
        ta.sort();
        t += dateNow() - dt;
        q += ta[0];
    }
    print(t, q);


Int32Array [N=1000, iterations=10000], improves from 725ms to 485ms
    var n = 1000;
    var source = new Int32Array(n);
    var j = 0; for (var x of XorShiftGenerator(0xCAFED00D, n)) source[j++] = x;
    var ta = new Int32Array(n);
    var q = 0;
    var t = 0;
    for (var i = 0; i < 10000; ++i) {
        for (var j = 0; j < n; ++j) ta[j] = source[j];
        dt = dateNow();
        ta.sort();
        t += dateNow() - dt;
        q += ta[0];
    }
    print(t, q);


Int8Array [N=100, iterations=100000], improves from 830ms to 265ms
    var n = 100;
    var source = new Int8Array(n);
    var j = 0; for (var x of XorShiftGenerator(0xCAFED00D, n)) source[j++] = x;
    var ta = new Int8Array(n);
    var q = 0;
    var t = 0;
    for (var i = 0; i < 100000; ++i) {
        for (var j = 0; j < n; ++j) ta[j] = source[j];
        dt = dateNow();
        ta.sort();
        t += dateNow() - dt;
        q += ta[0];
    }
    return [t, q];


Int8Array [N=1000, iterations=50000], improves from 680ms to 460ms
    var n = 1000;
    var source = new Int8Array(n);
    var j = 0; for (var x of XorShiftGenerator(0xCAFED00D, n)) source[j++] = x;
    var ta = new Int8Array(n);
    var q = 0;
    var t = 0;
    for (var i = 0; i < 50000; ++i) {
        for (var j = 0; j < n; ++j) ta[j] = source[j];
        dt = dateNow();
        ta.sort();
        t += dateNow() - dt;
        q += ta[0];
    }
    print(t, q);


function* XorShiftGenerator(seed, size, mask = 0xfffff) {
  var x = seed;
  for (var i = 0; i < size; i++) {
    x ^= x >> 12;
    x ^= x << 25;
    x ^= x >> 27;
    yield x % mask;
  }
}
Posted patch bug1381489.patch (obsolete) — Splinter Review
Changes for CountingSort:
- Default to quick-sort for small arrays to avoid the overhead of allocating the |buffer| array.
- Use a hard-coded array literal for the |buffer|, so we don't need to initialise all entries manually.
- When traversing the buffer, minimise read/write accesses to |buffer|.


Changes for RadixSort:
- Use a higher limit before using radix-sort, the additional allocations/setup code only makes it worth to use radix-sort for larger arrays.
- Switch to arrays instead of List, now that _DefineDataProperty is more optimised.
- Similar to CountingSort, use a hard-coded array literal for |counts|.
- And re-use the |counts| array, so we don't need allocate a new array for each SortByColumn call.
Attachment #8887060 - Flags: review?(till)
Comment on attachment 8887060 [details] [diff] [review]
bug1381489.patch

Review of attachment 8887060 [details] [diff] [review]:
-----------------------------------------------------------------

I guess for the non-comparefn case, we could in theory allocate the buffer/count arrays once and reuse them for all invocations. Might not be worth it, though, not sure. In any case, these are very nice improvements!
Attachment #8887060 - Flags: review?(till) → review+
I had to amend the patch a bit:
- Added markers so eslint won't complain about missing whitespace after commas.
- Changed tests/ecma_6/TypedArray/sort_basics.js to test arrays up to 512 elements to ensure the RadixSort implementation is tested (typed arrays with less than 512 elements call QuickSort with this patch).
- Replaced assertDeepEq with assertEqArray in TypedArray/sort_basics.js to ensure the test runtime is still acceptable even when typed arrays with 512 elements are tested.
- And most importantly, fixed a off-by-one error in CountingSort: The counter variable |val| was initialised with 0 and incremented with the post-increment operator, while it should be initialised with -1 and the incremented with pre-increment operator. This error was caught by tests/ecma_6/TypedArray/sort_basics.js.

I think it's okay to carry the r+, but in case someone wants to take a second look, I'll keep the bug open for a bit before requesting check-in.
Attachment #8887060 - Attachment is obsolete: true
Attachment #8887386 - Flags: review+
(In reply to Till Schneidereit [:till] from comment #2)
> I guess for the non-comparefn case, we could in theory allocate the
> buffer/count arrays once and reuse them for all invocations. Might not be
> worth it, though, not sure. In any case, these are very nice improvements!

I was also wondering about that, but decided not to take this route, because it'd mean the arrays would be kept alive forever.
(In reply to André Bargull from comment #4)
> I was also wondering about that, but decided not to take this route, because
> it'd mean the arrays would be kept alive forever.

Good point. I'm not sure it'd matter much, given the arrays' small size, but given that the gains also wouldn't be all that big, it makes sense regardless.

Also, feel free to land this whenever, the updated version looks good.
(In reply to Till Schneidereit [:till] from comment #5)
> Also, feel free to land this whenever, the updated version looks good.

Okay, and thanks for having a second look at the updated patch!
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/32c86747823b
Improve allocations and length limits when sorting typed arrays without a custom comparator. r=till
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/32c86747823b
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.