Horrible performance of Array.prototype.sort(cmp_fn) compared to other JS engines
Categories
(Core :: JavaScript Engine, defect, P3)
Tracking
()
People
(Reporter: qwertiest, Unassigned)
References
Details
Attachments
(1 file)
1.38 KB,
text/html
|
Details |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:63.0) Gecko/20100101 Firefox/63.0 Steps to reproduce: Run the following fiddle in FF, IE Edge and Chrome: https://jsfiddle.net/c42vd3m9/2171/ Actual results: 6-9x slower sort in FF than IE: FF: 18.57ms Chrome: 2.69ms IE: 2.34ms Comparison counts: FF: always ~120k Chrome: nearly always 64k IE: usually in the range of 50-60k (goes as low as 45k and as high as 65k)
Comment 1•6 years ago
|
||
I guess most of the time is spent in VM calls for relational string comparison. Changing the comparison function to: --- var str_cmp_first_charcode = function (a, b) { return (a.charCodeAt(0) < b.charCodeAt(0) ? -1 : a.charCodeAt(0) > b.charCodeAt(0) ? 1 : 0); } --- makes it much faster.
Reporter | ||
Comment 2•6 years ago
|
||
Note: even after removing `toString` + and using `str_cmp_full` (e.g. comparing numbers instead of strings) tests show quite a slowdown in FF: Edge: 1.07ms FF: 2.2ms
Reporter | ||
Comment 3•6 years ago
|
||
Comparison count difference is caused by a difference in sort algorithm: https://v8.dev/blog/array-sort PS: Timsort is epic on expensive comparisons and doubly epic with partially sorted data.
Comment 4•6 years ago
|
||
I don't know if my knowledge is still current (it's been years), but IIRC our merge sort uses an iterative algorithm that merges larger and larger power-of-2 chunks together, copying them back and forth between two arrays. This avoids the overhead of the recursive version, but depending on the size of the array to be sorted there might be an additional copy at the end (if an in-place sort is required). Timsort at its core actually sounds very similar - IIUC it's just smarter about taking advantage of partially sorted data. If we still use the implementation of merge sort that I remember, Timsort might be a reasonably good fit for the existing code. It's been a long time though, so maybe that's a big if.
Comment 5•6 years ago
|
||
Chromium 70 needs 3.5s when comparing numbers with `str_cmp_full` whereas Firefox 63 needs 2.7s (on Ubuntu running in a VM). And when running on Win10 (also in a VM), Edge is indeed twice as fast for number comparison with `str_cmp_full`. But I guess that's only a side-effect of using QuickSort in Edge and only sorting 21 unique numbers. After changing the test case to use `Math.random() * 20000000` instead of `Math.random() * 20`, Firefox was faster for me when compared to Edge. Btw. simple number comparison can be further speed up in Firefox by changing the comparison function to: --- function num_cmp_full_fast(a, b) { return a - b; } --- because that form is special-cased, so it doesn't actually call the function: https://searchfox.org/mozilla-central/rev/8848b9741fc4ee4e9bc3ae83ea0fc048da39979f/js/src/builtin/Array.cpp#2023-2077 :-)
Reporter | ||
Comment 6•6 years ago
|
||
Well, my primary scenario was sorting with non-trivial string comparison, which was simplified as `str_cmp_full`. It seems that the performance hit when using custom non-trivial comparator has several causes: 1. Extra comparisons (merge-sort vs tim-sort). 2. Slow string comparison in general (e.g. `str_cmp_first_charcode` vs `str_cmp_first`).
Updated•5 years ago
|
Comment 7•5 years ago
|
||
Posted some patches which should improve this case:
- bug 1538690 should make
str_cmp_first
as fast asstr_cmp_first_charcode
. - bug 1538692 and bug 1291463 should improve
str_cmp_full
. In my dev-environment, from 13.8ms to 5.5ms; 4.5ms when--spectre-mitigations=off
is used. (For comparison: JSC needs 3.2ms and V8 needs 3.3ms in the same environment.)
Reporter | ||
Comment 8•5 years ago
|
||
That's awesome! Thanks for all the work!
It seems the only remaining problem is the sorting algorithm: merge-sort is not really suitable for expensive comparisons - my example used rather primitive comparison function (which happened to be expensive, because of other bugs that are fixed with the patches above), but it was more complex (and expensive) in the real scenario.
Comment 9•5 years ago
|
||
(In reply to TheQwertiest from comment #8)
That's awesome! Thanks for all the work!
\o/
It seems the only remaining problem is the sorting algorithm: merge-sort is not really suitable for expensive comparisons - my example used rather primitive comparison function (which happened to be expensive, because of other bugs that are fixed with the patches above), but it was more complex (and expensive) in the real scenario.
There have been attempts to use tim-sort (bug 715181), but in the end a simpler solution (= merge-sort) was chosen. Maybe it makes sense to revisit this decision. For the test case from comment #0, tim-sort seems to be better than merge-sort simply because the input contains many duplicates (~8M comparisons for tim-sort and ~12M comparisons in merge-sort). When fewer duplicates are present (comment #5), tim-sort will also call the comparator function about 12M times. (Which shouldn't be too surprising, because the worst case for both algorithms is O(n * log n). :-)
Comment 10•5 years ago
|
||
It seems the only remaining problem is the sorting algorithm: merge-sort is not really suitable for expensive comparisons - my example used rather primitive comparison function (which happened to be expensive, because of other bugs that are fixed with the patches above), but it was more complex (and expensive) in the real scenario.
In defense of merge sort, it is very efficient in terms of comparisons in the general case. Indeed, it approaches the theoretical limit for larger arrays (the best known sorting function in terms of comparison efficiency is a hybrid of Ford-Johnson sort and merge sort with some special case logic).
With that being said, I'm not denying that there are real world examples where you can do much better - e.g. arrays that start out nearly sorted (or sorted in reverse). Detecting such cases adds more overhead to the general case, of course, but I'm sure they're common enough that it's worth the cost. I just wanted to push back a little against the claim that "merge-sort is not really suitable for expensive comparisons" :)
Reporter | ||
Comment 11•5 years ago
|
||
Fair enough, my statement was way too categorical =)
And I didn't test it in SM, so it might be not as useful in this case (e.g. bug 715181).
PS: a really good MIT-licensed tim-sort implementation is available here - https://github.com/tvanslyke/timsort-cpp
Reporter | ||
Comment 12•5 years ago
|
||
It seems to be fixed in ESR68: it actually outperforms Chrome and IE Edge most of the time now!
Reporter | ||
Comment 13•5 years ago
|
||
Updated JS fiddle with working console.log (the repro code is the same): https://jsfiddle.net/3k0jn6Ly/
Updated•2 years ago
|
Comment 14•3 months ago
|
||
Comment 15•3 months ago
|
||
Numbers from latest Nightly. And I have increased the number of iterations by 10X :
Nightly: https://share.firefox.dev/3TUlKA1 (2s)
Chrome: https://share.firefox.dev/3ShwYxd (1.4s)
Description
•