sort() should work better on already- or mostly-sorted input

VERIFIED FIXED in mozilla1.0

Status

()

defect
VERIFIED FIXED
18 years ago
5 years ago

People

(Reporter: pschwartau, Assigned: khanson)

Tracking

({js1.5, perf})

Trunk
mozilla1.0
x86
All
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(5 attachments, 6 obsolete attachments)

Reporter

Description

18 years ago
This arose out of bug 98012. The script at the given URL creates arrays
of 237 elements, each element of which is an array. Then it invokes the sort() 
method without using a comparison function. 


In bug 98012, jrgm made the following observations:

"There is a 237-element array of arrays that is sorted. This is slower than
IE since (a) mozilla uses quicksort (right?) and that 237-array
is already sorted (hence mozilla goes n-squared; IE apparently uses some
heap sort), and (b) when sorting an array of arrays, and not given a 
comparison function, mozilla js just flattens each subarray into a string, 
probably chucking the result after each of about 28000 comparisons
(and possibly IE is doing element-wise comparison as long as the elements
remain strings and numbers, but I can't really prove that).

That web page will load much faster if instead of calling ary.sort(), the 
authors did ary.sort(sortFunc), with 

  function sortFunc(a, b) {
      if (a[0] < b[0])
          return -1; 
      if (a[0] > b[0])
          return 1;
      return 0;
  }

That cuts the n-squared sort down from about 14 seconds to 0.5 seconds on my 
win2k system (500MHz)."


See bug 98012 for a standalone JS shell testscase -
Reporter

Comment 1

18 years ago
Adding perf keyword; cc'ing contributors from bug 98012 -
Keywords: perf
Assignee

Comment 2

18 years ago
Quicksort works well with unsorted data.  Sorted data causes long sort times
because the dividing or pivot point is the first element of the subarray being
sorted (usually a reasonable choice for shuffled input.)  Some implementations
select the median value or the median value of the first, middle and last
value.	This strategy fails for inputs that are all equal.  This patch
determines quickly if the input needs sorting and then proceeds to sort,
otherwise it skips the sort.  It does not solve the more difficult problem of
nearly sorted input.
Assignee

Comment 3

18 years ago
I tried putting a 2 pass early exit bubble sort, bubbling from the end (assumes
new items have been appended) of the list to detect sorted and some almost
sorted lists.  The overhead of a 2000 item almost sorted list was excessive
(10%).  Then there is the subjective problem of almost sorted lists: 1)are the
non ordered items in the middle, begining or end, 2) how many items are unsorted
1, 2 or some small number, 3) which sorting algorithm to use.  I decided the
most efficent patch was to check the list for unsorted elements and proceed to
existing quicksort when the first unsorted pair is found.  If no unsorted pair
is found skip the sort.

The patch is included above.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.7

Comment 4

18 years ago
What about switching to heapsort in Mozilla too?

Or if you want to continue using quicksort, use a better "find pivot element"
algorithm that will only choose a pivot element after seeing two different
values. If it fails to do so, that partition is already "sorted". I don't know
quite how to implement it in a good way but it should be possible.

Comment 5

18 years ago
Or a third approach that will handle the "all equal case" better, partition the
data into three partitions.

	  +-----------+---------+------------+
	  |     <     |    =    |     >      |
	  +-----------+---------+------------+

Then recurse only on the left and right parts. 

Sample implementation presented to me:

int quicksort(int p, int r) {
  int x = A[r];  /* pivot */
  int l = p-1;
  int e = p-1;
  int j, t;

  if (r <= p)
    return;
  for (j = p; j <= r; j++) {   /* partition */
    if (A[j] < x) {
      l++; e++; t = A[j]; A[j] = A[e]; A[e] = A[l]; A[l] = t; 
    } else if (A[j] == x) {
      e++;      t = A[j]; A[j] = A[e]; A[e] = t;
  }
  /* A[p..l] < x    A[l+1..e] = x    A[e+1..r] > x  */
  quicksort(p, l);
  quicksort(e+1, r);
}


Assignee

Comment 6

18 years ago
Daniel, thanks for the suggestions.  
THe current suggested patch examines the list only once, and jumps to the sort
as soon as an out of order pair is found (not unequal as you suggested).  The
quicksort algorithm you suggested would certainly speeds up sorting of lists
containing all equal elements, but would still perform poorly on almost ordered
lists. The heapsort while slightly slower (16N lg N vs 11.67N lg N, on average,
Knuth) has a worst case of order N log N while quicksort worst case is of order
N*N.  Might be worth considering. 
Assignee

Comment 7

18 years ago
I am convinced that heapsort is the proper sorting algorithm.  While slightly
less speedy than quicksort there are no pathalogical cases that cause heapsort
to degrade to n^2.  Patch coming soon.
Assignee

Comment 8

18 years ago
Updating milestone to 9.8
Target Milestone: mozilla0.9.7 → mozilla0.9.8
Assignee

Comment 9

18 years ago
Posted patch heapsort replaces quicksort (obsolete) — Splinter Review
Replaces quicksort with heapsort.  Heapsort is about 15% slower on random data,
but much faster on sorted and mostly sorted cases (see data).

			     random   sorted  reverse identical
					       sorted	 
quicksort   (100 values)	 15	  78	   77	    77ms
heapsort    (100 values)	 18	  19	   17	     7

quicksort  (2000 values)	565    30596	30598	 30490
heapsort   (2000 values)	621	 639	  593	   137

quicksort  (5000 values)       1501   189465   189428	188845
heapsort   (5000 values)       1753	1815	 1683	   344

Comment 10

18 years ago
Looks like heapsort is the winner. :-)

Btw, heapsort don't have to be less speedy than quicksort. The analysis Knuth
made was on a hypothetical computer so the constants doesn't fit normal
computers today. I've heard about, and read analysis about heapsort being faster
than quicksort. I guess it all depends on the hardware and implementation. 
Reporter

Comment 11

18 years ago
cc'ing Waldemar - this is the bug we discussed at the last JS Engine meeting.
Assignee

Comment 12

18 years ago
An interesting test case that shows the performance difference in the current
quicksort algorithm and the proposed heapsort algorithm is a large array of
random 0's and 1's.
Attachment #61513 - Attachment is obsolete: true
Comment on attachment 62882 [details] [diff] [review]
Tab free revision of previous patch.  Includes a minor tweak.

Sorry I didn't review this in time for 0.9.8 -- can I make it up by helping get
r=?

Nit: js_sortHeap violates the prevailing naming convention, which would favor
js_SortHeap (js_qsort_r follows a non-prevailing style from the POSIX world,
never mind it).

/be
Attachment #62882 - Flags: superreview+
Reporter

Comment 14

18 years ago
From Comment #9:

> Replaces quicksort with heapsort.  Heapsort is about 15% slower on 
> random data, but much faster on sorted and mostly sorted cases (see data).

Is the 15% slowdown on random data an acceptable price?
Comment on attachment 62882 [details] [diff] [review]
Tab free revision of previous patch.  Includes a minor tweak.

I believe the random input case is less common than nearly sorted or reversed.

IE's probable use of a heap-sort is compelling too, because people will expect
others to match.

But my review was lacking.  Trying again:

>-                    js_qsort(table, (size_t)j, sizeof *table, CompareOffsets,
>+                    js_hsort(table, (size_t)j, sizeof *table, CompareOffsets,
>                              NULL);

More naming nits: the js_qsort name aped Unix qsort(3), but I see on reason to
spread that meme.  How about js_HeapSort for the extern friend function, and
HeapSortHelper or some such for the static, instead of js_sortHeap (or
js_SortHeap as I mentioned last time, which now seems a silly name next to
js_HeapSort ;-)?

>@@ -649,49 +650,46 @@
>     (fastmove ? (void)(*(jsval*)(p) = *(jsval*)(q)) : (void)memmove(p, q, n))
> 
>-    while (lo < hi) {
>-        i = lo;
>-        j = hi;
>-        a = (char *)vec + i * elsize;
>+    if (lo == 1) {
>+        i2 = 2;
>+        b = (char *)vec + elsize;
>+        if ((i2 < hi) && (cmp((char *)vec, b, arg) < 0)) i2++;

Please put then statements on their own lines (i2++; here), for
breakpoint-ability.

Also, no need to overparenthesize the operands of && in that if condition.

>+        a = (char *)vec + (hi - 1) * elsize;
>+        b = (char *)vec2 + i2 * elsize;
>+        if (cmp(a, b, arg) >= 0) return;

return on its own line, and no else after return here.

>+        else {
>         MEMMOVE(pivot, a, elsize);
>-        while (i < j) {
>-            b = (char *)vec + j * elsize;
>-            if (cmp(b, pivot, arg) >= 0) {
>-                j--;
>-                continue;
>-            }
>             MEMMOVE(a, b, elsize);
>-            while (cmp(a, pivot, arg) <= 0) {
>-                i++;
>-                a = (char *)vec + i * elsize;
>-                if (i == j)
>-                    goto store_pivot;
>+            lo = i2;
>             }
>-            MEMMOVE(b, a, elsize);
>         }
>-        if (i > lo) {
>-      store_pivot:
>-            MEMMOVE(a, pivot, elsize);
>+    else {
>+        a = (char *)vec2 + lo * elsize;
>+        MEMMOVE(pivot, a, elsize);
>         }
>-        if (i - lo < hi - i) {
>-            lohi = i - 1;
>-            if (lo < lohi)
>-                js_qsort_r(qa, lo, lohi);
>-            lo = i + 1;
>-        } else {
>-            hilo = i + 1;
>-            if (hilo < hi)
>-                js_qsort_r(qa, hilo, hi);
>-            hi = i - 1;
>+    
>+    while (lo <= hi/2) {

Capture hi/2 in a local variable before the loop, just for feeble compilers and
my quick-scan sanity.

>+        i2 = lo + lo;

Is i2 the best name?  Canonical next single-letter name would be j, but maybe
there's a longer, yet still pithy name?

>+        a = (char *)vec2 + i2 * elsize;
>+        b = (char *)vec + (i2 - 1) * elsize;
>+        if ((i2 < hi) && (cmp(a, b, arg) < 0)) i2++;
>+        b = (char *)vec2 + i2 * elsize;
>+        if (cmp(pivot, b, arg) >= 0) break;
>+        else {

break on its own line, and get rid of the else and braces (and indentation).

>+            a = (char *)vec2 + lo * elsize;
>+            MEMMOVE(a, b, elsize);
>+            lo = i2;
>         }
>     }
> 
>+    a = (char *)vec2 + lo * elsize;
>+    MEMMOVE(a, pivot, elsize);
> #undef MEMMOVE
> }
> 
> JSBool
>-js_qsort(void *vec, size_t nel, size_t elsize, JSComparator cmp, void *arg)
>-{
>+js_hsort(void *vec, size_t nel, size_t elsize, JSComparator cmp, void *arg) {
>     void *pivot;
>     QSortArgs qa;
>+    int i;
> 
>     pivot = malloc(elsize);
>@@ -703,5 +701,10 @@
>     qa.cmp = cmp;
>     qa.arg = arg;
>-    js_qsort_r(&qa, 0, (int)(nel - 1));
>+
>+    for (i = nel/2; i > 0; i--)
>+        js_sortHeap(&qa, i, (int) nel);
>+    while ((int) nel > 2)

Is the (int) cast needed here by some compiler?

/be
Attachment #62882 - Flags: superreview+

Comment 16

18 years ago
I was meaning to comment in this bug for some time as it was my initial 
musing about a "quicker quicksort" ;-), which I suppose led to this bug/rfe.

I was wondering about the tradeoff we are making here, and while I agree 
somewhat with brendan that partially-sorted data is probably more the norm,
I wish we had a little more empirical data to back this up. [I'll note also
that the difference in the two is likely not to show up for typical web 
content, where arrays are usually "short" (N < 50), although JS is used in 
other places that may have different typical use].

One area that I may be impacted is in the machv UI. I can think of one area 
where a JS sort is used (fonts) and that data may be unsorted when retrieved 
(and on some systems with many fonts installed, may be >50 items long). I 
should check this tomorrow. 


jrgm, I agree we need more data.  My belief is based on cases that have come up
in the last year or so, and intuition.  Hardly enough to decide the issue!

One thought: we could instrument js_qsort in a test build so it compiles and
reports disorder stats, and run it through a bunch of tests and UI.  Or do
something smarter -- anyone?

/be
Assignee

Comment 18

18 years ago
Posted patch Changes suggested by Brendan (obsolete) — Splinter Review
Cosmetic changes suggested by Brendan. Code should be the same.

My intuition suggests that the 15% performance hit will not be noticed on small
or random sorts.  I'd be willing to build an instrumented version, possibly
including both sort algorithms selectable by the user.
Attachment #62882 - Attachment is obsolete: true
khanson: or sort both ways, keeping score, and mail the results periodically and
automatically to jrgm!  Half-kidding, we could actually do this fairly easily.

/be

Comment 20

18 years ago
Did you prove that this implementation of heapsort will terminate and not do
nasty things such as overrun buffers when given an inconsistent comparison
function?  An inconsistent comparison function might state that x>y, y>z, and
z>x, or it might state that x>y and later change its mind and state that x<y. 
Heapsorts are usually safe against such nastiness but individual implementations
may vary.
Assignee

Comment 21

18 years ago
Yes, this implementation will terminate always and will not overwrite memory
outside the sort range. Of course, an inconsistent comparison function probably
makes the sort results meaningless.  Thanks for the input.

=Kenton=

Assignee

Comment 22

18 years ago
The timings reported for the heap sort of random arrays is very close to the
worst case sort, i.e., the computation time upper limit for the heap sort
algorithm is very well behaved.  This limit is the same for inconsistent
comparison functions.

=Kenton=
Assignee

Comment 23

18 years ago
targeting 9.9.
Target Milestone: mozilla0.9.8 → mozilla0.9.9

Updated

18 years ago
Blocks: js-perf

Comment 25

18 years ago
Comment on attachment 65277 [details] [diff] [review]
Changes suggested by Brendan

The patch has a mix of tabs and spaces in the indentation. That should probably
be only spaces. Also, in jsarray.h qsort is still mentioned in a comment and in
some data structures, that better be renamed or corrected.

Comment 26

18 years ago
Comment on attachment 65277 [details] [diff] [review]
Changes suggested by Brendan

Forgot to mention that at least one source line grows beyond the 80 char width
limit.

Updated

18 years ago
Blocks: 121136
Oops, meant to say this here, not in the now-invalidated bug 121136:

We should be careful not to over-optimize one not particularly representative
benchmark.  I say this only because it seems like the benchmarks for sorting
tend to use the default comparison, and arrays of numbers.  Real-world sort
cases I've seen sort based on strings (e.g., directory listings for the filepicker).

/be
Assignee

Comment 28

18 years ago
Posted patch Revised patch (obsolete) — Splinter Review
Revised patch.	Maybe this time I caught all the tabs!

=Kenton=
Attachment #65277 - Attachment is obsolete: true

Comment 29

18 years ago
The test measures speed of Array.sort() for arrays with numbers in ascending
and  descending orders and filled by calls to Math.random(), filled by calls to
Math.sin()

Comment 30

18 years ago
Tipical results for the above test (with array length = 100) under RedHat Linux
7.2, Celeron 466, 512MB:

without the patch:

Ascending array sort time, ms:  12.13
Descending array sort time, ms: 7.83
Random array sort time, ms:     77.88
Sinus array sort time, ms:      40.21

with the patch:

Ascending array sort time, ms:  6.31
Descending array sort time, ms: 6.02
Random array sort time, ms:     55.78
Sinus array sort time, ms:      47.92

Observations: heapsort do better not only for sorted, but for random input 
as well, it is only slower for "sinus" order (5 periods of Math.sin()), but then
time is closer to the random case as a proof of heapsort stability
Comment on attachment 66143 [details] [diff] [review]
Revised patch

>@@ -649,49 +650,45 @@
>     (fastmove ? (void)(*(jsval*)(p) = *(jsval*)(q)) : (void)memmove(p, q, n))
> 
>-    while (lo < hi) {
>-        i = lo;
>-        j = hi;
>-        a = (char *)vec + i * elsize;
>+        if (lo == 1) {
>+        j = 2;

Even in a diff -w, something's amiss that tabs can't cause (and don't -- no
tabs in sight): that if needs to be exdented one stop, I think.

>+        b = (char *)vec + elsize;
>+        if (j < hi && cmp((char *)vec, b, arg) < 0) j++;
>+        a = (char *)vec + (hi - 1) * elsize;
>+        b = (char *)vec2 + j * elsize;
>+        if (cmp(a, b, arg) >= 0) return;

Return on same line as if considered harmful to debugability.

>+        
>         MEMMOVE(pivot, a, elsize);
>-        while (i < j) {
>-            b = (char *)vec + j * elsize;
>-            if (cmp(b, pivot, arg) >= 0) {
>-                j--;
>-                continue;
>-            }
>             MEMMOVE(a, b, elsize);
>-            while (cmp(a, pivot, arg) <= 0) {
>-                i++;
>-                a = (char *)vec + i * elsize;
>-                if (i == j)
>-                    goto store_pivot;
>-            }
>-            MEMMOVE(b, a, elsize);
>+            lo = j;
>         }
>-        if (i > lo) {
>-      store_pivot:
>-            MEMMOVE(a, pivot, elsize);
>-        }
>-        if (i - lo < hi - i) {
>-            lohi = i - 1;
>-            if (lo < lohi)
>-                js_qsort_r(qa, lo, lohi);
>-            lo = i + 1;
>-        } else {
>-            hilo = i + 1;
>-            if (hilo < hi)
>-                js_qsort_r(qa, hilo, hi);
>-            hi = i - 1;
>+    else {
>+        a = (char *)vec2 + lo * elsize;
>+        MEMMOVE(pivot, a, elsize);
>         }

Hmm, maybe I am reading the diff -uw wrong -- how about a diff -u as well?

>+
>+    hiDiv2 = hi/2;
>+    while (lo <= hiDiv2) {
>+        j = lo + lo;
>+        a = (char *)vec2 + j * elsize;
>+        b = (char *)vec + (j - 1) * elsize;
>+        if ((j < hi) && (cmp(a, b, arg) < 0)) j++;

More then-on-same-line-as-if(condition)-is-hard-to-breakpoint whining from me.

>+        b = (char *)vec2 + j * elsize;
>+       if (cmp(pivot, b, arg) >= 0) break;

Ditto.

>+ 
>+        a = (char *)vec2 + lo * elsize;
>+        MEMMOVE(a, b, elsize);
>+        lo = j;
>     }
> 
>+    a = (char *)vec2 + lo * elsize;
>+    MEMMOVE(a, pivot, elsize);
> #undef MEMMOVE
> }
> 
> JSBool
>-js_qsort(void *vec, size_t nel, size_t elsize, JSComparator cmp, void *arg)
>-{
>+js_HeapSort(void *vec, size_t nel, size_t elsize, JSComparator cmp, void *arg) {

Opening brace of function body on its own line is prevailing style, and helps
vi/vim lusers like me find the top quickly (via [[).

>@@ -1850,5 +1850,5 @@
> 			}
> 		    }
>-                    js_qsort(table, (size_t)j, sizeof *table, CompareOffsets,
>+                    js_HeapSort(table, (size_t)j, sizeof *table, CompareOffsets,
>                              NULL);

Totally aesthetic nit: make that NULL underhang the first (table) actual
parameter.

sr=brendan@mozilla.org with the above issues, if real, fixed.

/be
> 		}
>
Attachment #66143 - Flags: superreview+
Assignee

Comment 32

18 years ago
Posted patch revised patch (format only) (obsolete) — Splinter Review
Contains format changes only.

Kenton
Attachment #66143 - Attachment is obsolete: true

Comment 33

18 years ago
It is possible to speedup sorting for the default lexicographical sort farther
by using small cache to keep results of 3 resent calls to js_ValueToString
during heapsort. For number-only arrays it gives another 20-30% speedup and
brings JS on pair with IE. 
Igor, how about a patch?

Kenton, I'll bring my sr= forward to the latest patch.  Can you get an r= from
someone (bratell, rogerl, shaver)?  Thanks.

/be
Attachment #66374 - Flags: superreview+
Assignee

Comment 35

18 years ago

*** This bug has been marked as a duplicate of 120992 ***
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago
Resolution: --- → DUPLICATE

Comment 36

18 years ago
Don't think this is a dupe :)
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee

Comment 37

18 years ago
Ugh! I hit the wrong button!

Comment 38

18 years ago
I did a quick run where I tried with libc 2.2.5's quicksort:

qsort:
asc: 17.01
des: 5.74
rand: 61.44
sinus: 52:12

qsort-libc:
asc: 3.67
des: 3.52
rand: 48.26
sinus: 40.57

hs:
asc: 4.63
des: 4.79
rand: 69.4
sinus: 63.41

and now the % time saved:

hs compared to qsort: 
asc: 0.72
des: 0.16
rand: -0.12
sinus: -0.21

qsort-libc compared to qsort:
asc: 0.78
des: 0.38
rand: 0.21
sinus: 0.22

qsort-libc compared to hs:
asc: 0.20
des: 0.26
rand: 0.30
sinus: 0.36

qsort still wins :)
Dennis, how did you compare?  Did you hack GNU libc 2.2.5 qsort into the JS
engine?  Just wondering what it does differently (better pivot selection?) than
the old js_qsort.

/be
Status: REOPENED → ASSIGNED
Keywords: js1.5, mozilla0.9.9

Comment 40

18 years ago
Yup... Simply took the source and changed the call to the compare function (to
send the arg with).
Direct from libc's implementation:
/*
Order size using quicksort.  This implementation incorporates
four optimizations discussed in Sedgewick:

1. Non-recursive, using an explicit stack of pointer that store the
   next array partition to sort.  To save time, this maximum amount
   of space required to store an array of SIZE_MAX is allocated on the
   stack.  Assuming a 32-bit (64 bit) integer for size_t, this needs
   only 32 * sizeof(stack_node) == 256 bytes (for 64 bit: 1024 bytes).
   Pretty cheap, actually.

2. Chose the pivot element using a median-of-three decision tree.
   This reduces the probability of selecting a bad pivot value and
   eliminates certain extraneous comparisons.

3. Only quicksorts TOTAL_ELEMS / MAX_THRESH partitions, leaving
   insertion sort to order the MAX_THRESH items within each partition.
   This is a big win, since insertion sort is faster for small, mostly
   sorted array segments.

4. The larger of the two sub-partitions is always pushed onto the
   stack first, with the algorithm then concentrating on the
   smaller partition.  This *guarantees* no more than log (total_elems)
   stack size is needed (actually O(1) in this case)!  */

In less words:
First it uses a non-recursive quicksort to sort into non-sorted bloks of 4
elements, then uses insertion-sort to sort the rest. The quicksort uses the most
optimal of the 1st, middle and last element for its pivot thus avoiding
worst-case runtime.
Assignee

Comment 41

18 years ago
Comment on attachment 66183 [details]
Test case for array sorting

Test has flaws.

1)  Althouhg numeric data is being put into the array, it is being sorted be
lexical order, thus, the array containing the ordered numbers 1..100 is  not
sorted lexically.  All the examples in the test are not ordered by the default
sorting criteria used.

2)  Most sorting algorithms hvae similar speeds for arrays less than 100,
larger arrays that are more problematic exhibit different speeds (see comment 9
above.)

=Kenton=

Comment 42

18 years ago
sorry if this comment is a duplicate - i'm having bugzilla log-in trouble.

It seems to me that it is not the actual sort algorithm that is slowing stuff
down, rather the conversion on non-string data into strings for the default
lexical search.

My tests on a shortish non-random (keyboard bashing) array:

http://halo.gen.nz/moz/sorttest.html

                                     Moz 0.9.7     MSIE 5.0     NN 4.77

plain array sort over numbers         52.775        0.7          5.66
plain array sort over arrays         161.935        1.85        24.935
plain array sort over strings          2.005        0.3          1.9
array of arrays, sorted by function    7.06         6.61         3.305
sorted by anonymous function           5.61         6.61         2.605

I'm just wondering if this quicksort vs heapsort discussion is missing the point
a bit.
Assignee

Comment 43

18 years ago
The test program has flaws.

1)  for (w=0;w<no;w++){ ary.sort()}

loop sorts the array on the first iteration and the remaining iterations are a
sort of an already sorted array (a bad case for the current quick sort algorithm).

2)  I still claim that sorting performance issues are related to large arrays (
> 100).   The original bug report had an example array of size 237.  This test
program uses arrays of size 43.

3) I?d rather see the anonymous function as follows,

        ary.sort(function (a,b) {return a - b;})

4)  Bug #120992 is addressing some of the binary <-> decimal conversion
performance issues.  More to come.

Comment 44

18 years ago
When I added a cache (see the following patch) to keep results of few last calls
to js_NumberToString I got a speedup of factor 2.5 for sorting numeric arrays by
lexicographical sort:  

WITHOUT CACHE
time js array.js 
Ascending array sort time, ms:  1037.3
Descending array sort time, ms: 998.1
Random array sort time, ms:     667.7
Sinus array sort time, ms:      770.1

real    0m35.517s
user    0m35.150s
sys     0m0.190s

WITH CACHE
time js array.js 
Ascending array sort time, ms:  194.1
Descending array sort time, ms: 592.7
Random array sort time, ms:     253.3
Sinus array sort time, ms:      359.7
NumToStrCache hit/access: 55% 2558358
real    0m14.501s
user    0m14.330s
sys     0m0.170s


So indeed it is the number to string conversion that slows down the sort.

Comment 45

18 years ago
To make the patch fully working I need to know how to teach GC about cached
strings.

The patch adds to JSContext a small cache to store few (the number used by
patch is 5) recent results of js_NumberToString. It seems that this is
especially useful for sorting where speedup up to 2.5 is observed, but it works
in general too: number of hits during execution of
mozilla/js/benchmarks/all_bench.js was 99%...

Comment 46

18 years ago
This is really off topic for this bug, but anyway, every discussion has to start
somewhere.  

Brendan, could the cache idea be used as a short term solution until we have a
better way to handle strings? It really brings a good speedup and it doesn't
seem too complex.

Igor, I see that the cache is only for doubles. In the real world there are lots
of different types of objects so this only works for this test case. A real
(useful) cache would have to handle all kinds of objects. Hmm... Could every js
object keep itself as a string efter the conversion has been done once?

Comment 47

18 years ago
IMO caching strings from conversion of anything beyond numbers would be waster
of efforts. In typical JS objects are not converted to strings that much, but it
happens with numbers quite often. In any case it would be nice to profile more
scripts to see if it worth to have a cache for numbers in general and not just
while sorting.

Comment 48

18 years ago
Comment on attachment 66374 [details] [diff] [review]
revised patch (format only)

r=waldemar
Attachment #66374 - Flags: review+
The last few comments ask for a cache of recent/frequent number=>string
conversions.  I would like to see real-world sorting benchmarks, or any other
benchmarks, that would benefit from this.  IMHO we are focusing too much on a
synthetic benchmark.  I won't repeat what I wrote in comment 27.

/be
Moved to Mozilla 1.2 as a temporary holding place, as bugs remaining on 0.9.9
are being scrutinized, now that the 
milestone has passed.

If this must be fixed for MachV, then it needs to be assigned the "nsbeta1"
keyword.
Target Milestone: mozilla0.9.9 → mozilla1.2
Come on, we have a reviewed patch that seems to make a big difference on real
world workloads -- why shouldn't this go into 0.9.9 for 1.0?  What was the
hold-up that kept it from getting in last week?  I thought khanson had commit
access now.

I'm disturbed that one (of several) Netscape product keywords is mentioned as
the only way to pull this bug back.  Evangelism and embedding might well want
it.  But more to the point: good stewardship of open source means not treating
reviewed, technically good patches this way.

/be

Updated

18 years ago
Keywords: mozilla1.0+
Brendan -- my statement which disturbed you, was ambiguous at best. I could have
been clearer.

I did not mean to be understood that the `nsbeta1' keyword was the only option.
It makes sense that you nominated this `mozilla1.0'.
.

Comment on attachment 66374 [details] [diff] [review]
revised patch (format only)

a=asa (on behalf of drivers) for checkin to the 0.9.9 branch and the 1.0 trunk
Attachment #66374 - Flags: approval+
Comment on attachment 66374 [details] [diff] [review]
revised patch (format only)

a=asa (on behalf of drivers) for checkin to the 1.0 trunk
This has approval, why hasn't the patch gone in yet?  Retargeting at 1.0.

/be
Target Milestone: mozilla1.2 → mozilla1.0
Assignee

Comment 56

17 years ago
Posted file sorting test program (obsolete) —
The attached test program no longer runs with or without the patch.  I am gun
shy to commit the patch without a working test program.  The test program
produces an ?out of memory error?.  The array size being sorted is 1000.  I
have to reduce it to less than 175 to run on the Mac.

Kenton
Assignee

Comment 57

17 years ago
text format
Attachment #73783 - Attachment is obsolete: true
Kenton, thanks for the great testcase -- note that it was causing explosive
memory growth even without your heap-sort patch, due to the property tree patch
from bug 62164 combining badly with an old, non-ECMA and sub-optimal call to
OBJ_DEFINE_PROPERTY for each element being created by InitArrayObject.	I
changed that call to OBJ_SET_PROPERTY.

ECMA Array.prototype.sort (15.4.4.11) allows an implementation-dependent
combination of [[Get]], [[Put]], and [[Delete]] -- the last to allow impls to
sort sparse arrays without "filling holes" -- and we were using [[Define]], in
effect -- a non-ECMA primitive that creates a new property even if an old one
existed.  This causes massive property tree growth because it permutes paths in
the tree, creating new paths, for each sort call.

BTW, your test defines its own reverse function, but JS has
Array.prototype.reverse().

My patch is yours, plus indentation fixes (diff to see them), plus the DEFINE
=> SET change.	Its memory use is well-behaved.

/be
Attachment #66374 - Attachment is obsolete: true
Comment on attachment 73803 [details] [diff] [review]
khanson's last patch with indentation fixes and crucial perf fix to InitArrayObject

Carrying over marks, and shaver says sr=shaver on the DEFINE=>SET change; I'm
certain that jband will r=.  This is going in now for 1.0.

/be
Attachment #73803 - Flags: superreview+
Attachment #73803 - Flags: review+
Attachment #73803 - Flags: approval+
I picked on shaver and jband because they were my property-tree (bug 62164)
buddies.  Fix is in, sorry for the memory-growth scare, and thanks again.

/be
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago17 years ago
Resolution: --- → FIXED
from cvs log -N -r3.33 jsarray.c:

revision 3.33
date: 2002/03/13 01:50:13;  author: brendan%mozilla.org;  state: Exp;  lines:
+50 -48
khanson@netscape.com's patch to switch from QuickSort to heap-sort, plus a
crucial ECMA-purity/property-tree-perf fix to InitArrayObject (to SET rather
than DEFINE; bug 99120, r=waldemar&jband, sr=shaver&brendan, a=asa).

/be
Assignee

Comment 62

17 years ago
Results from comment #9 updated to reflect Brendan's latest update.
An order of magnitude improvement on all cases.


                         random   sorted  reverse  identical
                                           sorted     
quicksort   (100 values)     15       78       77        77ms
heapsort    (100 values)     18       19       17         7
heapsort*   (100 values)      2        2        2         1

quicksort  (2000 values)    565    30596    30598     30490
heapsort   (2000 values)    621      639      593       137
heapsort*  (2000 values)     69       70       65        12

quicksort  (5000 values)   1501   189465   189428    188845
heapsort   (5000 values)   1753     1815     1683       344
heapsort*  (5000 values)    195      198      185        31

*includes Brendan's latest crucial performance fix to
InitArrayObject
Reporter

Comment 63

17 years ago
Verified Fixed.

The standalone JS shell testscase in bug 98012 now completes in 1.25 seconds
on my WinNT box in the debug JS shell. It used to take 8-10 seconds.
Note: it now completes in 0.66 seconds in the optimized JS shell.

On my Linux box, it completes in 1.12 seconds in the debug shell,
and 0.64 seconds in the optimized shell.
Status: RESOLVED → VERIFIED

Comment 64

17 years ago
The sort() test at
http://www.formula-one.nu/mozilla/jsTimeTest.htm
shows a huge difference (70ms vs. 1292ms) between msie6 and the latest trunk 
build.
Is this a different problem?

Markush, did you note a regression on the trunk?  Is someone running these tests
daily and charting results?

/be
Reporter

Comment 66

17 years ago
Here is the test mentioned in Comment #64:

<SCRIPT>
testSortArray();

function testSortArray()
{
  var theArray = new Array;
  for(i=0; i<=10000; i++)
    theArray[i] = parseInt(Math.random()*10000)

  start = new Date();
  theArray.sort();
  end = new Date();
  
  alert('Result in ms: ' + (end-start) + '\n');
}
</SCRIPT>



------------------------------  MY RESULTS  ------------------------------
WinNT 4.0 (SP6); 500MHz CPU; 128M RAM
All times in ms:

                  IE6                     172
                  NN4.7                   1800   
                  Moz (current trunk)     1700
                  Opt JS shell (current)  780
                  Opt JS shell (RC4a)     780
                  Opt JS shell (RC4)      1350
                  Opt JS shell (RC3)      1400


The change in the JS shell occurs between RC4 (Jan 01, 2002)
and RC4a (Mar 21, 2002). The fix for this bug went in Mar 12-13.

However, we see that on my box, IE6 is 10x faster than Mozilla
on this test, and 4.5x faster than the current JS shell.
Assignee

Comment 67

17 years ago
In response to comment #64, this is a different problem.  This bug addresses the
problem of "sort() should work better on already- or mostly-sorted input".  This
patch corrects that problem.

Sorting pointers to the objects, and moving the objects only once after their
final destination was determined might help.  Also, storing any redundant and
expensive conversions used for compares during the sort would be helpful.
Reporter

Comment 68

17 years ago
Kenton is right: testing sort() on an array of random numbers is
orthogonal to this bug. Accordingly, I have filed bug 143354,
"Can we improve Array.prototype.sort() performance?", and added it
as a dependency for bug 117611 (meta bug for JS Engine performance).

Updated

14 years ago
Flags: testcase?

Comment 69

14 years ago
Checking in regress-99120-01.js;
/cvsroot/mozilla/js/tests/js1_5/Array/regress-99120-01.js,v  <-- 
regress-99120-01.js
initial revision: 1.1
Checking in regress-99120-02.js;
/cvsroot/mozilla/js/tests/js1_5/Array/regress-99120-02.js,v  <-- 
regress-99120-02.js
initial revision: 1.1
Flags: testcase? → testcase+
You need to log in before you can comment on or make changes to this bug.