Last Comment Bug 224128 - Array.sort isn't a stable sort (switch to MergeSort)
: Array.sort isn't a stable sort (switch to MergeSort)
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Trunk
: x86 Windows 2000
: P2 enhancement with 5 votes (vote)
: mozilla1.8alpha1
Assigned To: Nicholas Allen
:
: Jason Orendorff [:jorendorff]
Mentors:
: 321803 331148 (view as bug list)
Depends on: 360681
Blocks: 248969
  Show dependency treegraph
 
Reported: 2003-10-29 14:02 PST by Martin Thomson
Modified: 2014-04-26 03:10 PDT (History)
18 users (show)
asa: blocking‑aviary1.5-
bob: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Simple test case (772 bytes, text/html)
2003-10-29 14:04 PST, Martin Thomson
no flags Details
Quick and dirty benchmark (2.93 KB, text/html)
2003-11-08 09:51 PST, Nicholas Allen
no flags Details
Replace Array.sort heapsort with mergesort (8.73 KB, patch)
2003-11-13 14:00 PST, Nicholas Allen
no flags Details | Diff | Splinter Review
Replace Array.sort heapsort with mergesort - updated for the tip (8.81 KB, patch)
2004-01-15 21:06 PST, Nicholas Allen
no flags Details | Diff | Splinter Review
Replace Array.sort heapsort with mergesort (9.48 KB, patch)
2004-06-11 10:19 PDT, Nicholas Allen
no flags Details | Diff | Splinter Review
An updated patch (11.22 KB, patch)
2006-10-03 11:03 PDT, Thue Janus Kristensen
no flags Details | Diff | Splinter Review
An updated patch taking garbage collection into account (14.97 KB, patch)
2006-10-04 07:33 PDT, Thue Janus Kristensen
no flags Details | Diff | Splinter Review
Updated patch, with Igor's fixes implemented (check malloc, add free, memcpy opt) (15.38 KB, patch)
2006-10-04 13:20 PDT, Thue Janus Kristensen
no flags Details | Diff | Splinter Review
Move JS_STATIC_ASSERT outside function to be kind to older compilers (15.67 KB, patch)
2006-10-04 13:44 PDT, Thue Janus Kristensen
no flags Details | Diff | Splinter Review
An opdated version of Nick "Gyre" Allen's test program (3.42 KB, text/html)
2006-10-05 13:24 PDT, Thue Janus Kristensen
no flags Details
Updated patch (15.85 KB, patch)
2006-10-05 13:32 PDT, Thue Janus Kristensen
igor: review+
Details | Diff | Splinter Review
patch I'm checking in (17.11 KB, patch)
2006-10-05 15:03 PDT, Brendan Eich [:brendan]
brendan: review+
Details | Diff | Splinter Review

Description Martin Thomson 2003-10-29 14:02:44 PST
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007

The java script reference
http://devedge.netscape.com/library/manuals/2000/javascript/1.5/reference/array.html#1196882
 describes the Array.sort method as stable.  A quick test shows that this is not
the case.

Reproducible: Always

Steps to Reproduce:
1. Create a simple struct.
2. Create a comparison method that sorts based on a single key.
3. Create an Array and sort it.

Actual Results:  
The array is sorted on the key correctly, but where there are duplicate values
for the key, the original order is NOT preserved.

Expected Results:  
"JavaScript uses a stable sort: the index partial order of a and b does not
change if a and b are equal. If a's index was less than b's before sorting, it
will be after sorting, no matter how a and b move due to sorting."
Comment 1 Martin Thomson 2003-10-29 14:04:13 PST
Created attachment 134421 [details]
Simple test case

I've tested this in IE - which works fine.
Comment 2 Boris Zbarsky [:bz] (still a bit busy) 2003-10-29 15:49:26 PST
Note that ECMA-262 does not require the sort to be stable, by the way....
Comment 3 Phil Schwartau 2003-10-29 16:13:22 PST
Martin has a good point: that's a direct quote he gives from the
DevEdge JavaScript reference URL.

But Boris is correct; the ECMA-262 Ed. 3 spec specifically warns that
array sorts are not necessarily stable. Here is the relevant section 
(available at http://www.mozilla.org/js/language/):


15.4.4.11 Array.prototype.sort (comparefn)

The elements of this array are sorted. The sort is not necessarily stable
(that is, elements that compare equal do not necessarily remain in their
original order).

                      etc.
Comment 4 Mike Shaver (:shaver -- probably not reading bugmail closely) 2003-10-29 16:21:34 PST
More compelling even than that quote is the fact that IE's sort is stable.  I
presume that 4.x's -- and Mozilla's, at some point? -- sort was stable, based on
that doc, so we should probably fix this.  (A strict-ECMA mode could use an
unstable sort just to be petulant, though. =) )
Comment 5 Boris Zbarsky [:bz] (still a bit busy) 2003-10-29 16:44:50 PST
I wasn't saying we shouldn't fix this... ;)
Comment 6 Nicholas Allen 2003-10-29 20:09:00 PST
A serious question here would be how much of a performance hit to Array.sort is
acceptable in exchange for making it stable?
Comment 7 Nicholas Allen 2003-11-07 17:34:57 PST
Unless someone else covets this bug, I'll hold on to it.  My plan is to start
with a copying merge sort, and see from there how much interest there is to
trade speed for memory use (unless people are willing to just stick with the
non-stable heapsort).
Comment 8 Brendan Eich [:brendan] 2003-11-07 18:31:52 PST
Stability is good, but we switched to heap sort from (also unstable) QuickSort
because of "benchmark-itis" time performance pressure from users and testers. 
It's not at all clear that JS sort performance is on any critical path, but it
could be.  So whatever we do, we shouldn't regress time perf.

Dynamic memory footprint can swell a bit temporarily, I think, for desktop
embeddings.  It doesn't seem as though memory use would be horrendously larger
in any other sorting algorithm.

/be
Comment 9 Nigel McFarlane 2003-11-07 23:20:20 PST
IIRC there are some esoteric Fibonacci sorts that are parallel
to quicksort, but are more time efficient.
Rather than split on powers of 2, you split on 1,2,3,5,8,13,etc.

Binary tree sort is alleged to be stable here (no details):
http://camino.rutgers.edu/ut/utsa/cs3343/lecture10.html
A fibonacci tree sort should be stable as well, but variable
size buckets is a problem ..
Comment 10 Nicholas Allen 2003-11-08 09:51:59 PST
Created attachment 135062 [details]
Quick and dirty benchmark

Times from Moz (Firebird current opt):

num	int random*	int sorted	string random	string sorted	object
reverse*
500	0	30	0	20	10
1000	20	130	0	10	41
2000	30	240	20	10	40
4000	60	451	20	20	110
8000	160	902	40	40	251
16000	311	--	110	90	591
32000	691	--	341	210	--
64000	--	--	711	400	--

Times from IE6:

num int random* int sorted string random string sorted object reverse* 
500	20	0	0	10	30 
1000	40	10	10	10	120 
2000	110	20	20	20	471 
4000	331	40	40	40	1893 
8000	1683	120	150	141	-- 
16000	--	370	581	481	-- 
32000	--	1072	--	1702	-- 
64000	--	--	--	--	--
Comment 11 Nicholas Allen 2003-11-08 10:03:44 PST
So it looks like Moz does quite well speedwise with heap sort except when doing
lex compares of numbers (but we knew that?).  By the way, IE really does have an
O(n^2) sort happening in that last test.

I expect merge sort to be a big win for nearly sorted data and not do too badly
for the others.
Comment 12 Brendan Eich [:brendan] 2003-11-08 11:00:02 PST
> So it looks like Moz does quite well speedwise with heap sort except when doing
> lex compares of numbers (but we knew that?).

See bug 64065, bug 99120, bug 121136, and especially bug 143354.

/be
Comment 13 Nicholas Allen 2003-11-13 14:00:01 PST
Created attachment 135431 [details] [diff] [review]
Replace Array.sort heapsort with mergesort

Here is an initial patch.  I haven't tested it extensively yet.  The sort uses
an insertion sort on small blocks followed by a non-recursive mergesort.  Times
are looking fairly fast compared to heapsort, at the cost of additional memory
use.
Comment 14 Nicholas Allen 2003-11-13 20:21:53 PST
And here are those times for the benchmark-itis sufferers:

IE6
num 	int random* 	int sorted 	string random 	string sorted 	object reverse*
1000	80	10	10	10	130 
2000	201	30	30	20	380 
4000	451	60	60	50	-- 
8000	--	110	160	121	-- 
16000	--	250	521	300	-- 
32000	--	--	--	--	-- 
64000	--	--	--	--	-- 

Firebird trunk -O
num 	int random* 	int sorted 	string random 	string sorted 	object reverse*
1000	30	90	10	10	10
2000	41	180	10	10	40
4000	90	411	20	20	100
8000	160	--	50	40	230
16000	330	--	130	80	501
32000	--	--	310	171	--
64000	--	--	--	351	--

Firebird w/ patch -O
num 	int random* 	int sorted 	string random 	string sorted 	object reverse*
1000	30	10	10	0	10
2000	30	40	10	10	20
4000	50	80	20	10	60
8000	100	140	40	40	90
16000	211	360	110	50	190
32000	421	--	221	120	400
64000	--	--	471	191	--
Comment 15 Nicholas Allen 2003-11-20 16:44:25 PST
Comment on attachment 135431 [details] [diff] [review]
Replace Array.sort heapsort with mergesort

I've been using this for a while and there haven't been any problems.  The
tradeoff is additional memory usage during sorting for a stable sort and
significant speed gains.
Comment 16 Brendan Eich [:brendan] 2003-11-20 18:07:58 PST
Cool, thanks for doing this.  It's definitely 1.7alpha fodder -- can you set the
Target Milestone field?  I'll review before then, but probably not in the next
few days, due to higher priority stuff.

/be
Comment 17 Nicholas Allen 2004-01-15 21:06:31 PST
Created attachment 139177 [details] [diff] [review]
Replace Array.sort heapsort with mergesort - updated for the tip

Dragging this patch into the new year.
Comment 18 Brendan Eich [:brendan] 2004-02-10 22:32:16 PST
Comment on attachment 139177 [details] [diff] [review]
Replace Array.sort heapsort with mergesort - updated for the tip

I'll review this tomorrow and help get it approved by drivers for 1.7a.

/be
Comment 19 Brendan Eich [:brendan] 2004-03-09 19:08:22 PST
I'm sorry I didn't help get this into 1.7a.  At this point, we can certainly get
it in for 1.7b, but is it alpha material?  If it's well-tested for all the
classes of inputs we're likely to care about, maybe so.  Nick, what do you think?

/be
Comment 20 Nicholas Allen 2004-03-10 09:50:53 PST
I think this should wait for the next alpha.  I've tested this patch quite a bit
but a rare sorting error could lead to very hard to track down bustage.
Comment 21 Nicholas Allen 2004-04-27 10:04:33 PDT
Pinging Brendan to get this back on track.  This needs some careful eyes to make
sure there aren't any bad edge cases.
Comment 22 Brendan Eich [:brendan] 2004-06-10 18:09:00 PDT
Ok, looking at this for 1.8a2.  Fresh patch?

/be
Comment 23 Nicholas Allen 2004-06-11 10:19:30 PDT
Created attachment 150540 [details] [diff] [review]
Replace Array.sort heapsort with mergesort

Refreshed.
Comment 24 Nicholas Allen 2004-06-25 16:59:44 PDT
Comment on attachment 150540 [details] [diff] [review]
Replace Array.sort heapsort with mergesort

I'm back from my trip and ready to answer any questions about this patch.  cvs
says this is still good to apply against the trunk.
Comment 25 Igor Bukanov 2004-07-11 14:54:22 PDT
The last patch generates warning with gcc 3.3:

jsarray.c: In function `MergeSortHelper':
jsarray.c:669: warning: suggest parentheses around && within ||


In addition it seems that making the initial insert sort slice to be just 3
instead of 4 makes sorting faster by 1% or so on my Linux box. Is it just
specific to my box?
Comment 26 Nicholas Allen 2004-07-11 18:16:42 PDT
Igor-

The merge passes get maximal cache action on most processors when the pass sizes
are powers of two.  The first merge pass would like to be the same size as the
insertion sort chunk size.  Were you testing for correctness in addition to
speed?  Four is generally going to be the optimal value here.

I can overparenthesize if needed to fix the warning.
Comment 27 Shane Milton 2005-01-12 07:58:56 PST
Could we get an update on if the sort will be stable in v1.8?  It seems as 
though there was no final resolution.
Comment 28 Jack Eidsness 2005-06-10 09:11:07 PDT
Since we already have a fix and a speed boost attached to this bug, I think it
would be a pity not to include it in 1.1.

I might add, since no one mentioned it, if you have a table that should be
sortable on two keys, having a stable sort makes doing so trivial in javascript,
whereas it may be possible with a lot of extra work (sort, split into multiple
arrays, sort again), with most web dev languages, it takes less developer time
now to use server-side code to sort this data now - and that is exactly the way
that I see it done everywhere that I see sortable data on a page.  Quite
wasteful, if you ask me.  Allowing this to be done via js is a big win for the
world.

I am just a peon in the world of bugzilla, I wonder if i can nominate this as a
blocker...  I'm going to try and I hope that's not considered rude or bad.
Comment 29 Nicholas Allen 2005-06-10 09:23:55 PDT
The patch has rotted a bit due to other js array changes and bug fixes so it
can't go in as is.  What has held this back is that I don't particularly like
the temporary increase to memory footprint this leads to although stable and
faster sorting is good.
Comment 30 Jack Eidsness 2005-06-10 09:46:49 PDT
(In reply to comment #29)

If it's /actually/ temporary, I don't see how that's a problem.  "temporary" in
my eyes means that the memory usage goes back down when the sort function
completes.  If the memory is not released until window close or program exit,
then I see your latter point as well as the unmissable point about the code
getting old.
Comment 31 timeless 2005-06-10 15:59:32 PDT
jack@hh.peril.org: there are hazards, like risking running out of memory while
trying to do an out of place sort where you wouldn't have run out of memory if
you did an inplace sort....
Comment 32 Markus Hübner 2005-09-06 11:12:47 PDT
Can anyone run the test at https://bugzilla.mozilla.org/show_bug.cgi?
id=143354#c47 with this patch applied?
Comment 33 Brendan Eich [:brendan] 2006-01-07 10:20:19 PST
*** Bug 321803 has been marked as a duplicate of this bug. ***
Comment 34 :Gavin Sharp [email: gavin@gavinsharp.com] 2006-03-25 21:19:30 PST
*** Bug 331148 has been marked as a duplicate of this bug. ***
Comment 35 Thue Janus Kristensen 2006-09-23 05:07:09 PDT
This should definitely be applied - it fixes a bug, as per the reference posted in the initial post.

You say that you do not want to apply it because of memory concerns, but mergesort only uses O(n) extra memory (which is really not that much IMO).

According to http://en.wikipedia.org/wiki/Sorting_algorithm , only one stable nlog(n) sorting algorithm has space usage less than O(n). This is a modified merge sort with space usage O(1), which saved space by making the algiritm slower and more complex.

What would make you apply this patch, and fix the bug?

Comment 36 Brendan Eich [:brendan] 2006-09-23 10:00:26 PDT
Nothing is stopping us from trying the patch out in the 1.9 alpha trunk.  It can be tested by many people there.  It just needs to be updated, reviewed, and landed.  I'd appreciate it if Igor Bukanov did the review, since I'm short on time and he has lots of experience with array_sort and jsarray.c.

/be
Comment 37 Igor Bukanov 2006-09-23 16:11:18 PDT
(In reply to comment #36)
> Nothing is stopping us from trying the patch out in the 1.9 alpha trunk.  It
> can be tested by many people there.  It just needs to be updated, reviewed, and
> landed.  I'd appreciate it if Igor Bukanov did the review, since I'm short on
> time and he has lots of experience with array_sort and jsarray.c.

The patch in the current form has 2 problems:

1. No rooting of the temporary array.

2. No propagation of exceptions and errors from the compare function. 

But these are straightforward to fix.
Comment 38 Thue Janus Kristensen 2006-10-03 11:03:10 PDT
Created attachment 241085 [details] [diff] [review]
An updated patch

I have created an updated version of Nick "Gyre" Allen's patch.

It compiles and seems to work correctly, but this is the first time I touch the Mozilla code, so it might be a good idea to review my work.

> The patch in the current form has 2 problems:
> 
> 1. No rooting of the temporary array.

I don't know what this is. If it is related to the localroot var in that file, then no updates seem to be necessary.

> 2. No propagation of exceptions and errors from the 
> compare function. 

I copied the CALL_CMP function from current CVS into the patch.

Regards, Thue
Comment 39 Thue Janus Kristensen 2006-10-03 11:12:35 PDT
(In reply to comment #32)
> Can anyone run the test at https://bugzilla.mozilla.org/show_bug.cgi?
> id=143354#c47 with this patch applied?

The patch improves performance, from 132ms to 123ms on my computer.
Comment 40 Thue Janus Kristensen 2006-10-03 11:16:37 PDT
Just a note, In the current incarnation the patch may cause problems if an error occurs in a sort function, and the sort aborts midway through.

In that case the original src array which is input to mergesort may contain 2 copies of one element from the original array, and 0 of another.

If this is a problem (I don't know, I am a newbie :) ), then it could be fixed by doing the sorting in two new malloced arrays, and then copy the result to the original src array in the end.
Comment 41 Igor Bukanov 2006-10-03 11:21:27 PDT
(In reply to comment #38)
> > The patch in the current form has 2 problems:
> > 
> > 1. No rooting of the temporary array.
> 
> I don't know what this is. If it is related to the localroot var in that file,
> then no updates seem to be necessary.

The patch creates a temporary array as a scratch space for merge. Then at some point some elements of the original array would only exist at the scratch space. If at this moment a compare is called, then it is possible that the garbage collector would run.

Since SpiderMonkey uses a precise GC, it must know about all the roots where to scan for live things. Since the scratch space is not registered with GC, the patch introduces a GC hazard.

Such hazard existed for some time in the heapsort implementation. There GC was not aware about pivot memory. The fix was not to allocate the memory in js_HeapSort but rather force the caller to provide the memory for the temporary. This also suggests how to fix this problem with the merge sort. That is, it should require the caller to provide the scratch array delegating the rooting responsibility there. 

It would add an extra benefit of not allocating the memory twice since arrray_sort can use single malloc call to allocate the space for the original and scratch space.
Comment 42 Igor Bukanov 2006-10-03 11:25:46 PDT
(In reply to comment #40)
> Just a note, In the current incarnation the patch may cause problems if an
> error occurs in a sort function, and the sort aborts midway through.
> 
> In that case the original src array which is input to mergesort may contain 2
> copies of one element from the original array, and 0 of another.

No, this is not a problem. As long as an aborted sort does not leave partial copies, it is OK. That is, copies are fine as long as they are the complete copies, not like 1 byte from one element and 3 bytes from another.

AFAICS there is no such problems with the patch. 
Comment 43 Thue Janus Kristensen 2006-10-04 07:33:56 PDT
Created attachment 241174 [details] [diff] [review]
An updated patch taking garbage collection into account

I created an updated patch which GC-roots the used temp memory correctly. I used the single GC-rooted malloc to create both the vector and the temp space, as you suggested.

A few comments and a few minor cleanups also added.
Comment 44 Igor Bukanov 2006-10-04 08:53:29 PDT
Comment on attachment 241174 [details] [diff] [review]
An updated patch taking garbage collection into account

>+    /*
>+     * The first newlen elements of vec are copied from the array
>+     * object.
>+     *
>+     * Of the remaining 2*len-newlen positions, newlen are used as GC
>+     * rooted temp space for mergesort, and the last (2*len-2*newlen)
>+     * positions are unused.
>+     *
>+     * Here we clear the tmp-values before GC-rooting the array.
>+     */
>+    for (i = newlen; i < 2*newlen; i++) {
>+        vec[i] = JSVAL_NULL;
>+        tvr.count = tvr.count + 1;
>+    }
>+    mergesort_tmp = vec + newlen;

You can use here something like:
    JS_ASSERT(JSVAL_NULL == 0);
    memset(mergesort_tmp, 0, newlen * sizeof *mergesort_tmp)

>@@ -3568,18 +3568,21 @@ Decompile(SprintStack *ss, jsbytecode *p
>                             }
>                             table[j].key = INT_TO_JSVAL(low + i);
>                             table[j].offset = off2;
>                             table[j].order = j;
>                             j++;
>                         }
>                         pc2 += jmplen;
>                     }
>-                    js_HeapSort(table, (size_t) j, &pivot, sizeof(TableEntry),
>-                                CompareOffsets, NULL);
>+                    tmp = (TableEntry *)
>+                        JS_malloc(cx, (size_t)j * sizeof *table);
>+                    ok = js_MergeSort(table, (size_t) j, sizeof(TableEntry),
>+                                      CompareOffsets, NULL, tmp);
>+                    JS_ASSERT(ok);
>                 }

There is no free for tmp and the result of malloc is not checked.

Otherwsie it is OK.
Comment 45 Igor Bukanov 2006-10-04 08:58:30 PDT
For references, bug 311497 is that bug about GC hazard through heap sort pivot. 
Comment 46 Brendan Eich [:brendan] 2006-10-04 09:56:32 PDT
(In reply to comment #44)
> You can use here something like:
>     JS_ASSERT(JSVAL_NULL == 0);
>     memset(mergesort_tmp, 0, newlen * sizeof *mergesort_tmp)

Lots of code knows that JSVAL_NULL == 0, so the loop definitely should become a memset.

The JS_ASSERT should be JS_STATIC_ASSERT at top level, near the very top of the file, in jsapi.c, I think.

/be
Comment 47 Thue Janus Kristensen 2006-10-04 13:20:06 PDT
Created attachment 241207 [details] [diff] [review]
Updated patch, with Igor's fixes implemented (check malloc, add free, memcpy opt)

Embarrassing about the missed malloc check and free :(. Thanks to Igor for the catch!

malloc is now checked, free is now called, and memcpy is now used as suggested by Igor.

I added the
  JS_STATIC_ASSERT(JSVAL_NULL == 0);
just above where it is used, for code readability. That does of course not preclude it from being added other places too.
Comment 48 Igor Bukanov 2006-10-04 13:32:17 PDT
(In reply to comment #47)
> I added the
>   JS_STATIC_ASSERT(JSVAL_NULL == 0);
> just above where it is used, for code readability.

That does not work with older compilers. The macro uses a typedef which in C89 can only present at the beginning of the block. Ubfortunately due to older MSVC bugs it is not possible to write even 
{
   JS_STATIC_ASSERT(JSVAL_NULL == 0);
...
}

as there typedefs AFAIK inside blocks lead to compilation errors. So JS_STATIC_ASSERT can only be used either outside of any function or the begining of the function where other declarations are present. So just move it to the begining of the function with comments about using memset.

Comment 49 Thue Janus Kristensen 2006-10-04 13:44:13 PDT
Created attachment 241213 [details] [diff] [review]
Move JS_STATIC_ASSERT outside function to be kind to older compilers

As pointed out by Igor, JS_STATIC_ASSERT inside a function does not work in older compilers. So I moved it out just before the function, with an appropriate comment.
Comment 50 Igor Bukanov 2006-10-04 13:52:40 PDT
Please ask igor.bukanov@gmail.com for review=? and brendan@mozilla.org for superreview=? . I would plus the patch but then I am not that good at following the code style in SM, so just to be sure we need Brendan's blessing.
Comment 51 Igor Bukanov 2006-10-04 14:23:16 PDT
Comment on attachment 241213 [details] [diff] [review]
Move JS_STATIC_ASSERT outside function to be kind to older compilers

>+#define MEMCPY(p,q,n) \
>+    (fastcopy ? (void)(*(jsval*)(p) = *(jsval*)(q)) : (void)memcpy(p, q, n))

Nit: memcpy is also used to copy chunks of array in the new code, so it is better IMO to rename MEMCPY into something like COPY_ONE. 

>+    for (lo = 0; lo < nel; lo += 4) {

Curiosity: do you have benchmarks that 4, not 3-5-6 is a beter choice? I am fine with the value, but it would be interesting to know.


>+/* the array_sort function below assumes JSVAL_NULL==0 to perform an
>+ * initialization with JSVALL_NULL using memset.
>+ */

Nit: multi-line comments should be like

/*
 * Start with a capital letter and write text ...
 * that spans several lines.
 */
Comment 52 Brendan Eich [:brendan] 2006-10-04 18:52:24 PDT
Comment on attachment 241213 [details] [diff] [review]
Move JS_STATIC_ASSERT outside function to be kind to older compilers

Looks good at a glance, just some nits:

* MergeSort_MergeArrays should be renamed to just MergeArrays.

* Agree with Igor on COPY_ONE instead of MEMCPY.

I'll sr+ when Igor r+'s.

/be
Comment 53 Thue Janus Kristensen 2006-10-05 13:21:52 PDT
(In reply to comment #51)
> Curiosity: do you have benchmarks that 4, not 3-5-6 is a beter choice? I am
> fine with the value, but it would be interesting to know.

I just did a few recompiles and tests with different values, and 4 seems to be optimal. Using an version of Nick "Gyre" Allen's test program with some fixes and changes I got the following results:

2:
num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	3	0	2	1	2
1000	7	1	7	2	4
2000	12	2	16	4	11
4000	36	4	34	9	25
8000	65	8	79	19	55
16000	154	17	182	46	122
32000	292	35	410	85	275
64000	649	77	952	184	577

3:
num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	3	0	3	1	2
1000	7	1	7	2	5
2000	12	2	16	5	11
4000	48	4	35	10	26
8000	62	8	84	22	60
16000	135	17	179	45	145
32000	292	34	403	91	291
64000	628	75	928	188	612

4:
num 	int random* 	int sorted 	string random 	string sorted 	object reverse*
500	2	0	2	1	2
1000	5	1	6	2	4
2000	13	2	15	4	10
4000	30	3	35	8	24
8000	61	7	78	18	54
16000	131	16	171	45	122
32000	263	34	384	85	268
64000	609	69	853	165	569

5:
num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	3	0	3	1	2
1000	7	1	8	3	5
2000	12	2	15	5	11
4000	48	4	35	10	29
8000	67	7	79	20	61
16000	138	16	178	39	138
32000	322	35	395	77	296
64000	646	73	894	174	630

6:
num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	3	0	3	1	2
1000	6	1	6	2	6
2000	14	2	15	5	12
4000	31	4	35	10	28
8000	67	9	81	23	63
16000	143	17	182	46	142
32000	320	37	413	93	300
64000	673	79	904	189	641

8:
num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	2	0	3	1	2
1000	6	1	7	2	5
2000	12	1	16	4	14
4000	50	4	35	8	29
8000	64	7	76	18	64
16000	138	15	173	46	173
32000	299	34	397	83	310
64000	643	73	894	161	665


All of them is massively (ca. x2) faster than then ones in Firefox 2.0rc1:

num 	int random 	int sorted 	string random 	string sorted 	object reverse
500	4	4	5	4	5
1000	10	10	11	12	13
2000	23	25	26	27	30
4000	50	57	62	57	72
8000	110	122	134	127	161
16000	239	265	303	283	359
32000	522	577	694	575	794
64000	--	--	--	--	-- (the script stops if the values in the previous row got too large)
Comment 54 Thue Janus Kristensen 2006-10-05 13:24:08 PDT
Created attachment 241360 [details]
An opdated version of Nick "Gyre" Allen's test program
Comment 55 Thue Janus Kristensen 2006-10-05 13:32:10 PDT
Created attachment 241363 [details] [diff] [review]
Updated patch

Ok, I opdated the patch:
-Comments now in line with coding style
-MEMCPY renamed to COPY_ONE
-MergeSort_MergeArrays renamed to MergeArrays
-Created a "#define INS_SORT_INT 4" for the size of the insertion-sorted interval.
Comment 56 Igor Bukanov 2006-10-05 13:41:19 PDT
Comment on attachment 241363 [details] [diff] [review]
Updated patch

Thanks for your efforts!
Comment 57 Thue Janus Kristensen 2006-10-05 13:42:53 PDT
(In reply to comment #53)
> (In reply to comment #51)
> > Curiosity: do you have benchmarks that 4, not 3-5-6 is a beter choice? I am
> > fine with the value, but it would be interesting to know.
> 
> I just did a few recompiles and tests with different values, and 4 seems to be
> optimal. Using an version of Nick "Gyre" Allen's test program with some fixes
> and changes I got the following results:
>
> [...]
>
> All of them is massively (ca. x2) faster than then ones in Firefox 2.0rc1:

Just a through: the test uses arrays with 4 copies of each element value, and so may happen to hit a special case where mergesort is extra efficient.

The sort currently in firefox 2 may do comparatibly better of there are no copies of the values. I don't think the copies have much influence on the optimal insertion sort interval, as I think it is fairly unlikely that there are too many copies inside the small intervals.
Comment 58 Igor Bukanov 2006-10-05 14:07:22 PDT
(In reply to comment #57)
> Just a through: the test uses arrays with 4 copies of each element value, and
> so may happen to hit a special case where mergesort is extra efficient.
> 
> The sort currently in firefox 2 may do comparatibly better of there are no
> copies of the values. I don't think the copies have much influence on the
> optimal insertion sort interval, as I think it is fairly unlikely that there
> are too many copies inside the small intervals.

The sort must be dominated by the number of compare calls. Swaps and moves does not mater in the face of script function calls or even just C function invocation through a pointer. The merge sort has less compare calls so it must be better.

What would be interesting is to replace insertion sort over 4-element chunks by  the optimal 7-compare sort for chunks of 5. But that is for a different bug!
Comment 59 Brendan Eich [:brendan] 2006-10-05 15:03:59 PDT
Created attachment 241375 [details] [diff] [review]
patch I'm checking in

Mostly tw=78 and whitespace policing, a few comment tweaks.  Also, important fix to jsopcode.c to initialize ok in all paths leading to the new test of its value.

/be
Comment 60 Brendan Eich [:brendan] 2006-10-05 16:29:56 PDT
Fixed on trunk:

Checking in jsarray.c;
/cvsroot/mozilla/js/src/jsarray.c,v  <--  jsarray.c
new revision: 3.98; previous revision: 3.97
done
Checking in jsarray.h;
/cvsroot/mozilla/js/src/jsarray.h,v  <--  jsarray.h
new revision: 3.14; previous revision: 3.13
done
Checking in jsopcode.c;
/cvsroot/mozilla/js/src/jsopcode.c,v  <--  jsopcode.c
new revision: 3.190; previous revision: 3.189
done

Thanks to Thue and Nick (Nick, are you still reading mail?  Drop me a line if so. Thanks again).

/be
Comment 61 Nicholas Allen 2006-10-11 00:02:56 PDT
I'm glad to see this fix finally going in.  Thue, thanks for picking this up and driving it to resolution!

Brendan, I've tried to keep an eye on things but I've got a new employment agreement that requires moving mountains to even look at an active bug in the projects I used to work on.  It's frustrating at times but I can't really argue about the propriety.
Comment 62 Ken Johanson 2008-07-21 20:53:46 PDT
Has this fix not been committed? In UA:

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.15) Gecko/20080623 Firefox/2.0.0.15

The sort is still not stable, sample code below should (but does not) output:
1 1
1 2
1 3
1 4
2 3
2 4
2 5
2 5
2 6

<html>
<script>
var input = [
	[2,5],
	[2,6],
	[2,3],
	[2,5],
	[2,4],
	[1,1],
	[1,3],
	[1,4],
	[1,2]
];
function compare(a, b, reverse) {
	a = a.comparable;
	b = b.comparable;
	var ret = a<b ? -1 : a>b ? 1 : 0;
	return reverse ? 0-ret : ret;
}
function sortTable(col, neg)
{
	var composite = [];
	for (var i=0; i<input.length; i++)
	{
		var data = new Object();
		data.lineitem = input[i];
		data.comparable = input[i][col];
		composite[composite.length] = data;
	}
	composite.sort(function(a,b,neg){return compare(a,b,neg);});
	var ret = [];
	var newrows = [];
	for (var i=0; i<composite.length; i++)
	{
		var lineitem = composite[i].lineitem;
		newrows[i] = [];
		for (var j=0; j<lineitem.length; j++)
		{
			newrows[i][j] = lineitem[j];
			if (j==0) ret[ret.length] = "<br>";
			else ret[ret.length] = "&nbsp;";
			ret[ret.length] = lineitem[j];
		}
	}
	input = newrows;
	document.getElementById("inf").innerHTML += "Sorting ["+col+"]:<br>"+ret.join("")+"<p>";
}
</script>
<body onload="sortTable(1,true);sortTable(0,true);">
<div id="inf"></div>
</body>
</html>
Comment 63 Bob Clary [:bc:] 2008-07-22 00:45:48 PDT
Ken, it was fixed in Firefox 3 not Firefox 2. Please attach testcases rather than clutter up bugs with html source. Thanks!

Note You need to log in before you can comment on or make changes to this bug.