Closed Bug 255555 Opened 20 years ago Closed 19 years ago

Broken array comparison logic (ECMA-262 Ed. 3 15.4.4.11) iloop

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.8beta3

People

(Reporter: kemin.zhou, Assigned: mrbkap)

References

()

Details

(Keywords: hang, js1.5)

Attachments

(2 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.2) Gecko/20040803

I have an array of array ( a table 2066 rows, 17 columns)
when envoking the sort function tab.sort(arrorder) it never finished the sorting
job.  
The sort function is defined as follows
function arrorder(a, b) {
   if (counter++%1000 == 0) {
      d.write("Comparing " + a[colidx] + " x " + b[colidx] + " " + counter);
   }
   if (a[colidx] < b[colidx]) return -1;
   else if (a[colidx] > b[colidx]) return 1;
   else return 0;
}
The colidx is a global variable to tell which column to sort
the cocument is for debug output,
The probram stop at the 38,000 comparison and never get out of the sort function.

This same script can finish perfectly in IE on windows but not on mozilla on linux.




Reproducible: Always
Steps to Reproduce:
1. Read in the global tab javascript object from the HTML table
2. Sort(arrorder)
3. Stop

Actual Results:  
The browser went to sleep and the debuging window hang with the mouse pointer
shown as timer.  

Expected Results:  
Finish sorting

Here is the tab loading function
It loads from the current web page with a table with 2066 rows and 17 columns 
with text info in each cell.

function readTable(htmltab) {
   var rows = htmltab.rows;
   alert("table has " + rows.length + " rows");
   var nrow = rows.length;
   var cells = rows[0].cells;
   var nfields = cells.length;
                                                                                
   //var w = window.open("", "Debug",
"status,height=500,width=800,resizable,scrollbars");
   //var d = w.document;
   //d.write("<h1>Convert HTML table to JavaScript Object</h1>");
                                                                                
   tab = new Array(nrow);
   for (var i=1; i<nrow; i++) {
   //for (var i=1; i<100; i++) {
      //d.write("working on row: " + (i-1) + "<br>\n");
      var jsr=tab[i-1] = new Array(nfields);
      cells = rows[i].cells;
      for (var j=0; j<nfields; j++) {
         //d.write("column: " + j + " ");
         var cc=cells[j].firstChild;
         if (cc.nodeType == 3) {
            jsr[j] = cc.nodeValue;
         }
         else if (cc.nodeType == 1) {
            jsr[j] = cc.toString();
         }
         else {
            alert("cell nodeType: " + cc.nodeType + " not supported in readtable
()");
         }
      }
   }
}
Need a complete testcase, please.  If arrorder returns inconsistent values, the
sort may not complete.

1.  Attach a complete testcase.
2.  Log the arrorder return value, if possible, and see whether it is consistent.

/be
I think I know where the problem is here: arrorder is called with undefined as 
an argument because the readTable function leaves the last element of the |tab| 
array undefined: 

tab = new Array(nrow);
for (var i=1; i<nrow; i++) {
   var jsr=tab[i-1] = new Array(nfields);

tab[nrow-1] will still be undefined after the loop. 

Now when you call ntab.sort(arrorder), Mozilla will at some point pass undefined 
as the value for a or b, arrorder will try to access undefined[colidx] which 
ends execution.

If I read the ECMA 15.4.4.11 correctly, arrorder should be safe to assume that 
its arguments will never be undefined. This is true for both Opera and IE.

Quick testcase (see URL):
javascript:[ 1, 2, undefined ].sort(function(a,b) { if (a == undefined || b == 
undefined) alert("undefined passed"); return a-b; })
Erik, thanks for digging into this.  Note that ECMA-262 Edition 3 15.4.4.11 does
not say that undefined is never passed to the comparefn, in fact it specifies
how undefined and absent properties are sorted in penultimate NOTE.

The problem here is the bug you identified, that the tab Array is dimensioned
one too long, or else (likelier) the loop condition should be i <= nrow.

Given this script bug, the rest follows.  When arrorder is called with two
arguments, one of which is undefined, it is not a consistent comparison function
by the definition in 15.4.4.11, because undefined converts to NaN when used with
< and >, and Nan is neither less than nor greater than any other number.  So the
implementation may behave in an unspecified fashion, per the spec.

Should we behave better?  Yes, if the bad behavior is an iloop.  Does IE do the
right thing?  Not clear: in the old days, IE (and MSVC) had NaN bugs and didn't
follow IEEE-754 and ECMA-262 properly in all cases dealing with non-finite
values.  Someone might test more to find out, but the issue here is how to fix
jsarray.c.  Confirming.

/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
Er, I meant: undefined[colidx] is a reference error, which causes arrorder to
fail, which again makes it not a consistent comparison function by ECMA-262
Edition 3.  But we should not iloop.

Nallen, I owe you much for your patience in getting the new sort implementation
into SpiderMonkey (bug 224128 -- I will make time to review that patch this
week).  Is there any chance you would take this bug too?

/be
Keywords: hang
Summary: sort(orderfunction) broken when arrray has 2066 elements → Inconsistent array comparison function (ECMA-262 Ed. 3 15.4.4.11) iloop
If the new sort implementation doesn't already fix this, I'll make it happen.
Assignee: general → nallen
Brendan, is there an errata for this section that I am not aware of? I looks to 
me as if steps 3-5 and 10-12 of SortCompare make sure comparefn is never called 
with undefined as an argument:

3. If this object does not have a property named by Result(1), and this object 
does not have a property named by Result(2), return +0.
4. If this object does not have a property named by Result(1), return 1.
5. If this object does not have a property named by Result(2), return –1.
(...)
10. If x and y are both undefined, return +0.
11. If x is undefined, return 1.
12. If y is undefined, return −1.
13. If the argument comparefn is undefined, go to step 16.
14. Call comparefn with arguments x and y.
15. Return Result(14).

This here is a case where SortCompare is called, isn't it?
Erik: you are right, I am wrong -- I should have read that part!  It looks like
sort_compare in jsarray.c was never changed to track ECMA.  Going back to
revision 3.1 shows the same bug: special cases for undefined properties are done
only if no comparefn is passed.  D'oh!

Nallen: thanks for taking this.

/be
This bug is ancient, it predates ECMA-262's first edition and probably goes back
to the dawn of time (Spring 1995).  No one caught it until now.  Thanks again, Erik!

/be
Summary: Inconsistent array comparison function (ECMA-262 Ed. 3 15.4.4.11) iloop → Broken array comparison logic (ECMA-262 Ed. 3 15.4.4.11) iloop
js/tests/js1_5/Array/regress-255555.js checked in.
Any hope of getting this in for 1.8b2?

/be
Keywords: js1.5
Attached patch patch v1 (obsolete) — Splinter Review
This seems to work. It simply follows what the spec says to do in the face of
holes and undefined values.
Assignee: nallen → mrbkap
Status: NEW → ASSIGNED
Attachment #185626 - Flags: review?(brendan)
Comment on attachment 185626 [details] [diff] [review]
patch v1

> /* 2^32 - 1 as a number and a string */
> #define MAXINDEX 4294967295u
> #define MAXSTR   "4294967295"
>+/* A useful value for identifying a hole in an array */

How about one empty line before this comment, to set it off from the MAXINDEX
and MAXSTR group?

>+#define JSVAL_HOLE 0x16

BOOLEAN_TO_JSVAL(2) is better, sorry I didn't write that on the whiteboard ;-).

> static JSBool
> InitArrayElements(JSContext *cx, JSObject *obj, jsuint length, jsval *vector)
> {
>     jsuint index;
>     jsid id;
> 
>     for (index = 0; index < length; index++) {
>+        jsval v = vector[index];
>+        if (v == JSVAL_HOLE)
>+            continue;

Comment on this change being only for array_sort, but done here until the
interest of minimal code size increase.

Uber-nit: I typed this code too, but old style in js/src/*.c was to hoist
declarations to the function top-level body-block.  Not sure we shouldn't break
with tradition, but it's still mostly observed.  Your call.

> static int
> sort_compare(const void *a, const void *b, void *arg)
> {
>     jsval av = *(const jsval *)a, bv = *(const jsval *)b;
>     CompareArgs *ca = (CompareArgs *) arg;
>     JSContext *cx = ca->context;
>     jsdouble cmp = -1;
>-    jsval fval, argv[2], rval;
>+    jsval fval, argv[2], rval, special = JSVAL_NULL;
>     JSBool ok;
> 
>     fval = ca->fval;
>-    if (fval == JSVAL_NULL) {
>+
>+    /* 
>+     * By ECMA 262, 15.4.4.11 existance of the property takes precedence
>+     * over an undefined property
>+     */
>+    if (av == JSVAL_HOLE || bv == JSVAL_HOLE) {
>+        special = JSVAL_HOLE;
>+    }
>+    else if (av == JSVAL_VOID || bv == JSVAL_VOID) {
>+        special = JSVAL_VOID;
>+    }

Thought: instead of initializing special at its declaration, set it in a final
else clause here.

Nit (this one I'd fix): we don't overbrace single-statement then and else
clauses in js/src/*.c, unless the corresponding else or then (if present) is a
compound statement or looks like one because it's multi-line (in which case, "}
else {" is the style, after K&R C and most old Unix kernel source).

>+
>+    if (special != JSVAL_NULL) {
>+        if (av == bv)
>+            cmp = 0;
>+        else if (av != special)
>+            cmp = -1;
>+        else
>+            cmp = 1;

Nice, no extra braces! ;-)

> /* XXXmccabe do the sort helper functions need to take int?  (Or can we claim
>  * that 2^32 * 32 is too large to worry about?)  Something dumps when I change
>  * to unsigned int; is qsort using -1 as a fencepost?
>+ * XXX Wasn't qsort dumped a while ago? Is this a real concern?

Yeah, don't add to the XXX stale mystery madness, just remove this comment. 
It's another proof that the FIXME: <bug #> convention that nat@nat.org has
written about is better than the old XXX convention.

Fix these and I'll give a quick look and r+a= -- thanks for taking this one on
too -- fun bug, JSVAL_HOLE is truly righteous.

/be
Attached patch patch v2Splinter Review
It turns out that I'd managed to slip both ways of not plugging holes
incorrectly into that patch. JS_ASSERT() is sufficient in the face of newlen, I
think.
Attachment #185626 - Attachment is obsolete: true
Attachment #185638 - Flags: review?(brendan)
Comment on attachment 185638 [details] [diff] [review]
patch v2

Argh, we had just talked about that earlier today. Thanks for catching it. 
r+a=me.

/be
Attachment #185638 - Flags: review?(brendan)
Attachment #185638 - Flags: review+
Attachment #185638 - Flags: approval1.8b3+
Attachment #185626 - Flags: review?(brendan)
Looks good.  I tried running this today.  This had been hung up because of
problems with the other bug.
Fix checked in.
Severity: blocker → normal
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
OS: Linux → All
Hardware: PC → All
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.8beta3
Oops, thought of this last night as I was trying to get to sleep:

js> b = []

js> b[0] = undefined
js> b[2] = "hi"
hi
js> b.sort()
hi,,hi
js>

Also, "existence" misspelled.  Patch next.

/be
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch followup fixSplinter Review
Attachment #185693 - Flags: review?(mrbkap)
Attachment #185693 - Flags: approval1.8b3+
Comment on attachment 185693 [details] [diff] [review]
followup fix

I get it! r=me.
Attachment #185693 - Flags: review?(mrbkap) → review+
Followup fix checked in.

/be
Status: REOPENED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
v 1.8 and later.
Status: RESOLVED → VERIFIED
Flags: testcase+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: