Closed Bug 290592 Opened 15 years ago Closed 15 years ago

Array extras: forEach, indexOf, filter, map, some, every

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set

Tracking

()

VERIFIED FIXED
mozilla1.8beta2

People

(Reporter: shaver, Assigned: shaver)

References

Details

(Keywords: js1.5)

Attachments

(2 files, 6 obsolete files)

js> [1, 2, 4, 5].forEach(function (id, v) { print("[" + id + "]: " + v); });
[0]: 1
[1]: 2
[2]: 4
[3]: 5
js> [1, 2, 4].indexOf(2);
1
js> [1, 2, 4].indexOf(3);
-1
js>

Patch coming up.
Attached patch patch (obsolete) — Splinter Review
(Love to get these into 1.8b2/1.1a for developer previewing, I would!)
Attachment #180890 - Flags: review?(brendan)
Attached patch plus map (obsolete) — Splinter Review
I'll add filter on the next pass.  Also reordered the arguments to forEach's
lambda, to match map's (map callees will usually only care about the value, so
it is first).
Attachment #180890 - Attachment is obsolete: true
Attachment #180891 - Flags: review?(brendan)
I'm wondering whether the second argument for forEach is needed. Wouldn't 
'[].forEach(fun, optionalThis)' be the same as  	
'[].forEach.call(optionalThis, fun)' ?
Forget my last comment, it wouldn't be the same. Don't know what I was thinking.
Comment on attachment 180891 [details] [diff] [review]
plus map

>+        if (!IndexToId(cx, i, &id) ||
>+            !OBJ_GET_PROPERTY(cx, obj, id, &v)) {
>+            return JS_FALSE;
>+        }

Nit: instead of four lines with braces, how about:

	if (!IndexToId(cx, i, &id))
	    return JS_FALSE;
	if (!OBJ_GET_PROPERTY(cx, obj, id, &v))
	    return JS_FALSE;

>+        if (v == argv[0]) {
>+            *rval = INT_TO_JSVAL(i);
>+            return JS_TRUE;
>+        }

You want === here, so you need to equate -0 and 0, prevent NaN === NaN, and
compare strings, doubles, and ints carefully.  See NEW_EQUALITY_OP in
jsinterp.c.  That macro even handles non-canonized doubles that should have
been stored as int jsvals, just in case some native method fails to use
JS_NewNumberValue, and uses JS_NewDoubleValue or JS_NewDouble.

>+    if (JSVAL_IS_FUNCTION(cx, argv[0])) {
>+        funobj = JSVAL_TO_OBJECT(argv[0]);
>+    } else {
>+        JSFunction *fun = js_ValueToFunction(cx, &argv[0], 0);
>+        if (!fun)
>+            return JS_FALSE;
>+        funobj = fun->object;

Need to set the argv[0] local root here (to OBJECT_TO_JSVAL(funobj)).

>+    }
>+
>+    if (argc > 1) {
>+        if (!js_ValueToObject(cx, argv[1], &thisp))
>+            return JS_FALSE;

You might want to set argv[1] as a local root, even if there are currently no
more GC-thing allocations after this and before thisp is pushed on the new sta
>+    } else {
>+        thisp = OBJ_GET_PARENT(cx, funobj);
>+    }

You've just exposed Call objects.  This won't fly with ECMA, so use funobj's
global (as my patch to Function.prototype.call for bug 290488 does):

	while ((tmp = OBJ_GET_PARENT(cx, obj)) != NULL)
	    obj = tmp;

>+    /* We call with 2 args, value and index, plus room for rval. */
>+    origsp = js_AllocStack(cx, 2 + 3, &mark);
>+    if (!origsp)
>+        return JS_FALSE;
>+
>+    /* Lift current frame to include our args. */
>+    fp = cx->fp;
>+    oldsp = fp->sp;
>+
>+    for (i = 0; i < len; i++) {
>+        jsid id;
>+        jsval v;
>+
>+        if (!IndexToId(cx, i, &id) ||
>+            !OBJ_GET_PROPERTY(cx, obj, id, &v)) {
>+            ok = JS_FALSE;
>+            break;
>+        }

Again, avoid multi-line conditions, braces, and explicit JS_FALSE assignment to
ok with:

	ok = IndexToId(cx, i, &id);
	if (!ok)
	    break;
	ok = OBJ_GET_PROPERTY(cx, obj, id, &v);
	if (!ok)
	    break;

It uses one more line, but it's clearer and cleaner.

Similar comments apply to map, and because the functions are so similar, it
would be best to write a common subroutine controlled by a flag.  Especially if
you are adding filter, not to mention some, every, notAny, notEvery, ....

/be
The functionification of js_StrictlyEqual should help mitigate the codesize
cost of these tasty morsels, but I haven't done any measurements yet.
Attachment #180891 - Attachment is obsolete: true
Attachment #180957 - Flags: review?(brendan)
A brief demonstration!  (I'll update the new wiki-homed JS language reference
once this lands.)

js> [1, 4, 9].map(Math.sqrt);
1,2,3
js> [1, 2, 3, 4].filter(function (v) { return v % 2; })
1,3
js> [1, 2, 3, 4].some(function (v) { return v > 3; })
true
js> [1, 2, 3, 4].every(function (v) { return v > 3; })
false
js> [1, 2, 3].forEach(function (v, i) { print(i + ": " + v); });
0: 1
1: 2
2: 3
js> [1, 2, 3, 5].indexOf(5);
3
js>
Status: NEW → ASSIGNED
Flags: blocking1.8b2?
OS: Linux → All
Hardware: PC → All
Summary: [].forEach(fun, optionalThis) and [].indexOf(maybeMember) → Array extras: forEach, indexOf, filter, map, some, every
Target Milestone: --- → mozilla1.8beta2
Tasty; reviewing in a second.

/be
Severity: normal → enhancement
Keywords: js1.5
Is this just a Mozilla extension to JS just for Mozilla-specific scripts, a
Mozilla extension to JS that other vendors are supposed to find out about and
implement, or part of an existing or developing standard?
If it bears out in Mozilla-specific use, I'd be more than willing to submit it
to ECMA, should they ever want to produce another revision of 262 that isn't the
JS2-grade overhaul.  I'm not sure what their thinking is there, to be honest.

(I would hope to see these methods on JS2's array objects too, of course.)
Comment on attachment 180957 [details] [diff] [review]
comments addressed plus: filter, some, every

One bug, a couple of code-size ideas, and some very minor nits.  r=me with the
bug fixed.

/be

>+JSBool
>+js_StrictlyEqual(jsval lval, jsval rval, JSBool ifNaN)

Bug: now that you've made a === function, ifNaN is a false degree of freedom. 
Eliminate it and its macro-param counterparts, always passing JS_FALSE to
JSDOUBLE_COMPARE below.

>+{
>+    jsval ltag = JSVAL_TAG(lval), rtag = JSVAL_TAG(rval);
>+    jsdouble ld, rd;

Nit: extra newline here.

>+    if (ltag == rtag) {
>+        if (ltag == JSVAL_STRING) {
>+            JSString *lstr = JSVAL_TO_STRING(lval),
>+                *rstr = JSVAL_TO_STRING(rval);

Nit: make the declarators line up.

>+            return js_CompareStrings(lstr, rstr) == 0;
>+        }
>+        if (ltag == JSVAL_DOUBLE) {
>+            ld = *JSVAL_TO_DOUBLE(lval);
>+            rd = *JSVAL_TO_DOUBLE(rval);
>+            return JSDOUBLE_COMPARE(ld, ==, rd, ifNaN);

Bug, continued: the JSOP_NEW_NE case in js_Interpret when operating on two NaNs
will pass JS_TRUE for ifNaN to this function, which will evaluate (on Windows,
with broken MSVC -- did VS.NET fix this bug?) to JS_TRUE explicitly via the ?:
(see jsnum.h).	Then NEW_EQUALITY_OP will set cond to JS_TRUE != JS_TRUE or
false, or (0/0 !== 0/0) === false in JS, or 0/0 === 0/0 -- which violates the
NaN invariant that NaN != any number including any NaN.

>+++ jsinterp.h	17 Apr 2005 17:14:48 -0000
>@@ -299,6 +299,7 @@ js_CheckRedeclaration(JSContext *cx, JSO
> extern JSBool
> js_Interpret(JSContext *cx, jsbytecode *pc, jsval *result);
> 
>-JS_END_EXTERN_C
>+extern JSBool
>+js_StrictlyEqual(jsval lval, jsval rval, JSBool ifNaN);
> 
> #endif /* jsinterp_h___ */

Nit: declare this in the same order as its definition in jsinterp.c, right
after js_CheckRedeclaration.

>+    if (mode == MAP || mode == FILTER) {
>+        newlen = mode == MAP ? len : 0;

Nit: ken and dmr would overparenthesize the condition, so I would too.

>+    for (i = 0; i < len; i++) {
>+        jsid id;
>+        jsval v, rval2;

Nit: extra newline here.

>+#define CHECK_RETURN                                                          \
>+        if (rval2 == JSVAL_NULL) {                                            \
>+            b = JS_FALSE;                                                     \
>+        } else if (JSVAL_IS_BOOLEAN(rval2)) {                                 \
>+            b = JSVAL_TO_BOOLEAN(rval2);                                      \
>+        } else {                                                              \
>+            ok = js_ValueToBoolean(cx, rval2, &b);                            \
>+            if (!ok)                                                          \
>+                goto out;                                                     \
>+        }
>+
>+        switch (mode) {
>+          case FOREACH:
>+            break;
>+          case MAP:
>+            ok = OBJ_SET_PROPERTY(cx, newarr, id, &rval2);
>+            break;

All modes but FOREACH and MAP call CHECK_RETURN, so get rid of that macro and
just do it before the switch if mode > MAP (you control the order of
enumerators -- comment that enum to say "order is critical, see ...").

>+          case FILTER:
>+            CHECK_RETURN;
>+            if (!b)
>+                break;
>+            /* Filter passed v, push as result. */
>+            ok = IndexToId(cx, newlen++, &id);
>+            if (!ok)
>+                break;

goto out, you've got the label already.

>+            ok = OBJ_SET_PROPERTY(cx, newarr, id, &v);

move the if (!ok) break test from after the switch to here, but of course goto
out if !ok.

>+            break;
>+          case SOME:
>+            CHECK_RETURN;
>+            if (b) {
>+                *rval = JSVAL_TRUE;
>+                goto out;
>+            }
>+            break;
>+          case EVERY:
>+            CHECK_RETURN;
>+            if (!b) {
>+                *rval = JSVAL_FALSE;
>+                goto out;
>+            }
>+            break;
>+        }
>+
>+        if (!ok)
>+            break;

Lose this, it's unnecessary except in the FILTER case.

Nit: should the "extras" naming cite Lisp precedent?  Hmm, not for indexOf.

>+#define JS_HAS_ARRAY_EXTRAS     1       /* has indexOf, forEach, map, filter */

Or at least this comment, since it can't exhaustively list all the functions
anyway.
Attachment #180957 - Flags: review?(brendan) → review+
Flags: blocking1.8b2? → blocking1.8b2+
shaver: k, cool.
(In reply to comment #9)
> Is this just a Mozilla extension to JS just for Mozilla-specific scripts, a
> Mozilla extension to JS that other vendors are supposed to find out about and
> implement, or part of an existing or developing standard?

Hixie knows this, but I'd like to cite it for all and sundry.  From ECMA-262
Edition 3 Section 16:

An implementation may provide additional types, values, objects, properties, and
functions beyond those described in this specification. This may cause
constructs (such as looking up a variable in the global scope) to have
implementation-defined behaviour instead of throwing an error (such as
ReferenceError).

JS native object method innovation stopped in 1997 or so.  There was no reason
it should have stopped, except for the browser wars stalemating JS1 via the ECMA
standard, modulo a few design-by-committee-compromised innovations in Edition 3.

I'm happy shaver took up the cause here after I said something about "add
indexOf as a built-in" on IRC, in response to bryner asking about that missing
method.

ECMA TG1 is active again.  I'll certainly propose these methods when the time is
right, but it'll be a battle to get them into anything not Edition 4, and it may
be that Edition 4 focuses more on classes and packages than on filling native
object method gaps.

/be
Moving brendan's r forward, hoping for an a!
Attachment #180957 - Attachment is obsolete: true
Attachment #180966 - Flags: review+
Attachment #180966 - Flags: approval1.8b2?
(brendan knows this too, but just to clarify in case people misinterpreted what
I asked earlier: I wasn't suggesting there was anything wrong with extending JS.
I was just asking what the status of this was, for information and so that I
could advise other vendors, such as my employer, as to the status of these
extensions. There's no controversy here, before anyone posts to the forums or
something...)
Comment on attachment 180966 [details] [diff] [review]
For commit, nits picked.

>+#if JS_HAS_ARRAY_EXTRAS

More nits, can you stand it?  Extra newline here in jsarray.c, to match other
#if JS_HAS_... / #endif usage.

>+/* Order is important; extras which use a caller's predicate must follow MAP.*/

s/which/that/ and put a space after the period.

Net code size reduction?

a=me for 1.8b2.

/be
Attachment #180966 - Flags: approval1.8b2? → approval1.8b2+
Writing docs at
http://developer-test.mozilla.org/docs/Core_JavaScript_1.5_Reference:Objects:Array#Methods
made me realize that I wanted the array object available to the callback
(though it can be captured in a closure easily by most callees), and for
indexOf to support a fromIndex parameter to match String.prototype.indexOf.

Recklessly reusing this bug for a patch to remedy those.
Attachment #180980 - Flags: review?(brendan)
Attachment #180980 - Flags: approval1.8b2?
Comment on attachment 180980 [details] [diff] [review]
fromIndex on indexOf, pass array object as third argument to callback

>+    if (argc == 0)
>+        goto not_found;

Could avoid this by using a jsdouble start (better name for d, at function body
scope, and set it to 0 before the if (argc > 1) {...} block revised as follows:

>-    for (i = 0; i < len; i++) {
>+    if (argc == 1) {

    start = 0;
    if (argc > 1) {
>+        if (!js_ValueToNumber(cx, argv[1], &d))
>+            return JS_FALSE;
>+        d = js_DoubleToInteger(d);
>+        if (d < 0) {
>+            d += len;
>+            if (d < 0)
>+                d = 0;
>+        } else if (d > len) {
>+            d = len;
>+        }
>+        i = (jsuint)d;

s/d/start/g and note behavior matches slice, Python, etc. -- not
String.prototype.indexOf (boo to that for lamely following Java) -- by
interpreting negative start from end of string, instead of just clamping to 0.

>+    for (; i < len; i++) {

    for (i = (jsuint)start; i < len; i++) {
>+ not_found:
>     *rval = INT_TO_JSVAL(-1);
>     return JS_TRUE;

Could explicitly common the *rval = ...; return true code _a la_
String.prototype.indexOf, using another jsuint index = -1 before-hand, and by
setting index = i when found, and breaking, instead of returning, in the for
loop body.  Question: should no args mean nothing found, or should (as the
arity property Array.prototype.indexOf.length specifies) calling with no args
mean "return first index of undefined"?  I think both code and spec are simpler
with the latter, and users won't go wrong with the simpler spec.

/be
So tasty.

(I thought that starting at |length| rather than |length - 1| was a bit odd,
since |length| is known to be undefined for all arrays, but that sure is what
the String version does.  I can swing either way, at my esteemed reviewer's
discretion.)
Attachment #180980 - Attachment is obsolete: true
Attachment #181290 - Flags: review?(brendan)
Attachment #181290 - Flags: approval1.8b2?
Comment on attachment 181290 [details] [diff] [review]
review comments addressed, plus lastIndexOf

> #if JS_HAS_ARRAY_EXTRAS
>+typedef enum IndexOfDirection {
>+    FORWARD,
>+    BACKWARD
>+} IndexOfDirection;

I keep expecting your plain-named enumerators to collide with some Windows
macro or some-such, but more power to you if they don't.

>+
> static JSBool
>-array_indexOf(JSContext *cx, JSObject *obj, uintN argc, jsval *argv,
>-               jsval *rval)
>+array_indexOf_inner(JSContext *cx, JSObject *obj, uintN argc, jsval *argv,

Tradition here in jsarray.c would call this array_indexOf_sub.	_helper is en
vogue over in jsxml.c, but I say tradition wins.

>+                    jsval *rval, IndexOfDirection dir)
> {
>-    jsuint len, i;
>-    
>+    jsuint len, i, index = -1, end, step;

I would not hide index's initialization here in the declaration forest, but put
it down below, immediately before the for loop (about which, more in a minute).

>+    jsdouble start;
>+
>     if (!js_GetLengthProperty(cx, obj, &len))
>         return JS_FALSE;
>+
>+    if (dir == FORWARD) {
>+        start = 0;
>+        end = len;
>+        step = 1;
>+    } else {
>+        start = len;
>+        end = -1;
>+        step = -1;
>+    }

So instead of an enum and random logic in the helper (er, sub), why not make
these parameters?

>+    for (i = (jsuint)start; i != end; i += step) {
>         jsid id;
>         jsval v;
> 

It isn't right to index at the array length.  jsstr.c's str_lastIndexOf is
careful not to index at text[i + j] for i + j >= textlen, but intentionally
starts i at textlen (without a second arg) in order to make
"abcd".lastIndexOf("") return 4 (empty string match at end of string, and
before every character in the string).

Array.prototype.lastIndexOf is different, since there is no analogue for the
empty string.

In light of this, I would write array_lastIndexOf separately, with the right
boundary conditions, and then see whether there was enough code to make a
common subroutine worthwhile.

/be
I also changed the extras to use |length| instead of |len|, to better fit the
rest of that file (which also uses |length| where I might expect |index|, but
that's for another day).
Attachment #181290 - Attachment is obsolete: true
Attachment #181420 - Flags: review?(brendan)
Attachment #180980 - Flags: review?(brendan)
Attachment #180980 - Flags: approval1.8b2?
Attachment #181290 - Flags: review?(brendan)
Attachment #181290 - Flags: approval1.8b2?
Flags: blocking1.8b2+ → blocking1.8b3+
I found an unpleasant bug in which operating on an empty array returned a void
value.	I've fixed this so that it returns an empty array for map/filter, and
false for some/every.

I also made the behaviour of [].map() consistent with [].map("string"), which
is to say that the former now correctly errors out complaining that undefined
is not a function, which indeed it is not.

Best array extras ever, and fixed a bug that was busting my Lightning agenda
hacking.
Attachment #181420 - Attachment is obsolete: true
Comment on attachment 182500 [details] [diff] [review]
As above, with fixed behaviour for operations on an empty array and with no arguments

>+        newlen = (mode == MAP ? length : 0);

Parens around the condition, not the whole RHS of assignment, and r+a=me.

/be
Attachment #182500 - Flags: review?(brendan)
Attachment #182500 - Flags: review+
Attachment #182500 - Flags: approval1.8b2+
Flags: blocking1.8b2+
Flags: testcase?
/cvsroot/mozilla/js/tests/js1_5/Array/array-002.js,v  <--  array-002.js
initial revision: 1.1
Flags: testcase? → testcase+
OS: All → Windows 98
Hardware: All → PC
woah, I don't know what I did to change the OS/Platform fields...
OS: Windows 98 → All
Hardware: PC → All
Attachment #181420 - Flags: review?(brendan)
Attachment #180891 - Flags: review?(brendan)
Attachment #180890 - Flags: review?(brendan)
Fixed, though the docs on Devmo could use some updating.
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
I'll work on the docs.
Blocks: 305002
The test for this regressed on 10/4 branch and trunk 

Testcase js1_5/Array/array-002.js failed Bug Number 290592
[ Top of Page ]
STATUS: Array extras: forEach, indexOf, filter, map
Failure messages were:
FAILED!: Array.forEach: mutate
FAILED!: Expected value 'hello,mutated,undefined,', Actual value 'hello,mutated,'
FAILED!:
FAILED!: Array.forEach on sparse array
FAILED!: Expected value 'undefined,undefined,sparse,', Actual value 'sparse,'
FAILED!:
FAILED!: every element is a string
FAILED!: Expected value 'false', Actual value 'true'
FAILED!:
FAILED!: every element is a string, via object callback
FAILED!: Expected value 'false', Actual value 'true'
FAILED!: 
This test is now js1_6/Array/regress-290592.js

verified fixed.
Status: RESOLVED → VERIFIED
Blocks: es5
Depends on: 507904
No longer depends on: 507904
You need to log in before you can comment on or make changes to this bug.