Closed Bug 486356 Opened 12 years ago Closed 10 years ago

Type Specialized Arrays

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 566767

People

(Reporter: robarnold, Assigned: robarnold)

References

Details

Attachments

(1 file, 1 obsolete file)

Now that we have dense arrays for various index patterns, let's specialize the values stored so that we can avoid boxing/unboxing doubles and ints for arrays.

Attack plan:
Add a new JSClass for raw arrays (actually rename js_ArrayClass to js_DenseObjectArrayClass) - this class would be a dense array (reusing much of the existing code).
New arrays are integer arrays, these promote to double arrays or dense object arrays as needed.

Design points (feedback appreciated):
* Storage location: reinterpret_cast dslots and teach the gc not to touch dslots
* Type information: Can we still a bit or two from capacity? Upper or lower? If I can get two bits, is a separate JSClass needed?
* Public interface: Tracing will want to be able to set/get integers/doubles from arrays and check the array type and...anything else?
We still need dense generic (jsval element type) arrays, so js_DenseArrayClass is the right name, eh?

/be
The GC already doesn't touch dslots except as instructed to by the mark hook defined for dense arrays.

A separate JSClass will let us replace the current denseness guard, rather than add another one.  We want to preserve the ability to have dense AnyType arrays as well.

We have an existing "get doubles from array" API hook that canvas uses, we could extend it and combine with a type check to let native code get efficient access to (especially) the doubles case.  The tracer should be poking directly into the structures via inlined LIR, and not using the public API.
Mid-aired with shaver, recomposing slightly:

An idea that seems very much worth thinking about is sharing dense array goodness with objects having named properties (where access is optimized differently), so we are not stuck with either-or.

Lua does this, and Jason and I have talked about it in terms of extending stored 32-bit shapes to have an 8 bit "array tag". This tag would overlay the property-cache four-bit scope/proto-chain coordinates (see jsinterp.h PCVCAP_TAG*). 8 bits is enough to express "array of shape to the left, or else one of the machine-typed primitives, or of JSString or other jsval primitives".

How would this help? We should also get rid of a map per dense array, which is killing us right now (see bug 486321). Instead we would want shape to be in a dedicated object field, right after classword in the current world (we might get to a different world but let's do it with patches).

To unify fully, we would have an aslots optional field of JSObject, in addition to dslots, and probably void*. It would be scanned by the GC at the discretion of the class, as is done today even (see array_trace).

If you take this slightly longer road, I think you avoid higher-overhead steps and dead-end traps along the way. Having more than one *ArrayClass is one trap. The ongoing lack of array/object unification is another, bigger than this bug and of course not a good precedent -- but let's not have more, and let's try to fix it here and now. Or soon, anyway.

/be
I got the interpreter seemingly to work with native integer and double arrays (promotion included) except for Array.concat because I'm having trouble thinking up a good implementation.

At the beginning of the function, it checks if "this" is a dense array and then proceeds to copy it to a new object and then pops it off the argument vector. Is this an optimization that is worth doing for specialized arrays? (currently this crashes and dies a horrible death when "this" is an array with doubles or even integers) I can add a templated clone function to do this efficiently I think. I don't want to promote this" to a dense array so that this code will work as is. Sound good?

Also, it is worth looking into speeding up the rest of concat for the case when the arrays are all the same type? The conversion to jsval from int/double could be expensive for specialized arrays going through concat. The conversion isn't lossy however so concat-ing a bunch of int arrays should give an integer array (s/int/double/g is true as well).
IMO, concat (and splice and other array ops) should do their best to keep things in the native format as much as possible... so if you're calling concat on two int or double arrays, that should just extend + memcpy, no?
I just tried the current version of the patch (at http://hg.mozilla.org/users/robarnold_cmu.edu/array_spec/file/9f064e755f1b/array-spec) and it crashes on browser startup for me (or rather crashes in opt builds, fatally asserts in debug builds).  The code it's running is this part of httpd.js:

145 const HTTP_ERROR_CODES = array2obj(range(400, 417).concat(range(500, 505)));

There are at least two problems here:

1)  The immediate reason for the assert is that the 
    ArraySpec<unsigned int>::copyFromVector call we end up on from
    InitArrayObject, called from js_NewArrayObject, called from array_concat
    mutates the class of the object.  In particular, in InitArrayObject we
    dispatch on the class and end up in the unsigned int specialization, but
    the |fromJSVal(v, &getRawValue(obj, idx))| call in
    ArraySpec<unsigned int>::storeValue returns false, so we promote from a
    specialized array to a dense one.  But we're still in the unsigned int
    specialization of copyFromVector, so our next storeValue call that ends up
    having to promote hits a fatal assert like so:

#0  JS_Assert (s=0x4013a0 "OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj)", file=0x401360 "/Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.h", ln=125) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsutil.cpp:69
#1  0x0027dc13 in js_SpecArrayType (obj=0x14b849e0) at jsarray.h:125
#2  0x0027df3f in ArraySpecConversion<double, unsigned int>::updateObject (obj=0x14b849e0) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.cpp:270
#3  0x00288b41 in ArraySpec<unsigned int>::promoteTo<double> (cx=0x104fc00, obj=0x14b849e0) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.cpp:160
#4  0x0028943a in ArraySpec<unsigned int>::promoteToHoldValue (cx=0x104fc00, obj=0x14b849e0, idx=2, v=402) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.cpp:195
#5  0x0028950a in ArraySpec<unsigned int>::storeValue (cx=0x104fc00, obj=0x14b849e0, idx=2, v=402) at jsarraytemplates.h:111
#6  0x00289560 in ArraySpec<unsigned int>::copyFromVector (cx=0x104fc00, obj=0x14b849e0, start=0, count=18, vector=0x8a4194) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.cpp:218

2)  The reason there's any promoting going on at all is that in InitArrayObject
    things look like this:

(gdb) frame 7 
#7  0x00286045 in InitArrayObject (cx=0x104fc00, obj=0x14b849e0, length=18, vector=0x8a4194, holey=0) at /Users/bzbarsky/mozilla/tracemonkey/mozilla/js/src/jsarray.cpp:1857
(gdb) p vector
$13 = (jsval *) 0x8a4194
(gdb) x/18d vector
0x8a4194:       400     401     402     403
0x8a41a4:       404     405     406     407
0x8a41b4:       408     409     410     411
0x8a41c4:       412     413     414     415
0x8a41d4:       416     417

So someone passed in an integer-specialized array's dslots as the jsval* arg to js_NewArrayObject (and that someone is array_concat, to be precise).  Then we treat those values as jsvals, decide that 400 is an JSObject, promote from specialized to dense array and store it as a jsval, get to 401, we're still running inside the integer specialization, so we just store it in our array as 400, then get to 402, think it's a double, try to promote again and hit the above assert stack.
Note that with the latest change to the patch the above concat works fine, but browser startup still dies with the assert above (after some creative xpconnect asserts, actually).  Once again, the issue is trying to promote an already-dense array.  The relevant code is in nsExtensionManager.js and looks like this, basically:

  var singleProps = ["version", "updateURL", "updateService", "optionsURL",
                     "aboutURL", "iconURL", "internalName", "updateKey"];
  singleProps = singleProps.concat(["locked", "hidden"]);

We die when copying the second string from the original singleProps into the new array we created in the js_NewArrayObject call from array_concat.
Attached patch functional, but slower (obsolete) — Splinter Review
Lots more aborts due to inner tree trying to grow
Attachment #376661 - Flags: review?(gal)
Comment on attachment 376661 [details] [diff] [review]
functional, but slower

> #ifdef JS_TRACER
> JSBool FASTCALL
> js_Array_dense_setelem(JSContext* cx, JSObject* obj, jsint i, jsval v)
> {
>-    JS_ASSERT(OBJ_IS_DENSE_ARRAY(cx, obj));
>+    if (OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj))
>+        (void)JSARRAY_SPECDISPATCH(obj, promoteTo<jsval>(cx, obj));
>+    return ArraySpec<jsval>::setelem(cx, obj, i, v);
>+}
> 
>-    /*
>-     * Let the interpreter worry about negative array indexes.
>-     */
>-    JS_ASSERT((MAX_DSLOTS_LENGTH > JSVAL_INT_MAX) == (sizeof(jsval) != sizeof(uint32)));
>-    if (MAX_DSLOTS_LENGTH > JSVAL_INT_MAX) {
>-        /*
>-         * Have to check for negative values bleeding through on 64-bit machines only,
>-         * since we can't allocate large enough arrays for this on 32-bit machines.
>-         */
>-        if (i < 0)
>-            return JS_FALSE;
>-    }
>+JSBool FASTCALL
>+js_Array_dense_setelem_d(JSContext* cx, JSObject* obj, jsint i, jsdouble v)
>+{
>+    JS_ASSERT(OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj));

Why bogus cx? You have a real cx.

>+    if(JSAS_INT == js_SpecArrayType(obj))
>+        (void)ArraySpec<uint32>::promoteTo<jsdouble>(cx, obj);
>+    return ArraySpec<jsdouble>::setelem(cx, obj, i, v);
>+}
> 
>-    /*
>-     * If needed, grow the array as long it remains dense, otherwise fall off trace.
>-     */
>-    jsuint u = jsuint(i);
>-    jsuint capacity = js_DenseArrayCapacity(obj);
>-    if ((u >= capacity) && (INDEX_TOO_SPARSE(obj, u) || !EnsureCapacity(cx, obj, u + 1)))
>-        return JS_FALSE;
>-
>-    if (obj->dslots[u] == JSVAL_HOLE) {
>-        if (js_PrototypeHasIndexedProperties(cx, obj))
>-            return JS_FALSE;
>-
>-        if (u >= jsuint(obj->fslots[JSSLOT_ARRAY_LENGTH]))
>-            obj->fslots[JSSLOT_ARRAY_LENGTH] = u + 1;
>-        ++obj->fslots[JSSLOT_ARRAY_COUNT];
>-    }
>-
>-    obj->dslots[u] = v;
>-    return JS_TRUE;
>+JSBool FASTCALL
>+js_Array_dense_setelem_i(JSContext* cx, JSObject* obj, jsint i, uint32 v)
>+{
>+    JS_ASSERT(OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj) && JSAS_INT ==
>+            js_SpecArrayType(obj));

Same here.

>+    return ArraySpec<uint32>::setelem(cx, obj, i, v);
> }
> JS_DEFINE_CALLINFO_4(extern, BOOL, js_Array_dense_setelem, CONTEXT, OBJECT, INT32, JSVAL, 0, 0)
>+JS_DEFINE_CALLINFO_4(extern, BOOL,   js_Array_dense_setelem_d, CONTEXT, OBJECT, INT32, DOUBLE,   0, 0)
>+JS_DEFINE_CALLINFO_4(extern, BOOL,   js_Array_dense_setelem_i, CONTEXT, OBJECT, INT32, UINT32,   0, 0)
> #endif
> 

>diff --git a/js/src/jsinterp.cpp b/js/src/jsinterp.cpp
>--- a/js/src/jsinterp.cpp
>+++ b/js/src/jsinterp.cpp
>@@ -4787,17 +4787,24 @@ js_Interpret(JSContext *cx)
>             if (JSVAL_IS_INT(rval)) {
>                 if (OBJ_IS_DENSE_ARRAY(cx, obj)) {
>                     jsuint length;
> 
>                     length = js_DenseArrayCapacity(obj);
>                     i = JSVAL_TO_INT(rval);
>                     if ((jsuint)i < length &&
>                         i < obj->fslots[JSSLOT_ARRAY_LENGTH]) {
>-                        rval = obj->dslots[i];
>+                        if (OBJ_IS_SPECIALIZED_ARRAY(cx, obj)) {
>+                            // This will write JSVAL_HOLE for hole cases
>+                            if (!JSARRAY_SPECDISPATCH(obj,
>+                                                      getValue(cx, obj, i, &rval)))
>+                                goto error;
>+                        } else {
>+                            rval = obj->dslots[i];
>+                        }
>                         if (rval != JSVAL_HOLE)
>                             goto end_getelem;
> 
>                         /* Reload rval from the stack in the rare hole case. */
>                         rval = FETCH_OPND(-1);
>                     }
>                 }
>                 id = INT_JSVAL_TO_JSID(rval);
>@@ -4835,24 +4842,37 @@ js_Interpret(JSContext *cx)
>             FETCH_ELEMENT_ID(obj, -2, id);
>             do {
>                 if (OBJ_IS_DENSE_ARRAY(cx, obj) && JSID_IS_INT(id)) {
>                     jsuint length;
> 
>                     length = js_DenseArrayCapacity(obj);
>                     i = JSID_TO_INT(id);
>                     if ((jsuint)i < length) {
>-                        if (obj->dslots[i] == JSVAL_HOLE) {
>-                            if (js_PrototypeHasIndexedProperties(cx, obj))
>-                                break;
>-                            if (i >= obj->fslots[JSSLOT_ARRAY_LENGTH])
>-                                obj->fslots[JSSLOT_ARRAY_LENGTH] = i + 1;
>-                            obj->fslots[JSSLOT_ARRAY_COUNT]++;
>-                        }
>-                        obj->dslots[i] = rval;
>+                        if (!OBJ_IS_SPECIALIZED_ARRAY(cx, obj)) {
>+                            if (obj->dslots[i] == JSVAL_HOLE) {
>+                                if (js_PrototypeHasIndexedProperties(cx, obj))
>+                                    break;
>+                                if (i >= obj->fslots[JSSLOT_ARRAY_LENGTH])
>+                                    obj->fslots[JSSLOT_ARRAY_LENGTH] = i + 1;
>+                                obj->fslots[JSSLOT_ARRAY_COUNT]++;
>+                            }
>+                            obj->dslots[i] = rval;
>+                        } else {
>+                            if (JSARRAY_SPECDISPATCH(obj, hasHole(obj, i))) {
>+                                if (js_PrototypeHasIndexedProperties(cx, obj))
>+                                    break;
>+                                if (i >= obj->fslots[JSSLOT_ARRAY_LENGTH])
>+                                    obj->fslots[JSSLOT_ARRAY_LENGTH] = i + 1;
>+                                obj->fslots[JSSLOT_ARRAY_COUNT]++;
>+                            }
>+                            if (!JSARRAY_SPECDISPATCH(obj,
>+                                                      storeValue(cx, obj, i, rval)))
>+                                goto error;
>+                        }
>                         goto end_setelem;
>                     }
>                 }
>             } while (0);
>             if (!OBJ_SET_PROPERTY(cx, obj, id, &rval))
>                 goto error;
>         end_setelem:
>           END_SET_CASE_STORE_RVAL(JSOP_SETELEM, 3)
>diff --git a/js/src/jsobj.cpp b/js/src/jsobj.cpp
>--- a/js/src/jsobj.cpp
>+++ b/js/src/jsobj.cpp
>@@ -2889,17 +2889,21 @@ js_DropObjectMap(JSContext *cx, JSObject
>     return map;
> }
> 
> static void
> FreeSlots(JSContext *cx, JSObject *obj)
> {
>     if (obj->dslots) {
>         JS_ASSERT((uint32)obj->dslots[-1] > JS_INITIAL_NSLOTS);
>-        JS_free(cx, obj->dslots - 1);
>+        // Arrays have an extra negative slot
>+        if (OBJ_IS_DENSE_ARRAY(cx, obj))
>+            JS_free(cx, obj->dslots - 2);
>+        else
>+            JS_free(cx, obj->dslots - 1);
>         obj->dslots = NULL;
>     }
> }
> 
> #define SLOTS_TO_DYNAMIC_WORDS(nslots)                                        \
>   (JS_ASSERT((nslots) > JS_INITIAL_NSLOTS), (nslots) + 1 - JS_INITIAL_NSLOTS)
> 
> #define DYNAMIC_WORDS_TO_SLOTS(words)                                         \
>@@ -2907,16 +2911,19 @@ FreeSlots(JSContext *cx, JSObject *obj)
> 
> JSBool
> js_ReallocSlots(JSContext *cx, JSObject *obj, uint32 nslots,
>                 JSBool exactAllocation)
> {
>     jsval *old, *slots;
>     uint32 oslots, nwords, owords, log, i;
> 
>+    if (OBJ_IS_DENSE_ARRAY(cx, obj)) {
>+        return js_ReallocArraySlots(cx, obj, nslots, exactAllocation);
>+    }
>     /*
>      * Minimal number of dynamic slots to allocate.
>      */
> #define MIN_DYNAMIC_WORDS 4
> 
>     /*
>      * The limit to switch to linear allocation strategy from the power of 2
>      * growth no to waste too much memory.
>diff --git a/js/src/jstracer.cpp b/js/src/jstracer.cpp
>--- a/js/src/jstracer.cpp
>+++ b/js/src/jstracer.cpp
>@@ -251,16 +251,17 @@ js_InitJITStatsClass(JSContext *cx, JSOb
> }
> 
> #define AUDIT(x) (jitstats.x++)
> #else
> #define AUDIT(x) ((void)0)
> #endif /* JS_JIT_SPEW */
> 
> #define INS_CONST(c)        addName(lir->insImm(c), #c)
>+#define INS_CONSTF(c)       addName(lir->insImmf(c), #c)
> #define INS_CONSTPTR(p)     addName(lir->insImmPtr(p), #p)
> #define INS_CONSTFUNPTR(p)  addName(lir->insImmPtr(JS_FUNC_TO_DATA_PTR(void*, p)), #p)
> #define INS_CONSTWORD(v)    addName(lir->insImmPtr((void *) v), #v)
> 
> using namespace avmplus;
> using namespace nanojit;
> 
> static GC gc = GC();
>@@ -5711,24 +5712,44 @@ TraceRecorder::incProp(jsint incr, bool 
>     return JSRS_CONTINUE;
> }
> 
> JS_REQUIRES_STACK JSRecordingStatus
> TraceRecorder::incElem(jsint incr, bool pre)
> {
>     jsval& r = stackval(-1);
>     jsval& l = stackval(-2);
>+    // Not really a jsval* in the case of a specialized array
>     jsval* vp;
>     LIns* v_ins;
>     LIns* addr_ins;
>     CHECK_STATUS(elem(l, r, vp, v_ins, addr_ins));
>     if (!addr_ins) // if we read a hole, abort
>         return JSRS_STOP;
>-    CHECK_STATUS(inc(*vp, v_ins, incr, pre));
>-    box_jsval(*vp, v_ins);
>+
>+    if (JSVAL_IS_OBJECT(l) && 
>+        OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, JSVAL_TO_OBJECT(l))) {

This BOGUS_CX stuff is scary.

>+        JSObject *obj = JSVAL_TO_OBJECT(l);
>+        JSArraySpecialization spec = js_SpecArrayType(obj);
>+        jsdouble d = JSAS_INT == spec
>+                     ? static_cast<jsdouble>(*reinterpret_cast<int32*>(vp))
>+                     : *reinterpret_cast<jsdouble*>(vp);
>+        jsval v;
>+        if (!JS_NewNumberValue(cx, d, &v))
>+            return JSRS_ERROR;
>+        // Garbage collection won't happen during this call and inc doesn't
>+        // store it away somewhere so we don't need to root
>+        CHECK_STATUS(inc(v, v_ins, incr, pre));
>+        // demote to int if needed
>+        if (isi2f(v_ins) && spec == JSAS_INT)
>+            v_ins = iu2fArg(v_ins);
>+    } else {
>+        CHECK_STATUS(inc(*vp, v_ins, incr, pre));
>+        box_jsval(*vp, v_ins);
>+    }
>     lir->insStorei(v_ins, addr_ins, 0);
>     return JSRS_CONTINUE;
> }
> 
> static bool
> evalCmp(LOpcode op, double result)
> {
>     bool cond;
>@@ -6606,17 +6627,52 @@ TraceRecorder::guardClass(JSObject* obj,
>     JS_snprintf(namebuf, sizeof namebuf, "guard(class is %s)", clasp->name);
>     guard(cond, addName(lir->ins2(LIR_eq, class_ins, INS_CONSTPTR(clasp)), namebuf), exit);
>     return cond;
> }
> 
> JS_REQUIRES_STACK bool
> TraceRecorder::guardDenseArray(JSObject* obj, LIns* obj_ins, ExitType exitType)
> {
>-    return guardClass(obj, obj_ins, &js_ArrayClass, snapshot(exitType));
>+    bool isSpec = OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj);
>+    JSClass *cls = isSpec
>+                   ? &js_ArrayClass
>+                   : &js_DenseArrayClass;
>+    if (!guardClass(obj, obj_ins, cls, snapshot(exitType)))
>+        return false;
>+    if (isSpec) {
>+        JSArraySpecialization spec = js_SpecArrayType(obj);
>+        LIns* dslots_ins = lir->insLoad(LIR_ldp, obj_ins, offsetof(JSObject, dslots));
>+
>+        // ty = JSAS_INT;
>+        // if (dslots)
>+        //   ty = dslots[-2];
>+        // guard(ty)
>+        LIns *specSlot = lir->insAlloc(sizeof(spec));
>+        lir->store(lir->insImm(JSAS_INT), specSlot, 0);
>+
>+        LIns *br = lir->insBranch(LIR_jt,
>+                                  lir->ins_eq0(dslots_ins),
>+                                  NULL);
>+        // load specialization (dslots[-2])
>+        lir->store(lir->insLoad(LIR_ld, dslots_ins,
>+                                -2 * static_cast<signed int>(sizeof jsval)),
>+                   specSlot, 0);
>+        br->target(lir->ins0(LIR_label));
>+        // To be helpful, include names
>+        LIns *expectedSpec = spec == JSAS_INT
>+                           ? INS_CONST(JSAS_INT)
>+                           : INS_CONST(JSAS_DOUBLE);
>+        guard(true,
>+              lir->ins2(LIR_eq,
>+                        expectedSpec,
>+                        lir->insLoad(LIR_ld, specSlot, 0)),
>+              snapshot(exitType));
>+    } 
>+    return true;
> }
> 
> JS_REQUIRES_STACK JSRecordingStatus
> TraceRecorder::guardPrototypeHasNoIndexedProperties(JSObject* obj, LIns* obj_ins, ExitType exitType)
> {
>     /*
>      * Guard that no object along the prototype chain has any indexed properties which
>      * might become visible through holes in the array.
>@@ -7245,18 +7301,18 @@ TraceRecorder::newArray(JSObject* ctor, 
>         guard(false, lir->ins_eq0(arr_ins), OOM_EXIT);
>         if (argc == 1) {
>             // array_ins.fslots[JSSLOT_ARRAY_LENGTH] = length
>             lir->insStorei(f2i(get(argv)), // FIXME: is this 64-bit safe?
>                            arr_ins,
>                            offsetof(JSObject, fslots) + JSSLOT_ARRAY_LENGTH * sizeof(jsval));
>         }
>     } else {
>-        // arr_ins = js_NewUninitializedArray(cx, Array.prototype, argc)
>-        LIns *args[] = { INS_CONST(argc), proto_ins, cx_ins };
>+        // arr_ins = js_NewUninitializedArray(cx, Array.prototype, argc, 1)
>+        LIns *args[] = { lir->insImm(1), INS_CONST(argc), proto_ins, cx_ins };
>         arr_ins = lir->insCall(&js_NewUninitializedArray_ci, args);
>         guard(false, lir->ins_eq0(arr_ins), OOM_EXIT);
> 
>         // arr->dslots[i] = box_jsval(vp[i]);  for i in 0..argc
>         LIns *dslots_ins = NULL;
>         for (uint32 i = 0; i < argc && !lirbuf->outOMem(); i++) {
>             LIns *elt_ins = get(argv + i);
>             box_jsval(argv[i], elt_ins);
>@@ -8165,20 +8221,53 @@ TraceRecorder::record_JSOP_SETELEM()
> 
>     // Fast path for dense arrays accessed with a non-negative integer index. In case the trace
>     // calculated the index using the FPU, force it to be an integer.
>     idx_ins = makeNumberInt32(idx_ins);
> 
>     // Box the value so we can use one builtin instead of having to add one builtin for every
>     // storage type.
>     LIns* boxed_v_ins = v_ins;
>-    box_jsval(v, boxed_v_ins);
>-
>+    const CallInfo *setelem_ci = NULL;
>+    if (!OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj)) {
>+        setelem_ci = ArraySpec<jsval>::setelem_ci;
>+        box_jsval(v, boxed_v_ins);
>+    } else if (!JSVAL_IS_NUMBER(v) && !JSVAL_IS_INT(v)) {
>+        setelem_ci = ArraySpec<jsval>::setelem_ci;
>+        box_jsval(v, boxed_v_ins);
>+    } else {
>+        switch(js_SpecArrayType(obj)) {
>+            default:
>+                JS_ASSERT(false && "Unknown array specialization");
>+                // Fall through
>+            case JSAS_INT:
>+                setelem_ci = ArraySpec<uint32>::setelem_ci;
>+                if (v_ins->isQuad()) {

Wouldn't the right thing here be makeNumberInt32, but then make sure if that exit is taken that the array is converted from int to double? (new exit type, array type mismatch exit).

>+                    boxed_v_ins = makeNumberInt32(v_ins);
>+                    /*if (isPromote(v_ins))
>+                        boxed_v_ins = ::demote(lir, v_ins);
>+                    else
>+                        // Don't want to call f2i on trace...
>+                        ABORT_TRACE("Avoiding f2i just to maintain int array");*/
>+                }
>+                break;
>+            case JSAS_DOUBLE:
>+                setelem_ci = ArraySpec<jsdouble>::setelem_ci;
>+                if (!boxed_v_ins->isQuad()) {
>+                    if (boxed_v_ins->isconst()) {
>+                        boxed_v_ins = lir->insImmf(static_cast<jsdouble>(boxed_v_ins->constval()));
>+                    } else if (JSVAL_IS_INT(v)) {
>+                        boxed_v_ins = lir->ins1(LIR_i2f, boxed_v_ins);
>+                    }
>+                }


What if we write a string into a double array on trace? I don't see where that is stopped.

>+                break;
>+        }
>+    } 
>     LIns* args[] = { boxed_v_ins, idx_ins, obj_ins, cx_ins };
>-    LIns* res_ins = lir->insCall(&js_Array_dense_setelem_ci, args);
>+    LIns* res_ins = lir->insCall(setelem_ci, args);
>     guard(false, lir->ins_eq0(res_ins), MISMATCH_EXIT);
> 
>     jsbytecode* pc = cx->fp->regs->pc;
>     if (*pc == JSOP_SETELEM && pc[JSOP_SETELEM_LENGTH] != JSOP_POP)
>         set(&lval, v_ins);
> 
>     return JSRS_CONTINUE;
> }
>@@ -8766,17 +8855,20 @@ TraceRecorder::elem(jsval& oval, jsval& 
>         return JSRS_STOP;
> 
>     JSObject* obj = JSVAL_TO_OBJECT(oval);
>     LIns* obj_ins = get(&oval);
>     jsint idx = JSVAL_TO_INT(ival);
>     LIns* idx_ins = makeNumberInt32(get(&ival));
> 
>     /* make sure the object is actually a dense array */
>-    if (!guardDenseArray(obj, obj_ins))
>+    ExitType exitType = OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj)
>+                      ? BRANCH_EXIT
>+                      : MISMATCH_EXIT;
>+    if (!guardDenseArray(obj, obj_ins, exitType))
>         return JSRS_STOP;
> 
>     VMSideExit* exit = snapshot(BRANCH_EXIT);
> 
>     /* check that the index is within bounds */
>     LIns* dslots_ins = lir->insLoad(LIR_ldp, obj_ins, offsetof(JSObject, dslots));
>     jsuint capacity = js_DenseArrayCapacity(obj);
>     bool within = (jsuint(idx) < jsuint(obj->fslots[JSSLOT_ARRAY_LENGTH]) && jsuint(idx) < capacity);
>@@ -8845,21 +8937,39 @@ TraceRecorder::elem(jsval& oval, jsval& 
>     /* Guard array capacity */
>     guard(true,
>           lir->ins2(LIR_ult,
>                     idx_ins,
>                     lir->insLoad(LIR_ldp, dslots_ins, 0 - (int)sizeof(jsval))),
>           exit);
> 
>     /* Load the value and guard on its type to unbox it. */
>-    vp = &obj->dslots[jsuint(idx)];
>+    vp = reinterpret_cast<jsval*>(JSARRAY_TYPEDISPATCH(obj, getSlotLocation(obj, idx)));
>     addr_ins = lir->ins2(LIR_piadd, dslots_ins,
>-                         lir->ins2i(LIR_pilsh, idx_ins, (sizeof(jsval) == 4) ? 2 : 3));
>-    v_ins = lir->insLoad(LIR_ldp, addr_ins, 0);
>-    unbox_jsval(*vp, v_ins, exit);
>+                         lir->ins2i(LIR_pilsh, idx_ins,
>+                                    (JSARRAY_TYPEDISPATCH(obj, value_size) == 4) ? 2 : 3));
>+    v_ins = lir->insLoad(JSARRAY_TYPEDISPATCH(obj, load_opcode), addr_ins, 0);
>+    if (OBJ_IS_SPECIALIZED_ARRAY(BOGUS_CX, obj)) {
>+        JSArraySpecialization spec = js_SpecArrayType(obj);
>+        LOpcode eqOp;
>+        LIns *hole_value_ins;
>+        if (JSAS_INT == spec) {
>+            eqOp = LIR_eq;
>+            hole_value_ins = INS_CONST(ArraySpec<uint32>::hole_value);
>+        } else {
>+            eqOp = LIR_feq;
>+            hole_value_ins = INS_CONSTF(ArraySpec<jsdouble>::hole_value);
>+        }
>+        guard(false, lir->ins2(eqOp, v_ins, hole_value_ins), exit);
>+        if (JSAS_INT == spec)
>+            v_ins = lir->ins1(LIR_i2f, v_ins);
>+        return JSRS_CONTINUE;
>+    } else {
>+        unbox_jsval(*vp, v_ins, exit);
>+    }
> 
>     if (JSVAL_TAG(*vp) == JSVAL_BOOLEAN) {
>         /*
>          * If we read a hole from the array, convert it to undefined and guard that there
>          * are no indexed properties along the prototype chain.
>          */
>         LIns* br = lir->insBranch(LIR_jf,
>                                   lir->ins2i(LIR_eq, v_ins, JSVAL_TO_PSEUDO_BOOLEAN(JSVAL_HOLE)),
>@@ -10446,27 +10556,66 @@ TraceRecorder::record_JSOP_NEWARRAY()
> TraceRecorder::record_JSOP_NEWARRAY()
> {
>     LIns *proto_ins;
>     CHECK_STATUS(getClassPrototype(JSProto_Array, proto_ins));
> 
>     uint32 len = GET_UINT16(cx->fp->regs->pc);
>     cx->fp->assertValidStackDepth(len);
> 
>-    LIns* args[] = { lir->insImm(len), proto_ins, cx_ins };
>+    // Check for all double or all int arrays
>+    int type = JSAS_INT << 1;
>+
>+    for (uint32 i = 0; i < len; i++) {
>+        jsval& v = stackval(int(i) - int(len));
>+        LIns* elt_ins = get(&v);
>+        if (!JSVAL_IS_NUMBER(v) && !JSVAL_IS_INT(v)) {
>+            type = 1;
>+            break;
>+        }
>+        if (elt_ins->isQuad() && !isPromote(elt_ins)) {
>+            type = JSAS_DOUBLE << 1;
>+            continue;
>+        }
>+    }//*/
>+
>+    if (type != 1) {
>+        debug_only_v(printf("newarray can create a spec array of %d (len = %d)\n", type >> 1, len));
>+    }
>+
>+    // Temporarily treat all integer arrays as double arrays
>+    if ((JSAS_INT << 1) == type)
>+        type = JSAS_DOUBLE << 1;

aroo??

>+
>+    LIns* args[] = { lir->insImm(type), lir->insImm(len), proto_ins, cx_ins };
>     LIns* v_ins = lir->insCall(&js_NewUninitializedArray_ci, args);
>     guard(false, lir->ins_eq0(v_ins), OOM_EXIT);
> 
>     LIns* dslots_ins = NULL;
>     for (uint32 i = 0; i < len; i++) {
>         jsval& v = stackval(int(i) - int(len));
>         LIns* elt_ins = get(&v);
>-        box_jsval(v, elt_ins);
>-        stobj_set_dslot(v_ins, i, dslots_ins, elt_ins, "set_array_elt");
>-    }
>+        if (1 == type) {

We don't use const == type in SM I think (never seen it, just a random uber nit).

>+            box_jsval(v, elt_ins);
>+            stobj_set_dslot(v_ins, i, dslots_ins, elt_ins, "set_array_elt");
>+        } else {
>+            if (!dslots_ins)
>+                dslots_ins = lir->insLoad(LIR_ldp, v_ins, offsetof(JSObject, dslots));
>+            if (!elt_ins->isQuad()) {
>+                if (elt_ins->isconst()) {
>+                    elt_ins = lir->insImmf(static_cast<jsdouble>(elt_ins->constval()));
>+                } else if (JSVAL_IS_INT(v)) {
>+                    elt_ins = lir->ins1(LIR_i2f, elt_ins);
>+                }
>+            }
>+            // Don't have to worry about holes for doubles
>+            lir->insStorei(elt_ins, dslots_ins, i * sizeof(jsdouble));
>+        }
>+    }
>+
> 
>     stack(-int(len), v_ins);
>     return JSRS_CONTINUE;
> }
> 
> JS_REQUIRES_STACK JSRecordingStatus
> TraceRecorder::record_JSOP_HOLE()
> {
Impressive patch. The trace code looks pretty wrong. I think you want to check that its a specialized array, and that a number type is being written (isNumber(v)), and then emit code. For integer I would accept !isQuad and makeInt32, but for the latter make sure to downconvert the array in the exit case.
Comment on attachment 376661 [details] [diff] [review]
functional, but slower

>Array Type Specialization
>
>diff --git a/js/src/jsarray.cpp b/js/src/jsarray.cpp
>--- a/js/src/jsarray.cpp
>+++ b/js/src/jsarray.cpp
>  * We track these pieces of metadata for arrays in dense mode:
>  *  - the array's length property as a uint32, in JSSLOT_ARRAY_LENGTH,
>  *  - the number of indices that are filled (non-holes), in JSSLOT_ARRAY_COUNT,
>  *  - the net number of slots starting at dslots (capacity), in dslots[-1] if
>  *    dslots is non-NULL.
>+ *  - the array's specialization in dslots[-2] if it is a specialized array
>+ *    and dslots is non-NULL

The JSObject.fslots[JSSLOT_ARRAY_UNUSED] slot is much better place to store array's specialization than dslots[-2].
copyFromVector needs some serious comments (in particular, the fact that |start| is not used for indexing into the given vector is completely non-obvious from the signature).  Please document!
That last attached patch doesn't compile for me (on Mac):

../jsarray.cpp: In static member function ‘static JSBool ArraySpec<T>::setelem(JSContext*, JSObject*, jsint, T)’:
../jsarray.cpp:316: error: there are no arguments to ‘EnsureCapacity’ that depend on a template parameter, so a declaration of ‘EnsureCapacity’ must be available
../jsarray.cpp:316: error: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated)
Now with less BOGUS_CX!
Revised the setelem code - on trace functions are much more friendly towards dense arrays and arrays of the other specialization.
fslots[JSSLOT_ARRAY_SPEC] is now used (this is the third time the location of the specialization has been moved!)
copyFromVector has some more comments, but likely not enough
Hopefully the OSX (likely gcc) build error is fixed.

Sunspider tests of interest (significant sub-par performance):
3d-raytrace
bitops-nsieve-bits
crypto-aes
crypto-sha1
date-format-tofte

All the other sunspider tests are equal or slightly slower in speed.
Attachment #376661 - Attachment is obsolete: true
Attachment #376837 - Flags: review?(gal)
Attachment #376661 - Flags: review?(gal)
I'd test it on mac, but that diff doesn't apply to t-m tip.
I'm not at all a fan of JSVAL_HOLE as a magic value; having that removes one of the key reasons for having type specialized arrays, that is, being able to grab a raw pointer to the data in native code.  I guess you still can do that if count == length, and hope that noone stores the magic value (at which point it's no longer type specialized).

Isn't the most common case of holes in an array when there are holes at index >= N?  This is (I think) the case of |var x = Array(k)| (holes at index >= 0), and then walking over 0..k and setting values, which would move the start of the holes later on.  Are there other cases where holes show up in common usage?

I'd suggest that there be no magic hole value, but instead just that the index at which holes start is tracked.  If a hole is needed at any point other than in the contiguous ending portion of the array, then convert the array into an unspecialized one.
(In reply to comment #16)
> Isn't the most common case of holes in an array when there are holes at index
> >= N?  This is (I think) the case of |var x = Array(k)| (holes at index >= 0),
> and then walking over 0..k and setting values, which would move the start of
> the holes later on.  Are there other cases where holes show up in common usage?

Some JS uses code like:

  var a = Array(length);
  a[length - 1] = 0; 
  for (var i = 0; i != length; ++i)
    a[i] = real_initialization;

I suspect that this comes from some ancient advice to preallocate the space for array as a way to optimize things.
Ugh, that is horrible.

Ok, maybe have both a hole-start and hole-end index, to specify a range of indices which are holes.  Or worst case, a bit array would also work..
Blocks: 497458
Blocks: 497795
So I just tried Rob's latest patch here on the testcase in bug 497458, and it's a significant (15-20%) performance loss.  It's even a loss after I merge in the changes from bug 497794.

Looking at the shark profiles, dense_setelem_d is taking up a lot of time, both itself and via calling hasHole().  Specializing hasHole() for the double to use 64-bit int compares instead of double compares doesn't seem to help.

I wonder how much of the performance loss is actually due to the extra realloc we end up doing when we go from int to double for our specialized array: that reallocs from 65MB to 130MB...  More importantly, we then go touch all that memory to see whether it's holes, on every set.  So we're presumably completely blowing out the L2 cache (which just isn't 130MB here, I tell you) and might be gating on memory bandwidth.

I suspect that testcase is not that atypical as canvas testcases go.

Would it make sense to just create a byte-specialized array now, for use solely from canvas, which will ideally do clamp+round on set and immediately despecialize to a dense array on any weird stuff (holes introduced, length changed after initial creation, non-double-or-int value put in, whatever)?  If we only created such arrays via a special constructor used by canvas (a la bug 497794) we might be able to get away with this, right?  And I doubt people normally create arrays of millions of doubles...
In particular, the special constructor could guarantee no holes, by allocating exactly the right amount of memory for the array, since canvas guarantees it'll fill in all the rgba components for all the pixels it has.
Rob has on his todo list to rework this patch to avoid magic hole values, and instead optimize for the no-holes case (or no-holes from indices A..B, all holes otherwise from 0..length), given that that's the most common use case where we actually want this to be fast.
(hit enter too soon)

Creating a special object here for the canvas case would make sense though, because as you say, we can guarantee no holes and bounds etc.  However, we would have to do extra work to make sure things like data.forEach()/map()/other helpers continue to work (as I think they should).
The canvas-targetted byte-specialized array would also be able to use calloc to allocate for the CreateImageData, rather than malloc-and-then-set-a-word-at-a-time jsval stuff now, which is a non-trivial win I'm sure.

Is ImageData.prototype meant to have Array.prototype on the prototype chain?  I thought that people would just do Array.forEach(data, f) because it was array-like, but not instanceof Array.
That's a good question; I'm happy to have Array.forEach(d, f) be the only way to do that (I didn't realize that worked).  The spec doesn't mention this at all, it just has:

[IndexGetter, IndexSetter]
interface CanvasPixelArray {
  readonly attribute unsigned long length;
};

for the interface for the data element.
We live to serve!  That IDL is array-like but not instanceof-Array, so now the only question is whether there's content out there that depends on our current Arrayness.  I would bet not!
For what it's worth, just avoiding magic hole might not be enough: the real problem is that we do multiple passes over the array and move it all in and out of memory... and that we get gated on memory bandwidth.

I'd be pretty happy to not have imagedata be instanceof-Array.  And yes, with specialized arrays, whether byte or not, we can either calloc or memset in CreateImageData; we still have to do work in GetImageData.
Attachment #376837 - Flags: review?(gal)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 566767
(In reply to comment #3)
> 
> An idea that seems very much worth thinking about is sharing dense array
> goodness with objects having named properties (where access is optimized
> differently), so we are not stuck with either-or.

See bug 586842.
You need to log in before you can comment on or make changes to this bug.