Last Comment Bug 668024 - Array.prototype.splice test262 bugfixing and rewriting
: Array.prototype.splice test262 bugfixing and rewriting
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: Other Branch
: All All
: -- normal (vote)
: mozilla10
Assigned To: Jeff Walden [:Waldo] (remove +bmo to email)
:
:
Mentors:
Depends on: 694210
Blocks: test262 675164 845713
  Show dependency treegraph
 
Reported: 2011-06-28 14:09 PDT by Jeff Walden [:Waldo] (remove +bmo to email)
Modified: 2013-02-27 02:02 PST (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP - works but slow (9.90 KB, patch)
2011-07-21 16:40 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
WIP - Compare against old array_splice (14.27 KB, patch)
2011-07-29 10:50 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
Final-ish version (15.21 KB, patch)
2011-08-01 17:26 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
performant version (17.39 KB, patch)
2011-08-03 17:22 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
emulate old splice(0) bug (18.76 KB, patch)
2011-08-15 17:25 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
Fix all Fx uses of splice (41.70 KB, patch)
2011-08-17 11:43 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
With dense arrays separated and sparse arrays specified (55.04 KB, patch)
2011-08-25 18:29 PDT, Paul Biggar
no flags Details | Diff | Splinter Review
Slightly updated/tweaked tests (16.12 KB, patch)
2011-10-03 16:32 PDT, Jeff Walden [:Waldo] (remove +bmo to email)
jwalden+bmo: review+
Details | Diff | Splinter Review
Patch to reimplement the splice algorithm (21.43 KB, patch)
2011-10-03 17:09 PDT, Jeff Walden [:Waldo] (remove +bmo to email)
bhackett1024: review+
Details | Diff | Splinter Review

Description Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-28 14:09:16 PDT
On my system these test262 tests for Array.prototype.splice fail:

S15.4.4.12_A2_T1	If start is positive, use min(start, length).	fail
S15.4.4.12_A2_T4	If start is negative, use max(start + length, 0).	fail

I haven't fully investigated to determine why.  Our algorithm doesn't look like the spec's algorithm, so it's difficult to immediately pinpoint the exact issue.  I see other issues trivially resulting from that divergence, too -- this, say:

try
{
  Array.prototype.splice.call({ get length() { throw 'error'; } });
  throw new Error("should have thrown, didn't");
}
catch (e)
{
  assertEq(e, "error");
}

It looks to me like now's a good time to rewrite the algorithm to read like the spec, fixing the two test262 failures and any other latent issues in the process.  There'll doubtless be some fun involved in preserving the dense-array element optimizations here, but it shouldn't be too bad.  (For all I know, the refactoring might make it possible to introduce even more of them.  I definitely remember [].splice being a hairy mess when I added them in the first place, such that it was hard enough to do it that I didn't try very hard when those optimizations didn't obviously fit.)

Paul, you want to take this?
Comment 1 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-28 14:37:39 PDT
These tests, which I missed when making comment 0, probably also would be fixed by this:

S15.4.4.12_A3_T2	length is arbitrarily	fail
S15.4.4.12_A3_T3	length is arbitrarily	fail
Comment 2 Paul Biggar 2011-07-21 16:40:17 PDT
Created attachment 547559 [details] [diff] [review]
WIP - works but slow

So this works, but is pretty slow. There's a 30%ish regression, and profiling it shows that we're spending a lot more time in slot allocation. Hopefully this will be a trivial indexing error, and a few printfs might sort this out.
Comment 3 Paul Biggar 2011-07-29 10:50:29 PDT
Created attachment 549418 [details] [diff] [review]
WIP - Compare against old array_splice

Although everything worked before, there were some performance regressions. This copies the array and calls the old array_splice, and compares the results. There are assertions which show that is no case do we have a greater capacity, or different result, and the asserts don't trigger in any of our test suite or the microbenchmark I have.

Next step is to remove all this and polish the result.
Comment 4 Paul Biggar 2011-08-01 17:26:30 PDT
Created attachment 549982 [details] [diff] [review]
Final-ish version

This is very nearly the final version, but am hunting down a 8-9% regression. I think it's due to allocating fixed slots and then allocating dynamic slots immediately afterwards. The fixed slot allocation certainly lead to a huge increase in indirect cache misses. I'm going to try allocating 0 slots for arrays of a certain size, and see if that helps.
Comment 5 Paul Biggar 2011-08-03 17:22:49 PDT
Created attachment 550562 [details] [diff] [review]
performant version

This fixes all the sputnik tests and others that I found in searching bugzilla. The performance thing turned out to be that the array was getting shrunk (with the corresponding writing of holes) in js_SetLengthProperty.
Comment 6 Paul Biggar 2011-08-04 19:52:19 PDT
Comment on attachment 550562 [details] [diff] [review]
performant version

Tryserver came back with a few bugs that need to be fixed, but I'd appreciate a preliminary look at this.
Comment 7 Paul Biggar 2011-08-15 17:25:37 PDT
Created attachment 553316 [details] [diff] [review]
emulate old splice(0) bug

The old version of array_splice had a bug in 15.4.4.12 step 7, where calling x.splice(0) cleared x. We use this in at least 10 places in Firefox, and it was the cause of an xpcshell test, and I expect the other test failures I'm seeing (investigating).

Chrome, Safari and Opera all have the incorrect behaviour, and the latest IE has the correct behaviour. Test262 does not test correct behaviour, so I've emulated the bug. I think fixing the bug is worthwhile, but I'd rather do that in a follow-on bug, and just get this one done.
Comment 8 Paul Biggar 2011-08-17 11:43:02 PDT
Created attachment 553859 [details] [diff] [review]
Fix all Fx uses of splice

This removes all uses of the non-standard behaviour I can find in FF. Since IE doesn't have this, we might want to remove the feature, but I've left it in for now.
Comment 9 Jeff Walden [:Waldo] (remove +bmo to email) 2011-08-17 17:40:38 PDT
Comment on attachment 553316 [details] [diff] [review]
emulate old splice(0) bug

Review of attachment 553316 [details] [diff] [review]:
-----------------------------------------------------------------

A note on types: in a large number of cases you use |uint|.  You should use |uint32| for all array indexes (excepting the final length of the array, where |actualDeleteCount < itemCount| means you could overflow into double-space).

I was sufficiently through this when the new patch got posted that it seemed a waste of time to transfer comments over to it.  Interdiff says changes were non-substantive, so this is probably good enough, I hope.

If you can make the requested changes before about noon Eastern tomorrow, I might be able to review them before I head out.  (I probably won't leave the house for the flight until about noon Pacific, but I'll probably end up doing some last-minute shuffling of packing, so exactly when I'll go incommunicado is uncertain.)  Otherwise I'm not going to have a chance to review again until September.  I don't think we're so hurried by this that it must happen during that time, so you might end up sitting on this review a bit longer, alas, unless you want to find someone else to review an updated patch.

::: js/src/jit-test/tests/basic/bug590780.js
@@ +2,5 @@
> +Array.prototype.splice.call(o,0,1);
> +
> +assertEq(o[0], 2);
> +assertEq(o[1], 3);
> +assertEq(o.length, 2);

Assert that there's no "2" property here, please.  (Assuming I remember splice's semantics well enough, if I don't assert whatever the appropriate thing is for property 2's semantics.)

::: js/src/jit-test/tests/basic/bug668024.js
@@ +1,1 @@
> +try

Please name this file something more descriptive: splice-throwing-length-getter.js would be more helpful when skimming file listings, and would autocomplete much better.

::: js/src/jit-test/tests/basic/bug675164.js
@@ +8,5 @@
> +list.push('a');
> +
> +list.splice(0, 1);
> +
> +print(list.length + ' ' + list[0]);

This test doesn't actually, er, test anything.  Please make it do so, and give it a more descriptive name.

::: js/src/jsarray.cpp
@@ +2287,5 @@
> + * finalIndex is the greatest index into the caller will access directly. Omit
> + * finalIndex if the caller will extend the array before operating on it.
> + */
> +static inline bool
> +IsContiguousDenseArray(JSContext *cx, JSObject *arr, uint finalIndex = 0)

I keep reading this name and half-forgetting that it also checks the prototype chain for potential side effects that might happen during deleting, putting, etc.  How about the name IsDenseArrayWithUnindexedProtoChain to also that facet of meaning?

@@ +2291,5 @@
> +IsContiguousDenseArray(JSContext *cx, JSObject *arr, uint finalIndex = 0)
> +{
> +    return arr->isDenseArray()
> +        && !js_PrototypeHasIndexedProperties(cx, arr)
> +        && finalIndex <= arr->getDenseArrayCapacity();

SpiderMonkey style doesn't put && at the beginning of lines like this, rather at the end.

@@ +2301,5 @@
> +    if (!JS_CHECK_OPERATION_LIMIT(cx))
> +        return false;
> +
> +    JSBool hole;
> +    AutoValueRooter fromValue(cx);

This can just be a Value (with appropriate changes elsewhere).

@@ +2306,5 @@
> +    if (!GetElement(cx, source, sourceIndex, &hole, fromValue.addr()))
> +        return false;
> +
> +    if (!SetOrDeleteArrayElement(cx, target, targetIndex, hole, fromValue.value()))
> +        return false;

You might as well do |return GetElement(...) && SetOrDeleteArrayElement(...)|, doesn't seem the extra whitespace provides a benefit in readability.

@@ +2314,5 @@
> +
> +static bool
> +CopySparseArray(JSContext *cx, jsuint count, JSObject* source, jsuint sourceIndex, JSObject* target, jsuint targetIndex)
> +{
> +    for (uint i = 0; i < count; i++)

uint32

@@ +2324,5 @@
> +
> +static inline bool
> +CopyArray(JSContext *cx, uint count, JSObject *source, uint sourceIndex, JSObject *target, uint targetIndex)
> +{
> +    JS_ASSERT (source != target);

Extra space.

@@ +2325,5 @@
> +static inline bool
> +CopyArray(JSContext *cx, uint count, JSObject *source, uint sourceIndex, JSObject *target, uint targetIndex)
> +{
> +    JS_ASSERT (source != target);
> +    if (IsContiguousDenseArray(cx, source, sourceIndex + count)) {

|sourceIndex + count| screams integer overflow.  It's not here, by the nature of the way sourceIndex and count are computed, but it's a close thing.  At least add a JS_ASSERT(sourceIndex + count >= sourceIndex) here, although that only catches things for people who are thinking about it.  Really it would be better if this method could be made only usable by array_splice, but C++ doesn't have nested methods, alas.

But no, I lie!  This does overflow!  And dangerously -- the isDenseArray check I thought prevented badness if overflow happened doesn't save you.  It's all due to that |argc == 1| deviation.  Witness:

  var arr = [1, 2, 3, 4, 5, 6, 7, 8];
  arr.length = Math.pow(2, 32) - 2;
  arr.splice(5); // kaboom

You could hack around this problem here with an integer overflow check.  Looking at the code, I think it would be better to compute actualDeleteCount the spec way if |argc >= 2|, and then as |len - actualStart| in the |argc < 2| alternative.

But beyond that, I think you should just inline this method into array_splice.  I found it difficult to correlate the code for the relevant step with the code in this method due to the variable name change and the distance from the surrounding code that contextualized it.

@@ +2329,5 @@
> +    if (IsContiguousDenseArray(cx, source, sourceIndex + count)) {
> +        const Value *sourceVector = source->getDenseArrayElements() + sourceIndex;
> +        return InitArrayElements(cx, target, targetIndex, count, sourceVector);
> +    } else {
> +        return CopySparseArray(cx, count, source, sourceIndex, target, targetIndex);

Mozilla style is to not have an else block when the if block ends in a return statement.  (Although this becomes irrelevant when this method is inlined into array_splice.)

@@ +2335,5 @@
> +}
> +
> +/* Move the contents of an array to the left, but don't actually change the array's capacity. */
> +static inline bool
> +ShrinkArray(JSContext *cx, uint count, JSObject* arr, uint sourceIndex, uint targetIndex)

Just as with CopyArray, I find ShrinkArray more confusing to read when extracted from array_splice than if it were directly inside it.

@@ +2358,5 @@
> +{
> +    JS_ASSERT(sourceIndex < targetIndex);
> +
> +    if (IsContiguousDenseArray(cx, arr) &&
> +        arr->ensureDenseArrayElements(cx, targetIndex, count) == JSObject::ED_OK) {

If ensureDenseArrayElements returns the ED_FAILED enum (I think), then it has failed, and you should return false immediately, rather than continue on and attempt to treat this like a non-optimizable case.

@@ +2363,5 @@
> +        arr->moveDenseArrayElements(targetIndex, sourceIndex, count);
> +        return true;
> +    }
> +
> +    /* Go backwards to avoid overwriting uncopied parts of the array. */

This comment misleads somewhat.  While this is a reason to go backwards, even if it were somehow magically the case that going forwards didn't overwrite, the spec algorithm requires that we go backwards to properly order side effects like invoking setters on the prototype, calling proxy delete hooks, and so on.  The reason we go backwards is that that's how the spec orders it.  Placing this loop directly in array_splice, as I think we should do for some of these other methods (and have noted above), makes this much clearer.  So this should be inlined in array_splice, too.

@@ +2366,5 @@
> +
> +    /* Go backwards to avoid overwriting uncopied parts of the array. */
> +    for (uint i = count; i > 0; i--)
> +        if (!CopyArrayElement(cx, arr, sourceIndex + i - 1, arr, targetIndex + i - 1))
> +            return false;

This loop needs braces, because its body covers multiple lines.

Another problem with this loop: |sourceIndex + i - 1| and the identical one for |targetIndex| can overflow.  You'll start trashing low elements of the array if you don't terminate the loop before either addition forms a value that's not a uint32_t.  I suggest you have one loop that contains mathy logic to make sure it stops copying at the 2**32-1 boundary.  Then, after it, have a second loop that does identical stuff but computes its indexes with doubles, so as not to lose precision.  It's ugly, sure.  It's also what the spec requires.  :-\

@@ +2376,5 @@
> +DeleteArrayEndSlow(JSContext *cx, JSObject *arr, uint start, uint count)
> +{
> +    JS_ASSERT(!arr->isDenseArray());
> +    JS_ASSERT_IF(arr->isArray(), start + count == arr->getArrayLength());
> +    for (uint i = start; i < start + count; i++)

This order of deletion, from low to high, is wrong.  In the spec it's actually high to low.  This would be observable if someone inserted a proxy along the prototype chain and watched deletion attempts using it.

Also, as with previous methods, I think this is clearer if this method is inlined into array_splice.  Sure, all these inlinings make array_splice longer, and slightly harder to read.  But to be honest, Array.prototype.splice is hard to read in the spec.  That's not changing in the foreseeable future.  Trying to clarify it by changing the implementation only takes one hard-to-read problem (that the spec's obscure) and compounds it with another (that the implementation doesn't obviously match it).  Please just do the spec steps instead here, without extra abstraction.

@@ +2394,5 @@
>  array_splice(JSContext *cx, uintN argc, Value *vp)
>  {
> +    Value *argv = JS_ARGV(cx, vp);
> +
> +    /* Step 1 */

End the Step # comments with periods.

@@ +2395,5 @@
>  {
> +    Value *argv = JS_ARGV(cx, vp);
> +
> +    /* Step 1 */
> +    JSObject *O = ToObject(cx, &vp[1]);

I am sympathetic to naming this |O|, as that's what the spec uses.  At the same time, particularly when we're dealing with indexes, O is extremely easy to confuse with 0.  That goes even if your font distinguishes the two, because you might not be certain of the exact way they were distinguished (slash?  dot?  width alone) and would still end up confused.  (For that matter, I keep getting confused myself just trying to Splinter-review this in a nightly build.)  Please use |obj| instead.  At risk of bikeshedding, I'd probably even agitate for the spec to change name as well, to avoid confusion there (although there, width plus italics make the difference somewhat clearer).

@@ +2408,5 @@
> +
> +    /* Step 5 */
> +    Value start = argc >= 1 ? argv[0] : UndefinedValue();
> +    double relativeStart;
> +    if (!ToInteger(cx, start, &relativeStart))

|start|'s never used again, so just use the assignment expression directly in the ToInteger call.

@@ +2412,5 @@
> +    if (!ToInteger(cx, start, &relativeStart))
> +        return false;
> +
> +    /* Step 6 */
> +    double actualStart;

I believe |actualStart| can actually (pun intended) be a uint32.  The max/min used to initialize it is definitely, per ToInteger and ToUint32, constrained to the set of integers.  The max means it's less than len but not less than 0, and since len is a uint32, so would be the JS_MAX(...).  The min means it's len or a (non-negative) integer less than len, which means it too is a uint32.  So this should be uint32 for efficiency and simplicity, and there should be casts around the results of the overall min and max expressions.

@@ +2419,5 @@
> +    else
> +        actualStart = JS_MIN(relativeStart, len);
> +
> +    /* Step 7: DeleteCount is >=0, but capped by the number of elements after actualStart. */
> +    Value deleteCount = argc >= 2 ? argv[1] : UndefinedValue();

Again I'd get rid of |deleteCount| since it's used only once.

@@ +2420,5 @@
> +        actualStart = JS_MIN(relativeStart, len);
> +
> +    /* Step 7: DeleteCount is >=0, but capped by the number of elements after actualStart. */
> +    Value deleteCount = argc >= 2 ? argv[1] : UndefinedValue();
> +    double actualDeleteCount;

Use a fresh variable for the ToInteger result -- less confusing than giving it the name of a different quantity.

@@ +2435,5 @@
> +    if (!A)
> +        return false;
> +
> +    /* Step 17: return A */
> +    vp->setObject(*A);

I think some of the recent stack-walking code, and some of the recent debugger work, is beginning to dislike early return-value sets like this that blow away the identity of the callee function.  There's no real harm doing this in the actual stepwise location, I think (I don't see any potential early success returns), so let's do that.

@@ +2445,5 @@
> +    /* Step 10 */
> +    Value *items = &argv[2];
> +
> +    /* Step 11 */
> +    double itemCount = (argc >= 2) ? (argc - 2) : 0;

uint32

@@ +2452,5 @@
> +    double finalLength = len - actualDeleteCount + itemCount;
> +
> +    /* Step 12: Remove the deleted items */
> +    if (itemCount < actualDeleteCount) {
> +

Kill the blank line.

@@ +2460,5 @@
> +         */
> +        if (!ShrinkArray(cx, origLength - actualDeleteCount - actualStart, O,
> +                         actualStart + actualDeleteCount,
> +                         actualStart + itemCount))
> +            return false;

This would need to be braced because the if-head takes up more than one line, if it were to remain this way, rather than have ShrinkArray inlined.

@@ +2464,5 @@
> +            return false;
> +
> +
> +        /* Steps 12c-d: Delete remaining elements (non-array case). */
> +        if (!O->isArray())

This if should be braced, because its body takes up more than one line.

But beyond that, if we're going to special-case here for array versus non-array, I think we should take it all the way, and make it |if (!O->isArray()) { ...everything remaining in 12... } else { ...everything remaining in 12... }|.  (I would invert this ordering to make the conditions positive tests -- if is array, else if not.)  The intermingling that happens here is confusing to read.  And the way you've ensured there's only one length-property-set for all cases -- despite there being only one case in the spec -- is more confusing still.

@@ +2471,5 @@
> +                return false;
> +
> +        /* Steps 12c-d: Delete remaining elements (array case). */
> +        /* Step 16: set length. This also shrinks arrays (dense and sparse). */
> +        if (!js_SetLengthProperty(cx, O, finalLength))

I see one problem, if currently a theoretical one, with relying on setting the length to delete properties from the array.  If the length property is non-writable, then attempting to set the length property like this will promptly throw, without deleting any properties from the array.  Of course that's not the case now, because array length can't be made non-writable.  But we should consider it when doing this rewrite, I think, and not rely on currently-wrong behavior.  (We should have a test written for this case, marked as failing, to ensure correct behavior is implemented here when [].length can be made non-writable.)

But I see another problem, a real one, with leaning on length-setting to implement deletion here.  If that range of properties to be deleted contains a non-configurable property, then, because [[Delete]] is passed true for its strict/throw argument, deleting that property will throw a TypeError.  That's all fine, setting the length property will do the same.  But note carefully: deleting properties in step 12d does not affect the length property!  The only way the length property of the object being spliced is changed is if step 16 is reached.  But setting the length property of an array behaves differently per spec -- if it encounters a non-configurable property, it deletes all properties above it and changes the length property to match.  (We don't implement this facet of behavior properly, to be sure, because js_SetLengthProperty doesn't pass along a strict flag.  But we'll change it to be correct pretty soon, I'm sure.)  So setting the length property like this is not equivalent to looping to delete all high properties, then setting the appropriate length.

I'm still pretty sure there's some sort of optimization you can do here for dense arrays, possibly along the length lines -- at least in the case where there are no indexed properties along the prototype chain.  But for non-dense arrays, or for dense arrays when the prototype chain has objects with indexed properties, I think you need to just do the spec steps.

@@ +2478,5 @@
> +        /*
> +         * Fix running enumerators for deleted items (handled in
> +         * js_SetLengthProperty in the sparse case).
> +         */
> +        if (O->isDenseArray())

If this survives the changes mentioned just previously, make sure to brace it, because its body occupies multiple lines.

@@ +2486,5 @@
> +
> +    } else if (itemCount > actualDeleteCount) {
> +
> +        /* The index at the end of what we deleted and at the start of what we're keeping. */
> +        double intersection = actualStart + actualDeleteCount;

This can be a uint32, because the sum of both is definitely less than the length gotten from the array, after conversion to uint32, ergo must be a uint32.  And you'd have even bigger problems just down, if |len - intersection| could underflow.

@@ +2504,5 @@
> +     * Step 16: Length was set early in the shrinking case, to avoid
> +     * duplicating the shrinking effort.
> +     * */
> +    if (itemCount >= actualDeleteCount) {
> +        if (!js_SetLengthProperty(cx, O, finalLength))

I suspect after the refactorings requested above that this should occur unconditionally.  (Maybe.)

Past that, this is going to be treated as a non-strict assignment to length, because js_SetLengthProperty doesn't pass along the |true| that gets passed to [[Put]] in the spec algorithm.  But probably we should fix all such assignments to length to pass the right thing all at once, and not quite so much haphazardly.
Comment 10 Jeff Walden [:Waldo] (remove +bmo to email) 2011-08-17 17:44:54 PDT
Comment on attachment 553859 [details] [diff] [review]
Fix all Fx uses of splice

Review of attachment 553859 [details] [diff] [review]:
-----------------------------------------------------------------

Canceling as the comments on the older r? apply, and at a very quick skim the other changes look like a reasonable move...
Comment 11 Paul Biggar 2011-08-25 18:26:38 PDT
Thanks for the detailed review!

(In reply to Jeff Walden (remove +bmo to email) (gone August 18-September 4) from comment #9)
> ::: js/src/jit-test/tests/basic/bug668024.js
> Please name this file something more descriptive:
> splice-throwing-length-getter.js would be more helpful when skimming file
> listings, and would autocomplete much better.

I agree, but find the bug number useful in tinderbox logs. How do you feel about "splice-throwing-length-getter-668024.js"?


> I keep reading this name and half-forgetting that it also checks the
> prototype chain for potential side effects that might happen during
> deleting, putting, etc.  How about the name
> IsDenseArrayWithUnindexedProtoChain to also that facet of meaning?

Rather than exposing what it does, I want to expose its purpose, which is that we can access this array by reaching directly into memory. So I've renamed this to "canDirectlyAccessDenseArray", but other suggestions welcome.


> @@ +2301,5 @@
> > +    AutoValueRooter fromValue(cx);
> 
> This can just be a Value (with appropriate changes elsewhere).

Ah yeah, it's already rooted by being a property of |source|.



> @@ +2306,5 @@
> > +    if (!GetElement(cx, source, sourceIndex, &hole, fromValue.addr()))
> > +        return false;
> > +
> > +    if (!SetOrDeleteArrayElement(cx, target, targetIndex, hole, fromValue.value()))
> > +        return false;
> 
> You might as well do |return GetElement(...) &&
> SetOrDeleteArrayElement(...)|, doesn't seem the extra whitespace provides a
> benefit in readability.

Personally, I find the former more readable, as its clearly two operations. I know that we have a lot of places where do multiple operations in a single expression (via booleans ops or trinary expressions), but I never find these clear or readable, even though I recognize the idiom now. This is debatable, obviously, so if you insist, I'll change it.



> @@ +2325,5 @@
> > +static inline bool
> > +CopyArray(JSContext *cx, uint count, JSObject *source, uint sourceIndex, JSObject *target, uint targetIndex)
> > +{
> > +    JS_ASSERT (source != target);
> > +    if (IsContiguousDenseArray(cx, source, sourceIndex + count)) {
> 
> |sourceIndex + count| screams integer overflow.  It's not here, by the
> nature of the way sourceIndex and count are computed, but it's a close
> thing.  At least add a JS_ASSERT(sourceIndex + count >= sourceIndex) here,
> although that only catches things for people who are thinking about it. 
> Really it would be better if this method could be made only usable by
> array_splice, but C++ doesn't have nested methods, alas.
> 
> But no, I lie!  This does overflow!  And dangerously -- the isDenseArray
> check I thought prevented badness if overflow happened doesn't save you. 
> It's all due to that |argc == 1| deviation.  Witness:
> 
>   var arr = [1, 2, 3, 4, 5, 6, 7, 8];
>   arr.length = Math.pow(2, 32) - 2;
>   arr.splice(5); // kaboom
>
> You could hack around this problem here with an integer overflow check.

I don't see the "kaboom". In the previous version of this patch, actualDeleteCount was set to |len| rather than |len - actualStart|, which was an overflow, but I thought the version you reviewed had that fixed (esp due to your comments in the next paragraph).

Conceptually, I don't think there can be an overflow (except with the bug I mentioned above). actualDeleteCount + actualStart must be less than the length, unless I misunderstand the invariants in Steps 6 and 7.

In any case, I've added the assertions, and they don't trigger. (I verified it on your test, but I can't add the test to the suite because it iterates through the whole array of 2^32-2 elements).

 
> Looking at the code, I think it would be better to compute actualDeleteCount
> the spec way if |argc >= 2|, and then as |len - actualStart| in the |argc <
> 2| alternative.

I don't follow what you're saying here. I think this is exactly what I am doing.



> But beyond that, I think you should just inline this method into
> array_splice.  I found it difficult to correlate the code for the relevant
> step with the code in this method due to the variable name change and the
> distance from the surrounding code that contextualized it.

Yeah, after reading the rest of your comments on this issue (and especially noting the bugs in the implementation as regards ordering and overflow), I think you're right. Which is a pity :(

After inlining them, I pulled the dense version into it's own path, which I think makes it a lot simpler, and allowed me to leave the non-dense algorithm pretty much as stated in the spec (which answers a lot of your later comments too).

 
> @@ +2358,5 @@
> > +{
> > +    JS_ASSERT(sourceIndex < targetIndex);
> > +
> > +    if (IsContiguousDenseArray(cx, arr) &&
> > +        arr->ensureDenseArrayElements(cx, targetIndex, count) == JSObject::ED_OK) {
> 
> If ensureDenseArrayElements returns the ED_FAILED enum (I think), then it
> has failed, and you should return false immediately, rather than continue on
> and attempt to treat this like a non-optimizable case.

I'm not sure why? (I copied this behaviour from the old implementation). I'm not sure if we can write reasonably write a test for this, unfortunately, since the transition point is at 2^29 elements, which is 4GB of RAM.




> @@ +2408,5 @@
> > +
> > +    /* Step 5 */
> > +    Value start = argc >= 1 ? argv[0] : UndefinedValue();
> > +    double relativeStart;
> > +    if (!ToInteger(cx, start, &relativeStart))
> 
> |start|'s never used again, so just use the assignment expression directly
> in the ToInteger call.

There's mild performance benefits to this change, but at the cost of readability against the spec. Are you sure I should change it?

 
> @@ +2412,5 @@
> > +    if (!ToInteger(cx, start, &relativeStart))
> > +        return false;
> > +
> > +    /* Step 6 */
> > +    double actualStart;
> 
> So this should be uint32 for efficiency and
> simplicity, and there should be casts around the results of the overall min
> and max expressions.

I'm not sure about the casts. What's the reason for them?


> @@ +2435,5 @@
> > +    if (!A)
> > +        return false;
> > +
> > +    /* Step 17: return A */
> > +    vp->setObject(*A);
> 
> I think some of the recent stack-walking code, and some of the recent
> debugger work, is beginning to dislike early return-value sets like this
> that blow away the identity of the callee function.  There's no real harm
> doing this in the actual stepwise location, I think (I don't see any
> potential early success returns), so let's do that.

I did this so that A was rooted. Perhaps I misunderstand when to root things - could this not be GCed if I moved 17 to the end?
Comment 12 Paul Biggar 2011-08-25 18:29:06 PDT
Created attachment 555917 [details] [diff] [review]
With dense arrays separated and sparse arrays specified

This takes into account nearly all your comments from last time, and adds tests for a lot of the edge cases you highlighted.
Comment 13 Jeff Walden [:Waldo] (remove +bmo to email) 2011-09-19 18:15:52 PDT
Comments in response to your comments below, for what they're worth.

I'm not sure -- are you still working on this, or no?  If you are, I'll continue with the review -- if not, it'll be faster for me to make the changes myself.  Let me know either way what to do.

(In reply to Paul Biggar from comment #11)
> I agree, but find the bug number useful in tinderbox logs. How do you feel
> about "splice-throwing-length-getter-668024.js"?

I don't personally find the bug number hugely useful just when skimming test names.  If I find myself interested in a test, I'm usually more interested in the contents of the test.  If those are interesting enough, they'll mention the bug number, and I can go to town from that point.

But hey, if you get value out of this, I don't mind it.


> > I keep reading this name and half-forgetting that it also checks the
> > prototype chain for potential side effects that might happen during
> > deleting, putting, etc.  How about the name
> > IsDenseArrayWithUnindexedProtoChain to also that facet of meaning?
> 
> Rather than exposing what it does, I want to expose its purpose, which is
> that we can access this array by reaching directly into memory. So I've
> renamed this to "canDirectlyAccessDenseArray", but other suggestions welcome.

It's a fair cop.  CanOptimizeForDenseStorage?  That name might survive the sparse/dense storage change, as an added bonus.


> This is debatable, obviously, so if you insist, I'll change it.

It's not that big a deal.  Although, it seems to me there's a tradeoff between the noise of redundant return-falses and such and separation of distinct concerns.


> > @@ +2325,5 @@
> > > +static inline bool
> > > +CopyArray(JSContext *cx, uint count, JSObject *source, uint sourceIndex, JSObject *target, uint targetIndex)
> > > +{
> > > +    JS_ASSERT (source != target);
> > > +    if (IsContiguousDenseArray(cx, source, sourceIndex + count)) {
> > 
> > |sourceIndex + count| screams integer overflow.  It's not here, by the
> > nature of the way sourceIndex and count are computed, but it's a close
> > thing.  At least add a JS_ASSERT(sourceIndex + count >= sourceIndex) here,
> > although that only catches things for people who are thinking about it. 
> > Really it would be better if this method could be made only usable by
> > array_splice, but C++ doesn't have nested methods, alas.
> > 
> > But no, I lie!  This does overflow!  And dangerously -- the isDenseArray
> > check I thought prevented badness if overflow happened doesn't save you. 
> > It's all due to that |argc == 1| deviation.  Witness:
> > 
> >   var arr = [1, 2, 3, 4, 5, 6, 7, 8];
> >   arr.length = Math.pow(2, 32) - 2;
> >   arr.splice(5); // kaboom
> >
> > You could hack around this problem here with an integer overflow check.
> 
> I don't see the "kaboom". In the previous version of this patch,
> actualDeleteCount was set to |len| rather than |len - actualStart|, which
> was an overflow, but I thought the version you reviewed had that fixed (esp
> due to your comments in the next paragraph).

Hm.  I definitely downloaded a patch from this bug, applied it, and reproduced the kaboom.  Maybe it was an earlier patch.  Or something.  *shrug*  As long as the issue's fixed, it's good, not important how it all played out at the time.


> In any case, I've added the assertions, and they don't trigger. (I verified
> it on your test, but I can't add the test to the suite because it iterates
> through the whole array of 2^32-2 elements).

Yeah, sigh.  I hit that problem implementing the dense-storage optimizations for the Array methods Back In The Day, too.


> > Looking at the code, I think it would be better to compute actualDeleteCount
> > the spec way if |argc >= 2|, and then as |len - actualStart| in the |argc <
> > 2| alternative.
> 
> I don't follow what you're saying here. I think this is exactly what I am
> doing.

What I meant was something like this:

jsdouble actualDeleteCount;
if (argc != 1) {
  if (!ToInteger(cx, argv[1], &actualDeleteCount))
    return false;
  actualDeleteCount = JS_MIN(JS_MAX(actualDeleteCount, 0), len - actualStart);
} else {
  /*
   * Non-standard: if start was specified but deleteCount was omitted,
   * delete to the end of the array.
   */
  actualDeleteCount = len - actualStart;
}

This nicely cordons off the non-standard thing into a sibling code path to the standard code, so the reader doesn't have to consider the non-standard code at all if splice is called with typical arity.  (Semantically, of course, the reader doesn't have to very much now, if he's internalized that ToInteger won't have side effects when passed |undefined|.  Yet at the same time, he does have to consider it, in ways he doesn't when the if-guard clearly splits the code in two like this.)


> > @@ +2358,5 @@
> > > +{
> > > +    JS_ASSERT(sourceIndex < targetIndex);
> > > +
> > > +    if (IsContiguousDenseArray(cx, arr) &&
> > > +        arr->ensureDenseArrayElements(cx, targetIndex, count) == JSObject::ED_OK) {
> > 
> > If ensureDenseArrayElements returns the ED_FAILED enum (I think), then it
> > has failed, and you should return false immediately, rather than continue on
> > and attempt to treat this like a non-optimizable case.
> 
> I'm not sure why? (I copied this behaviour from the old implementation).

As I recall, this would happen with OOM when attempting to extend -- a rare case, but a possible one.  I'd generally not bother with testing something like this; I know some people have done testing of OOM like this in the past, but it's been fragile enough to get accurately tested that I don't worry about it.


> > @@ +2408,5 @@
> > > +
> > > +    /* Step 5 */
> > > +    Value start = argc >= 1 ? argv[0] : UndefinedValue();
> > > +    double relativeStart;
> > > +    if (!ToInteger(cx, start, &relativeStart))
> > 
> > |start|'s never used again, so just use the assignment expression directly
> > in the ToInteger call.
> 
> There's mild performance benefits to this change, but at the cost of
> readability against the spec. Are you sure I should change it?

It's a matter of taste, I think.  As long as the code-to-step correspondence is clear, I haven't seen the need for rigorous adherence to use of exact spec names for everything.  Here, just introducing the |start| name doesn't seem to add that much to me, since the very next step for a curious reader would be to determine what it computed.  Working from the computed value to its spec referent seems at least as easy to me.  I can live with this as-is.

As far as perf's concerned, I'd be surprised at the compiler (on platforms we care about) that didn't inline it.  So it's pure aesthetics.


> I'm not sure about the casts. What's the reason for them?

If I remember right, assigning a double to a uint32_t will warn with some compilers, absent those casts.  They seem like perfectly valid warnings, and we can make the (safe) narrowing explicit to silence them.


> I did this so that A was rooted. Perhaps I misunderstand when to root things
> - could this not be GCed if I moved 17 to the end?

The conservative stack scanner will root stack-allocated values, so long as those stack-allocated values are live and might actually be used again.  (And of course if it's used as a return value here, it must live at least that long, so it's fine to wait til the end to store it.)
Comment 14 Jeff Walden [:Waldo] (remove +bmo to email) 2011-09-22 16:39:42 PDT
I split out the fix-up-existing-splice-users changes (so that anyone who used |arr.splice(k)| and expected extensionland behavior now uses |arr.splice(k, arr.length)| instead) into a distinct patch and pushed it to mozilla-inbound:

https://hg.mozilla.org/integration/mozilla-inbound/rev/743e78a1a24d

Actual functional changes, and tests of rewritten splice behavior, will be in separate patches.  I'm working on rebasing the latest patch here through tip now, should have something in the next few days.
Comment 15 Jeff Walden [:Waldo] (remove +bmo to email) 2011-10-03 16:32:18 PDT
Created attachment 564389 [details] [diff] [review]
Slightly updated/tweaked tests

These are the tests from the previous patch here, split out into their own change.  They're mostly as they were, except that I added a couple new tests, and I adjusted some of them so I could understand them more easily.  The changes were all pretty trivial.

I'll land the two together, but until then it seems slightly nice to keep the two separate.
Comment 16 Jeff Walden [:Waldo] (remove +bmo to email) 2011-10-03 17:09:48 PDT
Created attachment 564396 [details] [diff] [review]
Patch to reimplement the splice algorithm

Type inference landing kind of made a hash out of the existing patch, so I spent a bit of time working through the ramifications of it in the fast paths.  I *think* I did it right, but a TI hand is desirable here.

Also, looking at the way the old code worked, I noticed some bugs, so I refactored things a fair bit.  I think the previous patch's single can-optimize test was slightly incomplete with respect to noticing the possible ways things could have to go slow, at all stages of execution of splice, so it was unfortunately a necessary bit of changing.

Note also that I fixed [].splice with respect to the problem noted in bug 690622.  To do this I decided to make the fast-path dense optimizations only apply to packed dense arrays.  It's possible some users really want fast-path behavior even for holey dense arrays, but that seems a bit crazy.  Also possible is that they want it for cases where the initialized length is less than the length.  (That seems somewhat more likely, for things like |new Array(20)|, but it's still unclear.)  If they really need it (for splice!  I am skeptical of such needs where the splicee is holey), we can figure out something else.  But let's not take on unnecessary complexity until we clearly need it.

Last, I fixed a bug in the previous patches, in my additions -- the Array.isArray test in splice-check-steps.js was wrong for the throwing tests.  I extended the test to also accommodate |undefined| as a return value.
Comment 17 Brian Hackett (:bhackett) 2011-10-06 10:31:06 PDT
Comment on attachment 564396 [details] [diff] [review]
Patch to reimplement the splice algorithm

Review of attachment 564396 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsarray.cpp
@@ +2714,5 @@
> +    return arr->isDenseArray() &&
> +           arr->isPackedDenseArray() &&
> +           !js_PrototypeHasIndexedProperties(cx, arr) &&
> +           length <= arr->getDenseArrayInitializedLength();
> +}

JSObject::isPackedDenseArray() will be going away soon, it would be better to use the packed information on the type object itself (testing whether all objects of the type are packed).
Comment 18 Jeff Walden [:Waldo] (remove +bmo to email) 2011-10-06 14:53:04 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/570ff6efe5ef
Comment 19 Ed Morley [:emorley] 2011-10-07 04:05:35 PDT
https://hg.mozilla.org/mozilla-central/rev/570ff6efe5ef
Comment 20 Guilherme Lima 2011-10-07 06:16:32 PDT
Is it possible that this patch have regressed kraken-crypto-sha256-iterative by 40%? (on AWFY for JM+TI)
Comment 21 Brian Hackett (:bhackett) 2011-10-07 07:40:42 PDT
(In reply to Guilherme Lima from comment #20)
> Is it possible that this patch have regressed kraken-crypto-sha256-iterative
> by 40%? (on AWFY for JM+TI)

Yeah, I think this is likely, the test uses splice.  Jeff, can you look into this?
Comment 22 Christian Holler (:decoder) 2011-10-11 16:48:36 PDT
This patch seems to have regressed the test at js/src/tests/ecma_3/Array/regress-322135-03.js

Running that test manually on 64 bit instantly gives me:

Assertion failure: UINT32_MAX - startingIndex >= count, at jsarray.cpp:2711
Comment 23 Jeff Walden [:Waldo] (remove +bmo to email) 2011-10-11 16:51:04 PDT
Looking now.
Comment 24 David Mandelin [:dmandelin] 2011-10-12 17:00:02 PDT
(In reply to Christian Holler (:decoder) from comment #22)
> This patch seems to have regressed the test at
> js/src/tests/ecma_3/Array/regress-322135-03.js
> 
> Running that test manually on 64 bit instantly gives me:
> 
> Assertion failure: UINT32_MAX - startingIndex >= count, at jsarray.cpp:2711

Can you file a new bug on this?
Comment 25 Christian Holler (:decoder) 2011-10-12 17:24:02 PDT
(In reply to David Mandelin from comment #24)
> Can you file a new bug on this?

Sure, done at bug 694210.

Note You need to log in before you can comment on or make changes to this bug.