Closed Bug 1129313 Opened 6 years ago Closed 4 years ago

ES6 Map iterator next is unusably slow

Categories

(Core :: JavaScript Engine, defect, P3)

38 Branch
x86_64
Windows 8.1
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: kael, Assigned: till)

References

(Blocks 2 open bugs, )

Details

(Keywords: perf)

Attachments

(4 files, 3 obsolete files)

Tried porting JSIL's dictionary and hashset containers to use ES6 map instead of a JS Object. Performance decreased measurably, despite the fact that this lets me avoid the use of Object.keys. Almost all the time spent is spent in the implementation of MapIterator.next.
Source code of MoveNext:

    function MoveNext () {
      var state = this.state;
      var valueIndex = ++(state.valueIndex);

      while (true) {
        var bucket = state.currentBucket;

        if (bucket && (valueIndex >= 0) && (valueIndex < bucket.length)) {
          state.current = bucket[state.valueIndex].key;
          return true;
        } else {
          var i = state.buckets.next();
          if (i.done) {
            return false;          
          }

          state.currentBucket = i.value;
          valueIndex = state.valueIndex = 0;
        }
      }

      return false;
    }

this.state initialization:

      this.state = {
        t: t,
        valueIndex: -1,
        buckets: hashSet.$getBucketIterator(),
        currentBucket: null,
        current: JSIL.DefaultValue(t)
      };

I use the same underlying representation (Map keyed on .NET HashCode, containing lists of items in the bucket) for both Dictionary and HashSet. For Dictionary I iterate the bucket to find a key that exact-matches (since multiple values can share the same HashCode)
Also, in my test adding values to Map is also quite slow but it doesn't seem to be slower for my scenario than Object, so that probably doesn't matter. (It'd be nice if it were faster.) My cost for adding would go down if I didn't have to maintain a separate Array of keys to avoid using iterators.
Whoops. Oh right.

  $.RawMethod(false, "$getBucketIterator", function HashContainer_GetBucketIterator () {
    return this._dict.values();
  });

Sorry I left that out.
Yep, the testcase below takes > 500 ms for us, ~60 ms in V8 :(

We spend a ton of time adding properties under CreateItrResultObject, we can make that faster but I think it'd be better to self-host this and use intrinsics. Then we can create the result object in JS and it should be much faster.

var map = new Map();
for (var i=0; i<1000000; i++)
    map.set(i, i);

function f(m) {
    var res;
    for (var v of m) {
	res = v;
    }
    return res;
}
var t = new Date;
f(map);
print(new Date - t);
Self-hosting it in JS seems like it would help a lot, yeah.

Reusing a single result object would also be a massive improvement, but the spec doesn't allow it. I consider that a defect in the spec.

The only reason I can think of to require each next() to return a new instance is if you want to support end-users storing the result of next(), which is kinda absurd. If they want to store all the elements, they're already stored in the container. (-:

When I tested my code in V8 it wasn't actually much faster - slower than Object likewise. I wonder what they're doing there - the hotspots seemed to be in the same place as SM, but it's way harder to debug since their profiler doesn't instrument across C++ boundaries.
Keywords: perf
(In reply to Jan de Mooij [:jandem] from comment #4)
> We spend a ton of time adding properties under CreateItrResultObject, we can
> make that faster but I think it'd be better to self-host this and use
> intrinsics. Then we can create the result object in JS and it should be much
> faster.

Would self-hosting CreateItrResultObject alone be much faster? ISTM that we'd have to also self-host Map's and Set's `next_impl` functions, no? That seems doable, provided we create an (ideally inlineable) intrinsic for getting the next item's value.

Since we want to do as little as possible in native code, but have to somehow return both a result, the value, and a state, whether iteration is complete, perhaps we can do the latter with a sentinel value? Because the value sometimes is an array, we'll want to allocate that in self-hosted code, too, probably.

Something like:

// Outside the function:
const iteratorDoneSentinel = {};
const iteratorResultValue = [undefined,undefined];

// Roughly, steps 3-12 of %MapIteratorPrototype%.next ( ), with parts in _nextIteratorListItem 
// and parts in the self-hosted function:
let done = _nextIteratorListItem(iterableObj, iteratorResultValue, iteratorDoneSentinel) === iteratorDoneSentinel;
let resultObject = {};
resultObject.done = done;
if (done) {
  resultObject.value = undefined;
} else {
  resultObject.value = [ depending on type of iteration, iteratorResultValue[0], iteratorResultValue[1], or [iteratorResultValue[0], iteratorResultValue[1]] ];
}
return resultObject;

If _nextIteratorListItem can be inlined, and `next` itself can be, too, then that in turn might allow scalar replacement to get rid of the resultObject allocation itself.
(In reply to Till Schneidereit [:till] from comment #6)
> Would self-hosting CreateItrResultObject alone be much faster? ISTM that
> we'd have to also self-host Map's and Set's `next_impl` functions, no?

Agreed, we should self-host next_impl.

Your code looks good. ArrayIteratorNext calls multiple intrinsics, but using one intrinsic is probably faster. For Ion it's easier to inline multiple simple intrinsics, but I doubt we want to inline _nextIteratorListItem (at least for now).

> If _nextIteratorListItem can be inlined, and `next` itself can be, too, then
> that in turn might allow scalar replacement to get rid of the resultObject
> allocation itself.

I don't think _nextIteratorListItem has to be inlined for scalar replacement to work. Scalar replacement worked for for-of + ArrayIteratorNext, so creating the result object like ArrayIteratorNext does is probably a good idea :)
(In reply to Jan de Mooij [:jandem] from comment #7)
> (In reply to Till Schneidereit [:till] from comment #6)
> > Would self-hosting CreateItrResultObject alone be much faster? ISTM that
> > we'd have to also self-host Map's and Set's `next_impl` functions, no?
> 
> Agreed, we should self-host next_impl.
> 
> Your code looks good. ArrayIteratorNext calls multiple intrinsics, but using
> one intrinsic is probably faster. For Ion it's easier to inline multiple
> simple intrinsics, but I doubt we want to inline _nextIteratorListItem (at
> least for now).

I hadn't seen ArrayIteratorNext. IIUC, we'd only need to add intrinsics to get a Map/Set entry or key for an index and to get the number of entries, respectively, to use exactly the same approach, right? Wouldn't inlining those intrinsics be fairly straight-forward? If so, then it seems like that'd be the fastest approach and we should do essentially the same as for Array iterators.

Intrinsics to add for that version:

_GetMapEntryCount(mapObj)
_GetMapEntryByIndex(mapObj, index)
_GetMapKeyByIndex(mapObj, index)
CallMapIteratorMethodIfWrapped(mapObj, funName)

The implementation of next would look something like this:


function MapIteratorNext() {
    if (!IsObject(this) || !IsMapIterator(this)) {
        return callFunction(CallMapIteratorMethodIfWrapped, this,
                            "MapIteratorNext");
    }

    var a = UnsafeGetObjectFromReservedSlot(this, ITERATOR_SLOT_ITERATED_OBJECT);
    var index = UnsafeGetReservedSlot(this, ITERATOR_SLOT_NEXT_INDEX);
    var itemKind = UnsafeGetInt32FromReservedSlot(this, ITERATOR_SLOT_ITEM_KIND);
    var result = {value: undefined, done: false};

    if (index >= _GetMapEntryCount(a)) {
        // When the above is changed to ToLength, use +1/0 here instead
        // of MAX_UINT32.
        UnsafeSetReservedSlot(this, ITERATOR_SLOT_NEXT_INDEX, 0xffffffff);
        result.done = true;
        return result;
    }

    UnsafeSetReservedSlot(this, ITERATOR_SLOT_NEXT_INDEX, index + 1);

    if (itemKind === ITEM_KIND_VALUE) {
        result.value = _GetMapEntryByIndex(a, index);
        return result;
    }

    if (itemKind === ITEM_KIND_KEY_AND_VALUE) {
        var pair = NewDenseArray(2);
        pair[0] = _GetMapKeyByIndex(a, index);
        pair[1] = _GetMapEntryByIndex(a, index);
        result.value = pair;
        return result;
    }

    assert(itemKind === ITEM_KIND_KEY, itemKind);
    result.value = _GetMapKeyByIndex(a, index);
    return result;
}
Attached patch Self-host MapIteratorObject#next (obsolete) — Splinter Review
I took a stab at this. The attached patch works and brings the runtime of the benchmark in comment 4 down from ~400ms to ~75ms on my machine. That's pretty nice, but I don't understand why it doesn't do even better than that. `--ion-scalar-replacement=off` doesn't make any difference to the score whatsoever, so while we're now at least creating the return object in JS, we do in fact still always create it. This is also true for the following modified benchmark where `for .. of` is out of the picture:

var map = new Map();
for (var i=0; i<1000000; i++)
    map.set(i, i);

function f(m) {
    var res;
    var itr = m[Symbol.iterator]();
    do {
    	res = itr.next();
    } while (!res.done);
    return res;
}
var t = new Date;
f(map);
print(new Date - t);

ISTM that this benchmark shouldn't cause any allocations whatsoever with scalar replacement enabled. Jandem, do you see anything obvious that's preventing that?
Assignee: nobody → till
Status: NEW → ASSIGNED
Attachment #8564993 - Flags: feedback?(jdemooij)
IIRC I talked to Nicolas about this, and probably Step 2/3 the callFunction invocation prevents us from optimizing this?
(In reply to Tom Schuster [:evilpie] from comment #10)
> IIRC I talked to Nicolas about this, and probably Step 2/3 the callFunction
> invocation prevents us from optimizing this?

Huh, I don't see how that'd be possible. callFunction doesn't exist in bytecode at all: `callFunction(foo, bar) is compiled to exactly the same thing as `bar.foo()` would be if `foo` was a property of `bar`.

Also, two notes: not returning `res` in the benchmark doesn't make a difference, so that's also not what's causing escape analysis to fail. And I mistakenly reported the score for my modified version of the benchmark as if it were for the original, `for .. of` version. That version takes a bit longer at ~100ms.
I just tested a version of the benchmark that uses a plain object and a `for .. in` loop: if I have it get the actual value instead of just the key, it's slower than jandem's original benchmark even without this patch at ~440ms. If I just use the key, it's about 1.8x as slow as jandem's benchmark with the patch applied at ~178ms.

var map = {};
for (var i=0; i<1000000; i++)
	map[i] = i;

function f(m) {
    var res;
    for (var e in m)
    	res = e; // Or res = m[e] to use the value instead of the key.
    return res;
}
var t = new Date;
print(f(map));
print(new Date - t);

A version using Object.keys() is either also much slower at ~335ms when retrieving values, or just slightly faster for only using keys at ~71ms. The difference isn't huge anymore, though.

var map = {};
for (var i=0; i<1000000; i++)
	map[i] = i;

function f(m) {
    var res;
    var keys = Object.keys(m);
    for (var i = keys.length - 1; i >= 0; i--) {
    	res = keys[i]; // Or res = m[keys[i]] to use the value instead of the key.
    };
    return res;
}
var t = new Date;
print(f(map));
print(new Date - t);

Katelyn, you said that the Object.keys() version was much faster. Is that still true, or does this patch make up the difference completely?
Flags: needinfo?(kg)
I don't have a build environment right now so I probably can't test the patch right away. Is it possible to get a try build?
Comment on attachment 8564993 [details] [diff] [review]
Self-host MapIteratorObject#next

Review of attachment 8564993 [details] [diff] [review]:
-----------------------------------------------------------------

I think scalar replacement doesn't work here (IONFLAGS=escape has some useful info) due to the phi we add for the return value, when we inline MapIteratorNext. So there's a phi with 3 operands, one for each return in MapIteratorNext... Ideally there'd be only one "return {value: v, done: done};" statement.

The first return may be hard to avoid though. I wonder if V8 and JSC can handle such cases...

Even without scalar replacement, it's a great win :)
Attachment #8564993 - Flags: feedback?(jdemooij) → feedback+
So even with a single return statement, there's still a problem because we add a post barrier for the object literal initialization, because "result" can be an object. This is silly: if we can get rid of the object allocation, we should just eliminate the post barrier...
You can test that by commenting out these lines in IonBuilder::jsop_initprop:

    if (NeedsPostBarrier(info(), value))
        current->add(MPostWriteBarrier::New(alloc(), obj, value));

Let me know if you can get it working without these lines. Locally the debug spew indicates it works with for-of if I comment out the "return callFunction(...)" and use a single return at the end...
(In reply to Jan de Mooij [:jandem] from comment #16)
> You can test that by commenting out these lines in IonBuilder::jsop_initprop:
> 
>     if (NeedsPostBarrier(info(), value))
>         current->add(MPostWriteBarrier::New(alloc(), obj, value));
> 
> Let me know if you can get it working without these lines. Locally the debug
> spew indicates it works with for-of if I comment out the "return
> callFunction(...)" and use a single return at the end...

Hmm, I must be doing something wrong, then: even with the `callFunction` removed and the change you describe above done, I still get a the write barrier:

[Escape] Object newobject234 = newobject constant233
resumepoint mode=After (caller in block3) constant202 constant203 phi29 phi29 phi232 constant204 constant204 constant204 constant204 newobject234
  at self-hosted:2337
  in quick.js:7
[Escape]   is escaped by
postwritebarrier335 = postwritebarrier newobject234 phi318
  after self-hosted:2350
  in quick.js:7

Also, I don't see a way to eliminate the second return. We can change things to have the check in MapIteratorNext and the real implementation in a new function MapIteratorNext_impl, but in a quick test that didn't make a difference (nor do I see how it would).
Attachment #8564993 - Attachment is obsolete: true
Attachment #8565430 - Flags: feedback?(jdemooij)
(In reply to Till Schneidereit [:till] from comment #17)
> [Escape]   is escaped by
> postwritebarrier335 = postwritebarrier newobject234 phi318

A simple way is to white-list the MPostWriteBarrier in the Scalar Replacement ""escape"" analysis, and to remove it when we remove all the load & stores on the object.
(In reply to Till Schneidereit [:till] from comment #8)
> The implementation of next would look something like this:
> 
> 
> function MapIteratorNext() {
>     if (!IsObject(this) || !IsMapIterator(this)) {
>         return callFunction(CallMapIteratorMethodIfWrapped, this,
>                             "MapIteratorNext");
>     }
> 
>     […]
>     var result = {value: undefined, done: false};
>     […]
>     return result;
> }

Note that this would prevent optimization from Scalar Replacement, as Scalar Replacement is (so-far) only capable to remove an object allocation if there is no Phi after its creation & before its uses.

Having a different branch able to create and return another object for the "next" function, will prevent any scalar replacement optimization.  The reason being that when a function is inlined, we add a join block where the returned value of the function is expressed by a Phi instruction.
(In reply to Nicolas B. Pierron [:nbp] from comment #19)
> Note that this would prevent optimization from Scalar Replacement, as Scalar
> Replacement is (so-far) only capable to remove an object allocation if there
> is no Phi after its creation & before its uses.
> 
> Having a different branch able to create and return another object for the
> "next" function, will prevent any scalar replacement optimization.  The
> reason being that when a function is inlined, we add a join block where the
> returned value of the function is expressed by a Phi instruction.

See also jandem's feedback in comment 14 and comment 15. I tried two different approaches: one where I create an object per return statement, and one where I create a single object to be returned and modify it based on the desired result. Either one of those has to happen, obviously. Please advise which one is better.

Also, can you comment on the multiple returns thing? I think there's no way to have less than two of those, because we have to be able to handle wrapped objects somehow.
Flags: needinfo?(nicolas.b.pierron)
Sorry for making the patch, but I needed it for testing Bug 1112158 in a
funny way.
Attachment #8566084 - Flags: review?(till)
Comment on attachment 8566084 [details] [diff] [review]
Scalar Replacement: Remove PostWriteBarrier at the same time as the stores.

Review of attachment 8566084 [details] [diff] [review]:
-----------------------------------------------------------------

While I *think* I understand this well enough to review it. However, there's a substantial likelihood of that just being the Dunning-Kruger effect in action, so I'll defer to someone definitely more knowledgeable.
Attachment #8566084 - Flags: review?(till) → review?(hv1989)
(In reply to Till Schneidereit [:till] from comment #11)
> (In reply to Tom Schuster [:evilpie] from comment #10)
> > IIRC I talked to Nicolas about this, and probably Step 2/3 the callFunction
> > invocation prevents us from optimizing this?
> 
> Huh, I don't see how that'd be possible. callFunction doesn't exist in
> bytecode at all: `callFunction(foo, bar) is compiled to exactly the same
> thing as `bar.foo()` would be if `foo` was a property of `bar`.
> 
> Also, two notes: not returning `res` in the benchmark doesn't make a
> difference, so that's also not what's causing escape analysis to fail. And I
> mistakenly reported the score for my modified version of the benchmark as if
> it were for the original, `for .. of` version. That version takes a bit
> longer at ~100ms.

Except for the PostWriteBarrier issue, the problem is that the branch is not removed before GVN, and Scalar Replacement runs before GVN to benefit to the Apply types phase which can use refined types compared to boxing everything under Values.

As the first branch is not removed at the time of the scalar replacement, we cannot perform the optimization.  We could do it if we had a single visited-bit from baseline to prune the branch directly from IonBuilder.

Having one or multiple return statement does not matter as long as they are returning the same object allocation.  If they are not returning the same object allocation, then we are unable to do any scalar replacement as we do not know which template object is used for the allocation of the object.  This issue can be solved with better compiler transformations, such as Scalar Promotion (Bug 1044202) with the support for Recovered-Phi.

In the mean time, if this is spec-compliant, I suggest to do the following modification:


function MapIteratorNext() {
    var result = {value: undefined, done: false};

    […]
    if (!IsObject(this) || !IsMapIterator(this)) {
        var tmp = callFunction(CallMapIteratorMethodIfWrapped, this,
                               "MapIteratorNext");
        result.value = tmp.value;
        result.done = tmp.done;
        return result;
    }

    […]
    return result;
}


If the previous one is not spec compliant and that we do not care about the performance of this branch, I guess we could do:


function MapIteratorNext() {
    if (!IsObject(this) || !IsMapIterator(this)) {
        try { throw 0; } catch (zero) { // unlikely & slow
            return callFunction(CallMapIteratorMethodIfWrapped, this,
                                "MapIteratorNext");
        }
    }

    var result = {value: undefined, done: false};
    […]
    return result;
}


as the throw statement end the control flow, and Ion does not compile any catch statements.  Which means that all return statement seen by IonMonkey will use the same allocation.  On the other hand, this means that using this iterator on any other object would cause the code to live in baseline after ~20 bailouts.
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8565430 [details] [diff] [review]
Self-host MapIteratorObject#next, v2

Clearing feedback as nbp already replied.
Attachment #8565430 - Flags: feedback?(jdemooij)
Comment on attachment 8566084 [details] [diff] [review]
Scalar Replacement: Remove PostWriteBarrier at the same time as the stores.

Review of attachment 8566084 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

::: js/src/jit/ScalarReplacement.cpp
@@ +459,5 @@
> +ObjectMemoryView::visitPostWriteBarrier(MPostWriteBarrier *ins)
> +{
> +    // Skip loads made on other objects.
> +    if (ins->object() != obj_)
> +        return;

For documentation purpose:

I first thought this could be incorrect by having a MGuardShape.
MPostWriteBarrier (MGuardShape MNewObject)
So object() would point to the MGuardShape instead of the MNewObject.

But the logic goes like this: MGuardShape gets visited first and all uses of the GuardShape are replaced by the object. So when we start visiting MPostWriteBarrier it will indeed point to MNewObject.
Attachment #8566084 - Flags: review?(hv1989) → review+
(In reply to Nicolas B. Pierron [:nbp] from comment #23)
> In the mean time, if this is spec-compliant, I suggest to do the following
> modification:
> 
> function MapIteratorNext() {
>     var result = {value: undefined, done: false};
> 
>     […]
>     if (!IsObject(this) || !IsMapIterator(this)) {
>         var tmp = callFunction(CallMapIteratorMethodIfWrapped, this,
>                                "MapIteratorNext");
>         result.value = tmp.value;
>         result.done = tmp.done;
>         return result;
>     }
> 
>     […]
>     return result;
> }

I may be skimming too cursorily, but I don't believe this is compliant.  With this implementation, if I did Map.prototype[Symbol.iterator]().next.call(new otherGlobal.Map()), I'd get a value instanceof Object.  But really it would be expected to be instanceof otherGlobal.Object.  Sorry.  :-(
Comment on attachment 8566084 [details] [diff] [review]
Scalar Replacement: Remove PostWriteBarrier at the same time as the stores.

https://hg.mozilla.org/integration/mozilla-inbound/rev/2a0481539f38
Attachment #8566084 - Flags: checkin+
Flagging as leave-open for till, as the Scalar Replacement part was not enough.
Keywords: leave-open
Depends on: 1165348
Depends on: 1166711
No longer depends on: 1166711
Till, So far scalar replacement should work again, without any of the work-around of comment 23.  Should we pursue with landing the self-hosted MapIteratorObject#next patch?
Flags: needinfo?(till)
(In reply to Nicolas B. Pierron [:nbp] from comment #30)
> Till, So far scalar replacement should work again, without any of the
> work-around of comment 23.  Should we pursue with landing the self-hosted
> MapIteratorObject#next patch?

I just tested this again, and the Ion spew does look quite a bit better: the object isn't escaped anymore, though the array still is. However, the benchmark results are the same as before, which is a bit puzzling. Maybe the object allocations just don't matter that much because they're never tenured and nursery allocation is really fast?

Anyway, we should get this in, so I'm requesting review (from jandem because he saw an earlier version of the patch already).
Flags: needinfo?(till)
Attachment #8622389 - Flags: review?(jdemooij)
(In reply to Till Schneidereit [:till] from comment #31)
> I just tested this again, and the Ion spew does look quite a bit better: the
> object isn't escaped anymore, though the array still is. However, the
> benchmark results are the same as before, which is a bit puzzling. Maybe the
> object allocations just don't matter that much because they're never tenured
> and nursery allocation is really fast?

Actually the object escaping in jandem's benchmark from comment 4 makes sense: iterator.next isn't inlined into the test function because that isn't ever compiled, so the object escapes. Modified version that changes that:

var map = new Map();
for (var i=0; i<100; i++)
    map.set(i, i);

function f(m) {
    var res;
    var values = m;
    for (var i = 10000; i--;) {
        for (var v of m) {
            res = v[0];
        }
    }
    return res;
}
var t = new Date;
f(map);
print(new Date - t);

For this version, --ion-scalar-replacement=off does cost ~10% performance. According to the attached spew log, the array is escaped by the for .. of loop itself, no matter what the containing function does with it. nbp, is this something we could potentially improve?
Attachment #8565430 - Attachment is obsolete: true
Flags: needinfo?(nicolas.b.pierron)
(In reply to Till Schneidereit [:till] from comment #32)
> According to the attached spew log, the array is escaped by the for .. of
> loop itself, no matter what the containing function does with it. nbp, is
> this something we could potentially improve?

The problem reported by this message

[Escape]   is escaped by
phi225 = phi unbox183 newarray198

is that we have one allocation has a life-span which is blend with some other paths, which in the current case is not even type-consistent.

> // Step 6.
> var itemKind = UnsafeGetInt32FromReservedSlot(this, ITERATOR_SLOT_ITEM_KIND);
> 
> var result;
> // Step 10.d.i.
> if (itemKind === ITEM_KIND_KEY) {
>     result = mapIterationResultPair[0];
> // Step 10.d.ii.
> } else if (itemKind === ITEM_KIND_VALUE) {
>     result = mapIterationResultPair[1];
> // Step 10.d.iii.
> } else {
>     assert(itemKind === ITEM_KIND_KEY_AND_VALUE, itemKind);
>     result = [mapIterationResultPair[0], mapIterationResultPair[1]];
> }

Which means that result is mixed with values and arrays.

To get rid of this "Phi" node, I think the best way would be to make the Jit be aware on the fact the ITERAOR_SLOT_ITEM_KIND slot can be considered as a constant property of the iterator.

A potential work-around would be if the "next" function was specialized ahead of time based on the item-kind, instead of testing the item kind at each iteration.

Still, the same problem will appear with 

> // Step 10.a, 11.
> var done = _GetNextMapEntryForIterator(O, mapIterationResultPair);
> if (!done) {
>     …
>     } else {
>         assert(itemKind === ITEM_KIND_KEY_AND_VALUE, itemKind);
>         result = [mapIterationResultPair[0], mapIterationResultPair[1]];
>     }
>
>     retVal.value = result;
>     …
> }
> // retVal.result is either undefined or […, …]
> 
> // Steps 7,12.
> return retVal;

This problem also exists with for-of on Arrays, and it cause us to be unable to hoist the bound check introduce with the array access.

To do fix the issue in this case, we would have to duplicate basic blocks based on known conditions, and then have scalar replacement replace the array allocation.

Scalar Replacement never handled such case, so I suggest we open separate & specific issues against IonMonkey.
Both optimizations should lead to improvements on misc-basic-array-forof [1].

[1] http://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-array-forof
Flags: needinfo?(nicolas.b.pierron)
Comment on attachment 8622389 [details] [diff] [review]
Part 2: self-host MapIteratorObject#next()

Review of attachment 8622389 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay! Looks great, but some questions below.

::: js/src/builtin/Map.js
@@ +67,5 @@
> +        // Step 10.d.i.
> +        if (itemKind === ITEM_KIND_KEY) {
> +            result = mapIterationResultPair[0];
> +        // Step 10.d.ii.
> +        } else if (itemKind === ITEM_KIND_VALUE) {

Nit: the location and indentation of this comment seems a bit strange. It might be clearer to move it inside the if-branch or at the end of the "if (..)" line?

@@ +69,5 @@
> +            result = mapIterationResultPair[0];
> +        // Step 10.d.ii.
> +        } else if (itemKind === ITEM_KIND_VALUE) {
> +            result = mapIterationResultPair[1];
> +        // Step 10.d.iii.

Same here.

@@ +81,5 @@
> +        retVal.value = result;
> +        retVal.done = false;
> +    }
> +
> +    // Steps 7,12.

Nit: space after ','

::: js/src/builtin/MapObject.h
@@ +156,5 @@
> +  public:
> +    static const Class class_;
> +
> +    enum { TargetSlot, RangeSlot, KindSlot, SlotCount };
> +#ifdef DEBUG

Since static_assert has no runtime overhead, I'd prefer removing the #ifdef.

::: js/src/builtin/String.js
@@ +188,5 @@
>      }
>      return T;
>  }
>  
> +#define ITERATOR_SLOT_ITERATED_OBJECT 0

Hm this one is also defined in SelfHostingDefines.h It also seems a bit strange to use "OBJECT" here even though it's a string. I think it'd be nicer to keep the different ARRAY_ITERATOR_*, STRING_ITERATOR_* and MAP_ITERATOR_* values, to make it easier to change one of them later.
Attachment #8622389 - Flags: review?(jdemooij)
Thanks for the review!

> Nit: the location and indentation of this comment seems a bit strange. It
> might be clearer to move it inside the if-branch or at the end of the "if
> (..)" line?

Yeah, this is somewhat annoying, and I haven't fully made up my mind on where the comments should go for cases like this. I moved them as you suggested.

> Since static_assert has no runtime overhead, I'd prefer removing the #ifdef.

I had that to only include the SelfHostingDefines header in debug builds. Since that's slightly silly, I removed it.

> > +#define ITERATOR_SLOT_ITERATED_OBJECT 0
> 
> Hm this one is also defined in SelfHostingDefines.h

Oops, forgot to remove that, fixed.

> It also seems a bit
> strange to use "OBJECT" here even though it's a string. I think it'd be
> nicer to keep the different ARRAY_ITERATOR_*, STRING_ITERATOR_* and
> MAP_ITERATOR_* values, to make it easier to change one of them later.

Right, the "OBJECT" part is strange. I'm not fond of duplicating all of the defines, though, so I chose a middle ground: renamed ITERATOR_SLOT_ITERATED_OBJECT to ITERATOR_SLOT_TARGET, and duplicated ITERATOR_SLOT_RANGE_OR_NEXT_INDEX into ITERATOR_SLOT_RANGE and ITERATOR_SLOT_NEXT_INDEX. WDYT?

BTW, we're at ~85ms now, with jsc at ~73ms and d8 at ~56ms. --ion-scalar-replacement=off still doesn't make a difference, so perhaps fixing that would get us to parity?
Attachment #8622389 - Attachment is obsolete: true
Attachment #8634692 - Flags: review?(jdemooij)
(In reply to Till Schneidereit [:till] from comment #35)
> BTW, we're at ~85ms now, with jsc at ~73ms and d8 at ~56ms.
> --ion-scalar-replacement=off still doesn't make a difference, so perhaps
> fixing that would get us to parity?

For the moment, the patches which were made to make scalar replacement works well with iterators got backed out, because of too many crashes on the web (including jquery).
Comment on attachment 8634692 [details] [diff] [review]
Part 2: self-host MapIteratorObject#next(), v2

Review of attachment 8634692 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay!

::: js/src/builtin/MapObject.h
@@ +88,5 @@
>  
>  class MapObject : public NativeObject {
>    public:
>      enum IteratorKind { Keys, Values, Entries };
> +#ifdef DEBUG

Nit: can remove the #ifdef DEBUG here.

::: js/src/vm/SelfHosting.cpp
@@ +1289,5 @@
>      JS_FN("std_Math_log2",                       math_log2,                    1,0),
>  
>      JS_FN("std_Map_has",                         MapObject::has,               1,0),
>      JS_FN("std_Map_iterator",                    MapObject::entries,           0,0),
> +    JS_FN("std_Map_iterator",                    MapObject::entries,           0,0),

Can remove this line; it's a copy of the previous one.
Attachment #8634692 - Flags: review?(jdemooij) → review+
url:        https://hg.mozilla.org/integration/mozilla-inbound/rev/51d2109c72dcb27394e043c0390bdc982c2771de
changeset:  51d2109c72dcb27394e043c0390bdc982c2771de
user:       Till Schneidereit <till@tillschneidereit.net>
date:       Sat Aug 01 00:13:26 2015 +0200
description:
Bug 1129313 - Part 2: self-host MapIteratorObject#next(). r=jandem
Is the leave-open still needed, i.e. is the bug fully fixed?
Flags: needinfo?(till)
We're still half as fast as Chrome here, so yes, still relevant.

nbp, is this still blocked on scalar replacement, and if so, can we do something about it?
Flags: needinfo?(till) → needinfo?(nicolas.b.pierron)
This would be fixed by Bug 1229813 [1], as soon as all the huge performance cliff are fixed.

Bug 1165348 was a large source of issues in the compiler.  I preferred to back it out because the risk was way too high from my point of view.  Scalar Replacement is designed on the idea that we can rely on the Apply Type phase for fixing up the types.  The problem is not much about fixing up the types, but more about having GVN capable of running on the result of Scalar Replacement, as it is today.  Thus I added a backward goto to re-run GVN, and this is where the complexity stands, as the rest of the compiler was not made for re-entry.

[1] https://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-array-forof&start=1450843547&end=1450925950
Depends on: 1229813
No longer depends on: 1165348
Flags: needinfo?(nicolas.b.pierron)
Depends on: 1248289
Flags: needinfo?(kg)
Is there anything left to be done here?
No! We're much faster on all benchmarks in this bug.

Numbers:

SM ~current m-c / d8 5.1.281.47 / jsc current Safari
Comment 4
19ms / 63ms / 112ms

Comment 9
25ms / 50ms / 110ms

Comment 32
25ms / 67ms / 107ms


For comparison, here are the current numbers for the plain object-using benchmarks from comment 12:
Comment 12 a
172ms / 132ms / 394ms

Comment 12 b
83ms / 105ms / 604ms


Oh, and as of tomorrow's Nightly, we should get roughly the same numbers for Set iterators, because bug 1288541 just landed.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Blocks: jsperf
Priority: -- → P3
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.