Closed Bug 1354554 Opened 5 years ago Closed 5 years ago

Remove iterator result object from the stack before the back edge in for-of loop to optimize more in JIT

Categories

(Core :: JavaScript Engine: JIT, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: arai, Assigned: arai)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

separated from bug 1354114 comment #2.
Marvelous work. I just tested your try push and the object allocation for {value, done} is correctly eliminated. On a simple benchmark this improves our times from ~500ms to ~310ms.
However now we run into a new problem: The undefined and a[index] value flow in the same phi and this degrades our type information. For example instead of an int32 add, we will now emit a double addition.
Maybe we could immediately leave the loop in the done: true case? Probably this unbalances the stack again.
is it okay to place backedge in the then-branch?
actually, just putting the backedge with JSOP_GOTO inside the then-branch breaks the JIT compilation
it crashes with "unknown goto case" assertion in IonControlFlow.cpp.

I'll try putting forward goto in done:true case
can you check if this fixes the both phi-related issues?
Attachment #8856077 - Flags: feedback?(evilpies)
Attached file testcase
Sadly with this patch the object allocation is back. It seems like the iterator object now shares a stack position with the value at some point? I am not sure if this is even possible to optimize with just bytecode.

Anyway if you are interested, usually I first run this test case with IONFLAGS=escape and --not-threads.

That should give output like
[Escape] check object
newobject = or newarrayiterator =
[Escape] object is not escaped

Afterwards I usually run with with IONFLAGS=logs and use https://github.com/sstangl/iongraph to check that at the end the add is typed Int32.
now the then-branch for done:true removes the RESULT object from the stack by replacing with UNDEFINED

# loop {

...

00159:  ifeq 172 (+13)                  # ITER RESULT
# if {
00164:  jumptarget                      # ITER RESULT
00165:  pop                             # ITER
00166:  undefined                       # ITER undefined
00167:  goto 184 (+17)                  # ITER undefined
# }

# from ifeq @ 00159
00172:  jumptarget                      # ITER RESULT
00173:  getprop "value"                 # ITER VALUE
00178:  false                           # ITER VALUE false
00179:  ifeq 27 (-152)                  # ITER VALUE

# }

# from goto @ 00167
00184:  jumptarget                      # ITER merged<undefined>
00185:  pop                             # ITER
00186:  pop                             #


in "IONFLAGS=escape", I see "[Escape]   Object is not escaped" only,
and in iongraph I see Int32-typed add.
Attachment #8856077 - Attachment is obsolete: true
Attachment #8856077 - Flags: feedback?(evilpies)
Attachment #8856098 - Flags: feedback?(evilpies)
Comment on attachment 8856098 [details] [diff] [review]
Remove iterator result object from the stack before the back edge of for-of loop.

You did it! Very nice. No more object allocation and the add is typed as int32. I also verified that on six-speed we get a string, instead of a value.
Attachment #8856098 - Flags: feedback?(evilpies) → feedback+
The only problem I now see is that we seem to box the value in every iteration now. But we are already much faster.
Maybe Ion would have an easier job optimizing this if we used synthetic local variables.
placed IteratorNext and result.done check at the top of the loop:
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=43099208c9c5c0b0d1c3b45cddc7da72759a2bfd

I kept the loop slots (2) from the previous patch, since using 3 slots and allocate slot for RESULT object doesn't look simplifies things for current situation
(since we should dup, swap, or push-undefined several times)

now the code looks like the following:

  do {
    var result = iter.next();
    if (result.done)
      break;

    try {
      HEAD = result.value;

      BODY;
    } catch (e) {
      IteratorClose(iter);
      throw e;
    }
  } not_while (false);

(|not_while| because I use JSOP_IFEQ. maybe we could just switch to JSOP_IFNE tho...)


btw, is synthetic local variables available?
in that case I would like to try that for some more places
(In reply to Tom Schuster [:evilpie] from comment #11)
> The only problem I now see is that we seem to box the value in every
> iteration now.

is it something we can fix in bytecode, by tweaking branch or stack slots or something?
if not, I'll go with the current patch.
the box sounds like because of the initial undefined for the VALUE slot in the stack.
maybe we should stop using stack slot also for value, or maybe using synthetic local variable?
(or stop initializing the slot with undefined, but push VALUE for first iteration...?)

(if removing VALUE slot from the stack, it might be difficult to balance stack around try-catch tho...
> the box sounds like because of the initial undefined for the VALUE slot in the stack.

this wasn't correct.
using JSOP_ZERO instead of JSOP_UNDEFINED doesn't remove the "undefined" in the graph.
I'm thinking it might be the {value:undefined,done:true}
> I'm thinking it might be the {value:undefined,done:true}

also not correct :P
changing |var result = { value: undefined, done: false };| to |var result = { value: 0, done: false };| in ArrayIteratorNext also doesn't remove the undefined in the graph.


what I'm seeing is

> 1 constant undefined
> ...
> 11 box constant1
> ...
> 18 phi box11 phi88
> ...
> 79 unbox phi75 to Int32 (typebarrier)
> 80 typebarrier unbox79
> ...
> 83 add phi17 typebarrier80 [int32] I[?, ?] (&lt; pow(2, 32+1)) : Int32
> ...
> 85 box typebarrier80
> ...
> 88 phi phi18 box85

there, phi18 and phi88 are not used elsewhere (only mutually referencing)
maybe we should fix that in Ion, to just eliminate unused phis.
I'll go with current patch that doesn't touch the box things here.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c6037f6277c5fc1a920b2c3c82a9bfc3e6244176
Changed the following:
  * the for-of loop becomes infinite-loop + break on done:true (emitSpecialBreakForDone)
    to avoid having RESULT object in the stack slot
  * the loop depth of for-of becomes 2 (ITER, VALUE), from 3 (ITER, RESULT, VALUE)
  * IteratorNext, IteratorComplete, and IteratorValue are placed at the top of the loop
    (theoretically we can remove the VALUE from the stack slot, but kept to balance the depth around the try-catch)
Attachment #8856098 - Attachment is obsolete: true
Attachment #8856210 - Flags: review?(shu)
(In reply to Tooru Fujisawa [:arai] from comment #13)

> btw, is synthetic local variables available?
> in that case I would like to try that for some more places

While there's no special framework for this yet, we can use the special "dot names" for this, I imagine. If we take care to clear the values from them before using, we wouldn't need a synthetic lexical scope, either. For example, if we parse for-of loops we can declare ".forOfIterValue" and ".forOfIterDone" or whatever, then use them as we would any other binding in BCE.
Comment on attachment 8856210 [details] [diff] [review]
Remove iterator result object from the stack and place IteratorNext at the top of the loop to avoid having unnecessary values at the backedge.

Review of attachment 8856210 [details] [diff] [review]:
-----------------------------------------------------------------

I admire your thoroughness for also dealing with the comprehension case. Nice patch.

::: js/src/frontend/BytecodeEmitter.cpp
@@ +270,5 @@
>      }
>  
> +    MOZ_MUST_USE bool emitSpecialBreakForDone(BytecodeEmitter* bce) {
> +        // This doesn't pop stack values, nor handle any other controls.
> +        // Should be called on the topleve of the loop.

Typo: toplevel
Attachment #8856210 - Flags: review?(shu) → review+
Pushed by arai_a@mac.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bbbc7f946b47
Remove iterator result object from the stack and place IteratorNext at the top of the loop to avoid having unnecessary values at the backedge. r=shu
https://hg.mozilla.org/mozilla-central/rev/bbbc7f946b47
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.