Closed Bug 889456 Opened 11 years ago Closed 10 years ago

BaselineCompiler: Compile JSOP_INITELEM_INC and JSOP_SPREAD

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: jandem, Assigned: arai)

References

Details

Attachments

(1 file, 2 obsolete files)

We need JSOP_INITELEM_INC, JSOP_SPREAD and JSOP_SPREADCALL etc (bug 762363).
Spread calls are useful for self-hosting fun.bind.
Blocks: 1000780
Implemented JSOP_INITELEM_INC and JSOP_SPREAD in the Baseline Compiler for now.
I'll implement rest of opcodes later.
Before that, could you review this patch and tell me if it has a problem?


jstests and jit-test(--tbpl) are passed on Mac OS X 10.9.2 64bit.
Performance comparison is following:

[opt build]

code:
  let s = elapsed(); for (let i = 0; i < 1000000; i ++) [1, 2, ...[3, 4, 5]]; print(elapsed() - s);

result:
  original --baseline-eager: 716018
  patched  --baseline-eager: 418642
  original --no-baseline:    694965
  patched  --no-baseline:    705402

[debug build]

code:
  let s = elapsed(); for (let i = 0; i < 100000; i ++) [1, 2, ...[3, 4, 5]]; print(elapsed() - s);

result:
  original --baseline-eager: 1806429
  patched  --baseline-eager:  809671
  original --no-baseline:    1776485
  patched  --no-baseline:    1826102
Attachment #8412389 - Flags: review?(jdemooij)
Comment on attachment 8412389 [details] [diff] [review]
Part1: Implement JSOP_INITELEM_INC and JSOP_SPREAD in the baseline compiler.

Review of attachment 8412389 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch, looks great! I'll take another look once the comments below are addressed, but should be good then. Please also check if this affects your perf numbers :)

::: js/src/jit/BaselineCompiler.cpp
@@ +2367,5 @@
> +    // Pop the rhs
> +    frame.pop();
> +
> +    // Increment index
> +    masm.loadValue(frame.addressOfStackValue(frame.peek(-1)), R0);

Instead of a pretty slow VM call just to increment the index, you can emit code to increment the index inline:

Address indexAddr = frame.addressOfStackValue(frame.peek(-1));
masm.incrementInt32Value(indexAddr);

Then you have to implement incrementInt32Value in the x86/x64/ARM macro-assemblers (jit/x86/MacroAssembler-x86.h etc)

For x86 and ARM it's like this:

void incrementInt32Value(const Address &addr) {
    add32(Imm32(1), payloadOf(addr));
}

On x64 the boxing format is different so you need something like this

void incrementInt32Value(const Address &addr) {
    addPtr(Imm32(1), addr);
}

@@ +2385,5 @@
> +
> +typedef bool (*SpreadFn)(JSContext *, HandleObject, HandleValue,
> +                         HandleValue, MutableHandleValue);
> +static const VMFunction SpreadInfo =
> +    FunctionInfo<SpreadFn>(js::SpreadOperation);

Nit: this fits on 1 line (SpiderMonkey uses 99 chars per line for code, 80 for comments).

@@ +2404,5 @@
> +    masm.extractObject(frame.addressOfStackValue(frame.peek(-3)), R0.scratchReg());
> +    pushArg(R0.scratchReg());
> +
> +    if (!callVM(SpreadInfo))
> +        return false;

Returning the updated count, clever :) I was wondering about incrementing the index in JIT code, but this seems fine and might be easier to reuse for Ion later on.

::: js/src/vm/Interpreter.cpp
@@ +3955,5 @@
> +{
> +    int32_t count = countVal.toInt32();
> +    ForOfIterator iter(cx);
> +    RootedValue iterVal(cx);
> +    iterVal.set(iterable);

RootedValue iterVal(cx, iterable); should work I think.
Attachment #8412389 - Flags: review?(jdemooij)
Thank you for reviewing!
Here is upated patch.

> Instead of a pretty slow VM call just to increment the index, you can emit
> code to increment the index inline:
Wow, that's what I was looking for :)
Thank you for letting me know that.


jstests and jit-test(--tbpl) are passed on Mac OS X 10.9.2 64bit.
I'd like to leave test for other archs to try-server, is it okay?


Performance comparison is following:

[opt build]

code:
  let s = elapsed(); for (let i = 0; i < 1000000; i ++) [1, 2, ...[3, 4, 5]]; print(elapsed() - s);
result:
  original --baseline-eager: 704647
  patched  --baseline-eager: 407070
  original --no-baseline   : 685092
  patched  --no-baseline   : 680039

[debug build]

code:
  let s = elapsed(); for (let i = 0; i < 100000; i ++) [1, 2, ...[3, 4, 5]]; print(elapsed() - s);

result:
  original --baseline-eager: 1776636
  patched  --baseline-eager:  769392
  original --no-baseline   : 1745850
  patched  --no-baseline   : 1744289


Single JSOP_SPREAD takes so much time than single JSOP_INITELEM_INC,
so here is another comparison for JSOP_INITELEM_INC:

[opt build]

code:
  let s = elapsed(); for (let i = 0; i < 1000000; i ++) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...[]]; print(elapsed() - s);
result:
  original       --baseline-eager: 1243559
  previous patch --baseline-eager:  465825
  patched        --baseline-eager:  344661

[debug build]

code:
  let s = elapsed(); for (let i = 0; i < 20000; i ++) [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...[]]; print(elapsed() - s);
result:
  original       --baseline-eager: 583111
  previous patch --baseline-eager:  87594
  patched        --baseline-eager:  74621

Yay, replacing VM call to inline code makes it faster :)
Attachment #8412389 - Attachment is obsolete: true
Attachment #8412923 - Flags: review?(jdemooij)
(In reply to Tooru Fujisawa [:arai] from comment #4)
> [opt build]
> 
> code:
>   let s = elapsed(); for (let i = 0; i < 1000000; i ++) [1, 2, 3, 4, 5, 6,
> 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...[]]; print(elapsed() - s);
> result:
>   original       --baseline-eager: 1243559
>   previous patch --baseline-eager:  465825
>   patched        --baseline-eager:  344661
> 
> [debug build]
> 
> code:
>   let s = elapsed(); for (let i = 0; i < 20000; i ++) [1, 2, 3, 4, 5, 6, 7,
> 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, ...[]]; print(elapsed() - s);
> result:
>   original       --baseline-eager: 583111
>   previous patch --baseline-eager:  87594
>   patched        --baseline-eager:  74621

Are you sure that these numbers aren't switched around? The debug build being a lot faster than the opt build seems highly unlikely. Assuming that they are indeed switched around (and in general), numbers for debug builds are only interesting in pathological cases, where something gets hugely slower in debug builds, and might heavily affect the run time of our test suites. Otherwise, you should only ever compare numbers for opt builds.
Note that the debug testcase uses fewer iterations :) That got me at first as well.

Per-iteration numbers would be:

opt:
  original:  1.243559
  old patch: 0.465825
  new patch: 0.344661

debug:
  original:  29.15555
  old patch:  4.37970
  new patch:  3.73105

... which seems a lot more reasonable :)
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #6)
> Note that the debug testcase uses fewer iterations :) That got me at first
> as well.

Ah! Sorry for the noise in that case. :)

Also: don't do perf comparisons of debug builds ;)
Comment on attachment 8412923 [details] [diff] [review]
Part1: Implement JSOP_INITELEM_INC and JSOP_SPREAD in the baseline compiler.

Review of attachment 8412923 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, nice work! Do you have access to the Try server or should I push this for you?

::: js/src/jit/BaselineCompiler.cpp
@@ +2364,5 @@
> +
> +    // Increment index
> +    Address indexAddr = frame.addressOfStackValue(frame.peek(-1));
> +    masm.incrementInt32Value(indexAddr);
> +

Nit: remove empty line.

@@ +2393,5 @@
> +        return false;
> +
> +    frame.popn(2);
> +    frame.push(R0);
> +

Same here.

::: js/src/vm/Interpreter.cpp
@@ +3973,5 @@
> +                                     JSPROP_ENUMERATE))
> +            return false;
> +    }
> +    resultCountVal.setInt32(count);
> +

Nit: remove this empty line.
Attachment #8412923 - Flags: review?(jdemooij) → review+
Removed those empty lines.

(In reply to Jan de Mooij [:jandem] from comment #8)
> Thanks, nice work! Do you have access to the Try server or should I push
> this for you?

I don't have access to Try server, could you push this?

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #6)
> ... which seems a lot more reasonable :)
(In reply to Till Schneidereit [:till] from comment #7)
> Also: don't do perf comparisons of debug builds ;)

Thank you for pointing that out.
I'll do it that way in next patch.
Attachment #8412923 - Attachment is obsolete: true
Attachment #8413689 - Flags: review?(jdemooij)
Comment on attachment 8413689 [details] [diff] [review]
Part1: Implement JSOP_INITELEM_INC and JSOP_SPREAD in the baseline compiler.

Review of attachment 8413689 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks again. I fixed some minor x86/ARM compilation failures locally and pushed to Try:

https://tbpl.mozilla.org/?tree=Try&rev=8ee6503fb938
Attachment #8413689 - Flags: review?(jdemooij) → review+
needinfo myself to land this if it's green.
Flags: needinfo?(jdemooij)
Try was green so I pushed this:

https://hg.mozilla.org/integration/mozilla-inbound/rev/c050752335d6

Thanks again, Tooru.

Let's do the JSOP_SPREAD* ops for calls in a new bug. Are you interested in adding these as well?
Assignee: general → arai_a
Status: NEW → ASSIGNED
Flags: needinfo?(jdemooij)
Summary: BaselineCompiler: Compile JSOP_INITELEM_INC and JSOP_SPREAD* ops → BaselineCompiler: Compile JSOP_INITELEM_INC and JSOP_SPREAD
Thank you!

Here is the new bug.
  bug 1003149 - BaselineCompiler: Compile JSOP_SPREAD* ops

The patch for JSOP_SPREAD* ops is almost ready.
Now I'm tuning :)
https://hg.mozilla.org/mozilla-central/rev/c050752335d6
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
You need to log in before you can comment on or make changes to this bug.