Closed Bug 742829 Opened 12 years ago Closed 12 years ago

IonMonkey: Don't allocate space for content of NewArray when data isn't set immediately

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: h4writer, Unassigned)

References

Details

Attachments

(3 files)

Testcase:

function AESEncryptCtr() {
    new Array(256);
}
for (var i = 0; i < 300000; i++) {
    var cipherText = AESEncryptCtr();
}

IM: 341ms
IM patched: 14ms
JM:  12ms
V8:  60ms

Looking into JM I noticed the following:
[...], new Array(1, 2 ....) check if content fits into arrayslot. If it does fit the fast path is used (no allocation needed). Else the slow path gets taken. This because there is not enough space, so allocation is needed to store all data we want in array.

new Array() or new Array(size) don't check if content fit into arrayslot. This because we don't know when data will get into the array. So we lazily take the fast path and when to much information gets into array it gets allocated.

In IM we currently always check if content fits into arrayslot. So for new Array(size) we take slow path and create space for the array immediately, instead of lazily.

Fixed by patch
Attachment #612629 - Attachment is patch: true
Attachment #612629 - Flags: review?(sstangl)
Comment on attachment 612629 [details] [diff] [review]
Don't allocate space for content of NewArray when data isn't set immediately

Review of attachment 612629 [details] [diff] [review]:
-----------------------------------------------------------------

It seems strange to be optimizing Array behavior for the case of an unused allocation. Before committing this patch, would it be possible to measure SS/v8/Kraken performance deltas and post the results to this bug? I'm unsure what effect this patch will have on real benchmarks with Array allocations that are actually used.

Furthermore, maybe there are heuristics that can be used. If the allocation is small, say < 10, maybe it would be faster to perform the allocation when requested? Plenty of possible future work involved here.

::: js/src/ion/CodeGenerator.cpp
@@ +882,5 @@
>      uint32 count = lir->mir()->count();
> +    bool allocates = lir->mir()->allocates();
> +
> +    if (!gen->cx->typeInferenceEnabled() || !typeObj)
> +        return visitNewArrayCallVM(lir);

I'm not a fan of calling visitNewArrayCallVM() from two locations. It would be clearer to express all the conditions in one go:

if (!gen->cx->typeInferenceEnabled() || !typeObj ||
    (allocates && count > maxArraySlots))
{
    return ...;
}

::: js/src/ion/MIR.h
@@ +963,5 @@
>      uint32 count_;
>      // Type of the object.
>      HeapPtr<types::TypeObject> type_;
> +    // Content of array is directly allocated
> +    bool allocates_;

Instead of using a boolean, could we use a class-specific enum? (MStart contains a good example). Then instead of writing

> MNewArray(..., ..., true /* allocates */);
> MNewArray(..., ..., false /* allocates */);

we could write

> MNewArray(..., ..., NewArray_Allocating);
> MNewArray(..., ..., NewArray_Unallocating);

which is a lot nicer aesthetically. Boolean flags are really messy. Even though they're all over the codebase, we shouldn't encourage more use of them.

@@ +982,5 @@
>      types::TypeObject *type() const {
>          return type_;
>      }
> +    
> +    bool allocates() const {

isAllocating()?
Attachment #612629 - Flags: review?(sstangl) → review+
Some measurements:
https://docs.google.com/spreadsheet/ccc?key=0ArexmzLDWSfydDFEZVFwbUlEVE9VNkxZYmdTWFd4enc

Conclusions with new Array(x) and filling with x elements:
- x < 16: patched and not patched are almost identical in speed.
- x > 16: not patched version is much much faster

So I assume the reason JM does this isn't speed concerns, but memory concerns. With new Array(x) we don't know when the content will get initialized. E.g. assume in the beginning of a js file one init. dozen new Array(x) to be used in script, but doesn't need the information immediately or even ever. We will take a huge penalty at creating the array's immediately.

So to do the same as JM, we should land this. If we want to be smarter, I think we should do it for JM and IM in follow-up bug. (Didn't check interpreter, but maybe even interpreter).

A follow-up bug could be:
- Save the requested length when creating array and whenever first element get's added initialize the array with the requested length instead of doubling length every time length it is too small.
- Another one (if first idea is bad). When doubling the length check if length will be higher than the requested length and only allocate the requested length. There is a reason a developer has taken the time to put a requested length in "new Array()". If a higher value is accessed we can start to double the length again.

What do you think Sean?

Thanks for the review. I'll update the patch with the review comments if you agree we can take this possible speed penalty?
I think it's OK to land this, but I'd like to see whether the following tests using NewArray regress: SS 3d-cube, v8-splay, kraken crypto-ccm.

Additionally, it would be interesting to see the performance change in a microbenchmark that actually fills in the array entirely. Is the slowdown noticeable? It might not be.

If we get perf numbers in this bug, and they're not regressing too much -- I'd expect them to be acceptable, possibly even better depending on benchmark -- let's land it.

I wouldn't worry about any follow-up work currently; there are bigger, non-experimental fish to fry.
(In reply to Sean Stangl from comment #4)
> Additionally, it would be interesting to see the performance change in a
> microbenchmark that actually fills in the array entirely. Is the slowdown
> noticeable? It might not be.

I posted that in Comment 3. The numbers for x ranging from 1 to 1000 are in the spreadsheet. 
Patched is 1.1x to 2x slower in that case.

For completeness testcase was:

function test() {
    var a = 99;
    var arr = new Array(a);
    for(var i=0; i<a; i++) {
        arr[i] = 0
    }
}
for(var i=0; i<1000000; i++) {
    test()
}
Attached file Sunspider test results
Sunspider results

Summary:
total: 1.003x slower - not conclusive
3d-cube: 1.004x slower - not conclusive
date-format-tofte: 1.016x slower - not conclusive
date-format-xparb: 1.012x slower - not conclusive
regexp-dna: 1.031x as fast - significant
Attached file Kraken test results
total: 1.001x slower
stanford-crypto-ccm: 1.01x faster - not conclusive

Hard to tell something about individual tests as variance is too high. But doesn't look like a big hit.
Looks OK to me.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: